INTRODUCTION

Premenstrual dysphoric disorder (PMDD) (Steiner and Born, 2000; Eriksson et al, 2002; Freeman, 2003) afflicts 5–8% of women of fertile age (Angst et al, 2001; Wittchen et al, 2002; Halbreich et al, 2003). It is characterized by symptoms appearing during the 2 weeks before menstruation, and disappearing after the onset of menses. Irritability, affect lability, tension, and depressed mood are cardinal symptoms. The condition impacts significantly on functioning and/or relationships; women with PMDD are reported being as disabled during the luteal phase as are women with depression (Pearlstein et al, 2000).

Many controlled trials have demonstrated serotonin reuptake inhibitors (SRIs) to be effective in PMDD (Steiner et al, 1995; Yonkers et al, 1997; Parry, 2001; Eriksson et al, 2002; Pearlstein, 2002; Wyatt et al, 2002). When these drugs are used for the treatment of depression and anxiety disorders, they require weeks of treatment to cause a marked symptom reduction. In the initial trials of SRIs for PMDD, medication was therefore given continuously throughout the menstrual cycle (Eriksson et al, 1990; Stone et al, 1991; Sundblad et al, 1992; Steiner et al, 1995; Su et al, 1997; Yonkers et al, 1997). However, as first shown for clomipramine (Sundblad et al, 1993), and later confirmed for citalopram (Wikander et al, 1998), fluoxetine (Steiner et al, 1997; Cohen et al, 2002; Miner et al, 2002), paroxetine (Steiner et al, 2005), and sertraline (Halbreich and Smoller, 1997; Young et al, 1998; Jermain et al, 1999; Halbreich et al, 2002; Freeman et al, 2004), effective symptom reduction might also be obtained by intermittent treatment, started around ovulation and discontinued after the onset of menstruation.

The purpose of this study was three-fold: (1) To confirm the efficacy of paroxetine for PMDD. (2) To explore if effects on specific symptoms of this drug are dependent on whether it is administered intermittently or continuously, the à priori hypothesis being that intermittent treatment would be at least as effective as continuous treatment in reducing irritability, but somewhat less effective in reducing somatic complaints. (3) To explore effect sizes for different symptoms, the à priori hypothesis—based on previous studies assessing the effects of intermittent treatment on various premenstrual symptoms—being that the effect size would be larger for irritability than for, for example somatic symptoms (Sundblad et al, 1992; Wikander et al, 1998; Cohen et al, 2002; Halbreich et al, 2002; Miner et al, 2002).

METHODS

Setting

This was a double blind trial with three parallel groups, undertaken by four investigators at four centers. All patients provided informed consent. Approval by ethics committees was obtained.

Study Population

Women responding to newspaper advertisements were interviewed by phone and then invited to a screening visit. To be included, they should be 18 years of age, report regular menstrual cycles (22–35 days), and meet criteria A–C for PMDD in DSM-IV. To be regarded as meeting criterion D, they should display a 50% increase in irritability and/or depressed mood during the luteal phase (=the mean rating of the 5 days preceding the first day of menstruation) as compared to the follicular phase (=the mean rating of days 6–10) during two cycles of daily symptom rating using a visual analogue scale (VAS) (0–100 mm). The mean luteal phase rating of the symptom qualifying for inclusion also should not be below 25 mm. Subjects not displaying the required cyclicity in the first reference cycle, but in the second, were included if displaying the required cyclicity during a third cycle.

Excluded were subjects (a) fulfilling the DSM-IV criteria for any Axis 1 disorder other than PMDD in the 12 months before screening, as assessed using the Mini International Neuropsychiatric Interview (Sheehan et al, 1998, b) displaying a baseline score of >10 on the Montgomery Asberg Depression Rating Scale (Montgomery and Åsberg, 1979) in the follicular phase, (c) having tried an SRI for PMDD, (d) taking oral contraceptives, or (e) reporting regular use of any psychoactive drug or any other kind of medication motivating exclusion due to safety reasons.

Treatment

A computer randomization list was used to assign participants to one of three treatment arms. Two blister packs with medication (capsules) for each of the three treatment cycles were provided, the first of which was to be used during the follicular phase, the first day of menstruation being the first day of medication. Time of ovulation was estimated on the basis of normal cycle length for the patient. On this day, the patient switched to a second pack that was to be used for the rest of that cycle. One group was given paroxetine 10 mg/day during the first 4 days and 20 mg/day for the rest of the study, that is paroxetine continuously (PC). After the third treatment cycle, 10 mg/day was administered during 4 days before treatment was discontinued. Patients in the other active treatment group were treated with placebo until the estimated time of ovulation, after which they received paroxetine 10 mg/day during 4 days followed by 20 mg/day for the rest of the luteal phase; they were thus given paroxetine intermittently (PI). During the first 4 days of the follicular phases of each treatment cycle they were given 10 mg/day. The third group received placebo (PBO) capsules throughout the study. In the event of side effects, the patients could reduce the dose from two capsules to one.

Assessments

The patients rated the severity of the symptoms listed in Table 2 using the above-mentioned VAS scales each day during the baseline cycles as well as during the treatment cycles; calculation of luteal and follicular ratings was undertaken as described above. Symptom assessment using the Premenstrual Tension Scale-Observer rated (PMTS-O) (Steiner et al, 1980) was undertaken at baseline and in the luteal phase of treatment cycle 1 and 3. Global improvement was assessed by the investigator using the Clinical Global Impression-Improvement (CGI-I) scale; in addition, patient evaluation of global efficacy (PGE) using a scale corresponding to the CGI-I was rated at the end of each treatment cycle (Guy, 1976). At baseline and at the end of treatment cycles 1 and 3, patients assessed to what extent their symptoms affected work, social life/leisure activities, and family life/home responsibilities using the Sheehan Disability Scale (SDS) (rated 0–10) (Sheehan et al, 1996).

Table 2 Effect Size for VAS Scores

Adverse events were noted in the diary. Also, at each visit the investigator posed a nonleading question regarding side effects.

At the end of treatment cycle 3, the patients were asked whether they found the medication so beneficial that they would want to continue taking it, taking both effect and side effects into consideration (‘Yes’, ‘No’, ‘Don't know’).

Statistical Analyses

The primary efficacy parameters were the percentage change from baseline in luteal phase irritability VAS score and the proportion of responders according to the CGI-I at study end point. In accordance with the protocol, these variables were analyzed for the intention to treat-population (ITT) with the last observation used as end point measure. The three groups were also compared at end point with respect to all other assessments mentioned above. End point was defined as ratings from treatment cycle 3 for subjects completing the trial, and ratings from the last completed cycle for dropouts. The study was powered (n=57 in each group) to have a 90% chance to detect a 30% (SD=49) difference between active drug and placebo in irritability VAS score. Based on the assumption that a difference between groups of 30% in VAS scores would be of interest, the study was also deemed sufficiently large to detect relevant differences between the two active arms, if such differences were at hand for certain symptoms.

To elucidate possible differences between PC and PI for different symptoms, all VAS-rated variables were analyzed not only for the ITT population but also for completers. Effects sizes for the active treatment groups vs placebo were calculated for all symptoms in the entire ITT population as well as after exclusion of subjects not displaying the symptom (<25 mm) in question at baseline.

The percentage change in symptom score for each treatment cycle was calculated using the formula (Treatment score−Baseline score) × 100/Baseline score.

For all the different luteal VAS scores, the percentage change was found not to meet assumptions of normality and homogeneity of variance; hence comparisons between groups wit respect to these measures were undertaken using the van Elteren's test with adjustment for investigator. van Elterens's test is an extension of the Mann–Whitney U-test that facilitates adjusting for a third variable (van Elteren, 1960).

The effect size for the comparisons made with van Elteren's test was calculated after assuming normality. The quotient delta mu/sigma was calculated and multiplied with (n1+n2–2) for each investigator, added together and divided with total number of patients minus 2 × 4 (=number of investigators).

In order to analyze to what extent the effect on symptoms other than dysphoria were secondary to the effect on symptoms of dysphoria, a logistic regression was undertaken, reductions in different symptoms being the independent variables, and treatment being the dependent variable For these analyses, subjects not displaying the symptom being investigated at baseline (<25 mm) were excluded.

Changes in PMTS and SDS scores met the normality and homogeneity of variance assumptions for parametric modeling and were hence analyzed using analysis of variance with adjustment for investigator and baseline scores.

Response rates were defined as the number of subjects (a) assessed as very much improved or much improved on the CGI global improvement scale, (b) rating themselves as very much improved or much improved on the PGE scale, (c) displaying a 50% reduction of irritability and depressed mood, respectively, and (d) no longer meeting the VAS-based inclusion criteria. The comparison of groups with respect to these response rates was undertaken using parametric logistic regression adjusting for investigator effects.

No adjustment of P-values for multiple comparisons was undertaken.

RESULTS

Baseline Demographics and Randomization

The numbers of subjects screened and randomized are shown in Figure 1. Age (mean±SD) of randomized subjects was 37±7.1 (PBO), 37±5.9 (PI) and 38±6 (PC). All but one were white. Mean age of onset of PMDD symptoms was 28 (PBO), 27 (PI), and 29 (PC).

Figure 1
figure 1

Trial flow diagram.

Attrition

The dropout rate was low and did not differ between the three treatment arms (PBO: 8/59, PI: 9/59, PC: 9/60). The reasons for discontinuation after receiving at least one dose were adverse experience (n=1) and other reason (n=7) in the PBO group, adverse experience (n=3) and other reason (n=6) in the PI group, and adverse experience (n=5) and other reason (n=4) in the PC group.

Baseline

Baseline VAS ratings are shown in Table 1. Baseline PMTS-O ratings (mean±SD) were 22±5 (PBO), 20±5 (PI), and 21±5 (PC). Baseline SDS ratings (mean±SD) were 5.7±2.3 (PBO), 4.6±2.5 (PI), and 5.9±2.7 (PC) for the work item, 6.1±2.1 (PBO), 5.2±2.8 (PI), and 6.2±2.7 (PC) for the social life/leisure activities item, and 7.0±1.8 (PBO), 6.6±2.6 (PI), and 7.4±1.9 (PC) for the family life/home responsibilities item.

Table 1 Baseline Characteristics of the Three Groups

Efficacy of Continuous Treatment

Figure 2 displays the percent symptom reduction at end point for the ITT population. PC was superior to PBO with respect to all symptoms. The effect sizes differed for different symptoms, being highest for irritability (Table 2). The displayed effect sizes were calculated on the entire ITT population including subjects that did not display the symptom in question at baseline. To assess the possibility that differences in effect size merely reflected the commonness of the symptoms at baseline, a second calculation of effect size was conducted for each symptom including only subjects reporting the symptom in question (25 mm) before treatment. This calculation revealed effect sizes >1 for all symptoms with the exception of food craving (0.3) (data not shown). Differences between symptoms with respect to effect size, however, were observed also after exclusion of subjects not displaying the symptom at baseline, the effect sizes for irritability (1.6) and breast tenderness (1.5) being the highest.

Given the superior effect size for reduction in irritability, it could be argued that the effects on other symptoms might be secondary to the effect of PC on irritability. This possibility was addressed using logistic regression, demonstrating that the reduction in irritability significantly discriminated the PC group from the PBO group (P<0.001). When the other VAS-rated symptoms were added to the model, one at a time, only breast tenderness significantly (P<0.01) enhanced the discrimination as compared to that obtained by irritability only. Thus, adding the reduction in depressed mood (P=0.8), tension (P=0.4), affect lability (P=0.7), mood swings (P=0.5), lack of energy (P=0.3), food craving (P=0.6), or bloating (P=0.6) did not improve the discrimination.

Depending on the definition of response, the response rate in the PC group was between 83 and 95% (Table 3). The outcome of the PC group was better than that of the PBO group on social functioning (Table 4). With respect to PMTS rating, the difference between the PC and PBO groups in mean change (95% CI) from baseline was −6.7 (−9.2, −4.3), P<0.0001.

Table 3 Response Rate
Table 4 Sheehan Disability Scale

Efficacy of Intermittent Treatment

PI was superior to PBO with respect to reduction in all VAS-rated symptoms with the exception of lack of energy and food craving (Figure 2). As was the case for the PC group, the effect size varied for different symptoms (Table 2). After exclusion of subjects not displaying the symptom in question at baseline (see above), effect sizes were >1 for irritability, tension, affect lability, and mood swings, around 1 for bloating and breast tenderness, and <1 for depressed mood, lack of energy and food craving (data not shown).

Figure 2
figure 2

The percentage reduction of VAS-rated symptoms. Vertical bars represent median percentage reduction for VAS-rated symptoms at end point for the ITT-population. P-values for a global comparison of all three treatment groups were 0.04 for lack of energy and breast tenderness, and <0.001 for all other symptoms. P-values for the difference between continuous paroxetine and placebo were: irritability <0.001, affect lability <0.001, mood swings <0.001, depressed mood <0.001, tension <0.001, lack of energy 0.001, food craving 0.001, breast tenderness <0.001, and bloatedness <0.001. P-values for the difference between intermittent paroxetine and placebo were: irritability <0.001, affect lability <0.001, mood swings <0.001, depressed mood 0.001, tension <0.001, lack of energy 0.12, food craving 0.38, breast tenderness 0.011, and bloatedness 0.001. P-values for the difference between continuous paroxetine and intermittent paroxetine were: irritability 0.62, affect lability 0.72, mood swings 0.89, depressed mood 0.10, tension 0.50, lack of energy 0.079, food craving 0.018, breast tenderness 0.29, and bloatedness 0.075. For statistics, see Methods. For n, see Table 1.

A logistic regression in which all symptoms that were significantly alleviated by intermittent treatment were added revealed that adding symptoms other than irritability never improved the discrimination between active treatment and placebo that was obtained when the analysis was based on the reduction in irritability only (data not shown).

Also with respect to response rate (Table 3) and SDS scores (Table 4), the outcome of the PI group was superior to that of the PBO group. The difference between the IC and PBO group in mean change (95% CI) from baseline on the PMTS rating was −5.5 (−8.0, −3.0), P<0.0001.

Continuous vs Intermittent Treatment

As shown in Figure 2, the reduction in irritability, affect lability and mood swings was of the same magnitude in the PI group as it was in the PC group, as were also the effect sizes for these symptoms (Table 2). With respect to depressed mood, tension, lack of energy, food craving, bloating, and breast tenderness, the effect of PI appeared somewhat less impressive than that of PC. When the same comparison as that shown in Figure 2 was undertaken on the cohort of completers only, that is excluding subjects in the PC group that, due to early dropout, had actually not received continuous treatment, PC was superior to PI in reducing depressed mood (P<0.05).

There was a tendency for greater improvement in the PC group with respect to SDS scores (Table 4). With respect to PMTS rating, the group did not differ significantly, the difference between the two groups in mean change (95% CI) from baseline being −1.2 (−3.7, 1.2) (P=0.33).

With respect to response rate, PC was superior to PI when response was defined by means of CGI or PGE, or as a 50% reduction in self-rated depressed mood (VAS) (ITT population). The response rate for the PI group was however equal to that of the PC group when response was defined as a 50% reduction in self-rated irritability (Table 3).

Number of Tablets Taken and Tolerability

In the PI group, 46 subjects were taking two capsules per day (containing 10 mg each) at end point; four subjects failed to report their dose. In the PC group, 45 subjects were taking two capsules per day, three took one capsule only, and three failed to report their dose at end point. In the PBO group, all 51 subjects took two capsules at end point.

The rate of reported adverse events was high in all three groups, including the PBO group, which probably was due to the fact that patients were not only asked to recapitulate adverse events at the visits, but also to note them daily in the diary.

Common side effects are shown in Table 5. In the PI group, nausea was reported by 37% (n=22) in one cycle only, and by 14% (n=8) in more than one cycle. In the PC group, it was reported by 38% (n=23) in one cycle only, and by 5% (n=3) in more than one cycle. In the PBO group, 17% (n=10) reported nausea in one cycle, and none in more than one cycle. Fewer subjects in the PI group reported decreased libido than in the PC group (χ2=5.8, P=0.016).

Table 5 The Number of Patients Reporting the Most Common Side Effects, n (%)

Patient's Global Evaluation of Treatment

Twenty-one out of 58 subjects (36%) in the PBO group, 35 out of 53 subjects (66%) in the PI group, and 48 out of 60 subjects (80%) in the PC group, responded affirmative (‘yes’, in contrast to ‘no’ or ‘don't know’) to the question whether they would want to continue with the tested drug.

DISCUSSION

In line with previous studies (Eriksson et al, 1995; Cohen et al, 2004), paroxetine was found to be very effective for PMDD. In the ITT population of the PC group, the median percentage reduction in self-rated intensity of all four cardinal symptoms hence was above 90%. When response was defined by means of CGI, PGE, a 50% reduction in key VAS symptoms, or no longer meeting the inclusion criteria, the response rate in this group was between 83 and 95%. Moreover, 80% of the subjects in this group stated that they would like to go on taking this compound. In line with previous SRI studies (Pearlstein et al, 2000; Cohen et al, 2002; Steiner et al, 2003; Cohen et al, 2004), the treatment also resulted in improved social functioning.

Previous authors have emphasized that there is a considerable portion of nonresponders to SRIs among PMDD subjects (van Leusden, 1995). Our results challenge this view, at least if response is defined as percentage reduction of one of the four key symptoms.

The response rate was somewhat higher in this study than in many previous multicenter studies evaluating the effect of SRIs for PMDD (Steiner et al, 1995; Yonkers et al, 1997), despite the fact that our placebo response was not higher than in other recent studies. One explanation to the high response rate may be that only subjects reporting marked irritability and/or depressed mood at baseline were included, and that most subjects turned out to display marked irritability. In studies of less homogenous PMDD populations, a lower response rate probably reflects the fact that some symptoms might be less responsive to SRIs than others (see below). The high response rate in this trial should hence not lead to the conclusion that any group of subjects meeting PMDD criteria would display a similar response.

The possibility that paroxetine is more effective for PMDD than other SRIs also should not be excluded; in order to confirm such a difference, head to head comparisons are, however, required. When comparing the outcome of this trial with that of other trials of comparable size, it should be taken into consideration that this study was undertaken at one center, and with few investigators; such a setting often results in high effect sizes as compared to those obtained in multicenter trials. Moreover, the dropout rate being low, which could be related to this factor, may contribute to the response rate being higher than in other trials, at least in the PC group.

Although PC was superior to placebo for all symptoms assessed, the percentage symptom reduction, as well as the effect size, was higher for irritability than for, for example, physical symptoms. Many authors have suggested irritability to be the cardinal symptom of PMDD (Angst et al, 2001; Eriksson et al, 2002; Hartlage and Arduino, 2002; Landén and Eriksson, 2003). The relative importance of irritability versus other symptoms was not addressed in this trial, but the different effect sizes for different symptoms does suggest that irritability is an important target symptom for SRIs. This notion gains support from the logistic regression, revealing that the effect on irritability clearly separated PC from PBO, and that adding all other VAS-rated symptoms, one at a time, did not improve this discrimination, breast tenderness being the sole exception.

The feasibility of intermittent treatment with SRIs in PMDD is of theoretical importance since it indicates that an influence of these drugs on mood and behavior may be exerted without lag phase, challenging the assumption that SRIs require weeks of treatment to facilitate serotonergic transmission (Artigas et al, 1996). We have previously (Landén and Eriksson, 2003) suggested that symptoms such as irritability and affect lability are more inclined to respond rapidly to serotonin facilitation than, for example, depressed mood, and that the importance of these symptoms in PMDD is a major reason for the feasibility of intermittent treatment. In line with this, the effect of intermittent treatment with paroxetine was as impressive as that of continuous treatment for irritability, affect lability, and mood swings. On the other hand, intermittent treatment was somewhat less effective with respect to reduction in depressed mood, tension, lack of energy, and somatic symptoms. Of interest in this context are previous studies suggesting symptoms, such as anger and affect lability to respond rapidly to SRIs in patients with brain injury (Sloan et al, 1992; Muller et al, 1999), or stroke (Nahas et al, 1998; Burns et al, 1999).

It has previously been suggested that SRIs might not influence somatic symptoms per se, and that the reduction in the rating of these symptoms during treatment merely reflects the reduction in dysphoria, making them more tolerable (Steiner et al, 2001). However, the logistic regression indicating that the effect of continuous treatment on breast tenderness was partly independent of the effect on irritability supports a direct effect of paroxetine on breast tenderness. In the PI group, on the other hand, the logistic regression did not support the reduction in breast tenderness to be independent of the effect on irritability. In line with this, the effect of treatment on somatic symptoms generally was less impressive in the PI group than in the PC group. Needless to say, the observation that intermittent treatment was as effective as continuous treatment in reducing irritability, but less effective in reducing somatic symptoms, also indicates that the latter effect, in the PC group, is not merely a consequence of the former. The findings from the logistic regression should, however, be interpreted with due caution, given that the study was not primarily powered for this analysis.

Intermittent treatment being less effective than continuous treatment in reducing somatic complaints gains support from the first study regarding the effect of intermittent SRI administration on premenstrual complaints, in which we found intermittent clomipramine to reduce irritability and depressed mood but not somatic symptoms (Sundblad et al, 1993); previously we had shown continuous clomipramine to reduce both mental and somatic complaints (Sundblad et al, 1992). In line with this, a recent study showed that a low dose of fluoxetine administered intermittently reduced mood symptoms but not somatic symptoms; a higher dose of fluoxetine, however, was effective for somatic symptoms as well (Cohen et al, 2002). Less effects of intermittent administration of SRIs on somatic than on mental symptoms also have been reported by others (Halbreich et al, 2002; Miner et al, 2002).

When interpreting the outcome with respect to the effects of the different treatment regimens on different symptoms, it should be noted that VAS-rated symptoms other than irritability were not defined primary effect parameters, and that no adjustment of P-values were undertaken despite multiple comparisons. These findings should thus be regarded as preliminary until replicated. However, the notion that intermittent and continuous treatment do differ with respect to the influence on certain, but not all, symptoms, is well in line with previous studies, and in perfect accordance with our à priori hypothesis.

Nausea, somnolence/fatigue, and sexual dysfunction were more common in the groups given active treatment than PBO. Reduction in libido was reported more often by subjects in the PC group than in the PI group. The lack of frequent reports of vertigo and dizziness in the PI group suggests that recurrent discontinuation symptoms (Black et al, 2000) were not a problem. This may be due to that fact that dosage was tapered gradually, or to the fact that 2 weeks of treatment may be too short to cause withdrawal symptoms. Supporting the latter alternative, withdrawal symptoms did not constitute a problem in a previous study in which paroxetine was given intermittently, and in which the medication was discontinued abruptly (Steiner et al, 2005). Nausea usually did not reappear during treatment cycles 2 or 3 in the PI group. It hence seems as if the tolerance to SRI-induced nausea may remain in spite of the fact that patients are off treatment for almost 2 weeks per cycle.

The conclusions of this study are the following: (1) The response rate of continuous administration of paroxetine in PMDD subject with irritability and/or depressed mood as prominent symptom is close to 90%. (2) Within the PC group, the effect size was highest for irritability. (3) Intermittent treatment with paroxetine was as effective as continuous treatment with respect to reduction in irritability, affect lability, and mood swings, but somewhat inferior with respect to effects on other symptoms, and with respect to global improvement.