Estimating the smallest worthwhile difference of antidepressants: a cross-sectional survey

Background

Depression is the second leading cause of disability worldwide1 with millions seeking treatment through antidepressant medications.2 Antidepressants are proven efficacious, backed by hundreds of randomised controlled trials and rigorous meta-analyses.3 4 For example, a network meta-analysis including 21 commonly prescribed antidepressants demonstrated response odds ratios (ORs) ranging between 1.49 and 1.85 favouring antidepressants over placebo.3 However, effect size measures such as ORs miss the patient’s perspective on the significance of intervention benefits.5 The patient-deemed worthiness of antidepressant treatments, given the symptom improvement benefits on the one hand and the burdens (harms, expenses and inconveniences) on the other, remains controversial.6 7

The minimum important change (MIC) can help determine whether changes in health outcomes over time are important from the patient viewpoint. The MIC, also known as the minimal important difference or minimal important clinical difference, is the smallest change after treatment in a health outcome perceived as important.8 Defining the MIC is a useful way to interpret patient-reported outcome measures.9 By definition, the MIC is specific to a particular instrument or outcome measure,10 11 generally lacks association with an intervention12 and does not explicitly account for burdens and benefits relative to an alternative.10 12 The MIC for depression scales is estimated to be a 6-point reduction for the Beck Depression Inventory-II13 and a 7-point to 8-point reduction for the Hamilton Depression Rating Scale 17-item.14 Estimates of the MIC concern intraindividual change and the mean change seen in participants in antidepressant trials (both on placebo and antidepressants) is usually larger than the MIC.13 14

A conceptually different approach to facilitate interpretation of patient importance in the context of an intervention is to estimate the smallest worthwhile difference (SWD). The SWD is ‘the smallest beneficial effect of an intervention that justifies the costs, risks and inconveniences of that intervention’ over a treatment alternative.10 The SWD represents a between-treatments assessment reflecting a trade-off of the benefits and burdens of two treatment options. It is patient-derived, intervention-specific, control-specific and expressed as an absolute difference between treatment options.10 12 Two methods have been proposed to estimate the SWD. The discrete choice experiment asks individuals their preferences for hypothetical scenarios where benefits and costs vary. Regression models determine the preference threshold for one treatment over another.15 The benefit-harm trade-off method (BHTM) asks individuals to state their preferences for scenarios, but benefits vary, while burdens remain constant. The BHTM ascertains how many benefits people are willing to trade-off for the expected burdens of one intervention over another.16 The BHTM is easy to understand and has been used to estimate the SWD in treatments for respiratory disease,12 fall prevention15 and pain reduction therapies.17 Despite the debate about whether the effect of antidepressants is large enough to justify their burdens, the SWD of antidepressants has never been estimated.3 18

Methods

Online supplemental 1 presents the prespecified protocol approved by the Kyoto University Graduate School of Medicine Ethics Committee on 9 September 2022 (R3574-1, changes in online supplemental 2), and all participants provided e-consent.

Study design

We conducted a cross-sectional survey using three research participant crowdsourcing services (RPCSs): Prolific, and MQ Mental Health (MQ) and Amazon Mechanical Turk (MTurk) between October 2022 and January 2023. We invited participation through RPCSs and linked participants fitting the inclusion criteria to an online survey. While Prolific and MTurk represent a general internet population, the MQ participant pool includes mainly people with lived experiences in mental health and healthcare professionals who volunteer to improve research representation. RPCS participants generally demonstrate high test–retest reliability and high convergent and concurrent validity in psychological tests,19 although there remain concerns of careless or fraudulent responses.20 Following previous methods to increase data quality,19 20 we restricted MTurk participants to those in the USA rated by researchers as a 95%+approval and graded at the ‘masters’ level; the highest vetting. Prolific participants were limited to the UK or USA. MQ participants were limited to the UK only. These RPCSs provide monetary compensation for participation from their established participant pools. The compensation was £1.20 for Prolific, US$1 for MTurk and voluntary for MQ. Remuneration was commensurate with similar length RPCS studies, based on time-to-completion, and above the average amount as evaluated by Prolific.

Primary outcome and its measurement

Our primary outcome was the SWD, representing the patient-preferred efficacy of depression treatment with antidepressants that would be deemed worthwhile compared with no treatment, given the treatment burdens (harms, expenses and inconveniences).

We estimated the individual participant preference through the BHTM. We presented a major depressive episode symptoms summary and explained the benefits and burdens of antidepressant treatment and the no treatment/natural course alternative based on The Diagnostic and Statistical Manual of Mental Disorders, 5th edition (DSM-5) and US Federal Drug Administration (FDA) descriptors (online supplemental 3). We used selective serotonin reuptake inhibitors (SSRIs) and serotonin-norepinephrine reuptake inhibitors (SNRIs) in the antidepressants description because they are most frequently prescribed21 and have similar tolerability/efficacy profiles.3 We estimated that response, defined as 50% or greater reduction in depression severity, occurs in approximately 30 out of 100 people after 2 months without any treatment.22–24

Next, we asked participants if they believed antidepressant treatment was worthwhile given variable hypothetical response rates for antidepressants compared with the response rate for no treatment (ie, 30%) after 2 months. We asked them to weigh the benefits and burdens, then decide if they would accept the drug. Based on their yes/no response, we next asked the participants about lower or higher response rates, respectively. To eschew a default to heuristics and laziness, we randomly assigned participants to two different response algorithms, requiring different attention and concentration levels (online supplemental 4). As we observed no difference in SWD by algorithm, in the following we report the combined results. Then, the difference between 30% and the minimum antidepressant response the participant would consider taking defines the individual’s SWD.

Demographic and clinical variables

We collected demographic information including gender, race, education, employment, country of residence and insurance status. Insurance was categorised as Affordable Care Act (USA only), Medicare/Medicaid (USA only), national healthcare insurance, private health insurance, other and uninsured.

Clinical variables included lifetime depression prevalence, family history of depression, lifetime antidepressant treatment, lifetime psychotherapy, current antidepressant treatment, current psychotherapy, treatment preference (antidepressants vs psychotherapy) and current depression symptoms. Current depression symptom severity was assessed with the Patient Health Questionnaire-9 (PHQ-9).

Participants

We included participants aged 18 or older, residing in the UK or USA, who were fluent in English. We were interested in a general population, but we were most concerned with people experiencing depressive symptoms of at least moderate severity (PHQ-9≥10) but not currently engaged in any treatment. This group most closely resembles potential treatment seekers, and we can expect them to provide more accurate estimates of the SWD for a major depressive episode as depicted in the provided clinical scenario because of their current experiences and potential treatment needs. This group would represent the best estimate of a clinical sample taken from a general internet population. Thus, to explore how different experiences with depression and treatment could be associated with treatment-seeking judgements and SWD estimates, we included participants with four differing profiles, (1) moderate-to-severe depressive symptoms but not in treatment: PHQ-9≥10 and not receiving any treatment, the primary interest group for SWD estimation, (2) currently in treatment: ongoing antidepressant therapy or psychotherapy, (3) absent-to-mild depressive symptoms with treatment experiences: zero-to-mild depression symptoms (PHQ-9<10), no current treatment, but previous antidepressant treatment and (4) absent-to-mild depressive symptoms without treatment experiences: zero-to-mild depression symptoms and no current or previous antidepressant treatment.

Sample size

We set the sample size to achieve the expected precision in the estimate of the SWD. We assumed that the SD of the SWD would resemble that estimated in a pain study (SD=22 for SWDs of 20%)17 because it is an SWD study investigating a subjective outcome, and no depression SWD study exists from which to base a power analysis. To obtain a 95% CI within 10 percentage points, we needed approximately 80 participants with at least moderate symptoms but not in treatment. We also anticipated similar precision for the three other groups as well. Estimates of depression incidence in RPCSs vary but are demonstrably higher than in the general population.19 We therefore assumed approximately 20% of participants would present with moderate-to-severe depressive symptoms on the PHQ-9. Depending on these subgroup populations, approximately 800 participants may be necessary to reach n=80 in the groups with the smallest populations (online supplemental 5). Recruitment was stopped after all four groups included 80 participants. We accepted responses with no missing outcome variables.

Data analysis

We first presented the SWD with its distribution and percentile ranking for people with moderate-to-severe depressive symptoms but not in treatment and estimated its median and IQR. We then examined the SWD by participant groups for comparison using box-violin plots. Finally, we analysed the entire sample’s SWD using demographic and clinical independent variables in 11 univariable regressions and a single multivariable regression investigating SWD predictors using the least absolute shrinkage and selection operator (LASSO) method.25 Participants who reported they would not accept antidepressant treatment, even if the response were 100%, were removed from the primary analysis because they would not be real-world candidates for antidepressant treatment.12 They may be philosophically opposed to taking mood-altering drugs, or any drugs. To examine the effect of this decision, we conducted sensitivity analyses by assigning them an SWD=71, which is an impossible value representing an antidepressant response rate over 100%. We used SAS V.9.4 (Cary, North Carolina, USA, SAS Institute) and R V.4.2.2 (R Core Team, 2022) for all statistical analyses.

Patient and public involvement

We piloted the BHTM script with two members of the patient and public involvement (PPI) group at the Oxford Precision Psychiatry Lab of the University of Oxford. PPI members (SM and RE) reviewed descriptions of a DSM-5 major depressive episode, and the benefits and burdens of antidepressant treatments. These experts by experience provided feedback on clarity, inclusivity and accuracy of patient experiences, and we modified the scripts accordingly. They further collaborated in the interpretation and write-up of the manuscript.

FindingsParticipants

The total sample included 935 participants from three RPCSs: MTurk (n=255), Prolific (n=395) and MQ (n=285). Participants had a mean age of 44.1 (SD=13.9), were mostly women (66%) and Caucasian (84%). Table 1 lists the demographic and clinical characteristics for those with moderate-to-severe symptoms but not in treatment and for the full sample.

Table 1

Demographic and clinical characteristics of participants

Ninety-five participants (10.2%; n=20 with moderate-to-severe symptoms but not in treatment, n=20 currently in treatment, n=14 with absent-to-mild symptoms with treatment experiences and n=41 with absent-to-mild symptoms without treatment experiences) reported that they would not consider taking antidepressants even if these drugs achieved 100% response.

SWD for people with moderate-to-severe depressive symptoms but not in treatment

Figure 1 shows the distribution and percentile rank of the SWD as reported by participants with moderate-to-severe symptoms but not in treatment. The median was 20% (IQR=10–30%, n=104).

Figure 1Figure 1Figure 1

Frequency distribution of the smallest worthwhile difference (SWD) for people with moderate-to-severe depressive symptoms but not in treatment (n=104). The top row shows the cumulative percentiles of the distribution. The stripped bar represents the median SWD.

SWD for participant groups according to depression symptoms and treatment experience

Table 2 shows the average and dispersion data of the SWD for the four groups and the full sample. Distributions of individual estimates overlapped to a great degree across the groups, but dispersion varied considerably (figure 2). The participants currently in treatment comprised the largest proportion of the sample (44%) and showed the largest dispersion.

Table 2

The SWD for participant groups according to depression symptoms and treatment experience

Figure 2Figure 2Figure 2

Box-violin plots of the SWD distributions for four participant groups. Boxes represent the second and third quartiles with the median line in the middle. Whiskers represent the range. Violins represent the frequency distribution, enhanced with raw data points. The sample used to estimate the figure does not include 95 participants who reported they would not take antidepressants for depression even if response were 100%. SWD, smallest worthwhile difference.

To further investigate potential correlates of the SWD, we conducted univariable regressions and a single multivariable linear regression between all baseline covariates and the SWD (online supplemental 6). LASSO indicated that only treatment preference could be an important predictor of the SWD. Participants who preferred antidepressants at outset reported a lower SWD (median=20%, IQR=10–35%) than those who preferred psychotherapy (median=25%, IQR=15–35%).

As a sensitivity analysis, we assigned an SWD of 71% to participants unwilling to ever accept antidepressants, which increased the median SWD for the people with moderate-to-severe symptoms but not in treatment (median=25%, IQR=15–42.5%, n=124) and for the entire sample (median=30%, IQR=15–40%, n=935).

Discussion

We used the BHTM to estimate the SWD for initiation of antidepressant treatment for depression in an online cross-sectional survey. The median SWD among participants with moderate-to-severe depressive symptoms but not in treatment was an additional 20 percentage points (IQR: 10–30%) over the assumed natural response rate of 30% for no treatment. Other groups showed a median of 25% with considerable dispersion: People currently in treatment showed the largest variability (IQR=10–40%), and people with absent-to-mild symptoms without treatment experiences showed the smallest variability (IQR=20–30%). About 10% of the participants reported they would not take antidepressants even if these drugs achieved 100% response. Treatment preference, either for drug therapy or for psychotherapy, was the only important predictor of the SWD (median of 20% vs 25%, respectively). These findings, and their implications for the relevance with existing antidepressants, should be interpreted considering the natural response rate of no treatment (assumed to be 30% in our study) and the average greater response rate of antidepressants (assumed to be about 45% according to the literature, see below).

We queried the expected difference in antidepressants response rates versus the no treatment natural course because placebo is not a real-world treatment alternative. However, there are a very limited number of studies examining natural response rates among depressed seeking-treatment patients. The survival curve of 393 incident depressive episodes observed in a naturalistic community cohort study suggests a response rate between 10% and 40% after 2 months.23 Compared with placebo, a systematic review comprising 252 randomized controlled trials (RCTs) showed the average response rate was 37%.22 Some evidence suggests that placebo produces greater response than no treatment, due to the placebo effect. For example, the OR for pill placebo response over no treatment was 1.38 (0.75–2.55) in a network meta-analysis in depression psychotherapy trials.26 One large pragmatic trial compared watchful waiting versus antidepressants among primary care patients and found watchful waiting response rate to be 29%.24 Taken together, we conservatively set the no treatment response rate to be 30%.

A network meta-analysis of 522 randomised trials of antidepressants showed that the response ORs for SSRIs and SNRIs over placebo ranged between 1.5 and 1.9.3 Assuming placebo response rates of 30 or 40%, these ORs would translate into antidepressants response rates of 39–45% or 50–56%, constituting risk differences between 9% and 16%. However, it is generally believed that more responsive patients are selectively enrolled in RCTs, and antidepressant real-world effectiveness may be smaller than these 9–16% risk differences. Alternatively, a greater probability of receiving placebo in RCTs is associated with lesser response rates for the same antidepressant, with an observed response rate difference up to 10%.27 In practice we have no placebo condition, and the antidepressant response rate may be somewhat higher than the above estimations. A recent individual participant data meta-analysis of 73 388 patients from 232 placebo controlled RCTs reported an overall 15% risk difference for antidepressants over placebo when individual response patterns were considered.28 Everything considered, we assume 15% as the realistic response rate difference between the currently most efficacious antidepressants and no treatment/natural course in the following discussion.

The estimated SWD of 20% (IQR: 10–30%) for people with moderate-to-severe symptoms but not in treatment, or 25% (IQR: 10–35%) for the entire sample, are greater than the 15% greater response rate to be expected on currently available best antidepressant drugs. However, there was wide variability in individual SWDs, both for the participants with moderate-to-severe symptoms but not in treatment and for the entire sample. Approximately one in three (40/104, or 276/840) would be willing to take antidepressants for a depressive episode at the currently expected response rate (ie, 15 percentage points greater than no treatment), in exchange for the potential burdens. Another one-third would need double the current antidepressant effect (ie, 30%) before they initiate antidepressant treatment. The remaining third would need to see greater response rates or fewer burdens.

To explore potential sources of variability in the SWD values, we examined all demographic and clinical variables. Only treatment preference demonstrated a robust association. Preconceived notions about antidepressants can affect confidence in therapies and motivation to seek psychiatric treatment. Depressed but not yet treated patients would likely be seeking treatment in the real world. Our findings suggest that these people may show the smallest average SWD, but the group SWD distributions largely overlapped. The fact that the SWD estimates did not substantively differ among those with or without clinical needs, and with or without lived experiences, corroborate the appropriateness of our method co-produced by people with lived experiences. The observed substantial variability in the participants’ requirements to accept antidepressants, coupled with the increasing prescription rates,22 highlights the need for an explanation of the high SWD observed in the current study. We can only speculate, but perhaps there is a lack of understanding from patients and a lack of communication from doctors (understating burdens or overstating efficacy) that factor into hasty decision-making in prescription acceptance. Much more research is needed in this area about actual transactions in the real world and their appropriateness.

The SWD may provide a benchmark for efficacy to be expected for future antidepressant drugs over placebo. The median SWD of a 20–25 percentage points greater response rate than the 30% for no treatment would correspond with an OR between 2.3 and 2.9 and an SMD between 0.47 and 0.5829 (online supplemental 7 for calculations). Unfortunately, a systematic review of currently approved general medicine and psychiatric drugs shows that only a minority of drugs have evidence for this magnitude of efficacy.30 It should also be noted that this SMD of 0.5–0.6 would translate into a mean difference of 3–4 points on the Hamilton Rating Scale for Depression, because the average SD of this scale is around 7 points.3 The MIC of the same rating scale is reported to be 7–8,14 and is therefore much larger than the SWD. Thus, the MIC is both conceptually and pragmatically distinct from the SWD. We must use the SWD, not the MIC, as a benchmark to evaluate the observed group differences in RCTs and to estimate the sample size of future trials.

Limitations

There were some noteworthy limitations. Foremost is our use of an RPCS sample, which may not represent real-world populations.19 While the SWD from a clinical sample is important and may potentially produce a different SWD, evaluation of a general population serves as foundation for further research in the perception of people with lived experiences. Our study focused on people observed in an RPCS sample with moderate-to-severe depression symptoms and represents a group who may or may not access psychiatric services. Participants recruited through RPCSs tend to be younger, more educated and report greater psychopathology, when compared with the general population.20 However, RPCSs produce greater generalisability than convenience samples.20 We balanced our population by including two RPCSs based in two different countries and we also sampled from MQ Mental Health, a UK-based charity specifically aimed at improving mental health outcomes research. By sampling the RPCSs with performance restrictions, we were able to focus on those with more pressing clinical needs and evaluate other groups with potentially different needs.20 The fact that the two different sequences of questions in the BHTM method yielded the same estimates attests to the attentiveness and trustworthiness of our internet collaborators.

Second, we excluded participants who replied that they would not take antidepressants even if antidepressant response rates were 100%, while the expected natural response rate was 30%. We followed established methods citing they would not be candidates for treatment in clinical practice.12 A sensitivity analysis including them with an improbable SWD, reflecting greater than 100% response, raised the estimate by 5 percentage points. Third, participants with moderate-to-severe symptoms but not in treatment were not necessarily diagnosed with major depression. We asked them to estimate the SWD for people with diagnosed depression as depicted in the clinical scenario. We assumed that while not diagnosed, they might provide personally invested SWD estimates because of their current depressive symptoms and potential treatment needs. Finally, systematic differences in depression treatment burdens must be considered. Policies differ between national healthcare systems, and between individuals within one country, based on insurance status and access to mental health services. The indirect costs (eg, loss in productivity) due to depression may also differ between individuals, and between societies. To evaluate these differences, we conducted exploratory analyses and found no appreciable association between the SWD and individual variables such as age, sex, race, nationality or insurance status.

Clinical implications

One-in-three of those who might consider antidepressant treatment and experience at least moderate depression symptoms would find the currently available antidepressants worthwhile, in exchange for the expected burdens of treatment including side effects, expenses and other inconveniences. Two-in-three require greater response rates or fewer burdens. There was great variability in individual SWD estimates but this variability could not be explained by demographic or clinical variables. Thus, while a minority may be satisfied with the best currently available antidepressants, more effective and/or less burdensome medications are needed. This is the first evaluation of the effectiveness that patients require for initiating antidepressant treatment. Greater value must be placed on the patient perspective for antidepressants in the treatment of depression. This information can help clinicians and researchers to understand patient expectations of therapies, evaluate the worthiness of antidepressants and establish evidence-based benchmarks for all medical research.

Comments (0)

No login
gif