Abstract
All research needs ethical regulation, which is institutionalized in research ethics committees. The patient information sheet, approved by a research ethics committee, sets out what patients need to know to make an informed choice about research participation. However, guidance from research ethics committees is much less explicit about risk communication. In this commentary, the balance of risk in the patient information sheets from protocols of 2 randomized controlled trials (RCTs) of medication reduction in psychosis was compared with numbers needed to treat and harm from the literature. The patient information sheet omitted risk of excess death and incomplete recovery following relapse, and overestimated the anticipated benefits. All of these risks were demonstrated in the published results of 1 of the 2 RCTs. Quantifying and tabulating risk might improve patient information sheets.
Introduction
All research must be conducted within a regulatory framework. Research ethics committees (RECs) are regulatory gatekeepers that seek to ensure that research is conducted according to agreed-upon, accepted ethical standards. The idea of voluntary RECs was considered in the United Kingdom as early as 1968. An institutionalized system was developed in the 1990s after a scandal about babies’ organs being stored without parental consent.1,2 Academic focus has remained on value management, power differentials, and governance, while current guidance pays close attention to information provision and consent.3–5 The original Nuremberg declaration was to prevent recurrence of the harmful research practised by the Nazis. Still, concerns regarding its effectiveness led to the subsequent Helsinki declaration, which prioritized informed consent and assessment by consensus, and provided the international basis for regulation across jurisdictions.6 In general, RECs are considered to perform well, if not perfectly, and they do prioritize protecting patients from harm.7 However, the explicit specification of consent but not harm has led to a substantial imbalance in procedural advice. For example, the current UK guidance mentions “consent” 159 times; “risk” 20 times, 15 instances of which referred to risk to participants; and “harm” 7 times.5 This imbalance suggests 2 potential risks given that procedure structures information to ensure transparency.
First, researchers will be much more expert in the risk–benefit balance of their projects than the REC, but will also be at risk of incorrect maximization through (unconscious) self-serving bias, given that “in scenarios characterized by uncertainty, human behaviour is affected by mental strategies which aim to protect or enhance individuals’ self-perceptions.”8 Their skill and knowledge differential may prevent RECs from being able to identify such bias when present.9 This may be exaggerated for psychiatric research, as this makes up only a small proportion of the research submitted to most RECs. For example, in the UK, mental health research is around 6% of total health research by monetary value.10
Second, if undetected, these biases could be included in the patient information sheet despite REC review. If a study is approved, this document determines the information given to patients when they decide to give consent. Inaccurate information about risk could lead patients to make decisions that are not in their best interests.
The current interest in trials of medication reduction for psychosis provides an excellent opportunity to investigate this issue. The motivation for such trials arises from patients frequently experiencing negative changes in their quality of life following medication that persist despite their psychosis having remitted, while the risk of future relapse is not certain. 11 The previous literature in this area is unclear. In a study involving patients with recent-onset psychosis, Gitlin and colleagues12 found that, although nearly all patients relapsed within 2 years, only 13% required readmission to hospital. In a 10-year follow-up study involving patients with first-episode psychosis, 20%–30% were in remission without antipsychotic medication at the time follow-up, with more than four-fifths of those having been in remission at the 5-year follow-up time point.13 However, the last century has seen an improvement in remission rates overall, consistent with public health benefit from antipsychotic medications.14 Both psychosis and the drugs used to manage it have associated risks of harm, so balancing these risks is essential to ethical trial design. Schizophrenia is associated with 15–20 years foreshortened life expectancy, and while antipsychotic medications are protective overall, this protection may not apply to first-generation antipsychotics.15 Quality of life is central to patients, but does not necessarily correlate with symptoms in psychosis that antipsychotics target.16 The combination of reluctance to take medications and the adverse effects of those medications may reduce quality of life.17 Even under uncertainty, the information provided to patients via the patient information sheet must still be as transparent as possible.
To ensure the procedures represented current practice, I identified 3 recent RCTs of medication reduction in psychosis, namely the Research into Antipsychotic Discontinuation and Reduction (RADAR) trial in the UK,18 the Handling Antipsychotic Medication Long-term Evaluation of Targeted Treatment (HAMLETT) trial in Holland,19 and the “reduce” trial in Australia.20 These trials would have been approved by their RECs under current guidance and would each have a current patient information sheet available. The trials are similar, although not identical, in their samples and intervention. The RADAR trial has now been published.21
Effective REC oversight would have 2 indicators. First, the studies’ patient information sheets would be mostly similar, with study design differences accounting for any divergence. Second, the patient information sheets would provide information about the balance of benefit and harm, reflecting available literature, and would enable patients to make an informed decision, unbiased by investigator preference.
Determining risk
I approached authors of all 3 studies for copies of their patient information sheets, and obtained those for RADAR and HAMLETT, but the authors of the “reduce” trial did not respond. The single comparison available still allowed consideration of the first indicator, as including 2 jurisdictions still allowed for disaggregation of investigator and regulatory effects.
The ordinary process of grant composition does not require a full systematic review, and REC submissions currently present a heavy administrative burden.4 I identified published studies detailing benefits and harms of medication reduction from the trials’ published reference lists, supplemented by citation and topic searches, with likely completeness reviewed by an expert in the field, following procedures currently used in grant and REC application processes. Studies published after the date of regulatory approval were included, as patient information sheets should be updated if the risk is substantially altered.
Where quantifications of benefit or harm were available, I converted them to a common metric, namely numbers needed to treat (NNT) or harm (NNH), which estimate the number of patients treated before 1 additional patient either benefits (NNT) or is harmed (NNH) by the intervention. Formally, they are defined as the reciprocal of the difference between the experimental event rate and the control event rate. They can be calculated from continuous, binomial, or survival data.22,23 A systematic review found that differences in natural frequencies were less likely to lead to overestimation than relative risks, possibly when combined with percentages.24,25 The accuracy of NNT and NNH increases with prevalence; selecting subpopulations with particular characteristics could increase the NNT or NNH and can be personalized by referring to symptoms that increase subpopulation prevalence, making them potentially applicable as individual patient estimates.26
I tabulated and compared the benefits and harms from each patient information sheet. If the harm or benefit was quantified, it was included in the tabulation. I also recorded the steps each study took to protect or mitigate the known risks of medication reduction, and whether patients were alerted to the possibility of the information updating. Where a risk was mentioned, but explicit quantification was not included, I rated risk severity (rare, unlikely, possible, likely, certain) with a colleague experienced in research in psychosis.
I compared the estimates of benefits and harms from the patient information sheet with the NNTs and NNHs calculated from the literature. These were necessarily average NNTs as individual-level adjustments were not possible.26 As trials proceed on the basis of equipoise, a risk or benefit anchor point of 50% was set at the boundary of probable and likely risk or benefit. Benefits, where mentioned but not quantified, were considered at least likely as it would seem unfair to enrol patients when benefits were less than chance assignation between case and control groups might provide, so equipoise was treated as a floor for benefit. Medication reduction was treated as a benefit because patients were willing to risk medication reduction by enrolling.
I compared predictions of risk from the literature with the adverse effects reported in RADAR to assess their predictive validity.
Analysis of patient information sheets
Table 1 shows that HAMLETT and RADAR were similar in terms of anticipated benefits. In HAMLETT, the anticipated difference in risk of relapse between the 2 arms of the trial was 37%, below the 50% threshold for being a likely outcome. Therefore, the risk for an increase in relapse was rated as possible rather than likely range, agreeing with RADAR. However, the quantification of the risk of relapse in HAMLETT appeared to be associated with a more detailed protection plan for early detection than RADAR. The HAMLETT patient information sheet did not mention mitigation separately from protection, but no constraints were placed on the practitioner if alerted. Even so, the 2 studies were similar, with HAMLETT documenting its mitigation strategy implicitly. Both studies’ patient information sheets can thus be summarized as presenting likely benefits balanced against possible harms. However, in quantifying the risk of relapse, the HAMLETT patient information sheet was more cautious than RADAR, reflected in its higher level of protection. The RADAR protocol suggested an increase of up to 10% in risk of relapse (defined as hospitalization), which was considered as an acceptable cost for the predicted benefits; this was slightly less than the 11% difference in hospitalization found in their reference.18 This degree of risk was less than that accepted by HAMLETT, which justified their judgment by referring to uncertainty in the research.19 Therefore, RADAR anticipated lower risk of relapse than HAMLETT.
Risk–benefit quantification
Table 2 shows the risks and benefits described in the literature. Neither study elected to include additional psychological treatments to substitute for medication withdrawal. A national cohort study comparing depot and oral medications (which have a higher risk of discontinuation) over 2 years found that the adjusted hazard ratio for discontinuation of depot versus oral medication was 0.41 (95% confidence interval [CI] 0.27–0.61), while the adjusted hazard ratio for hospitalization was 0.36 (95% CI 0.17–0.75).33 The latter is very similar to the hazard ratio for relapse (0.44) reported in a study of maintenance versus guided withdrawal,27 and their surveillance window was 7 years, enabling a 2-year average surveillance time. Therefore, studies reporting discontinuation rates and consequences under ordinary psychiatric care are likely to be good estimators of risk and benefit for RADAR and HAMLETT.
Benefits
The potential benefit of medication reduction could be estimated from 2 publications of the index trial of gradual discontinuation. 27,28 After 2 years of the RCT, the chance of successful discontinuation was 20%. In the 7-year follow-up period,28 the major and unexpected benefit was associated with group assignation rather than any measured treatment aspect, other than the medication reduction itself. However, the follow-up study was not a continuation of the previous RCT, but rather an unblinded case–control follow-up study. Members of each group experienced the other’s treatment regimen, and the follow-up raters were not blind to group membership. Inspection of the Kaplan–Meier curves showed a sudden loss of proportional hazard around the time of study termination, with the curve flattening sharply in the medication reduction group only. This result implies a time by group interaction and is consistent with some form of rescue action occurring in the medication reduction group. More intensive engagement could improve this outcome and recollection of the rescue therapies would be hard after 5 years.35 Therefore, the follow-up finding does not confirm a benefit from medication reduction, but rather from membership of the group in that context. The original goal of the RCT was to maintain recovery after discontinuation of medication. Lack of blinding meant the most unbiased measure at followup would be the proportion of participants not taking medication at follow-up. The participant’s decision to cease was taken before; presumably, they would have restarted medication after a deterioration. After 7 years, 11 patients from the original dose-reduction group (16.2%) and 6 from the maintenance group (9.5%) were medication-free.
Both HAMLETT and RADAR used social functioning measures as their primary outcomes. For comparison, no significant difference in this measure was apparent in the index medication reduction RCT, but one was observed in the 7-year follow-up.27,28 The HAMLETT trial identified 2 time points, at 6 months and 3.5 years; they implied using the 3.5-year time point for power calculations. The RADAR trial collected data for 2 years. The difference in social functioning after 2 years could therefore be used to estimate the relevant NNT for social improvement in RADAR and HAMLETT.27 Randomized controlled treatment trials have a better methodology than other study designs, and the time point was comparable with both studies. Seven-year28 and 10-year29 differences are reported for completeness (Table 1). The categories of improvement at 7 years overlap, so numbers without improvement were derived to estimate each arm’s general improvement.
Antipsychotic medication affords a substantial long-term risk to health from weight gain, which medication reduction might be expected to reduce. Pathologically significant weight increase is 7% or more, and Crins and colleagues30 provided a meta-analysis from which the NNT could be estimated, averaging across antipsychotic medications.
Risks
Risk must be detected within the study’s duration for the study’s risk mitigation strategy to be effective. Mean time to relapse was 19.3 (standard deviation 9.7) weeks34 so 97.5% of relapses would be captured in 38.7 weeks. Thus, the RADAR trial needed to complete all its reductions by the first quarter of its second year and HAMLETT needed to complete these by the third quarter of its second year, although neither protocol specified this. Overall, 97.5% of reductions would be complete within 75.6 weeks.27 Thus, HAMLETT (182 wk), but not RADAR (104 wk) investigators could be confident that all of their relapses would be detectable within the study period. The NNH for relapse at 3.5 years was estimated from the Kaplan–Meier plot in the follow-up study,28 using the hazard ratio from the original RCT,27 as inspection of the plot suggested that the maintenance arm had not experienced a group by time effect. A recent meta-analysis also provided information on untimed relapse risk; it was included for comparison with the other studies.31
Comparison with RADAR
The recent publication of RADAR21 allowed comparison of those benefits and risks predicted from the previous literature that RADAR reported in its findings.
Weight was only available in the appendix table, disaggregated by lockdown status, so did not report total weight difference at 24 months. The other benefits and risks reported in the literature were outside RADAR’s duration, so were not compared.
Discussion
The quantification of benefits and risks showed important differences between the literature and the studies’ patient information sheets. Both studies were large enough to risk excess death in their treatment arms, but this was not mentioned in either patient information sheet. Significance values offer false reassurance as the studies were underpowered to detect this difference. Such deaths would be hard to recognize as a study consequence as the risk is for all-cause mortality. Substituting psychological treatment to mitigate this risk may not be sufficient (e.g., for suicide).36,37 The excess in deaths identified in a 10-year follow-up of another discontinuation study (2 deaths per 89 patients during maintenance v. 4 deaths per 89 patients during discontinuation) was consistent with this prediction.29 The proposed improvement in quality of life, measured by social functioning, was not detectable within either study’s time frame. Instead, social functioning was associated with an NNH, although this is not significantly different from equipoise. However, it suggests that the benefits offered are possible, rather than likely. The longer-term benefit in social functioning also seemed questionable as it was based on possibly biased valuations in the context of no overall difference in outcome at 7 years, and a possibly biased nonsignificant gain with worse symptomatic outcomes at 10 years. The risk of relapse in the literature was like that reported in HAMLETT’s patient information sheet, although it was higher than the risk anticipated by RADAR. However, RADAR’s outcomes resembled those predicted in HAMLETT and the literature. Neither patient information sheet mentioned the risk that, if relapse occurred, subsequent recovery could be less successful than previously, even with optimal treatment. Although my quantification used findings published after the patient information sheets were developed, the risk had been previously identified as possible, with empirical support.29,38 Knowledge and acceptance of risk is usually associated with efforts to mitigate it. Factors such as intrafamilial expressed emotion are modifiable risk factors that could have been actively reduced in the absence of medication, but neither study mentioned such approaches.39,40 Neither study considered mitigating the loss of the potential neuroprotective effect of antipsychotics, nor advised of this risk.41
Although broadly comparable, the patient information sheets provided to patients did not adequately reflect available evidence about balance of risk, irrespective of research group or jurisdiction. This suggests a regulatory issue, not one of investigator integrity or individual jurisdictional failings. Although one cannot exclude bad actors, investigators undertake trials because they believe they will improve the lives of patients. Self-serving bias here may result from the investigators’ enthusiasm, backed by years of commitment, scholarship, experience, and effort. Unfortunately, the same characteristics that support the leadership skills necessary for research success increase the risk of such biases.8 Self-serving bias is often unconscious, which is why trials need to be blinded and why systematic reviews are trusted. The same weaknesses were found in 2 nonoverlapping, geographically separate teams who submitted to 2 different RECs in different cultures, so they are unlikely to reflect local issues or bad faith; rather, international standards are poorly optimized to regulate provision of risk information to patients. Tabulating benefits and risks by a common metric seems a simple, effective way of correcting self-serving bias that does not place a great additional burden on applicants and could be readily incorporated into guidance.
Footnotes
Competing interests: David Foreman reports travel support from the Royal College of Psychiatrists and National Institute for Health and Care Excellence (NICE). He sits on the technology appraisal committee with NICE. No other competing interests were declared.
- Received October 2, 2023.
- Revision received October 22, 2023.
- Revision received November 10, 2023.
- Accepted November 13, 2023.
This is an Open Access article distributed in accordance with the terms of the Creative Commons Attribution (CC BY-NC-ND 4.0) licence, which permits use, distribution and reproduction in any medium, provided that the original publication is properly cited, the use is noncommercial (i.e., research or educational use), and no modifications or adaptations are made. See: https://creativecommons.org/licenses/by-nc-nd/4.0/