The severe acute respiratory syndrome corona virus 2 (SARS-CoV-2) has since its emergence infected more than 226 million people, and more than 5 million deaths are currently directly attributed to coronavirus disease 2019 (COVID-19).1 The disease mainly affects the lungs, which in severe cases causes hypoxaemic respiratory failure. Despite the introduction of dexamethasone and interleukin-6 receptor antagonists for severe COVID-19, treatment options remain largely supportive with the administration of supplemental oxygen being essential for these patients.2-5
Titration of the oxygen therapy for critically ill adults remains a much contested issue and determining a range of safe and possibly superior oxygenation levels for patients with severe COVID-19 has the potential to improve treatment and to ensure a more effective use of available oxygen supplies.6-13
For this reason, the recently published Handling Oxygenation Targets in the ICU (HOT-ICU) trial13 was amended with the Handling Oxygenation Targets in COVID-19 (HOT-COVID) trial, a randomised clinical trial investigating the benefits and harms of oxygen titration to a targeted partial pressure of arterial oxygen (PaO2) of 8 kPa (lower oxygenation target) versus 12 kPa (higher oxygenation target) in acutely admitted ICU patients with COVID-19 and hypoxiaemia.14
The primary outcome of the HOT-COVID trial is the absolute number of days alive without life-support at 90 days and will in the primary analysis be assessed through a conventional frequentist statistical approach with results dichotomised as statistically significant or not according to a predefined p-value cut-off at the conventional 5% level.14 Importantly, p-values are indirect probabilities calculated under the assumption that the null hypothesis of no difference is true, and thus do not directly assess the probabilities of the underlying true effect of the intervention and the probabilities of effect sizes different to the null-hypothesis.
To complement the primary analysis, the present protocol outlines the methods and principles for a pre-planned secondary Bayesian analysis of the primary outcome with the purpose of assessing the probability of the between-group difference in the absolute number of days alive without life-support, and of the secondary outcome of 90-day mortality, including probabilities of pre-specified effect sizes. Additionally, as heterogeneous treatment effects (HTE) may exist, we aim to explore the extent of such in Bayesian HTE analyses based on four pre-specified baseline variables being the sequential organ failure assessment score (SOFA score), the PaO2/FiO2 ratio at randomisation, the highest dose of norepinephrine during the 24 h before randomisation, and the plasma concentration of lactate at randomisation.15-17
2 METHODS 2.1 Trial design and conductThe HOT-COVID trial is an amendment to the HOT-ICU trial.13, 18 It is an investigator-initiated, international, randomised, parallel-group trial with stratification for site of inclusion only. The trial investigates the benefits and harms of a lower versus a higher oxygenation target in adult ICU patients with COVID-19 and hypoxaemic respiratory failure.
Trial participants are randomised 1:1 to the lower oxygenation target (intervention) or the higher oxygenation target (control) for their entire ICU-stay including any re-admissions for up to 90 days. The primary outcome is the number of days alive without life-support (defined as mechanical ventilation, renal replacement therapy, or infusion of vasoactive or inotropic agents) within 90 days from randomisation. The actual numbers are used, and death is not penalised by the score 0. Additionally, we will assess the secondary outcome all-cause mortality at 90 days, as is required for interpretation of the primary outcome.
Further details on trial design, inclusion and exclusion criteria, outcomes, registered variables and methodology is presented in the published primary protocol,14 the supplement and on the trial website (www.cric.nu/hot-covid).
Both the present protocol and the planned analysis will adhere to the Strengthening the Reporting of Observational studies in Epidemiology (STROBE) statement,19 and the Reporting of Bayes Used in clinical Studies (ROBUST) guideline for the present protocol.20
All results will be published regardless of findings.
The present protocol for a pre-planned secondary Bayesian analysis has been finalised and submitted prior to randomisation of the last patient in the HOT-COVID trial.
2.2 Approvals, registration and ethicsThe HOT-COVID trial is approved as an amendment to the HOT-ICU trial by the Danish Health and Medicines Agency (AAUH-ICU-01), the Committee on Health Research Ethics in the North Denmark Region (N-20170015), the Danish Data Protection Agency (2017-055), and by all required authorities in participating countries. It is prospectively registered at ClinicalTrials.gov (NCT04425031) and in the European clinical trials database (EudraCT 2017-000632-34). The trial adheres to the Helsinki Declaration21 and is externally monitored according to the Good Clinical Practice (GCP) directive. Patients are enrolled after consent has been obtained according to national regulations.
2.3 Sample size and trial statusAs of the 7th of December 2021, 550 of 780 planned patients have been randomised; the sample size justifications are presented in the primary protocol.14
2.4 Statistical analysisAll analyses described this secondary protocol will be conducted in the intention-to-treat population of the HOT-COVID trial, being all randomised patients with available data on days alive without life-support at 90 days or all-cause mortality at 90 days. All analyses will be adjusted for the stratification variable (trial site) as fixed effects.
Bayesian analyses will be conducted using R (R Core Team, R Foundation for Statistical Computing, Vienna, Austria) and Stan22 through the R-package brms.23
2.5 OutcomesWe will analyse two outcomes, the number of days alive without life-support at 90 days (count outcome), and all-cause mortality at 90 days (binary outcome), and will present mean differences (MDs) and ratio of means (RoM) for the former, and relative risks (RRs), risk differences (RD) and odds ratios (OR) for the latter.
2.6 PriorsFor the primary analyses, we will use weakly informative priors, centred on no difference (i.e. MD or RD = 0, RoM, RR or OR = 1) encompassing all plausible effect sizes. These priors will have minimal influence on the results and will allow the trial data to dominate the posterior distribution.
Furthermore, sceptic priors will be applied consistent with expectations based on the small difference generally found between intervention groups in ICU trials and in ICU trials comparing different oxygenation targets.24, 25
Evidence-based priors will be incorporated if new high-quality evidence regarding the subject is published during the trial period. Presently, only limited direct evidence is available on how to titrate supplemental oxygen in ICU patients with COVID-19.8, 26 If more become available, we will incorporate into the priors.
Complete prior definitions are presented in the supplement.
2.7 PosteriorsPosteriors probability distributions will be presented visually and summarised using median values of each parameter (either MD, RoM, RR, OR or RD depending on the outcome) with percentile-based 95% credibility intervals (Crls). Such intervals may be directly interpreted at the 95% most likely values given the model, priors and data. Additionally, the cumulated posterior distributions will be presented visually.
2.8 Bayesian analysis of the primary outcomeBased on the data distribution observed at the interim analysis conducted after randomisation of the first 390 patients, we expect many values of both 0 and 90 days alive without life support, and thus a non-normal data distribution. This limits the use of standard count models such as Poisson or negative binomial regression models.
We opted to not use ordinal regression models as the assumption of proportional odds may not be met, and as results are not presented on a scale that is easily translated to clinically understandable effect sizes. Three-part models (i.e. zero-one inflated beta models) were considered but rejected due to the moderate sample size which combined with the complexity of such a model would make it difficult to define meaningful priors.
Consequentially, we will use a Bayesian linear regression model adjusted for the stratification variable ‘site’ to describe the primary outcome of the HOT-COVID trial. Despite the expected non-normal distribution of the data, the model represents a robust approach to estimate a difference in means between the two groups.
We will present the MD for the comparison of absolute number of days and the RoM for a relative comparison. Furthermore, full posterior probability distributions including cumulated distributions will be visually presented to display the entire range of plausible effect sizes; an example is presented in Figure 1.
A mock figure showing the visualisation of posteriors. Here the MD of days alive without life-support at 90 days’ is used as example. The upper subplot shows the cumulative probability for a MD below (left y-axis) or above (right y-axis) the MD chosen on the x-axis. The lower subplot shows the entire posterior probability distribution for the MD of ’days alive without life-support at 90 days’. An highlighted blue area corresponding to pre-defined ROPE is superimposed on the plots. The highlighted red area under the curve corresponds to the 95% credibility interval. MD, mean difference; ROPE, region of practical equivalence corresponding to the range between the pre-defined minimal clinically relevant difference (MIREDIF), in this case ±1 day
A MD in days alive without life-support of more than 1 day in both directions is defined as the minimal clinically relevant difference (MIREDIF) since a difference below probably this has little importance for patient health and ICU admission. Correspondingly, the region of practical equivalence (ROPE) is a MD below 1 day in both directions. Exact probabilities will be presented for any benefit (a MD above 0 days), any harm (a MD below 0 days), a clinically relevant benefit (a MD above 1 day), and a clinically relevant harm (a MD below 1 day)
2.9 Bayesian analysis of all-cause mortality at 90 daysThe binary, secondary outcome of 90-day mortality will be analysed using a Bayesian logistic regression model. Results will be presented as ORs, RRs and RDs calculated using the predicted probabilities for reference patients with the adjustment (and stratification) variable ‘site’ set to represent the most frequent category. Furthermore, full posterior probability distributions including cumulated distributions will be visually presented to display the entire range of plausible effect sizes (Figure 1). A RD in mortality of 2 percentage points or more in both directions is defined as the MIREDIF. Correspondingly the ROPE is defined as a RD less than 2 percentage points in both directions. Exact probabilities will pre presented for any benefit (a RD below 0 percentage points), any harm (a RD above 0 percentage points), a clinically relevant benefit (a RD below 2 percentage points), and a clinically relevant harm (a RD above 2 percentage points).
2.10 HTE analyses We will assess the presence of HTE on the continuous scale according to four baseline characteristics: Baseline sequential organ failure assessment score (SOFA score) as a measure of severity of illness. Baseline PaO2 divided with the FiO2 constituting the PaO2/FiO2 ratio. Highest dose of norepinephrine (µg/kg/min) in the 24 h before randomisation. Plasma lactate concentration (mmol/L) at randomisation.The relation between each of the above mentioned four baseline parameters and the intervention allocation for both outcomes will be analysed using a Bayesian linear regression model for the primary count outcome and Bayesian logistic regression models for the secondary binary outcome. All models will be stratified for site, while the models containing the PaO2/FiO2-ratio also will be adjusted for type of oxygenation system at baseline (open [nasal cannula or high flow nasal cannula]) or closed (invasive mechanical ventilation, non-invasive mechanical ventilation [NIV] or continuous positive airway pressure [CPAP]), with the most common group being considered the reference.
Results will be presented using conditional effects plots after truncation of values below 0 days and above 90 days.
3 MISSING DATAThe completeness of the data will be presented alongside the results. If less than 5% of patients have missing data for variables included in a given analysis, we will perform a complete case analysis without imputation. If more than 5% of the data is missing for at least one variable used in an analysis, multiple imputation with chained equations will be performed as specified in the HOT-ICU protocol.27
4 MODEL DIAGNOSTICSWe will use a similar approach as previously described.27-29 Four chains (using Stan’s default dynamic Hamiltonian Monte Carlo sampler) with at least 10,000 post-warmup draws in total, and a bulk and tail effective sample size (ESS) of at least 5000 for the parameters of interest, and no divergent transitions is required. Chain convergence is visually assessed using overlay density and trace plots of the draws.30 Furthermore, Rhat-statistics ≤ 1.01 are required to accept the model of the outcome in question.31 Posterior predictive checks (of predicted means/probabilities) combined with Pareto-smoothed importance sampling leave one out cross-validation (focused on the effective number of parameters) is used to evaluate the fit of each model.32
In case multiple imputation is used, each model will be fitted and assessed separately in each imputed dataset before posteriors are pooled. In this case the requirements for the number of post-warmup draws and the ESS applies to the pooled draws.
5 DISCUSSIONThe aim of this paper was to present the protocol and statistical analysis plan for a secondary, pre-planned Bayesian analysis of the HOT-COVID trial.
5.1 The relevance of a Bayesian approachThe primary analysis of the HOT-COVID trial will use a conventional frequentist statistical approach to determine a potential significant difference in the absolute number of days alive without life-support within 90 days of randomisation between the two oxygen target groups. This approach produces a p-value which does not describe the probability of a treatment effect, but only the percentage of times the observed or a more extreme outcome will occur if the exact same experiment is repeated indefinitely assuming that the null hypothesis is true. Furthermore, the p-value offers no information on the effect size of the intervention, but only its statistical significance as defined by the arbitrarily set alpha level. Consequently, the research question ‘is a lower or higher oxygenation target more beneficial for ICU patients with COVID-19?’ is handled indirectly and lack of statistical significance will not necessarily exclude clinical relevance.33
Bayesian statistics offers an alternative. Fundamentally, it quantifies how probability distributions for a given outcome are changed with the introduction of new knowledge. Here the probability of the observed outcome is directly assessed by updating a pre-set notion of probability for the outcome with new data from an experiment. This results in an updated probability distribution for the outcome reflecting how certain we presently are of the outcome occurring. Using technical terms, a prior probability distribution is updated with data to produce a revised posterior probability distribution.34, 35
By applying a Bayesian approach, the interpretation of the trial results and the clinical relevance ceases to rely on a self-contained amount of data producing a single number (i.e. the p-value). Instead, results are presented as probability distributions directly informed by both relevant prior beliefs and data obtained from the trial. It should be noted that the prior beliefs in the present study is minimally informative due to the lack of information. Furthermore, by presenting the trial results as probability distributions we hope to openly compel the reader to individually interpret and judge the clinical relevance of the trial results.
The use of Bayesian statistics is presently gaining support in medical science with the American College of Cardiology and the American Heart Association officially supporting its use in creating clinical guidelines, and with an increasing number of randomised clinical trials using this approach.27-29, 35-39
5.2 Choice of outcomes analysedThe absolute number of days alive without life-support at 90 days is the primary outcome of the HOT-COVID trial. The outcome combines an assessment of both mortality and disease severity to provide a numerical composite measure of how different levels of targeted supplemental oxygen impact the disease course of ICU patients with COVID-19. By applying an outcome based on two parameters, we gain the possibility to detect relevant clinical effects besides death. This is especially important in the present trial, since large scale differences in mortality between a higher versus a lower oxygenation target for ICU patients are less likely based on existing evidence.11-13 Furthermore, moving from a binary outcome such as mortality to a discrete numerical outcome such as days alive without life-support provides a higher degree of information for each participant resulting in a higher power level achieved from the same amount of participants.40-42
Using an outcome informed by more than one parameter is not without problems. The relative importance of each part in driving the observed effect becomes unclear if the composite outcome is presented exclusively. This is problematic since, for example, death and number of days on life-support are not equally important for the patient. Furthermore, combining more than one parameter to measure the effect of an intervention requires that they are affected in the same direction by the intervention. Based on the available literature it seems likely that a decrease in the use of life-support will be associated with a decrease in mortality,43 but we will also present 90-day all-cause mortality to provide sufficient information for the reader to interpret the results.
5.3 Choice of baseline parameters and model for HTE analysesThe aim of the HTE analyses is to examine if variation in baseline parameters for disease severity translates into differences in the outcomes between the intervention groups.
The baseline variables used in the HTE analyses mirrors those used in the Bayesian HTE analysis of the HOT-ICU trial to allow for comparison of the results.27 The SOFA score is a surrogate marker for organ failure and has been linked to an increased mortality in previous studies.44, 45 The PaO2/FiO2 ratio is a marker for severity of hypoxaemic respiratory failure,46-48 while high levels of norepinephrine dosage and plasma lactate are markers of circulatory failure.8 Notably, a pre-planned Bayesian HTE analysis of the HOT-ICU trial found a potential interaction between increasing baseline norepinephrine dosages and the lower oxygenation target, that is, increased mortality with increasing doses in the lower target compared to the higher target group.49
The HTE analyses are based on the models used in the primary Bayesian analysis. For the primary outcome, which is based on a linear regression, this will entail that the effects of each baseline variable and their interactions will be completely linear. While it may not reflect reality completely, the alternative of using a more complex model or introducing splines would allow for higher flexibility at the cost of transparency, methodological consistency and possibly increased uncertainty due to increased model complexity.
6 LIMITATIONSDespite the advantages of applying a Bayesian framework for clinical trials, shortcomings still exist. A common argument against Bayesian analyses revolves around the introduction of subjectivity by allowing prior beliefs to influence trial results. The problem of subjectivity is present in all statistics, for example, when choosing models or estimating normality within data using a frequentist approach. In this aspect, we have presented the rationale behind the chosen statistical model and the chosen priors to strengthen the validity and transparency of this pre-planned Bayesian analysis. Here, the focus on using weakly informative priors as the basis for the analyses will ensure that the data will overwhelm the priors and thus help mitigate the risk of introducing a high degree of subjectivity. Additionally, despite the pre-specification of these analyses before completion of the HOT-COVID trial, all HTE analyses should be interpreted cautiously, as the risk of spurious findings is not eliminated.
7 CONCLUSIONThe presented secondary pre-planned Bayesian analysis of the HOT-COVID trial will supplement the conventional frequentist primary analysis. It will provide a direct assessment of the probabilities for clinically relevant differences when comparing a lower versus a higher oxygenation target for patients admitted to an ICU with severe COVID-19. Furthermore, it will investigate how differences in baseline parameters for disease severity might influence the treatment effect of each oxygenation target.
AUTHOR CONTRIBUTIONFMN and BSR drafted manuscript in close collaboration with TLK and AG. Furthermore, TL, AP and OLS made substantial contributions to the manuscript and provided important scientific input. The HOT-COVID management Committee consists of TLK, OLS, BSR, AP and TL. FMN functions as coordinating investigator for the HOT-COVID trial.
Comments (0)