Estimating Risk Factor Time Paths Among People with Type 2 Diabetes and QALY Gains from Risk Factor Management

2.1 EXSCEL and TECOS Data

We pooled data from two randomised, placebo-controlled clinical cardiovascular outcome trials in people with type 2 diabetes: (1) the Exenatide Study of Cardiovascular Event Lowering [18, 19] (EXSCEL ClinicalTrials.gov NCT01144338), with 14,752 participants conducted between 2010 and 2017, and (2) the Trial Evaluating Cardiovascular Outcomes With Sitagliptin [20] (TECOS ClinicalTrials.gov, NCT00790205), with 14,671 participants conducted between 2009 and 2014. Both trials were pragmatic and allowed any concomitant medications (other than the drug class under investigation) to be used at the discretion of the usual care physician (see Supplementary Material Tables A1–A2 for a comparison between these trials).

Data from both arms of the two trials were combined to maximise generalisability and use all available data. Risk factor measurements performed < 6 months after starting randomised treatment were excluded from the analysis to exclude the initial effect of treatment. Within analyses for each risk factor, we excluded: (1) participants who did not provide any information on that risk factor at randomisation, (2) those who had information on that risk factor at the randomisation visit but not at follow-up visits, (3) those who withdrew from the study on the day of their randomisation visit and (4) and those who had missing data on ethnicity.

2.2 Risk Factors

Our study focused on 11 risk factors: high-density lipoprotein cholesterol (HDL-C), low-density lipoprotein cholesterol (LDL-C), systolic blood pressure (SBP), HbA1c, heart rate, haemoglobin, body mass index (BMI), estimated glomerular filter rate (eGFR) and whether the patient had been diagnosed with peripheral vascular disease (PVD), atrial fibrillation (AF) or micro- or macroalbuminuria (ALB). Neither trial measured white blood cell count or post-randomisation smoking status, which are included in the UKPDS-OM2 [3].

Risk factors were analysed on a yearly basis, taking the average across the measurements in that year was used. Values outside predefined ranges [21] were omitted from the analysis.

2.2.1 Equations for Continuous Risk Factors

We applied linear dynamic model to estimate the time paths for HDL-C, LDL-C, HbA1c, haemoglobin, heart rate, SBP and BMI. Linear dynamic model in this study refers to the inclusion of the value of the risk factors in the previous period to capture the dynamic feature that previous values of risk factors affect current ones [11]. We used this model with random effects because it gave good predictions of risk factor values [within 95% confidence interval (CI) of observed values].Footnote 1

The risk factor value for individual i in year t (\(_\)) was:

$$_=_+_y}_+_}_,0}+_}_+_\text}_+_}_+_\text\left(}_\right)+_+_,$$

(1)

where \(_\) was the previous year’s risk factor value, \(_,0}\) was the first post-randomisation risk factor value and captured effect of the initial risk factors values on the subsequent time-path [11]; \(}_\) was a series of dummy variables, with ‘1’ indicating white, Black or Asian (oriental, Indian or other) and a baseline category of ‘other’ (Hispanic, Australian Aboriginal, Maori, Native Hawaiian, Pacific Islander, American Indian or Alaska Native), and age was age at randomisation. The natural log of \(}_\) was used as this was previously found to improve model fit [11]. The model included random effects by patient (\(_\)), reflecting unobserved time-invariant characteristics, and an error component (\(_\)).

2.2.2 Risk-Factor Equations for PVD, AF, ALB and eGFR

We applied multivariable parametric proportional hazard survival models to estimate the risk of developing PVD, AF, ALB and progressing to eGFR < 60 ml/min/1.73 m2. The underlying assumption is that once an individual progresses to one of these health states they can never leave. Several of the equations predicting diabetic events in UKPDS-OM2 incorporate eGFR as a spline variable with a knot at 60 ml/min/1.73 m2. Hence, to ensure robust predictions, the eGFR predictions must accurately represent not only the time paths of eGFR values but also the proportion of the population below the knot value (60 ml/min/1.73 m2).

We therefore modelled eGFR time paths with a two-part model, predicting whether eGFR was < 60 ml/min/1.73 m2 this year and then the exact eGFR value, conditional on covariates that included last year’s values. Monte Carlo simulation was used to convert probability predictions for individual patients into binary events (Supplementary Material 2). To predict a continuous eGFR value conditional on the eGFR < 60 ml/min/1.73 m2 prediction, we used two additional multivariable random effects Tobit autoregressive models of order one for eGFR values above or below 60. In the two separate Tobit models, we included the same covariates as the other continuous risk factors, and an upper limit of 60 ml/min/1.73 m2, a lower limit of 0 for eGFR <60 ml/min/1.73 m2 and a lower limit of 60 for eGFR ≥ 60 ml/min/1.73 m2. This two-step approach to predicting eGFR values has previously been shown to give the most accurate predictions of eGFR and events within UKPDS-OM2 [11].

For PVD, AF, ALB and the binary variable indicating eGFR < 60 ml/min/1.73 m2 (and ‘0’ otherwise), the parametric form (Weibull, exponential and Gompertz) was examined graphically and model choice was based on AIC. For all four outcomes, we selected a Weibull distribution. The proportional hazards assumption was tested by plotting Schoenfeld residuals and Cox–Snell semi-log graphs [22].

2.3 Selection of Predictors

For all risk factor equations, we selected predictors based on two criteria: (1) as suggested by previous evidence [23], parsimonious models with fewer predictors were preferred over models with more predictors and (2) a good agreement between observed and predicted risk factor values (predicted values should be within 95% CI of observed values).

For the continuous risk equations, we included all covariates shown in Eq. 1 regardless of their statistical significance if they were shown to improve agreement between observed and predicted risk factor values. This ensured a parsimonious model informed by the covariates most likely to be available to other users.

For eGFR and the equations estimating time to PVD, AF and ALB, we followed the same approach as in the UKPDS-OM2 and Leal et al. [11]. The set of candidate covariates was initially informed by literature and expert opinion and included time-invariant factors (sex, age at diagnosis, ethnicity and smoking at baseline) and time varying clinical risk factors (SBP, HbA1c, BMI, HDL, heart rate and LDL). The multivariable models were initially fitted with all covariates; backwards stepwise regression at P < 0.05 was then used to select the significant covariates in the final models.

We explicitly excluded trial treatment allocation as a covariate in all risk equation models because the time path equations are intended to be applied to risk factor values from any diabetes dataset and any treatment. In sensitivity analysis, we re-estimated separate equations for each trial (TECOS and EXSCEL) and evaluated the impact of treatment allocation.

2.3.1 Mapping Out Risk Factor Trajectories

For all risk factors, predicted and observed time paths were plotted for the combined datasets to assess prediction accuracy and internal validity. We considered time paths to have good prediction accuracy if the predicted values lay within the 95% CI of observed values. Duration of diabetes was used as the time scale rather than time from randomisation, which allowed combining data on patients with different durations of diabetes together (similar to period life tables). For PVD, AF and ALB, we assessed calibration by plotting the observed cumulative incidence using the Kaplan–Meier estimator and comparing it with the predicted risk [24].

2.4 QALY Gains Using Current and Previous Risk Equations

To estimate the health gains from improvements in risk factor management, pre-randomisation data for the placebo arms from both trials were extrapolated to 70 years using UKPDS-OM2. This analysis only included patients with non-missing data for all risk factors at both baseline and year 1 (n = 2579). Using coefficients estimated from Eq. (1), we first calculated values of risk factors in year 1 from pre-randomisation values and then simulated values from year 2 onwards. The same data were extrapolated using the time paths estimated by Leal et al. [11] and the difference in QALYs between the two simulations was estimated.

A bespoke version of UKPDS-OM2 was used which enabled the coefficients for binary risk factors to be modified. Baseline white blood cell count was imputed using a published algorithm [21]. Baseline data on smoking and white blood cell counts were extrapolated using the equations estimated by Leal et al. [11] in both scenarios, since these were not re-estimated in the current study. Default values for utilities and other parameters were used within these analyses and we ran 100,000 loops and no bootstraps (see Supplementary Material 3 for further details).

2.5 Reference Simulation

Following recommendations of the Mount Hood Diabetes Challenge Network, we replicated the reference case simulations for the UKPDS-OM2 that were registered on the Mount Hood website (https://www.mthooddiabeteschallenge.com/registry) [25]. The registry includes a set of reference simulations that are intended to enable comparisons of models and increase model transparency [2].

The registry simulations were replicated using last observation carried forward (LOCF), trajectories estimated by Leal et al. [11] and those from Tables 1−2. Supplementary Material 4 gives further details.

Except where otherwise stated, all analyses were conducted using Stata version 17 (StataCorp, College Station, TX).

Comments (0)

No login
gif