Frailty Assessment Can Enhance Current Risk Prediction Tools in Emergency Laparotomy: A Retrospective Cohort Study

This study evaluated the quality of two common risk prediction tools and is, to the best of our knowledge, a pioneering attempt to combine the CFS with the P-POSSUM and NELA RPT to improve the discrimination of each tool. The mortality rate in our study population was slightly lower than that reported by NELA across England and Wales [1].

The discrimination of a model, as measured by the AUC of the ROC curve, is a measure of how accurately the model can classify the risk level of patients. The discrimination of both tools was good if the whole study population was considered; however, it showed a decline in the elderly population for both the P-POSSUM and NELA RPT. This is consistent with the previously reported data. Joseph et al. showed in their prospective study that in the elderly population, age was not a strong predictor of mortality, whereas there was a strong correlation between frailty and adverse events after EL [6]. As both the P-POSSUM and NELA RPT only account for patient age and not frailty, it is expected that their discrimination is reduced in the elderly population. Frailty alone is also predictive of mortality as shown in previous studies, with similar discrimination as in our cohort [4, 20]. However, the AUC for CFS alone is considerably lower when compared to specialized prediction models like the NELA RPT. Therefore, frailty assessment alone as well as current prediction models each have their limitations and an effort should be made to make predictions more accurate.

Reliable risk stratification is fundamental to clinical decision-making. There is clear benefit in routinely admitting high-risk patients to an intensive care unit postoperatively, as this has been shown to reduce adverse events after the operation [1, 3]. Some centers have improved the outcome of elderly surgical patients by routinely involving specialized geriatric teams [23]. It is highly likely that high-risk and frail younger patients would also benefit from a similar approach.

Moreover, there are cases where surgical care is considered futile. This unsuitability for surgery usually arises from a combination of patient frailty, deterioration of physiology and operation severity. An accurate and reliable risk prediction can empower patients, their relatives as well as caregivers to set realistic goals and to avoid considerable burden of treatment where active care is considered futile.

This study showed that combining the P-POSSUM and NELA RPT with the CFS improved the discrimination of both tools. The modified P-POSSUM score had a slightly better AUC when compared to the non-modified NELA RPT. This improvement in both tools was especially significant in patients who were aged 65 years or older. Previous attempts to improve the NELA RPT using a frailty measure did not show an improvement in its discrimination power. Barazanchi et al. showed the benefit of combining the mFI with the P-POSSUM and ACS NSQIP risk calculators but did not observe an improvement with the NELA RPT; however, they did not perform a subgroup analysis for the elderly [24]. In our cohort, the improvement in AUC when combining the CFS with the NELA RPT did not reach statistical significance when looking at all age groups. However, we observed a statistically significant improvement in the modified NELA RPT model in patients aged ≥ 65 years.

We demonstrated that frailty had an effect on mortality in both age groups. However, in younger patients with less severe frailty, the correlation with mortality was statistically less significant; for younger patients with higher grades of frailty, the odds ratios were lower than those in the elderly. Furthermore, the proportion of frail patients was significantly lower among patients under 65 years of age in our cohort. This could, in part, be the reason why the improvement in discrimination for the NELA RPT was not as significant across the whole study population, but in contrast, was significant in patients over 65.

The calibration of a risk prediction model is a measure of how close the predicted risk is to the observed mortality. Previous reports have shown that the NELA RPT tends to have a significantly better calibration than P-POSSUM. This excellent goodness of fit of the NELA RPT has also been disproven in some external validation studies. In our cohort, the NELA RPT showed excellent calibration, with only a slight under-prediction in all age groups. The goodness of fit for the P-POSSUM was poor in our study, and the over-predicted mortality was nearly twofold; however, the difference between observed and expected mortality was not statistically significant. This level of overprediction can lead to a significant unnecessary allocation of healthcare resources.

Furthermore, current risk prediction models are designed to predict 30-day mortality. Although morality is highest during the first month after operation, there is significant mortality beyond the traditional 30-day milestone. It is clear that mortality is only the tip of an iceberg reflecting human suffering but also the cost of healthcare caused by EL. Further research is needed not only to predict long-term mortality but also to predict quality of health and return to independent functioning after EL.

Limitations

The main limitation of this study comes from its retrospective nature. The CFS has been shown to be reliably assessable based on patient notes [25]. Even so, the quality of the assessment in this case, is dependent on the quality of the notes. In cases where the CFS was missing in the NELA database, we found the occupational therapy notes to be most useful, followed by physiotherapy notes, as they often describe patients’ prior (in-)dependence in activities of daily living in detail. When such information was missing, patients were more likely to be assessed as “not frail.” This could cause a possible information bias, with the trend of misclassifying patients to be more fit, with this risk being especially high in CFS grade four [26].

The study was conducted at a single center and therefore might not be generalizable beyond similar patients.

Comments (0)

No login
gif