To construct a data-driven composite from (a subset of) currently used quality indicators for oesophagogastric cancer surgery and to evaluate whether this approach enhances the reliability of between-hospital comparisons on outcome relative to the expert-driven composite indicator ‘textbook outcome (TO)’.
In this retrospective cohort study, we applied Item Response Theory (IRT) to construct a data-driven continuous composite indicator reflecting a single latent variable—the quality of surgical care—and estimated latent variable scores for all individual patients. Reliability was compared between the expert-driven (TO) and data-driven (IRT) composite indicators.
All Dutch hospitals providing oesophagogastric cancer surgery.
All patients who underwent oesophagectomy (n=3588) or gastrectomy (n=1782) between 2018 and 2022 as registered in the Dutch Upper GI Cancer Audit (DUCA).
We evaluated the reliability of between-hospital comparisons using ‘rankability’, which quantifies the proportion of observed variation in indicator scores between hospitals not attributable to chance.
Seven out of 15 quality indicators were included in the IRT composite indicator. Most of the patients were assigned the artificial maximum of the continuous quality score (ie, ceiling effect), resulting in similar average hospital scores. Relative to TO, rankability increased when using the IRT composite for oesophagectomy (57% vs 41%) but declined for gastrectomy (38% vs 47%).
The selected seven quality indicators for oesophageal and gastric cancer surgery represent a single latent variable but are not yet optimal for differentiating surgical care quality due to ceiling effects. Despite using fewer indicators, the continuous IRT score showed a promising increase in rankability for oesophagectomy, suggesting that data-driven composite indicators may enhance hospital benchmarking reliability.