Science and Research

Threshold optimization in AI chest radiography analysis: integrating real-world data and clinical subgroups

BACKGROUND: Manufacturer-defined AI thresholds for chest x-ray (CXR) often lack customization options. Threshold optimization strategies utilizing users' clinical real-world data along with pathology-enriched validation data may better address subgroup-specific and user-specific needs. MATERIALS AND METHODS: A pathology-enriched dataset (study cohort, 563 (CXRs)) with pleural effusions, consolidations, pneumothoraces, nodules, and unremarkable findings was analysed by an AI system and six reference radiologists. The same AI model was applied to a routine dataset (clinical cohort, 15,786 consecutive routine CXRs). Iterative receiver operating characteristic analysis linked achievable sensitivities (study cohort) to resulting AI alert rates in clinical routine inpatient or outpatient subgroups. "Optimized" thresholds (OTs) were defined by a 1% sensitivity increase leading to more than a 1% rise in AI alert rates. Threshold comparisons (OTs versus AI vendor's default thresholds (AIDT) versus Youden's thresholds) were based on 400 clinical cohort cases with expert radiologists' reference. RESULTS: AIDTs, OTs, and Youden's thresholds varied across scenarios, with OTs differing based on tailoring for inpatient or outpatient CXRs. AIDT lowering most reasonably improved sensitivity for pleural effusion, with increases from 46.8% (AIDT) to 87.2% (OT) for outpatients and from 76.3% (AIDT) to 93.5% (OT) for inpatients; similar trends appeared for consolidations. Conversely, regarding inpatient nodule detection, increasing the threshold improved accuracy from 69.5% (AIDT) to 82.5% (OT) without compromising sensitivity. Graphical analysis supports threshold selection by illustrating estimated sensitivities and clinical routine AI alert rates. CONCLUSION: An innovative, subgroup-specific AI threshold optimization is proposed, automatically implemented and transferable to other AI algorithms and varying clinical subgroup settings. RELEVANCE STATEMENT: Individually customizing thresholds tailored to specific medical experts' needs and patient subgroup characteristics is promising and may enhance diagnostic accuracy and the clinical acceptance of diagnostic AI algorithms. KEY POINTS: Customizing AI thresholds individually addresses specific user/patient subgroup needs. The presented approach utilizes pathology-enriched and real-world subgroup data for optimization. Potential is shown by comparing individualized thresholds with vendor defaults. Distinct thresholds for in- and outpatient CXR AI analysis may improve perception. The automated pipeline methodology is transferable to other AI models or subgroups.

  • Rudolph, J.
  • Huemmer, C.
  • Preuhs, A.
  • Buizza, G.
  • Dinkel, J.
  • Koliogiannis, V.
  • Fink, N.
  • Goller, S. S.
  • Schwarze, V.
  • Heimer, M.
  • Hoppe, B. F.
  • Liebig, T.
  • Ricke, J.
  • Sabel, B. O.
  • Rueckel, J.

Keywords

  • Humans
  • *Radiography, Thoracic/methods
  • *Artificial Intelligence
  • Female
  • Male
  • Middle Aged
  • Aged
  • Sensitivity and Specificity
  • *Radiographic Image Interpretation, Computer-Assisted/methods
  • Adult
  • Artificial intelligence
  • Lung neoplasms
  • Pleural effusion
  • Pneumothorax
  • Radiography (thoracic)
Publication details
DOI: 10.1186/s41747-025-00632-8
Journal: Eur Radiol Exp
Pages: 95 
Number: 1
Work Type: Original
Location: CPC-M
Disease Area: PLI
Partner / Member: KUM
Access-Number: 40982157


chevron-down