Science and Research

Artificial intelligence for TNM staging in NSCLC: a critical appraisal of segmentation utility in [(1)⁸F]FDG PET/CT

PURPOSE: This study aims to investigate whether a diagnostic AI model can effectively support lesion detection and staging in non-small cell lung cancer (NSCLC) [(1)⁸F]FDG PET/CT studies, focusing on the distinction between technical segmentation accuracy and clinically meaningful performance. METHODS: In this retrospective single-centre study, [(1)⁸F]FDG PET/CT scans from 306 treatment-naïve NSCLC patients were reviewed with reference to multidisciplinary team decisions. Tumour lesions were manually segmented for reference and compared with predictions from the top-performing algorithm of the autoPET III challenge. Quantitative segmentation metrics were calculated, and lesion-level errors were assessed for impact on patient-level TNM and UICC staging. RESULTS: The algorithm achieved a mean Dice Similarity Coefficient (DSC) of 0.64. Lesion-level sensitivity was 95.8% across all patients, with a precision of 87.5%. False positive M-category lesions (n = 196) occurred as most frequent error. Of all false positives, 35.7% were benign and 34.7% non-oncologic pathologies. UICC staging matched ground truth in 207/306 patients, with most discordances due to upstaging (88/306). CONCLUSION: Clinically driven metrics and cause-based error analysis offer valuable insight into AI segmentation performance. The evaluated model showed excellent lesion sensitivity but a tendency towards systematic overprediction across TNM categories. On a lesion level M-stage false positives and undersegmentation in the hilar region emerged as the main driver of clinically relevant upstaging. Despite promising lesion detection sensitivity, only 67.7% UICC-stagings were accurate using AI masks, indicating that diagnostic AI may support, though not yet replace, manual lesion evaluation in NSCLC [(1)⁸F]FDG PET/CT.

  • Heimer, M. M.
  • Dexl, J.
  • Ta, J.
  • Ebner, R.
  • Herr, F. L.
  • Orasanin, L.
  • Jeblick, K.
  • Adams, L. C.
  • Sundar, L. K. S.
  • Tufman, A.
  • Werner, R. A.
  • Sheikh, G.
  • Ricke, J.
  • Ingrisch, M.
  • Fabritius, M. P.
  • Cyran, C. C.

Keywords

  • Artificial intelligence
  • External validation
  • Non-small cell lung cancer
  • Task-specific evaluation
  • [¹⁸f] fdg pet/ct
Publication details
DOI: 10.1007/s00259-025-07677-2
Journal: Eur J Nucl Med Mol Imaging
Work Type: Original
Location: CPC-M
Disease Area: LC
Partner / Member: KUM
Access-Number: 41275455


chevron-down