Importance: Malignancy prediction models based on participant-related characteristics and imaging parameters from low-dose computed tomography (CT) may improve decision-making regarding nodule management and diagnosis in lung cancer screening. Objective: To externally validate 5 malignancy prediction models that were developed in screening settings, compared with 3 models that were developed in clinical settings, in terms of discrimination and absolute risk calibration among participants in the German Lung Cancer Screening Intervention trial. Design, Setting, and Participants: In this population-based diagnostic study, malignancy probabilities were estimated by applying 8 prediction models to data from 1159 participants in the intervention arm of the Lung Cancer Screening Intervention trial, a randomized clinical trial conducted from October 23, 2007, to April 30, 2016, with ongoing follow-up. This analysis considers end points up to 1 year after individuals' last screening visit. Inclusion criteria for participants were at least 1 noncalcified pulmonary nodule detected on any of 5 annual screening visits, receiving a lung cancer diagnosis within the active screening phase of the Lung Cancer Screening Intervention trial, and an unequivocal identification of the malignant nodules. Data analysis was performed from February 1, 2019, through December 5, 2019. Interventions: Five annual rounds of low-dose multislice CT. Main Outcomes and Measures: Discrimination ability and calibration of malignancy probabilities estimated by 5 models developed in data from screening studies (4 Pan-Canadian Early Detection of Lung Cancer Study [PanCan] models using a parsimonious approach including nodule spiculation [PanCan-1b] or a comprehensive approach including nodule spiculation [PanCan-2b], and PanCan-2b replacing the nodule diameter variable with mean diameter [PanCan-MD] or volume [PanCan-VOL], as well as a model developed by the UK Lung Cancer Screening trial) and 3 models developed in clinical settings (US Department of Veterans Affairs, Mayo Clinic, and Peking University People's Hospital). Results: A total of 1159 participants (median [range] age, 57.63 [50.34-71.89] years; 763 [65.8%] men) with 3903 pulmonary nodules were included in this study. For nodules detected in the prevalence round of CT, the PanCan models showed excellent discrimination (PanCan-1b: area under the curve [AUC], 0.93 [95% CI, 0.87-0.99]; PanCan-2b: AUC, 0.94 [95% CI, 0.89-0.99]; PanCan-MD: AUC, 0.94 [95% CI, 0.91-0.98]; PanCan-VOL: AUC, 0.94 [95% CI, 0.90-0.98]), and all of the screening models except PanCan-MD and PanCan-VOL showed acceptable calibration (PanCan-1b: Spiegelhalter z = -1.081; P = .28; PanCan-2b: Spiegelhalter z = 0.436; P = .67; PanCan-MD: Spiegelhalter z = 3.888; P < .001; PanCan-VOL: Spiegelhalter z = 1.978; P = .05; UK Lung Cancer Screening trial: Spiegelhalter z = -1.076; P = .28), whereas the other models showed worse discrimination and calibration, from an AUC of 0.58 (95% CI, 0.46-0.70) for the UK Lung Cancer Screening trial model to an AUC of 0.89 (95% CI, 0.82-0.97) for the Mayo Clinic model. Conclusions and Relevance: This diagnostic study found that PanCan models showed excellent discrimination and calibration in prevalence screenings, confirming their ability to improve nodule management in screening settings, although calibration to nodules detected in follow-up scans should be improved. The models developed by the Mayo Clinic, Peking University People's Hospital, Department of Veterans Affairs, and UK Lung Cancer Screening Trial did not perform as well.
- Gonzalez Maldonado, S.
- Delorme, S.
- Husing, A.
- Motsch, E.
- Kauczor, H. U.
- Heussel, C. P.
- Kaaks, R.