Science and Research

Predictive Modeling of Nontuberculous Mycobacterial Pulmonary Disease Epidemiology using German Health Claims Data

OBJECTIVES: Administrative claims data are prone to underestimate the burden of nontuberculous mycobacterial pulmonary disease (NTM-PD). METHODS: We developed machine learning-based algorithms using historical claims data from cases with NTM-PD to predict patients with a high probability of having previously undiagnosed NTM-PD and to assess actual prevalence and incidence. Adults with incident NTM-PD were classified from a representative 5% sample of the German population covered by statutory health insurance during 2011-2016 by the International Classification of Diseases, 10(th) revision code A31.0. Pre-diagnosis characteristics (patient demographics, comorbidities, diagnostic and therapeutic procedures, and medications) were extracted and compared to that of a control group without NTM-PD to identify risk factors. RESULTS: Applying a random forest model (area under the curve 0.847; total error 19.4%) and a risk threshold of >99%, prevalence and incidence rates in 2016 increased 5-fold and 9-fold to 19 and 15 cases/100,000 population, respectively, for both coded and non-coded vs. coded cases alone. CONCLUSIONS: The use of a machine learning-based algorithm applied to German statutory health insurance claims data predicted a considerable number of previously unreported NTM-PD cases with high probabilty.

  • Ringshausen, F. C.
  • Ewen, R.
  • Multmeier, J.
  • Monga, B.
  • Obradovic, M.
  • van der Laan, R.
  • Diel, R.

Keywords

  • Epidemiology
  • Insurance claims analysis
  • Machine learning
  • Nontuberculous mycobacteria
  • Nontuberculous mycobacterium infections
  • Probability learning
Publication details
DOI: 10.1016/j.ijid.2021.01.003
Journal: Int J Infect Dis
Work Type: Original
Location: ARCN, BREATH
Disease Area: PALI
Partner / Member: Ghd, MHH
Access-Number: 33444748

DZL Engagements

chevron-down