Science and Research

Provenance for Biomedical Ontologies with RDF and Git

The German Center for Lung Research (DZL) is a research network with the aim of researching respiratory diseases. In order to enable consortium-wide retrospective research and prospective patient recruitment, we perform data integration into a central data warehouse. The enhancements of the underlying ontology is an ongoing process for which we developed the Collaborative Metadata Repository (CoMetaR) tool. Its technical infrastructure is based on the Resource Description Framework (RDF) for ontology representation and the distributed version control system Git for storage and versioning. Ontology development involves a considerable amount of data curation. Data provenance improves its feasibility and quality. Especially in collaborative metadata development, a comprehensive annotation about "who contributed what, when and why" is essential. Although RDF and Git versioning repositories are commonly used, no existing solution captures metadata provenance information in sufficient detail. We propose an enhanced composition of standardized RDF statements for detailed provenance representation. Additionally, we developed an algorithm that extracts and translates provenance data from the repository into the proposed RDF statements.

  • Stohr, M. R.
  • Gunther, A.
  • Majeed, R. W.

Keywords

  • *Biological Ontologies
  • Data Warehousing
  • Humans
  • Metadata
  • Prospective Studies
  • Retrospective Studies
  • Biological ontologies
  • automatic data processing
  • data curation
  • quality improvement
Publication details
DOI: 10.3233/SHTI190832
Journal: Stud Health Technol Inform
Pages: 230-237 
Work Type: Original
Location: UGMLC
Disease Area: PLB
Partner / Member: JLU
Access-Number: 31483277
See publication on PubMed

DZL Engagements

chevron-down