AstraZeneca: intelligent Study Information Mining

Creating a Knowledge Repository from Historical Data

AstraZeneca’s vision – delivering great medicines through innovative science and excellence in development and commercialization – is dependent on excellence in safe and efficient drugs. The company has archives of historical information describing completed clinical studies. These studies play a key role in the design of improved tailored therapies and form a knowledge repository for interpretation of current trial results. The same clinical knowledge is required for tools that support model-based drug development, strengthen the clinical to pre-clinical ‘feedback loop’ and enable better predictive science. But the repository needed to be analyzed, structured and enriched enabling AstraZeneca’s teams to quickly identify patterns and relationships important in the studies and drug therapies.

The objective for the iSIM (intelligent study information mining) system was to semantically integrate large amounts of heterogeneous clinical study knowledge across a broad period of time. State-of-the-art text analysis algorithms from Ontotext was used to segment, structure, classify and extract knowledge from various types of clinical documents like Clinical Study Reports (CSR), Clinical Study Protocols (CSP) and Period Safety Update Reports (PSUR). The trained and validated semantic model was integrated with internal clinical study systems to enrich the structured data with knowledge about:

  • Study outcomes
  • Primary and secondary objectives
  • The observed adverse events
  • The study type and its design parameters
  • The applied laboratory measurements
  • Monitored biomarkers and clinical observations
  • Patient population definition criteria
  • All drug substances and project related information

By doing this, AstraZeneca was able to take clinical studies that were in different formats, extract meaning from unstructured text using validated text mining algorithms and enrich the entities extracted from the documents. Relationships between those entities that are important in drug research and patient safety were stored inside of GraphDB. Today, analysts and other users search the entire knowledge base efficiently. Search results yield precise sets of information useful in improving tailored therapies.

The iSIM system operates in production and has resulted in saving both time and effort applied to this process at AstraZeneca. Today, iSIM contributes to an increased general awareness for all existing studies and significantly decreases the cost to search, analyze and navigate historical clinical information.

Back to top