Accurate Patient Cohort Identification from Electronic Health Records

EHR phenotyping is the process of identifying patients with specific conditions or characteristics from electronic health records using validated, reproducible algorithms, enabling research, audit and service evaluation at scale across both primary and secondary care data.

What We Do
  • Develop and validate computable phenotype algorithms across primary and secondary care
  • Build and curate clinical codelists (SNOMED-CT, ICD-10, Read codes, OPCS)
  • Work with PhenoFlow and HDRUK Phenotype Library frameworks
  • Identify patient cohorts for research, trials and service evaluation
  • Support rare disease identification and diagnostic pathway analysis
  • Link GP and HES data to build complete, longitudinal patient phenotypes

Data Sources: Phenotype Across
  • GP records: primary care coding from EMIS, SystmOne and Vision
  • Hospital Episode Statistics (HES): secondary care diagnoses, procedures and admissions
  • Linked primary and secondary care data: for complete, longitudinal cohort definitions
  • CPRD Aurum & Gold: research-grade primary care phenotyping
  • US EHRs: for international cohort identification
  • Disease registries โ€” condition-specific patient identification

Disease Areas We Cover
  • Cardiovascular disease
  • Respiratory conditions
  • Metabolic conditions
  • Rare conditions, including Cardiovascular and Respiratory Conditions
  • Multimorbidity and comorbidity patterns
  • Oncology and cancer pathways

What You Get
  • Validated phenotype algorithms
  • Reproducible clinical codelists
  • Cohort characterisation reports
  • Linked primary and secondary care cohort definitions
  • HDRUK-compatible phenotype documentation