Phenotype Estimation for Patient-Centered Pediatric EHR Research (PEPPER)

Overview

PEPPER is our PCORI-funded study of phenotype estimation and analysis using electronic health records (EHR) data. This project aims to develop new statistical methods that combine the unique set of measures available for each individual to estimate a “latent phenotype.” The latent phenotype consists of a patient’s underlying, true disease profile, which may be only hinted at by the series of medical tests recorded in the EHR. By efficiently combining all available information for each individual, we will leverage the richness and complexity of EHR data, and we will be able to better characterize patients.

In addition to statistical methods development, we will use new methods to identify children and adolescents with type II diabetes using data from the PEDSnet federation. Using EHR data from eight children’s hospital health systems participating in PEDSnet, we will develop a pediatric diabetes latent phenotype. This phenotype can be used in subsequent research for identifying patient participants or for assessing risk of other health outcomes that may be increased in children with type II diabetes. We will work with clinician, patient, and parent members of our Research Advisory Board to identify downstream health consequences that are most important for further study and analyze associations between the newly developed diabetes latent phenotype and these outcomes. These analyses will illustrate the performance of the latent phenotype approach in a real-world context where information on risk factors and outcomes for type II diabetes is urgently needed.

Aims

  1. To develop statistical methods for estimating latent phenotypes
  2. To develop methods for incorporating latent phenotypes into analyses of health outcomes accounting for uncertainty in phenotypes and other patient covariates
  3. To estimate a type II diabetes phenotype for patients in the PEDSnet federation and associations with downstream health outcomes

The long-term objective of this research is to provide better statistical methods for combining inconsistently collected measures derived from the EHR.