wiki:DataStandardization

For mailing list and teleconference info, see SoftwareDev.

Milestone 2.4 data standardization and interoperability implemented (RC2)

  1. CDRN HAS SUCCESSFULLY STANDARDIZED DATA WITHIN ITS NETWORK
    Current informatics standards, interoperability between systems, and plans for achieving data standardization and interoperability between systems within network and across CDRNs
    1. Approved approaches to enhance data standardization and interoperability within each system and between systems within the network are implemented (RC2)

Background: Layered Metadata Architecture

The GPC employs an iterative approach to data standardization and interoperability, targeting a layered metadata architecture:

  1. National and international data standards specified by the nationwide health information network (NwHIN) and subsequent guidance provided by the Office of the National Coordinator for Health Information Technology (ONCHIT) and outlined by Stage 2 and Stage 3 Meaningful Use (MU) criteria
  2. PCORI CDM compliant view for SQL
  3. GPC metadata view for network queries with i2b2
  4. Local site terminology
GraphViz image
  • We have reviewed and conducted GPC-STD team discussions of the PCORI DSSNI guiding principles for data standardization when that information is provided by the DSSNI task force
  • We have reviewed and discussed the PCORI CDM v1 (#89)

Background: i2b2 data model and HERON ETL

We use the flexible i2b2 data model to to facilitate standardization of terminologies and data. We began by setting up an installation of i2b2 as the GPC term sharing service, babel, and loading terminology from all sites into it(#1). This facilitates comparing and contrasting approaches as we align our terminologies across sites. The i2b2 data model also allows us to consolidate data from:

  1. EHRs,
  2. administrative “billing” data and derived benchmarking datasets such as the University HealthSystem Consortium Clinical Data Base (UHC CDB),
  3. research registries (e.g. NAACCR Tumor Registries) and
  4. patient reported outcome measures for our three target populations
GraphViz image

The HERON open source ETL code serves as the GPC reference; it has been adopted at two sites that did not have an existing i2b2 ETL infrastructure when phase 1 began. The HERON code for NAACCR ETL was used at all GPC sites, whether directly or as a design guide (#44), to load tumor registry data for breast cancer research.

Milestone: shared, repeatable GPC CDM ETL

We have developed a shared, repeatable refresh process where, given a site has aligned its i2b2 terminology with GPC-standards, using shared ETL scripts (CDM ETL source, #145), a data warehouse with CDM tables is built from the i2b2 data warehouse using mappings between CDM terms and GPC terms:

GraphViz image

This process is operational at three GPC sites. Other sites have achieved equivalent results using local enterprise data warehouse approaches.

Our approach to integrating CDM and i2b2 (#109) was informed by collaboration with the SCILHS and PaTH CDRNs.

Initial priority domains were:

  1. Demographics (#67), Diagnoses (#63, #90), Vitals (#23) to support CDM ETL
  2. Cancer Tumor Registry (#185) to support breast cancer research
  3. Medications (#78) based on successful deployment of the HERON ETL code for medications at several sites.

Plan: Data Domains

With our architecture and approach established, we continue to iterate and expand to other domains:

  • Lab results using LOINC (#158)
  • Procedures using CPT, ICD9, HCPCS (#243)
  • Patient Reported Outcomes (#102)

We have reviewed CDM v2 (#157) and we plan to approach the expanded needs of CDM v2 iteratively, driven by PCORI requirements and the needs of our research teams.

Plans: Data Quality and Interoperability Measurement

Ongoing data quality work (#232) aims to measure

  • what portion of each site's observation facts are mapped
    • to GPC standard terms demographics, diagnoses, meds, ...
      • from high level categories
      • down to individual leaf terms
    • to the CDM
  • how many observation facts, patients we have across the network
    • for GPC standard terms
    • for CDM terms

GPC Interoperable Standardization Measurement Framework

status: #160, Domains by Site

DataStandardization is a central part of the GPC effort. From Review Criterion 2 of the proposal:

The Greater Plains Collaborative (GPC) is poised to contribute evidence that the national investment in electronic health records will be useful for measuring healthcare’s impact on patient outcomes by building an interoperable federated research network adhering to national and international data standards specified by the nationwide health information network (NwHIN) and subsequent guidance provided by the Office of the National Coordinator for Health Information Technology (ONCHIT) and outlined by Stage 2 and Stage 3 Meaningful Use (MU) criteria. Figure 2.1 illustrates our approach.

Figure 2.1

We will use the I2B2DataModel to consolidate data from

  1. EHRs,
  2. administrative “billing” data and derived benchmarking datasets such as the University HealthSystem Consortium Clinical Data Base(UHC CDB),
  3. research registries (e.g., Tumor Registries) and
  4. patient reported outcome measures for our three target populations (e.g., prototyped in REDCap).

We will bind both the internal EHR concept codes and mapped codesets to standard terminologies (upper left in Figure 2.1) into i2b2 so we can quantitatively measure Meaningful Use stage 2 attainment based upon both concept coverage and the amount of observed data. ... This provides a feedback cycle between the research informatics teams and the operational clinical information systems teams. ... This work is possible because of the increasing support of the NwHIN domain model by Epic, Cerner, and other EHR vendors.

Plans

Ticket Index

data standards tickets

Epic Mapping

Most of the GPCSites use Epic. Jim sent several of us the following in an "Epic Data Standardization" message Friday, September 06, 2013 11:56 AM:

  • Meaningful Use Terminology Mapping Guide
    Last Revised: July 3, 2013
    Based on the Final Rule for Stage 2 of Meaningful Use

HERON: i2b2 at KU Med Center

See HERON.

U.S. National Recommendations

In discussion around the August HERON workshop, Dr. Campbell noted the NCVHS recommendations around SNOMED etc.:

The National Committee on Vital and Health Statistics (NCVHS), a public–private advisory committee established to provide advice to DHHS and Congress on national health information policy, has for many years recommended that the federal government assume a more active role in establishing national data standards (National Committee on Vital and Health Statistics, 2000).

especially chapter 4 and the diagram on page 157.

Terminology Alignment?

as "Data Standardization" is an awfully broad term, perhaps TerminologyAlignment is a better topic name for much of this work. But it seems that in context of this PCORI project, it's called data standardization.

Last modified 4 months ago Last modified on Feb 5, 2018 5:06:00 PM

Attachments (4)

Download all attachments as: .zip