Greater Plains Collaborative Reusable Observable Unified Study Environment (GROUSE)

GROUSE is our project to obtain health insurance claims from the Center for Medicare and Medicaid Services CMS through the Research Data Assistance Center ResDAC at the University of Minnesota.

A summary was provided April 2018 to the PCORnet Council on our motivation and status at that time

Using GROUSE for Analysis

in progress; see #623 and these slides:

access, regulatory issues...

tools ...

Tracking on submissions for Cancer-RCR-3 sites

Site i2b2 i2b2 meta-data CDM
IOWA X X Securefile link expired
MCW missing data_offset field X X

CMS Data

i2b2 Data

Eventually, we plan to provide one integrated i2b2 with both the CMS data and the site data.


GROUSE query by days supply, dx_source, and other PCORNet modifiers
collect several i2b2 datamarts for GROUSE
collect WISC i2b2 repository copy for GROUSE
collect i2b2 datamarts from non Aim 3 sites for GROUSE


Eventually, we plan to provide one integrated CDM with both the CMS data and the site data.


GROUSE: integrate CDM data from non-RCR Aim 3 GPC sites
Send copy of CDM to KUMC for Cancer RCR / GROUSE integration testing

Data Integration and Record Linkage

from executive summary


We will be organizing our technical milestones on the roadmap (e.g. milestone:grouse-research-1) but also have preliminary thoughts under CompleteData.

Photo by Bob Gress, Birds in Focus.

Development: Data Staging, ETL


Design Sketches, Usage Scenarios, Customers

In a 9 Aug meeting, we (at KUMC) identified 7 usage scenarios. We did some more detailed planning on the first few in a 29 Sep meeting.

Note CancerClaimsPilot

other customers:

  • IU ALS and DVT
  • Anne B. at UMN

1 CMS Files, de-identified


training for GROUSE SQL queries
SAS GUI for GROUSE Analysts


  • Mary S. from UIOWA on
    • CancerRCR Aim 3
    • pilot project(s) which ones, exactly?
  • Dr. Peggy Pessig is studying adverse drug events at MCRF.

The de-identified files (tables) have been prepared, using grouse/cms_deid.

Geocoding Integration

scheduled for Jan 2018 in an Oct 4, 2017 KUMC planning meeting

Note consensus in ​​ticket:508 on using obfuscated geographic location codes.

pilot 26, Project 3:

... Merge census-tract-level sociodemographic information derived from the American Community Survey and the 2010 Census Summary File to study cohort. Hypotheses: Differences in chemotherapy delays or discontinuation by race, ethnicity, or other sociodemographic characteristics will not be explained by differences in hospitalizations or other evidence of complications. Most patients will receive chemotherapy and subsequent treatments for complications from a single institution.

2 ETL CMS Files to CDM

CancerRCR#Aim3 is a customer; Jan 1 sync point: test SAS code.

source code: cms_i2p module and surrounding code.

Alternative option:

3 ETL CMS Files to i2b2

source code: grouse/etl_i2b2



KUH pop health

  • ticket:526 finder file
    • and subsequent tickets to get the crosswalk file(s), since the scope of #526 has been narrowed:
    • #581: task: Crosswalk files for the remaining 4 GPC sites back from GDIT (new)
    • #564: task: Crosswalk files from 6 sites back from GDIT (closed: fixed)
  • ticket:??? create offset file: hash, pat_num, site_days_offset; establish master_days_offset

5 KUMC i2b2 + CMS i2b2

KUH pop health breast cancer: tumor registry (RW, SS)

6 big CDM (CMS + many sites)

  • #619 integrate CDM data from GPC sites

CancerRCR#Aim3 relies on this.

Replace 12 GPC popmednet nodes with big CDM?

7 big i2b2 (CMS + many sites)

  • #595 collect i2b2 datamarts
  • #597: spec for crosswalk to accompany i2b2 datamarts for GROUSE

obesity Davis @ KU Larry at WISC. Joan Neuner MCW

breast cancer: tumor registry (EC and co)

8 Provider Availability by County (potential supplemental data integration)

Check out federal datasets that measure the amount of health care providers available at the county level. Also see the materials and videos on RESDAC from Beth Virnig for ideas.

9 death info by cohort

from HackathonFive: RW: let’s make a note of that; e.g. as a side benefit of the GROUSE finder file process is that we can give you a table with patid and death date.

Last modified 4 days ago Last modified on Jun 15, 2018 3:10:23 PM

Attachments (9)