  • Syllabus linked as an attachment

August 22, 2017

Student Pages:

August 24, 2017

closing class comments

  • students are formulating ideas. Will post to their wiki pages.
  • we will look and see if we can replicate the computable phenotype stated in class can be replicated in HERON (Female, Diabetic and Obese).

August 29, 2017

Administrative and Clinical Systems as Sources.

Discussion of our example from last class. Introduction to HERON and how it displays sources.

Review the ONC Materials for next class:

  • Component 3, Terminology in Health Care and Public Health Settings, Unit 15 skim but especially 16.
  • Component 6, Health Management Information Systems, 1,2, especially 3, and 9.
  • Homework and Class Participation due for next class:
  • Write on your wiki page, comments and/or questions from reading those modules and areas where you could use greater clarity.
  • Also, pick one additional component and sub unit(s). Skim them and comment on your wiki how applicable this material may be for future students in your degree program.

August 31, 2017

Review of the students review of the ONC training materials wrt understanding clinical and administrative systems.

Review of ONC materials and HERON ontologies for each data domain.

Russ' office hours will mainly be 430-6 on Friday with the following exceptions:

  • Wednesday October 4, from 430-6 due to PCORI data committee meeting in LA.
  • Thursday November 9 from 430-6 due to Veteran's day holiday on Friday November 10th.
  • Thursday November 24th, no office hours that week due to Thanksgiving Holiday.

September 5, 2017

Preliminary hypotheses. Be prepared to give a 5 minute description of your subject of interest, hypothesis, and initial sense of whether KUMC/HERON will have the relevant data for your research topic.

  • If you don't have a topic, give 5 minutes of your interests and I will help with a topic.

September 7, 2017

HERON and Querying: review training manual and videos. Hands on demo and discussion in class today and Tuesday.

September 12, 2017

HERON and Querying

September 14, 2017

Sakher explains his topic.

HERON and Querying. Discussion of last years query exam. I will be developing a new exam with a new topic but it will cover the same general areas. Reference these links for last years example

September 19, 2017

Data Management and Data Delivery from HERON using REDCap.

As far as an introduction to REDCap ( , please review the built in videos:

We will be focusing in class on how we use REDCap to distribute data and highlight some of the built in data profiling tools. But, we won't be focusing much on REDCap's core utility which is electronic data capture for prospective studies or chart abstraction.

Review the REDCap videos and the HERON videos on understanding data receipt.

Also, work on a draft data request for your project similar to the one created by Jasmin for her project (see attached file.HERONSponsorshipAndDataAccess2017-09-19_1058.pdf)

September 21, 2017

  • Exam was distributed yesterday. Brief Q&A.
  • Discussion and finalizing your data requests
  • We deferred the introduction of SQL to spend time focused on helping everyone get their project data requests refined for submission.

September 26, 2017

  • Continue to finish your data requests. About half the class has prepared their requests and Russ has submitted them.
  • Introduction to SQL

SQL Tutorial and Exercises W3Schools has a good interactive website with the SQL basics and an online tool that lets you query a tiny relational database

We will go through some of the exercises in class together but become familiar as the data you will receive will be in a SQLite database

Now next class, we will focus on SQL and interacting with data from HERON; specifically the data tables in Jasmin's project.

In preparation, make sure you can connect to mydrive so you can securely manage the dataset that you will then load into SQLite. Instructions are provided below.

Steps to access files (Windows).

  1. Open myComputer folder.
  2. Right click
  3. Click add a network location
  4. Click next
  5. Choose a custom network location. Enter
  6. Enter KUMC login/password

Steps to acces files (Mac).

  1. Open Findser.
  2. Select "Go" from the menu
  3. Select "Connect to server"
  4. enter
  5. Enter KUMC login/password. Make sure it's your KUMC login, not your computer userid if it's different.

For either Mac or Windows navigate to the folder for PVRM_868_fall_2017.

September 28, 2017

We will continue the introduction to SQL and especially areas that will be most useful to your clinical data:

  • Joins
  • Creating Tables to store the results where you have performed operations such as joins and min(), max() with GROUP BY

We will look at this first with the continued online w3schools tutorial but move to using our the data from Jasmin's project and loading the data into our own SQLite database.

A tool that will be useful for this is SQLiteBrower: Please download this tool:

Additionally, during Fall Break (October 14-17), we will be holding our Greater Plains Collaborative Learning Engagement Conference here in KC at the Kauffman Foundation: Students in the past have learned and enjoyed attending the event. Let Hillary Sandoval know you are interested in sitting in.

We spent considerable time in class getting everyone to where they could load the data onto, import their data into SQLite and access fro SQLite browser.

October 3, 2017

Everyone now has finished submitting your data requests for your projects. But, please coordinate with Maren to polish the final computable phenotype for the data request.

This class, I want to build on last class and have you spend more time using SQL and SQLite using Jasmin's database.

Did everyone watch Maren's videos?

We may also use the prior SQL Exam as a learning exercise.

October 5, 2017

I will be traveling but Maren and Tamara will continue the instruction now that you have learned out to subselect variables from the data view table and load them into a new table for your analysis.

They will continue and finish the lectures regarding the other common transformations of clinical temporal data and linking your analysis tables together into a final analytic set.

I have also distributed last year's SQL exam as a guide for common use of SQL databases based on i2b2 data.

October 10, 2017

Checking in that everyone has their query for their database defined.

Continued work on building clinically relevant analytic sets in SQL. Discussion of how to define your analytic set to answer your hypotheses.

October 12, 2017

Using Jasmin's database and the prior exam, use of temp tables to stitch data into final analytic set. Note: another good example of data transformation into analytic sets is provided in the HERON training manual, , and the Receiving HERON data training videos

Based on feedback from the last class, review of method to aggregate codes based on the code_info_view table.

select * from data_view where exists (select 1 from code_info_view where code_info_view.variable = '401-405.99 HYPERTENSIVE DISEASE' and data_view.code = code_info_view.code)

We will then in the class explore ways in which we might split that set based on codes or referencing the path (but note need to use i2b2 to interpret the rather cryptic path names in the code_info_view table).

select * from data_view where exists (select 1 from code_info_view where code_info_view.variable = '401-405.99 HYPERTENSIVE DISEASE' and data_view.code = code_info_view.code) ;

select distinct data_view.code from data_view where data_view.variable = '401-405.99 HYPERTENSIVE DISEASE' ;

/* now, let's see if we can do it on a split of hypertensive disease and not include those with the CKD htn */

select * from code_info_view where code_info_view.code like '%403%' ;

select * from data_view where exists (select 1 from code_info_view where code_info_view.variable = '401-405.99 HYPERTENSIVE DISEASE' and data_view.code = code_info_view.code and code_info_view.code_path not like '\i2b2\Diagnoses\A18090800\A8359006\A8359014\A8359777\A10863166\%') ;

October 19, 2017

As discussed last class, come prepared to discuss your analytic file(s) and timelines. Please attach these to your wiki page and add some narrative on the wiki about the specific requirements you have. I know some of you have done this but may have placed it on the shared drive. Let's put them on the wiki so we can share our progress with future students and collaborators.

These are likely to include:

  • combining diagnosis codes into a co-morbidity category (e.g. all Diabetes codes in ICD9, ICD10 and IMO DX_IDs)
  • defining the key intervention or observation that anchors other observations included in your study (e.g. the lab result or vital sign prior to the intervention/surgery/diagnosis/drug but within a certain time window)
  • potentially defining a control population

We made it through three students timelines and analytic files: Natalie, Carlee, and Ramzi.

October 24, 2017

The SQL Exam was distributed today. It will be due on November 2nd.

Let's continue this class reviewing additional students timelines and analytic sets. Please remember to update your wiki page to refine your project and you may also comment on other student's presentations.

We made it through three student's timelines and analytic files: Lubna, Abdul, Mohammed.

October 26, 2017

We will finish reviewing students timelines and analytics sets: Sakher and Kosaku.

I have slipped the exam a week to give you more time but we are also a bit ahead of schedule wrt finalizing projects.

How is it going with your datasets to move from design (timeline and file specification) to actual data?

Based on reviewing your projects the last several weeks, we can focus further discussion in a couple areas to gain the most value for you in the class and for your project:

  • I think it's becoming clear having a precise timeline and consort diagram for your study may be very beneficial for some
  • discussion regarding multiple clinical measures for your condition of interest (e.g. diagnosis, labs, meds and diabetes).
  • windows of observation for follow up: as you dig into your data: how will it compare against existing knowledge? (Ex: Abdul's 6 month window to see evolution of comorbidities)
    • Are differences due to irregular sampling problem common to clinical care and documentation?
  • Reinforcing an early observation wrt clinical documentation: it's important to understand the workflow and the quantity of data recorded before you assume it can be used as a reasonably complete covariate in your study (e.g. insomnia severity recorded in the concussion checklist).
    • It may be valuable to have a discussion of how to approach this and gaining insight through rounding in clinical areas from the different perspectives of medical, nutrition and rehabilitation students.
    • To what degree have you met with people documenting all the data you are depending upon?
  • challenges with "larger" data and query performance.
    • This may or may not be an issue depending on your cohort size and the types of variables you have included.
    • There are trade-offs in data manipulation and analysis: space versus speed versus flexibility. We can discuss strategies such as intermediate tables and possibly determine whether it would be valuable to introduce the concept of indexing for the next class.
    • For future classes could introduce discussion of distributed computing and newer data storage formats.

October 31, 2017

Discussion of schedule and final presentations and abstracts. We will still aim to do all eight final presentations the last week (Dec 5th and 7th) with the prelim reviews the prior week.

We worked through one of the projects (Natalie) from import to first steps of creating analytic set.

November 2, 2017

Reviewed analytic set approach for Abdul.

November 7, 2017

Reviewed the SQL exam. I am still finalizing the grading to determine if I will award partial credit for proper SQL syntax but the numeric result is incorrect due to a prior error. Also, want to test your extra credit questions.

Next class we will discuss format for final presentations and abstracts:

  • generating basic descriptive statistics
  • though not a clinical trial, your audience may benefit from a consort diagram so they understand your inclusion/exclusion criteria for your studied population relative to the larger patient population in HERON

November 9, 2017

Based on Maren's suggestion, we will have people pair up to do code review of their SQL to create their analytical sets. Please blog about what you learned. Were there specific things you helped uncover for the person who you were reviewing? Did reviewing their code also help you think about your project and approach to data?

  • Abdul and Ramzi
  • Lubna and Mohammed
  • Natalie and Koh
  • Carlee and Sakher
  • If someone is sick and the other person needs a code review, Song Sing can play "reviewer" for you. (xsong@…)
  • Let's aim to have a your code to your reviewer by next class, Tuesday November 14th and then meet with your reviewer and have the review done by Thursday November 16th.

November 14, 2017

Discussion of SQL code reviews relative to timelines and analytic files previously defined.

November 16, 2017

Review of Abdul and Ramzi's SQL and analytic file generation. Commentary for the group: as I had suspected, some of these physical assessments are not going to be frequently recorded in every clinical population (e.g. mobility assessment for the stroke population versus an orthopedic surgery population). This will definitely be revealed as you create your final analytic set but could have been determined when doing exploratory evaluation with HERON, the cohort, and the frequency with which such measures are charted during the encounter.

Office hours available on Friday.

November 21, 2017

Reviewing analytic files and preliminary analysis methods. We had a discussion and review mainly of Mohammed's analytic file preparation and preliminary analysis. Also reviewed Ramzi's analysis.

For next class we will discuss drafts of abstracts and presentations. If you don't have a conference proceeding already in mind for your work, let's use the AMIA 2 page Podium Presentation abstract as an example as listed on this page

and in the attached file

November 28, 2017

Abstracts and Preliminary Results Review: we will discuss briefly the common types of presentations at conferences, often the first step for presenting your work. We will continue with further student's preliminary results review.


  • if you don't have a conference in mind, use the AMIA 2 page podium abstract guidelines. This is also for a 15 minute talk
  • if you do have one identified, use their format, even if 250 words for an abstract. But provide on your wiki page the link to the call for proposals you are targeting. Please though make sure you are targeting a presentation format (not just a poster). As you target your presentation, if it calls for 10 minutes, instead of 15, tailor your presentation for net week to take only 10 minutes.

Discussion of presentation schedule. Some might like to have the extra week to prepare their presentation and have the finals week for the presentation. This is ok. Students agreed to attend the finals week presentations.

Here's the schedule then:

  • December 5, Ramzi and Mohammed
  • December 7, Abdul, Kosaku, Sakher
  • December 12: Natalie, Lubna, and Carlee

Consider using the Biostatistics consultation available Fridays from 1-3 in Robinson.

Then reviewed Abdul's project. He did his analysis in SPSS and use largely Chi Square testing comparisons.

November 30, 2017.

Note from Carlee: Biostats meets 10-3 on Friday's in the ground floor with a 12-1 lunch break.

Discussion of claims data analysis in comparison to EHR based analysis.

  • reminder: please include a consort diagram for your study: who you started with, who was lost due to various kinds of attrition. Who's in your final analytic set.

Reviewing Kosaku's preliminary. Lumbar spine surgery and opioid use. Does the dose close to surgery impact pain sensitivity and gait within 24 hours. Looking at a 72 hour frame for assessing pain and gait distance.

December 5, 2017

Presentations by Ramzi and Mohammed

December 7, 2017

Presentations by Abdul, Kosaku, Saker

December 12, 2017

Presentations by Natalie, Lubna and Carlee

