healthcare data sources

Alt text for my gif

Introduction#

Healthcare analytics has gained more popularity in the recent years. Healthcare analytics includes but not limited to: biomedical image analysis, sensor data analysis, biomedical signal analysis, clinical text mining, clinical prediction models, Clinical Decision Support Systems(CDSS), fraud detection, etc.

This post will talk about healthcare data sources, coding system, and

1. Electronic Health Record (EHR)#

EHR, or electric health record is where patient’s medical histories are stored at. With EHR, patient’s medical records are easier to retrieved by health providers. In 2011, 54% of physicians had adopted an EHR system, and about 1/3 of adopters said that using an EHR system resulted in enhanced patient care. Other than support patient care, the main purpose of EHR is to support clinical care and billing. In EHR, data are generated from different components and departments, such as pharmacy, radiology, physician’t entries, etc. Alt text for my gif

2. Coding system#

In EHR, data are collected and stored following standard coding system to support accurate data analysis and information exchange. International Classification of Diseases(ICD) is the official coding standard developed by WHO. ICD code organizes information into different groups. It has several benefits such as easy storage and retrieval.

ICD-9#

Here I will introduce ICD-9 which is the ninth revision ICD code published in 1978 by WHO. The clinical modification ICD-9-CM is published by the U.S. Public Health Services, which is used to encode all the diagnoses for healthcare services in the United States. However, ICD-9 or ICD-10 are enough for billing purpose, but NOT enough for analytics or aggregation. Newer version of ICD codes were being published later to include more disease categories and descriptive words.

SNOWMED-CT#

Systematized Nomenclature of Medicine Clinical Terms(SNOWMED-CT) is a comprehensive, multi langauge clinical and healthcare terminology. SNOWMED-CT has a logical and semantic relationship between concepts. It has different level of details of information, which is main used for identify terms, textual description, represents relationship. The main difference between SNOWMED-CT and ICD is that SNOWMED-CT has logic and relationship, while ICD is a classification system. That mean, ICD is good for statistical analysis, and SNOWMED-CT is designed for clinical purposes.

3. UMLS#

The Unified Medical Language System(UMLS) is a collection of comprehensive biomedical concepts. It was developed by the U.S. National Library of Medicine(NLM). It is designed for medical information professions, and allows data process, retrieval, aggregation. The UMLS contains relationship, attributes, concepts, sources, etc. Because of this, it’s very easy to look for relationship, concepts and semantic information from UMLS.

For example: I want to find the immediate children concepts under SNOMEDCT_US, for a disease string “Parkinson’s disease”. In the UMLS database, I wrote the query to get the result:

select distinct str from umls2019.mrconso where AUI in (
select umls2019.mrhier.AUI from umls2019.mrhier
join  umls2019.mrconso on umls2019.mrhier.paui=umls2019.mrconso.aui
where umls2019.mrconso.str="Parkinson's disease"
and   umls2019.mrconso.sab='snomedct_us');

4. DICOM#

The Digital Imaging and Communications in Medicine(DICOM) is a medical imaging standard. It describes the data exchange protocol, digital image format and file structure for biomedical images and related information. For example, a DICOM file may include information like study date, time, physician’s Name, Procedure code, etc.

# read dicom file
fpath='/Users/user/Desktop/file.dcm'
path=get_testdata_files('file.dcm')
ds=dcmread(fpath)

Alt text for my gif

Conclusion#

Electronic Health Record system has been widely used in the United States. Other than improving patient’s care and accuracy, it also great for data analytics and research purposes. Future post will talk about techniques in healthcare data analytics.

Resources:#

Aggarwal, Charu C., and Chandan K. Reddy. 2015. Healthcare Data Analytics. CRC Press.

Eric Jamoom et al. Physician Adoption of Electronic Health Record Systems: United States, 2011, volume 98. US Department of Health and Human Services, Centers for Disease Control and Prevention, National Center for Health Statistics, 2012.