what is descriptive and predictive analysis in healthcare?
Faster Diagnosis, Better Treatment:#
For a long time, healthcare professionals have been using data science to optimize patient outcomes. The application can include but not limited to:
- predictive analytics of frequent medical event sequence;
- predict healthcare event/costs/readmission rate;
- clustering medical records, or predicting causations between medical events;
- Classifying medical images/videos, etc.
Using data science allows the industry to draw actionable insights in a cost-effective way. For example, Stanford ML Group developed a model which can diagnose irregular heart rhythms from single-lead ECG signals. The model outperforms the cardiologist average on both the Sequence and Set F1 metrics. Given that more than 300 million ECGs are recorded annually, the diagnosis can save professionals considerable time and decrease the number of misdiagnose. Cardiologist-Level Arrhythmia Detection with Convolutional Neural Network
Descriptive Analytics#
Descriptive analytics can be applied when we want to find out what happened in the past through aggregation.
Example 1 - CMS#
I will use the CMS 2008-2010 Data Entrepreneurs' Synthetic Public File (DE-SynPUF) as an example. The DE-SynPUF was created for providing a realistic set of claims data in the public domain while protected Medicare beneficiaries' health information. The DE-SynPUB contains five types of data: Beneficiary Summary, Inpatient Claims, Outpatient Claims, Carrier Claims and Prescription Drug Events.
I loaded the data to MySQL workbench already. In order to find out the number of patients who have both depression and Alzheimer, I will write a SQL query to extract information.
SELECT count(*) as CountNums
FROM beneficiarySummary_sample1
WHERE SP_ALZHDMTA='yes'
AND SP_DEPRESSN='yes'
Example 2 - UMLS#
Take Unified Medical Language System(UMLS) as another example. UMLS is a set of files and software that brings together many health and biomedical vocabularies and standards to enable interoperability between computer systems. Here are more usage in UMLS.
For the example in UMLS, I would like to find out possible treatments for an ALS(Amyotrophic Lateral Sclerosis) patient under same treatment concept. For an example like this, I need to go through the UMLS database query diagrams, which shows all information associated with a particular source concept.
SELECT distinct CUI, STR from umls.mrconso
where str='Amyotrophic Lateral Sclerosis";
[output]:C0002736
SELECT distinct r.cui2, r.rela, r.cui1, min(str) as treatment_name
FROM umls.mrrel r, umps.mrconso c
WHERE cui=r.cui2 and r.cui1='C0002736' and r.rela like "%treat%"
AND c.lat="eng"
group by r.cui2, r.rela, r.cui1;
[output]:
The first step shows me all the concepts under ALS inside umls.mrconso table. The second steps involved searching for relationship, in which I went through the umls.mrrel table and assigned the relationship(rela) to “%treat%”. The result shows the unique concept(cui2) which treats ALS(cui1).
Predictive Analytics#
Unlike descriptive analysis, which informed historical events, predictive analytics allows us to predict what will happen in the future. In predictive analytics, machine learning/artificial intelligence techniques will be used to train models to predict future events.
For example, I would like to predict whether a patient will develop depression. There are multiple models can be used in order solve this, I will demonstrate using Decision Tree:
from sklearn.tree import DecisionTreeClassifier
decision_tree = DecisionTreeClassifier(max_leaf_nodes=50, criterion='entropy', random_state=0)
decision_tree.fit(X_train, y_train)
predict_dt=decision_tree.predict(X_test)
decision_tree_score=decision_tree.score(X_test, y_test) * 100
decision_tree_score
[output]:69.33333333333334
class | Precision | recall | f1-score | support |
---|---|---|---|---|
0 | 0.72 | 0.61 | 0.66 | 440 |
1 | 0.67 | 0.77 | 0.72 | 460 |
Conclusion#
This post quickly introduced two analytics methods. Descriptive analytics is used for analyzing historical events and understand the past. Predictive analytics is applied for predicting future trends and prevention. Both analytics methods are vitally important not only for health professional, but also for insurance companies, medical companies, suppliers, etc.