Raw Data Library
About
Aims and ScopeAdvisory Board Members
More
Who We Are?
User Guide
Green Science
​
​
EN
Kurumsal BaşvuruSign inGet started
​
​

About
Aims and ScopeAdvisory Board Members
More
Who We Are?
User GuideGreen Science

Language

Kurumsal Başvuru

Sign inGet started
RDL logo

Verified research datasets. Instant access. Built for collaboration.

Navigation

About

Aims and Scope

Advisory Board Members

More

Who We Are?

Contact

Add Raw Data

User Guide

Legal

Privacy Policy

Terms of Service

Support

Got an issue? Email us directly.

Email: info@rawdatalibrary.netOpen Mail App
​
​

© 2026 Raw Data Library. All rights reserved.
PrivacyTermsContact
  1. Raw Data Library
  2. /
  3. Publications
  4. /
  5. Next Generation Myelofibrosis Risk Analysis in the Electronic Health Record

Verified authors • Institutional access • DOI aware
50,000+ researchers120,000+ datasets90% satisfaction
Article
English
2018

Next Generation Myelofibrosis Risk Analysis in the Electronic Health Record

0 Datasets

0 Files

English
2018
Blood
Vol 132 (Supplement 1)
DOI: 10.1182/blood-2018-99-113692

Get instant academic access to this publication’s datasets.

Create free accountHow it works

Frequently asked questions

Is access really free for academics and students?

Yes. After verification, you can browse and download datasets at no cost. Some premium assets may require author approval.

How is my data protected?

Files are stored on encrypted storage. Access is restricted to verified users and all downloads are logged.

Can I request additional materials?

Yes, message the author after sign-up to request supplementary files or replication code.

Advance your research today

Join 50,000+ researchers worldwide. Get instant access to peer-reviewed datasets, advanced analytics, and global collaboration tools.

Get free academic accessLearn more
✓ Immediate verification • ✓ Free institutional access • ✓ Global collaboration
Access Research Data

Join our academic network to download verified datasets and collaborate with researchers worldwide.

Get Free Access
Institutional SSO
Secure
This PDF is not available in different languages.
No localized PDFs are currently available.
Adrian Bejan
Adrian Bejan

Duke University

Verified
Andrew Sochacki
Adrian Bejan
Shilin Zhao
+4 more

Abstract

Background: Myelofibrosis (MF) is a devastating myeloproliferative neoplasm that is hallmarked by marrow fibrosis, symptomatic extramedullary hematopoiesis, and risk of leukemic transformation, most commonly driven by janus kinase 2 (JAK2) pathway mutations. MF risk classification systems guide prognosis, decisions regarding allogeneic stem cell transplantation, and disease modifying agents. Key systems include the Dynamic International Prognostic Scoring System (DIPSS) 2009, DIPSS plus 2010, Genetics-Based Prognostic Scoring System (GPSS) 2014, and Mutation-Enhanced International Prognostic Scoring System (MIPSS) 2014. System contributions include dynamic scoring (DIPSS), cytogenetics (DIPSS Plus), and high risk molecular mutations (GPSS and MIPSS). To power the next generation of MF risk prognostication, and ascertain new prognostic factors, large scale electronic health record (EHR) and genomic data will need integration. As a proof of concept, we leveraged our de-identified research EHR (2.9 million records) and linked genomic biobank (288,000 patients) to develop an all-inclusive phenotype-genotype-prognostic system for MF and recapitulate DIPSS, DIPSS Plus, GPSS and MIPSS. Methods: Our previously described methods (Bejan et al. AACR 2018) utilized natural language processing to algorithmically identify 306 MF patients. A subset (N=125) had available DNA for genotyping. We automatically extracted: age greater than 65, leukocyte count (WBC) greater than 25x109/L, hemoglobin (Hgb) less than 10g/dL, platelets (PLT) less than 100 x 109/L, circulating myeloid blasts ≥ 1%, and 10% weight loss compared to baseline as a proxy for constitutional symptoms. Transfusion data was not included. Karyotype data was manually reviewed. Next generation sequencing (NGS) was performed on biobanked peripheral blood DNA with the Trusight Myeloid Panel (Illumina®). Genotyped samples were restricted to dates after MF diagnosis. Multivariate Cox proportional hazard analysis was performed on all clinical and genomic variables. DIPSS plus was calculated without adjustment but lacked transfusion data. DIPSS, GPSS and MIPSS scores were calculated by published methods. Results: Multivariate Cox proportional hazard regression identified Hgb (HR=6.4; P=0.006), myeloid blasts (HR=3.8; P=0.03), and ASXL1 (HR=5.2; P=0.02) as significant in our cohort with regard to overall survival (OS). We noted a strong trend for high risk karyotype (HR=5.6; P=0.07). Our DIPSS model median survival (N=120) for each subgroup; low risk (median survival not met), intermediate-1 (108 months), intermediate-2 (47 months) and high risk (6 months) P=0.0002 (Figure 1a). DIPSS Plus (N=122) integrated karyotype data and PLT count with similar survival with the exception of high risk (4 months) P=0.00003 (Figure 1b). The percentage of patients with driver mutations in JAK2V617F (57%), CALR (3%) and MPLW515 (7.2%); JAK2WT , CALRWT and MPLWT triple negative (34%); high molecular risk ASXL1 (15%), EZH2 (6%), IDH1/2 (7%), SRFS2 (17%); other variants of interest TET2 (9.6%), TP53 (29%) and DNMT3A (16.8%). MIPSS (N=125; 48 months follow up) noted low risk, intermediate-1, and intermediate-2 (median survival not met) and high risk (32 months) P=0.0001 (Figure 1c). GPSS (N=125; 48 months follow up) did not demonstrate statistical separation among groups (Figure 1d). Discussion: This proof of concept transformed raw EHR records into clinical risk scores for MF. The addition of retrospective DNA analysis via NGS opens the possibility of multi-institutional EHR-biobank studies to most accurately create a system to define MF risk. Our sample size limited the significance of age, PLTs, poor risk mutations and other variables previously shown to impact OS. Likewise, we lacked the capacity to track transfusion dependence, previously shown to have prognostic relevance. Still, prognostication via the EHR mimics common scoring systems in MF and supports correct MF case selection, accurate laboratory extraction and reproducible genotyping of biobanked samples. Similar to the original GPSS report, our low risk cohort was small (N=2) and will benefit from expansion of genotyping underway. Finally, this phenotype-genotype-prognostic paradigm represents a technical advance and a unique opportunity to deploy patient specific comorbidities from lifetime EHR records to further refine risk across all myeloid disease. Disclosures Savona: Boehringer Ingelheim: Consultancy; Celgene: Consultancy, Membership on an entity's Board of Directors or advisory committees; Incyte: Membership on an entity's Board of Directors or advisory committees, Research Funding.

How to cite this publication

Andrew Sochacki, Adrian Bejan, Shilin Zhao, Travis Spaulding, Thomas Stricker, Yaomin Xu, Michael R. Savona (2018). Next Generation Myelofibrosis Risk Analysis in the Electronic Health Record. Blood, 132(Supplement 1), pp. 3038-3038, DOI: 10.1182/blood-2018-99-113692.

Related publications

Why join Raw Data Library?

Quality

Datasets shared by verified academics with rich metadata and previews.

Control

Authors choose access levels; downloads are logged for transparency.

Free for Academia

Students and faculty get instant access after verification.

Publication Details

Type

Article

Year

2018

Authors

7

Datasets

0

Total Files

0

Language

English

Journal

Blood

DOI

10.1182/blood-2018-99-113692

Join Research Community

Access datasets from 50,000+ researchers worldwide with institutional verification.

Get Free Access