American Recovery and Reinvestment Act of 2009 and Clinical and Translational Science Awards are promoting active work in the field of interoperability.
Enabling Hierarchical View of RxNorm with NDF-RT Drug Classes Matvey B. Palchuk, MD, MS1,2, Michael Klumpenaar1, Tarang Jatkar, MS1, Ralph J. Zottola, PhD3, William G. Adams, MD4, Aaron H. Abend, MBA1 1 Recombinant Data Corp., Newton, MA; 2Harvard Medical School, Boston, MA; 3 University of Massachusetts Medical School, Worcester, MA; 4 Boston University School of Medicine, Boston, MA Abstract NDF-RT is the proposed source of drug classification information. We set out to construct a hierarchy of NDF-RT drug classes and RxNorm medications and evaluate it on medication records data. NDF-RT and RxNorm are distributed in different file formats, require different tools to manipulate and linking the two into a hierarchy is a non-trivial exercise. Medication data in RxNorm from two institutions was constrained by the hierarchy. Only 37% of records from one and 65% from another institution were accessible. We subsequently enriched the RxNorm mapping in NDF-RT by exploiting relationships between concepts for branded and generic drugs. Coverage improved dramatically to 93% for both institutions. To improve usability of the resulting hierarchy, we grouped clinical drugs by corresponding clinical drug form. Introduction The major focus of research and development in contemporary applied medical informatics is interoperability. Strong incentives such as the American Recovery and Reinvestment Act of 2009 and Clinical and Translational Science Awards are promoting active work in the field of interoperability among systems at the point of care as well as for secondary uses of data1, 2. Achieving effective and meaningful data sharing across institutions requires aligning institutional interests, deploying technological solutions for data exchange, and establishing robust, scalable and maintainable semantic infrastructure to ensure that the meaning is preserved as data crosses institutional boundaries. Semantic interoperability relies on standards. One of the more challenging areas on the spectrum of clinical information is data about medications and corresponding standards. Medication data is complex - there are inpatient medication lists, orders, administration records, pharmacy information, billing, reconciliation, outpatient lists, e-prescribing, medication history, etc. Different users of medication data require different views, and clinicians, patients, and pharmacists are unlikely to be satisfied by the same organization and naming convention of a medication list. The uses of medication data are
complex too, and go beyond representing medication history or prescribing to scenarios dealing with allergies, adverse event recognition, quality, performance, business intelligence reporting. Medication standards must be capable of going beyond representing lists of drugs and support aggregation of medications into classes depending on purpose (therapeutic action, allergy-related, drug interactions, financial utilization, etc.). Hierarchical representations of medications are required to access, use and share meaningful data about medications. The landscape of medication standards is densely populated with varied offerings from well-established commercial vendors to federal agencies to homegrown terminologies. Efforts on the national level, represented by such organizations as the Federal Medication interagency collaboration, Healthcare Information Technology Standards Panel (HITSP), and Certification Commission for Health Information Technology (CCHIT), promote the use of such standards as Federal Drug Administration’s Unique Ingredient Identifier (UNII) and National Drug Codes (NDC), National Library of Medicine’s RxNorm, and Veterans Health Administration’s National Drug File – Reference Terminology (NDFRT)3. Background Adopting existing tools or creating new ones for aggregating information from disparate sources and enabling data exchange is a high priority for healthcare delivery and biomedical research organizations4. One such tool is i2b2 (Informatics for Integrating Biology and the Bedside) – a platform to support a collaborative approach to translational research5. i2b2 consists of a research data mart with de-identified patient data and an ad-hoc query tool which enables self-service queries for cohort identification. The i2b2 query tool features a classification hierarchy (visualized as nested folders) of data available for querying; this hierarchy is required by i2b2. A user can browse the hierarchy to find a concept of interest, or search for it. In constructing a query, concepts from different levels of the ontology may be used. For example, users interact with a hierarchy of diagnoses
AMIA 2010 Symposium Proceedings Page - 577
in ICD-9-CM and can use a single diagnosis or a group of diagnoses organized into a folder in their query. RxNorm is the logical choice when it comes to deciding on a standard way of representing medications in i2b2 or related tools. The standard is endorsed on the national level, it is rapidly maturing, it includes a wide selection of existing medication terminologies as its sources, and many commercial vendors offer mappings to RxNorm, making efforts to map local terminologies feasible. “RxNorm is organized around normalized names for clinical drugs and drug delivery devices. These names contain information on ingredients, strengths, and dose forms6.” Medications in RxNorm are represented by concepts at different levels of abstraction, and relationships between these concepts. There are four types of concepts that represent packaged drugs: Semantic Clinical Drug (SCD), Semantic Branded Drug (SBD), Generic Pack (GPCK), and Brand Name Pack (BPCK). Concepts of these types contain drug name, strength, form and route of administration – attributes we consider most relevant for representing patient instance data such as drug orders or medication administration records. Unfortunately, RxNorm does not include any notion of drug classes and provides no hierarchy of medications. NDF-RT is the proposed source of drug classification information3. NDF-RT is produced by the Veterans Health Administration and is developed in the Ontylog7 language using Apelon’s Terminology Development Environment product (Apelon, Inc., Ridgefield, CT). It is a comprehensive terminology that goes beyond medications and represents physiological effect, mechanism of action, pharmacokinetics, and related diseases. Our interest in NDF-RT, however, is limited to the “Drug Products by VA Class” hierarchy under “Pharmaceutical Preparations.” Drugs are classified mainly by a specific chemical or pharmacological classification (for example, beta-blockers, cephalosporins) or by therapeutic category (for example, antilipemic agents, antiparkinson agents). Drug products with local effects are classified by route of administration (for example, dermatological, ophthalmic)8. Methods We downloaded RxNorm Full Monthly Release dated 12/07/2009 from National Library of Medicine9. RxNorm is distributed in Rich Release Format – the default relational format used by the National Library of Medicine. To construct our hierarchical view, we
required two tables: RXNCONSO and RXNSAT. The release includes scripts for loading RxNorm into Oracle or MySQL, but we used SQL Server Integration Services to load the data into Microsoft SQL Server (Microsoft Inc., Redmond, WA). We downloaded NDF-RT Public Edition version 2009-10-13.9AA from the official National Cancer Institute repository8. The file was a zipped Common Data Format (CDF) full load package intended for importing into an open source Apelon Distributed Terminology System (DTS) database10. The Terminology Query Language (TQL) Editor plug-in for the DTS Editor was then used to export the entire Pharmaceutical Preparations hierarchy into a pipe delimited text file. Each record in the file contained a reference to the parent class, so that the hierarchical relationship was not lost. The contents of the text file were loaded into a Microsoft SQL Server database and converted into an i2b2 concept path structure. The concept paths were constructed using a stored procedure that traversed the hierarchy defined by parent/child relationships. Mappings to RxNorm are present as attributes of NDF-RT “VA Product” concepts (children of “VA Class” concepts used to construct the hierarchy). We linked the two standards together using these mappings to create a hierarchical view of RxNorm organized by NDF-RT Drug Classes. We used medication records data from the University of Massachusetts Memorial Health Care (UMMHC) and Boston Medical Center (BMC) to evaluate the use of NDF-RT Drug Classes with RxNorm-coded medications. The data used for this project contained no personal health information of any kind. Both datasets were coded to NDC. NDC is one of the source vocabularies for RxNorm and is available in RxNorm distributions. We performed a crosswalk from NDC to RxNorm to map both datasets to RxNorm. To improve the linking (increase the number of RxNorm concepts linked to NDF-RT hierarchy), we used “has_tradename” and “tradename_of” relationships in RxNorm to enrich existing NDF-RT mappings for SCD and SBD concept types. For each existing NDF-RT’s mapping to RxNorm, we identified a corresponding drug form in RxNorm, and then added mappings to all matching generic and branded drugs. Lastly, we introduced one more level between NDFRT Drug Classes and RxNorm concepts into the hierarchy to make it more usable. All RxNorm concepts were grouped under Semantic Clinical Drug
AMIA 2010 Symposium Proceedings Page - 578
Form (SCDF) – a concept that represents drugs as “ingredient plus dose form” (where dose form includes the route of administration, like “oral tablet” or “injectable solution”) – using “isa/inverse_isa” relationships. This additional level of hierarchy organizes all branded and generic drugs that share ingredients, form and route of administration. Specific steps to achieve this grouping are as follows: 1. 2. 3. 4.
5.
We used native RxNorm codes from the NDF-RT Drug Classes hierarchy to link RxNorm-coded data from both institutions into the hierarchy. The outcome was poor – only 37% of records from BMC and 65% from UMMHC ended up in the resulting hierarchy (see Table 3). With such large swaths of records missing from the resulting view, the use of this hierarchy in any system would be unacceptable.
Replace all SBDs with SCDs Replace all BPCKs with GPCKs Replace all SCDs and BPCKs with SCDFs Add all SCDs and GPCKs under SCDFs using SCDF to SCD and SCDF to GPCK crosswalks Add all SBDs and BPCKs under SCDFs using SCDF to SCD to SBD and SCDF to GPCK to BPCK crosswalks
To increase the coverage, we enriched mappings on the NDF-RT side by adding all branded drugs for which NDF-RT only contained the corresponding generic and adding all generics for which NDF-RT only contained the branded drugs. This intervention dramatically improved the linking and increased RxNorm records coverage by NDF-RT Drug Class hierarchy to 93% at both institutions (see Table 4). The resulting hierarchical view featured NDF-RT Drug Classes as upper-level nodes and RxNormcoded drug names (semantic clinical and branded drugs and packs) as terminal nodes or leaves. For example, traversing the hierarchy of “CARDIOVASCULAR MEDICATIONS” to “BETA BLOCKERS/RELATED,” we saw a very long list of beta blockers, each with all permutations of strength, form and route of administration. The look and feel of this view was not optimal: individual drugs were hard to differentiate and there was no effective way of organizing all variations of a single drug into a group.
Results NDF-RT to RxNorm mapping spans multiple RxNorm concept types (see Table 1), but the absolute majority of mappings point to concept types we identified above as most relevant: Semantic Clinical Drug (92%), Semantic Branded Drug (5%) and drug packs (1.6% branded and 0.7% generic). It is difficult to quantify the degree of coverage of NDF-RT mappings to RxNorm, but judging by comparing concept types, NDF-RT Drug Class hierarchy points to only a subset of RxNorm. A more meaningful assessment of this mapping can be achieved by using transactional medication datasets from electronic health records systems.
To control these long lists of similar medications, we leveraged SCDF concept type of RxNorm to organize semantic clinical and brand drugs and packs. This resulted in a listing where all drugs that have the same ingredients and dose forms are grouped together. For example, all of the different dosages of Lipitor and Atorvastatin oral tablet are grouped in a single folder.
UMMHC and BMC medication datasets were successfully mapped from NDC to RxNorm. Mapping involved the same four concept types (semantic clinical and branded drugs, and packs) representing packaged drugs. 93% of UMMHC data and 83% of BMC data were mapped to RxNorm (see Table 2). We did not investigate the discrepancy in mapping rates at this time, although some degree of imperfection in a mapping is always expected and is likely due to peculiarities of original data coding (invalid or outdated codes, for example), differences in representation and gaps in coverage11. In subsequent analysis, we used only records mapped to RxNorm as reference. RxNorm Concept Types
BN
RxNorm Concepts NDF-RT to RxNorm Mapping
BPCK
Figure 1 illustrates the final hierarchy. The view starts with NDF-RT Drug Classes (shown as folders with labels in uppercase letters), and navigating through nested folders, a user gets to a listing of folders representing Semantic Clinical Drug Forms, and finally to semantic clinical and branded drugs and packs.
IN
PIN
SBD
SBDF
SCD
SCDF
13,372
349
GPCK 289
4,545
1,514
19,318
14,201
30,363
13,105
4
136
59
15
4
423
4
7,895
26
Table 1. Distribution of RxNorm concepts (selected types shown for reference) and NDF-RT “Drug Class by VA Product” to RxNorm mapping by RxNorm Concept Type. Concept types are Brand Name (BN), Brand Name Pack (BPCK), Generic Pack (GPCK), Ingredient (IN), Precise Ingredient (PIN), Semantic Branded Drug (SBD), Semantic Branded Drug Form (SBDF), Semantic Clinical Drug (SCD), and Semantic Clinical Drug Form (SCDF).
AMIA 2010 Symposium Proceedings Page - 579
BPCK
GPCK
SBD
SCD
Total Records in NDC
Records Mapped to RxNorm
Records Mapped to RxNorm %
RxNorm Concepts
349
289
19318
30363
UMMHC Records
169
9
3345
2120
7,722,841
7,204,047
93
BMC Records
185
9
4734
2429
4,531,379
3,766,583
83
Table 2. Mapping instance medication data from University of Massachusetts Memorial Health Care (UMMHC) and Boston Medical Center (BMC) from NDC to RxNorm. Shown breakdown by relevant concept types as well as overall mapping summary. BPCK NDF-RT to RxNorm Mapping UMMHC Records in RxNorm Linked to NDF-RT BMC Records in RxNorm Linked to NDF-RT
GPCK
SBD
SCD
Records Mapped
Records Linked
Records Linked %
136
59
423
7,895
93
7
201
1,989
7,204,047
4,659,274
65
110
7
252
2,219
3,766,583
1,412,188
37
Table 3. Linking instance medication data mapped to RxNorm from UMMHC and BMC into NDF-RT Drug Class hierarchy. BPCK NDF-RT to RxNorm Mapping, Enriched UMMHC Records in RxNorm Linked to NDF-RT BMC Records in RxNorm Linked to NDF-RT
GPCK
SBD
SCD
Records Mapped
Records Linked
Records Linked %
136
59
12,541
8,794
93
7
3,006
1,995
7,204,047
6,687,842
93
110
7
4,174
2,228
3,766,583
3,505,974
93
Table 4. Linking instance medication data mapped to RxNorm from UMMHC and BMC into NDF-RT Drug Class hierarchy subsequent to enriching SCDs and SBDs in NDF-RT to RxNorm Mapping using RxNorm “has_tradename” and “tradename_of” relationships. Discussion Representing terminologies in hierarchical fashion is important not only for enabling browsing function in graphical user interfaces (for example, navigate to a drug of interest in a hand-held e-prescribing system), but in dealing with any situations where data needs to be grouped logically for a specific purpose. The i2b2 query tool requires terminologies representing underlying data to be hierarchical to enable querying on groups of concepts. This requirement represents an obstacle on the path to deployment of i2b2 since data in many organizations is represented by domain vocabularies that do not have a corresponding hierarchical view. When it comes to medications, efforts were made to graft hierarchies onto RxNorm but they did not involve NDF-RT4. Even though NDF-RT has been promoted as a source of drug class information for several years, updates to NDF-RT became publically available only recently. Using NDF-RT’s drug class tree to organize RxNorm into a hierarchy is a difficult and imperfect process. Two standards are made available in very different distribution formats and require very different tools
for access and manipulation. NDF-RT contains only a subset of mapping to RxNorm, resulting in gaps of coverage and limiting the use of resulting hierarchies. Furthermore, simply constraining RxNorm by NDFRT drug classes results in loss of data and less-thanoptimal outcome from the usability standpoint. Enhancements described in this paper go a long way toward improving the final hierarchical view, but it is not ideal. A view of RxNorm with NDF-RT drug classes is probably fine for use in research, but not appropriate for use in point-of-care applications without additional improvements. We would like to see the NDF-RT Drug Class tree widen its mapping to RxNorm and adopt a more systematic approach to mapping – attempt to restrict mapping to a selected subset of RxNorm concept types (as shown in Table 1, it is close but not perfect). It would be nice for NDF-RT to be made available in a more standard release format. Perhaps the drug classification hierarchy can be spun-off and made available separately? It is likely that in the future other classifications of RxNorm will be required (such as for representing drug allergies classes) – will NDF-RT be the source of this information?
AMIA 2010 Symposium Proceedings Page - 580
NDF-RT has gone through two releases after the project was completed and before the preparation of this manuscript. We used October 2009 NDF-RT (2009.10.13) release; there have been November 2009 NDF-RT (2009.11.10) and December 2009/January 2010 NDF-RT (2010.01.12) releases. Not evaluating the latest NDF-RT release is a limitation of this work. Conclusion NDF-RT can be used as a source of Drug Class information for medication data in RxNorm, but the process is onerous. Applicability should be limited to secondary uses of data until mapping coverage improves. Creating a usable hierarchy requires an additional level of organization. These deficiencies make working with medication data in RxNorm very difficult in cases where browsing is required or drugs need to be organized into groups. References
Figure 1. A fragment of the RxNorm-coded drug list grouped by Semantic Clinical Drug Form and organized into NDF-RT Drug Classes. The nested folder view is expanded to illustrate the hierarchy starting from Cardiovascular Medications to Semantic Clinical Drug and Semantic Branded Drug representations of Atenolol Oral Tablet. NDF-RT is one of the sources in the National Library of Medicine’s Unified Medical Language System (UMLS) Metathesaurus, but the version available as of this writing is dated 3/11/2008 – over 2 years old12! It is highly desirable to update NDF-RT in UMLS and close this version gap; in addition, it will simplify the process of obtaining both standards and perhaps will result in improved mapping coverage. Alternatively, we can envision RxNorm taking on the responsibility of building and making available different hierarchical views of its own data. We see RxNorm evolving toward addressing the requirements of an interface terminology, and as such it will be difficult to avoid hierarchical organization.
1. Kuperman et al. Developing data content specifications for the Nationwide Health Information Network Trial Implementations. JAMA (2010) 17:6-12. 2. Zerhouni EA. US biomedical research: Basic, translational, and clinical sciences. JAMA (2005) Sep 21;294(11):1352-8. 3. C32-HITSP summary documents using HL7 continuity of care document (CCD) component. Available from: http://wiki.hitsp.org/docs/C32/C32-1.html 4. Lowe HJ, Ferris TA, Hernandez PM, Weber SC. STRIDE – An Integrated Standards-Based Translational Research Informatics Platform. AMIA Annu Symp Proc. (2009) 391-5. 5. Murphy SN, et al. Serving the enterprise and beyond with informatics for integrating biology and the bedside (i2b2). JAMIA (2010) 17:124-130. 6. An Overview to RxNorm. Available from: http://www.nlm.nih.gov/research/umls/rxnorm/overvie w.html 7. Hartel FW, Coronado Sd, Dionne R, Fragoso G, Golbeck, J. Modeling a description logic vocabulary for cancer research. JBI (2005) 38(2):114–129. 8. Veterans Health Administration National Drug File – Reference Terminology. Available from: http://evs.nci.nih.gov/ftp1/NDF-RT/ 9. RxNorm Files. U.S. National Library of Medicine. Bethesda, MD. Available from: http:/nlm.nih.gov/ research/umls/rxnorm/docs/rxnormfiles.html 10. Apelon Distributed Terminology System. Ridgefield, CT. Available from: http://apelon-dts.sourceforge.net/ 11. Campbell et al. Phase II evaluation of clinical coding schemes: completeness, taxonomy, mapping, definitions, and clarity. JAMIA (1997) 4 (3):238-51. 12. UMLS Source Release Documentation. Available from: http://www.nlm.nih.gov/research/umls/sourcereleasedo cs/index.html
AMIA 2010 Symposium Proceedings Page - 581