594
User Centred Networked Health Care A. Moen et al. (Eds.) IOS Press, 2011 © 2011 European Federation for Medical Informatics. All rights reserved. doi:10.3233/978-1-60750-806-9-594
Checking Coding Completeness by Mining Discharge Summaries Stefan SCHULZa,c,1,Thorsten SEDDIGa, Susanne HANSERa, Albrecht ZAIßa, Philipp DAUMKEb a University Medical Center Freiburg (UMCF), Germany b Averbis GmbH, Freiburg, Germany c Medical University of Graz, Austria
Abstract. Incomplete coding is a known problem in hospital information systems. In order to detect non-coded secondary diseases we developed a text classification system which scans discharge summaries for drug names. Using a drug knowledge base in which drug names are linked to sets of ICD-10 codes, the system selects those documents in which a drug name occurs that is not justified by any ICD-10 code within the corresponding record in the patient database. Treatment episodes with missing codes for diabetes mellitus, Parkinson's disease, and asthma/COPD were subject to investigation in a large German university hospital. The precision of the method was 79%, 14%, and 45% respectively, roughly estimated recall values amounted to 43%, 70%, and 36%.. Based on these data we predict roughly 716 non-coded diabetes cases, 13 non-coded Parkinson cases, and 420 non-coded asthma/COPD cases among 34,865 treatment episodes. Keywords. Clinical Coding, Diabetes Mellitus, Parkinson's Disease, Obstructive Lung Disease, Natural Language Processing, Electronic Patient Records
1. Background Health services research, outcome assessment, disease reporting and reimbursement in hospitals require valid and complete data on diagnoses at discharge. It is well known that clinical coding results exhibit extensive weaknesses [1, 2]. Reimbursement systems based on diagnosis related groups (DRGs) tend to increase coding quality [3]. For the clinical controller the main question is whether coding optimization prevents the loss of revenue, whereas for the clinical epidemiologist there is a concern that the coding performed by DRG-savvy coders penalizes the documentation of those conditions that are known to be irrelevant for reimbursement. We present an approach that is suited to bring to light undocumented diagnoses, i.e. conditions that play a certain role in the clinical process but are non remembered (or deemed irrelevant) when it comes to the assignment of ICD codes at discharge. Our hypothesis is that medication at discharge can give an important hint to which ICD codes may be missing. However, manual review of the patient record with the aim to identify missing information and to re-code the discharge profile is time consuming, and requires in-depth medical knowledge. The most trustworthy source here is the discharge summary, as a comprehensive structured documentation of medication is 1
Corresponding author: E-mail:
[email protected]
S. Schulz et al. / Checking Coding Completeness by Mining Discharge Summaries
595
often missing. In the treatment episodes under scrutiny summaries mostly finish with a "drugs at discharge" list, because this information is important for the follow-up treatment by the patient's general practitioner. The objective of this study is to employ a simple text mining approach to predict missing codes. We focus on three diseases, which were known to be readily omitted in coding, according to the long-lasting experience of two of the authors (SH, AZ) in the UMCF medical control department: (i) diabetes mellitus, (ii) Parkinson's disease, (iii) bronchial asthma and chronic obstructive pulmonary disease (COPD).
2. Materials and Methods Documents. We used a corpus of 34,865 in-patient discharge summaries from UMCF, covering all clinical disciplines (except psychiatry) for one year. Each discharge summary represents one treatment episode (one patient may occur more than once). The corpus was split by random into a training corpus (n=17,000) and a test corpus (n=17,865). The summaries show a broad variation between clinical departments. Information on drugs occurred in several sections (family history, patient history, lab, evolution, medication at discharge) with large disparities in layout and formatting. Annotations. Via a unique ID each summary is linked to one treatment episode and a list of one to many ICD-10 codes with which the episode has been manually annotated for reimbursement, based on the German DRG (diagnoses related groups) system. Rule Bases. For the three diseases under scrutiny a rule base was built, relating drug names with their indications. The official indications were acquired from two databases, MMI and RL [4, 5], completed by additional drug indications found in the training corpus in order to capture off-label usage. Both commercial drug names and ingredient names were included. Unspecific name parts like "sodium" or "hydrochloride" were ignored. For each drug a rule was encoded as a triple (D, P, N) with D (drug name) being a string of characters, P a list of ICD codes (p1…pn) for the diseases under scrutiny ("positive list"), and N a list of ICD codes (n1…nm) detailing other indications for this drug ("negative list"). Only the first there ICD digits were mandatory. E.g., in the Parkinson rule base, for D = "Madopar", the positive list contains the code fragments P = {G20, G21, G22}, covering also more specific codes like G21.3. Extensive negative lists had to be built for anti-Parkinsonian and bronchodilatatory drugs, whereas, no negative list was necessary for antidiabetic drugs. Filter algorithm. For each target disease the rule base was applied to the entire document corpus using the following algorithm, implemented as a Python script (documents had been made available as plain text, extracted from the original RTF format): For each document: For each drug name: If drug name matches text token: If no match between any discharge ICD code and any code in the negative or positive list: Return the document ID
Thus, all those discharge summaries are selected that mentioned a drug for which no ICD annotation justified its administration. The execution of this algorithm on the training data discovered some sources of error, e.g. drug names that are homographs of
596
S. Schulz et al. / Checking Coding Completeness by Mining Discharge Summaries
patient names, others which also occur in laboratory results, as well as treatment episodes in which the drug can theoretically be justified by a code from the negative list, although the summary clearly tells that this drug was prescribed to treat a disease from the positive list, mentioned in the record but not coded in the patient management system. Finally, there are cases where antidiabetic and antiparkinsonian substances occur in lab procedures. Such cases were difficult to decide without further information and therefore constitute a source of potential false positive candidate documents. Evaluation methodology. For calculation of the precision a samples (n = 3 * 50) of the candidate texts retrieved by the above algorithm was analyzed by a domain expert. A gold standard for roughly approximating the recall was created as follows: The filter was tested on a document set which has already been annotated with a ICD code of interest. So we modified the above algorithm in order to obtain a rough estimator. For each document: If annotated with ICD code from positive list For each drug name: If drug name matches text token: If drug name is not justified any code from the negative list: Return the document ID
by
The number of documents returned by this procedure divided by the number of documents annotated with a code from the positive list yields the recall estimator for the given rule set (by disease). A low rate of documents retrieved by this code indicates either (i) that the medication is missing in the document or (ii) that there is no drug treatment at all. The first option is not very frequent because all discharge summaries are forwarded to the patients' GP, who generally expects a complete list of drugs at discharge. Note that the set of correctly coded episodes is not representative and the derived values must be interpreted as very rough estimates.
3. Results For the test set (n=17,865) Table 1 shows the output of algorithm 1 and the result of the relevance assessment for the estimation of precision. As an overall result 1.3 percent of all cases lacked an ICD code for either diabetes, asthma / COPD or Parkinson's disease. The differences in precision can be explained by the fact that antidabetic drugs are very specific to diabetes, while antiparkinsonian drugs are used for a broad range of diseases. Table 1 Candidates for missing codes as returned by algorithm 1 and estimated precision after rating of 50 treatment episodes per disease.
As introduced above, we estimated the recall by applying the filter on the set of already coded cases (Table 2). All these treatment episodes are annotated by some code for
S. Schulz et al. / Checking Coding Completeness by Mining Discharge Summaries
597
diabetes, Parkinson's or asthma. The high recall for Parkinson's is consistent with the fact that most of these cases take a specific medication. The lower rates for the other two disease groups comply with the cases of diabetes treated by diet only and the lighter obstructive lung diseases which are only treated in case of exacerbation. Table 2 Recall estimation based on correctly coded diagnoses.
Taking in account the recall estimates, and considering the test set and the entire data there are approximately 716 non-coded diabetes cases, 13 non-coded Parkinson cases, and 420 non-coded asthma/COPD cases among 34,865 treatment episodes. Table 3 Analysis of false positive cases
The analysis of false positive cases (Table 3) reveals insulin administration in cases of intensive care, provocation tests, as well as a reference to the measurement of probably endogenous insulin in blood. Table 4 explanation for false negative cases
For Parkinson's, false positives derive from the fact that for a broad range of rare neurologic diseases anti-Parkinsonian drugs are used. Antiasthmatic drugs, finally, are used in a series of severe pulmonary diseases like lung cancer for which the bronchial obstruction is rather a symptom than a disease on its own. Most of false negative cases are due to disease cases which are not treated by drugs and, to a minor extent, drugs that are missing in the rule base, or misspelt drug names.
598
S. Schulz et al. / Checking Coding Completeness by Mining Discharge Summaries
Recent studies have applied various text mining approaches for the extraction of drug or substance names from medical texts [9, 8, 7, 6, 10]. [11] emphasizes importance of the drug / disease relationship. These studies highlight that extraction of drug names extend the medical record use case we were focusing on. Equally important is the mining of literature abstracts in order to extract generic medical knowledge.
4. Conclusions A computationally simple, high-throughput text mining approach retrieved missing secondary ICD-10 codes of hospitalized patients. For three selected chronic diseases we obtained a rate of together under 2% undercoded treatment episodes, which demonstrates a fairly good coding quality, although the rate is expected to be higher considering a broader array of typical secondary diseases. It supports the observation that although DRG-based reimbursement systems have led to an increased coding quality for major diseases, diseases deemed secondary or unrelated to the actual clinical problem tend to be omitted, given that that they have no impact for DRG grouping. Precision and recall of the proposed information extraction system can be increased in two directions. The rule base must be improved, as our data clearly demonstrate questionable quality of the pharmacopeia used. Off-label indications added, and, ideally, additional knowledge should be acquired by medical experts. This became especially evident when we analysed the potential additional indications for antiparkinsonian drugs. For better investigating the justifications for drugs often used for symptomatic treatments (such as bronchodilatators) additional knowledge associating symptoms with the underlying diseases would also be helpful. The information extraction system can be improved by allowing fuzzy string match and by better identifying the discourse context in which the text string of interest occurs (thus ignoring, e.g. the occurrence of substance names in the lab result section). The latter will also support the harvesting of additional diseases names which occur in the summary but are not coded.
References [1]
Sackett DL. Clinical disagreement. How often it occurs and why. Canadian Medical Association Journal, 123:499–536, 1980. [2] Barnum JF. The misinformation era: the fall of the medical record. Annals of Internal Medicine 10: 482–484, 1989 [3] Stausberg J. Die Kodierqualität der stationären Versorgung. Bundesgesundheitsblatt, Gesundheitsforschung, Gesundheitschutz, 20:1039–1046, 2007. [4] Medizinische Medien Informations GmbH: www.mmi.de, last accessed 5th February, 2011 [5] ROTE LISTE®: www.roteliste.de, last accessed 5th February, 2011 [6] Schönbach C, Nagashima T, Konagaya A: Textmining in support of knowledge discovery for vaccine development. Elsevier, Amsterdam (ISSN 1046-2023: 2004, vol. 34) [7] Jimeno A et al. Assessment of disease named entity recognition on a corpus of annotated sentences. BMC Bioinformatics. 2008; 9 (Suppl 3): S3 [8] Hauben M, Reich L: Data mining, drug safety, and molecular pharmacology: potential for collaboration. The Annals of pharmacotherapy, Whitney, Cincinnati (2004) [9] Garten Y, Altman R: Pharmspresso: a text mining tool for extraction of pharmacogenomic concepts and relationships from full text. BMC Bioinformatics, 2009; 10 (Suppl +2): S6 [10] Dunkel M, Günther S, Ahmed J, Wittig B, Preissner R: SuperPred: drug classification and target prediction. Nucleic Acids Res. 2008 Jul 1;36 (Web Server issue):W55-9. [11] Phoebe M. Roberts, William S. Hayes: Information needs and the role of text mining in drug development. In Pacific Symposium of Biocomputing 2008, 592-603.