ecgML: Tools and Technologies for Multimedia ECG Presentation Haiying Wang Benjamin Jung Francisco Azuaje Norman Black
Abstract Electrocardiogram (ECG) data have been traditionally generated by multiple software applications on various platforms. Furthermore local data storage and distribution uses different formats and structures. These data modelling and distribution tasks should consist of flexible and inexpensive tools to enhance pattern recognition and visualisation capabilities of humans and machines. There is an increased need to promote the development of standards in order to support a seamless exchange and migration of ECG data as well as the native integration into Electronic Patient Records (EPR) and medical guidelines. Such models should be platform-independent, flexible and open to the scientific community. In the case of ECG data interpretation, an important prerequisite is a comprehensive data description independent of the number of channels, instrumentation platform and type of experiments. Additionally, an ECG record should include annotations relating to the acquisition protocols, patient information and analysis results. The Food and Drug Administration (FDA), Center for Drug Evaluation and Research, has proposed recommendations for the exchange of time-series data. The projected standard consists of a hierarchical data structure for the representation of signals, including ECG data, which ideally would be encoded in an XML file. Recent advances include I-Med, which is a XML based format for clinical data bundled with a domain-independent interface for exchanging several types of medical information. Its major goal is to provide a unique platform for clinical transactions. I-Med messages can include ECG records, which may be described by basic features, such as QRS duration (i.e. time interval necessary for ventricular depolarization) and text-based interpretations. This paper discusses the minimum set of information needed for a meaningful representation and storage of electrocardiogram signals. It has been synthesized from existing recommendations and compiled into an XML schema (ecgML). The accompanying application comfortably supports medical tasks such as pattern recognition and identification of relevant wave markers. More recently, eXtensible Stylesheet Language (XSL) transformations are developed to convert "raw" ecgML files into data mining formats such as MatLab (for further analysis), Scalable Vector Graphics (SVG) (for graphical visualisation) and audio format. Thus, ecgML is a useful tool to facilitate the representation, exchange and interpretation of ECG information.
Table of Contents 1. 1. Introduction ........................................................................................................................... 1.1. 1.1 Existing Formats for ECG Management and Exchange ...................................................... 1.2. 1.2 Current XML Efforts from Standard Bodies ..................................................................... 2. 2. Hierarchical Presentation of ecgML ............................................................................................ 3. 3. Accompanying Tools ............................................................................................................... 4. 4. Applications of ecgML ............................................................................................................. 5. 5. Conclusions ........................................................................................................................... Bibliography ................................................................................................................................. Glossary .......................................................................................................................................
2 2 2 3 6 6 8 8 9
Proceedings by deepX Ltd.
1
Rendered by www.RenderX.com
ecgML: Tools and Technologies for Multimedia ECG Presentation
1. 1. Introduction Electrocardiography is one of the most important non-invasive diagnostic methods, which can be performed at a low cost and allows the early recognition of coronary heart disease. In today’s distributed healthcare environment, ECG data are commonly acquired, stored and analysed using different formats and software platforms. Various alternatives used for the management and exchange of ECG data still exist. Medical informatics will fully exploit the benefits from its research only when data can be openly shared and interpreted. There is an increasing need to develop cross-platform solutions to support biomedical training, decision-making and telemedicine applications [1].
1.1. 1.1 Existing Formats for ECG Management and Exchange ECG data have been traditionally recorded using flat file formats, such as the Massachusetts Institute of Technology and Beth Israel Hospital (MIT-BIH) file library [2]. This type of data format lacks the information necessary to support a meaningful analysis, interoperability and integration of multiple resources. In 1980, a large international project, sponsored by the European Commission, was launched to develop Common Standards for Quantitative Electrocardiography (CSE) project. The main findings of the first CSE study include standardization of computerised definitions of waves and the references for each beginning and end point of the inter-wave components of the ECG [3]. In 1993, a Comité Européen de Normalisation Technical Committee 251 (CEN/TC251) project team developed the Standard Communications Protocol for Computer-assisted Electrocardiography (SCP-ECG)[4]. The standard is relatively well established for the interchange, encoding and storage of digital ECG data. The data level in the standard includes the ECG signal, patient demographics and ECG administrative data, as well as measurement and interpretation results. Although this standard is supported by many manufacturers of ECG equipment, the utilisation of SCP-ECG has demonstrated some disadvantages: •
A binary file standard for the transmission and storage of ECG data, which is not a human-readable format.
•
12-lead ECG specific.
•
No provision for multiple time-related ECGs in one record.
•
Only limited provision for annotation.
In 1987, Health Level Seven (HL7) was founded to develop standards for the electronic interchange of clinical, financial and administrative information among independent healthcare oriented computer systems. This standard currently addresses the interfaces among various systems that send or receive patient admission/registration, queries, orders, results, clinical observations, billing, and master file update information. The HL7 integration approach focuses on synchronizing the databases of multiple application systems. As a de facto standard for the electronic exchange of clinical and administrative data, an HL7 message is able to represent ECG, waveform, measurements, a computer analysis of the waveform, and demographics data. The main problems of HL7 include [5]: •
The message format is not user friendly and can not easily be interpreted by reading the message.
•
Changes in HL7 can not easily be incorporated into clinical information systems using older versions of HL7.
The Digital Imaging and Communications in Medicine (DICOM) was originally created as a protocol for image data exchange. Due to popular demand (from the ECG community) for purposes where biosignals are collected in connection with a medical imaging procedure, supplement no. 30 on waveforms has been developed to integrate waveform storage into DICOM [6]. This includes ECG, electrophysiological and hemodynamic curve data. However the implementation of this format requires understanding of the DICOM philosophy, which is not possible by reading Supplement 30 alone.
1.2. 1.2 Current XML Efforts from Standard Bodies Since its adoption as a World Wide Web Consortium (W3C) recommendation in 1998, XML and a number of related W3C recommendations are shaping the future of the web, providing simple, elegant, and scalable interoperability solutions [7]. Developed as a subset of Standard for General Markup Language (SGML) in 1996 to "be
Proceedings by deepX Ltd. Rendered by www.RenderX.com
2
ecgML: Tools and Technologies for Multimedia ECG Presentation
straightforwardly usable over the Internet", XML soon became a ubiquitous syntax for data and data-exchange over the Internet and presents new opportunities for the representation and exchange of clinical information. As a result, committees within standardization organizations in healthcare such as CEN/TC251 , HL7, American Society for Testing and Materials (ASTM), etc. are currently working on recommendations for the use of XML in healthcare. The use of XML syntax for the exchange of electronic patient records was shown in all its aspects in Synapses [8] and SynEx [9] and their implementations[10] [11]. Synapses concentrated on the specification of a Federated Healthcare Record (FHCR) server (including data model and format definition), which provides integrated access to a record’s distributed components. The SynEx project concerned integrating a number of components to form an information system from which client applications could access a wide range of data in support of the healthcare business [12]. However, the detailed description of specific data models (e.g. for ECG data) hasn’t been part of it. The FDA Centre for Drug Evaluation and Research has proposed recommendations for the exchange of time-series data. It includes a hierarchical structure for the representation of signals, including ECG data, which may be encoded as an XML file. This protocol focuses on the acquisition of multiple records from different subjects within a single file [13].The most recent document specifying the XML data format for ECG was issued in the middle of April 2002. However, the data model (specified in a Document Type Definition (DTD)[14] includes presentation information that we believe should be kept outside in order to follow the principle of separating content and presentation information, such as elements MinorTickInterval, MajorTickInterval and LogScale. Recent advances include I-Med, which is an XML-based format for clinical data [15]. This project consists of a domain-independent interface for exchanging several types of medical information. Its major goal is to provide a unique platform for clinical transactions. These messages can include ECG records, which may be described by basic features, such as QRS duration and text-based interpretations. One major limitation of this solution is that it partially addresses important ECG data content-definitions.
2. 2. Hierarchical Presentation of ecgML There is a need to harmonise the representation of digital ECG data originating from the full spectrum of devices along with annotations for events, and to include necessary associated information, such as patient identification, interpretation and other clinical data. The following hierarchical structure is proposed to address such concerns. In this paper terms written in bold prints (i.e. bold and italic) represent either XML element or attribute names. Element names are made of concatenated words with the first letter of each word capitalised ("upperCamelCase"). Attribute names satisfy the same rule except for the first word ("lowerCamelCase"). Each patient record starts with a root element ECGRecord , which is uniquely identified by its attribute studyID . The StudyDate and StudyTime elements represent the latest time record of the study of the ECG recording. Diagnosis contains a text version of the latest diagnostic interpretation of the ECG,while MedicalHistory is a description of the medical history of a patient's clinical problems and diagnoses. There are two main components for each record: one PatientDemographic element and one-or-more Record elements. The tree diagram of the ECGRecord element is given in Figure 1.
Proceedings by deepX Ltd. Rendered by www.RenderX.com
3
ecgML: Tools and Technologies for Multimedia ECG Presentation
Figure 1. The tree diagram of ecgML element ECGRecord PatientDemographics contains information of general interest concerning the person from whom the recording is obtained, such as demographic data (e.g. patientID , Name , etc.) and contact information (e.g. Address , etc.). This component is required in each record. Record , shown in Figure 2, represents the physical storage for the basic content of an ECG recording. The AcquisitionDate and AcquisitionTime elements specify date and time the record was taken. investigatorID and siteID are used to identify the responsible person and institution for the recording. There are three main components: zero-or-one RecordingDevice , zero-or-one ClinicalProtocol , and one-or-more RecordDate .
Figure 2. The tree diagram of ecgML element Record RecordingDevice describes the device that generated the data, while ClinicalProtocol may include information relating to a patient’s clinical report. RecordData is a key ecgML element. There can be multiple RecordData elements within a file, which are identified by their Channel element names. The DICOM lead labelling format is recommended for this purpose[6]. RecordData includes three main sub-components: Waveforms , Annotations and Measurements . The corresponding tree diagrams are illustrated in Figure 3, Figure 4 and Figure 5.
Proceedings by deepX Ltd. Rendered by www.RenderX.com
4
ecgML: Tools and Technologies for Multimedia ECG Presentation
Figure 3. The tree diagram of ecgML element RecordData
Figure 4. The tree diagram of ecgML element Waveform
Figure 5. The tree diagram of ecgML element Annotations Based on the FDA-recommended PlotGroup format [14], Waveforms are represented by a series of values along two dimensions X, Y ( XValues and YValues ). Annotations would typically be used to describe events specific to the corresponding channel. It defines a time point or interval, which can be used for performing the measurements. This consists of a collection of PointNotation and WaveNotation elements.The Measurements element contains
Proceedings by deepX Ltd. Rendered by www.RenderX.com
5
ecgML: Tools and Technologies for Multimedia ECG Presentation
a list of Values (i. e. the measurements of each recorded channel). Each Values element may be associated with a label and a measurement unit . As mentioned earlier, the FDA, together with a number of other institutions, has developed and published an XML vocabulary [13] [14] to represent collected time-series data. However, there are some significant differences between the FDA proposal and ecgML. The FDA proposal is intended to represent collected biological data, including ECG, electroencephalogram (EEG), or other time series data such as temperature, pressure and oxygen saturation. The main goal is to facilitate the submission of the biological data and to make sure that accuracy and consistency of the measurements made from the collected biological data is achieved. It is important for the FDA to view the biological data in an appropriate way. Thus, the data model (specified in a DTD) includes some presentation information, including elements such as MinorTickInterval, MajorTickInterval and LogScale. On the other hand, ecgML is specific to ECG signals. There are some elements directly related to ECG waveforms, e.g. the elements Pwave , QRSwave and Twave . The purpose of ecgML is to develop an open and transparent way of representing, exchanging and mining ECG data. Therefore, ecgML not only consists of some important components, which may be used to perform knowledge discovery in ECG data (e.g. ClinicalProtocol , Diagnosis and Measurements ) but also follows the principal of separating content and presentation information, which will exhibit great advantages when using ecgML in combination with inter-media transformations (see below).
3. 3. Accompanying Tools A series of tools are being developed to assist users in exploiting ecgML-based applications. These include an XML-based ECG record generator, ECG parser and ECG viewer. The generator will automatically produce XMLbased ECG records from existing ECG databases, e.g. the MIT-BIH database [2]. The ECG parser allows the user reading the ECG records and access their contents and structure, whereas the ECG viewer provides onscreen display of the corresponding waveform data, shown in Figure 6. It shows all annotation information of the individual waveform. The hierarchical structure of the XML-based ECG record, including every elements and attribute is displayed on the left hand side. It can be expanded and shrunk at any level. The right hand side shows an individual part of the ECG waveform chosen from the ecgML structure. The viewer graphically locates boundaries (i.e. beginning, peak, and end) of the P, QRS and T waveforms for each selected QRS complex.
Figure 6. Screenshot of ECG viewer
4. 4. Applications of ecgML Based on advantages of XML technologies, ecgML has the ability to present a system-, application- and formatindependent solution for representation and exchange of ECG data. Moreover, a distinct separation of content and
Proceedings by deepX Ltd. Rendered by www.RenderX.com
6
ecgML: Tools and Technologies for Multimedia ECG Presentation
presentation (among other components such as links and semantic) exhibits a remarkable advantage over existing systems where information is merged and intertwined with its representation format. Figure 7 exemplifies a scenario where the raw ECG data is kept in an ecgML data file and therefore independently from possible presentation information. Various XSL Transformations (XSLT) (stored as XSL files and applied on the fly, transparent to the user) convert the ecgML source into user- and/or application-specific data formats, such as Moving Picture Experts Group (MPEG) (audio), MatLab (text) and SVG/Portable Network Graphics (PNG) (graphics). The centralised storage of the ECG record and dynamic creation of data representations avoids redundancy.
Figure 7. Dynamic transformation of ecgML data The data and metadata contained in an ecgML record may be useful to improve pattern recognition in ECG applications. It would also aid the implementation of automated decision support models such as case-based reasoning. The proposed ecgML may also be significant for problems such as future proof storage, context-sensitive (textual) search of patterns in ECG data, and its native inclusion into medical guidelines. Figure 8 illustrates the utilisation of map files to convert "raw" ecgML files into different tabular data, which will be imported into data mining systems for further analysis.
Proceedings by deepX Ltd. Rendered by www.RenderX.com
7
ecgML: Tools and Technologies for Multimedia ECG Presentation
Figure 8. Converting XML into Tabular data using map files
5. 5. Conclusions ecgML will enable the seamless integration of ECG data into Electronic Patient Records and medical guidelines. This protocol can support data exchange between different ECG acquisition and visualisation devices. The accompanying application comfortably supports medical tasks such as pattern recognition and identification of relevant wave markers. The advantages of separating content from presentation information has proven very successful, where ECG data stored in the ecgML can be delivered in customised output format to suit different devices and applications. Thus, ecgML is a useful tool to facilitate the representation, exchange and interpretation of ECG information. Further research will address the following issues. •
How does ecgML affect storage capacity?
•
Does on-the-fly compression (as used by Hypertext Transfer Protocol (HTTP) 1.1) make a difference in terms of transmission speed?
•
Is it feasible to use ecgML in applications such as 24 hour monitoring?
•
Does ecgML data contain all the significant information required for ECG analysis?
Bibliography [1] A. Värni, B. Kemp, T. Penzel, A. Schlögl: Standards for biomedical signal databases. IEEE Engineering in Medicine and Biology 2001, 20(3): 33-37. [2] A. L. Goldberger, L. A. N. Amaral, L. Glass, J. M. Hausdorff, P. Ch. Ivanov, R. G. Mark, J. E. Mietus, G. B. Moody, C. K. Peng, H. E. Stanley: PhysioBank, PhysioToolkit, and Physionet: Components of a New Research Resource for Complex Physiologic Signals. Circulation, 2000(June 13), 101(23): e215-e220. [http://circ.ahajournals.org/ cgi/content/full/101/23/e215].
Proceedings by deepX Ltd. Rendered by www.RenderX.com
8
ecgML: Tools and Technologies for Multimedia ECG Presentation
[3] J.L. Willems, P.Arnaud, J.H. van Bemmel, R.Degani , P.W. Macfarlane and Chr. Zywietz: Common Standards for Quantitative Electrocardiography: Goals and Main Results. Methods of Information in Medicine, 1990, vol.29, pp.263-271. [4] ENV 1064 standard communications protocol for computer-assisted electrocardiography. European Committee for Standardisation(CEN), Brussels, Belgium, 1996. [5] J. Sable: The HL7 RIM (Reference Information Model) [http://hmi.missouri.edu/Course_Materials/Residential_Informatics/semesters/W2000_Materials/401_hales/hl7_1.ppt] [6] DICOM Suppl. 30, Waveform interchange, Nat. Elect. Manufacturers Assoc.: ARC-NEMA, Digital Imaging and Communications, NEMA, Washington D.C., 1999. [7] Extensible Markup language (XML) [http://www.w3.org/xml/]. [8] Synapses Homepage [http://www.cs.tcd.ie/synapses/public/]. [9] SynEx Homepage [http://www.gesi.it/synex/]. [10] B. Jung, J. Grimson: Synapses/SynEx goes XML, In Proceedings of the Medical Informatics Europe '99 Conference, 1999; Technology and Informatics, 68, IOS press, Amsterdam, 1999: 906-911. [11] B. Jung, E. P. Andersen, J. Grimson: Using XML for Seamless Integration of Distributed Electronic Patient Records. In Proceedings of XML Scandinavia 2000 conference, Gothenburg, Sweden, May 2000. [12] J. Grimson, G. Stephens, B. Jung, W. Grimson, D. Berry, S. Pardon: Sharing Health-Care Records over the Internet. IEEE Internet Computing, May/June 2001, 5(3): 49-58. [13] FDA application: Proposed Standard for Exchange of Electrocardiographic and Other Time- Series Data [http://www.fda.gov/cder/regulatory/ersr/ECGdata.htm]. [14]
FDA XML Data Format Design Specification [http://www.cdisc.org/ sions/EGC/FDA%20_XML_Data_Format_Design_Specification_DRAFT_C.pdf].
discus-
[15] I-Med Homepage [http://www.hnbe.com/healthweb/imedpub/].
Glossary ASTM
American Society for Testing and Materials
CEN/TC251
Comité Européen de Normalisation Technical Committee 251
CSE
Common Standards for Quantitative Electrocardiography
DICOM
Digital Imaging and Communications in Medicine
DTD
Document Type Definition
ECG
Electrocardiogram
EEG
electroencephalogram
EPR
Electronic Patient Records
ESEM
European Society for Engineering and Medicine
FDA
Food and Drug Administration
Proceedings by deepX Ltd. Rendered by www.RenderX.com
9
ecgML: Tools and Technologies for Multimedia ECG Presentation
FHCR
Federated Healthcare Record
HL7
Health Level Seven
HTTP
Hypertext Transfer Protocol
IEEE
Institute of Electrical and Electronics Engineers
MIT-BIH
Massachusetts Institute of Technology and Beth Israel Hospital
MPEG
Moving Picture Experts Group
NIBEC
Northern Ireland Bioengineering Centre
PNG
Portable Network Graphics
SCP-ECG
Standard Communications Protocol for Computer-assisted Electrocardiography
SGML
Standard for General Markup Language
SVG
Scalable Vector Graphics
TCD
Trinity College Dublin
W3C
World Wide Web Consortium
XML
eXtensible Markup Language
XSL
eXtensible Stylesheet Language
XSLT
XSL Transformations
Biography Haiying Wang University of Ulster at Jordanstown Belfast United Kingdom
[email protected] Haiying Wang is a PhD student at the School of Computing and Mathematics, Ulster University. His research areas of interest are: Biomedical informatics, databases, data mining and machine learning. Benjamin Jung Trinity College Dublin Dublin Ireland
[email protected] Benjamin has more than 10 years experience in developing Hypertext systems. Prior to the web-age, he built CBT systems for the pharmaceutical and medical industry. In 1995, he developed a web-based project management system for the Systems Realization Laboratory at Georgia Institute of Technology, Atlanta. Benjamin holds a degree (Dipl. Inform. Univ.) from Technische Universität Müenchen in Computer Science and Theoretical Medicine. From 1998 he worked for two years as a Research Assistant and Technical Team leader at the Centre for Health Informatics, TCD, Ireland. He concentrated on the deployment and development of eXtensible Markup Language (XML) based vocabularies for the exchange of electronic patient records. At present, he is a full-time lecturer in the Knowledge and Data Engineering Group (TCD), teaching databases, document architectures and XML technologies. Since 1997, Benjamin has presented papers and chaired sessions at various Computer Science and Medical conferences. He developed full-day XML tutorials and workshops that were given at conferences in Europe and the US. In 2000, Benjamin co-founded deepX Ltd, where he holds positions of director and consultant.
Proceedings by deepX Ltd. Rendered by www.RenderX.com
10
ecgML: Tools and Technologies for Multimedia ECG Presentation
Francisco Azuaje University of Ulster at Jordanstown Belfast United Kingdom
[email protected] Dr. Azuaje received the B.Sc. degree in electronic engineering from Simon Bolivar University, Caracas, Venezuela, in 1995. He was a student of the Master in Policy and Management of Technological Innovation at Central University of Venezuela in 1996 and received the Ph.D degree from the University of Ulster, Jordanstown, U.K. Before joining the University Ulster as a Reader in 2002, he was a Lecturer at the Department of Computer Science of Trinity College Dublin (TCD), Ireland. He is currently the coordinator of the AI Research Group at the University of Ulster, which performs research on machine learning and data mining applications in biosciences. Dr. Azuaje has published several refereed publications in journals, books and conference proceedings relating to the areas of bioinformatics, artificial intelligence and data management. He is an editorial board member of the Institute of Electrical and Electronics Engineers (IEEE) Transactions on Nanobioscience and the Online Journal of Bioinformatics. Dr. Azuaje has also edited special issues for the IEEE Engineering in Medicine and Biology Magazine and the AI Review journal. Norman Black University of Ulster at Jordanstown Belfast United Kingdom
[email protected] Prof. Norman Black received the B.Sc. degree in Electrical and Electronic Engineering and the Ph.D. degree from the Queen's University of Belfast, U.K, in 1980 and 1984 respectively. He joined the University of Ulster, Jordanstown, U.K, in 1985 as Lecturer in the Department of Electrical and Electronic Engineering, becoming a Reader in the same department in 1989 and Professor of Digital Communications in 1995. He was the Director of the Northern Ireland Bioengineering Centre (NIBEC) from 1994 to 1999, and became the Head of Medical Informatics at the University of Ulster in 1999. His research interests include biomedical engineering, medical informatics, telemedicine, data fusion and bioengineering education. Prof. Black is currently President of the European Society for Engineering and Medicine (ESEM) and Dean of Informatics at the University of Ulster.
Proceedings by deepX Ltd. Rendered by www.RenderX.com
11