XML security in healthcare web systems - ACM Digital Library

3 downloads 2050 Views 123KB Size Report
on XML and HL7 health industry standards, which are key for healthcare industry ... Information Security, Information Assurance, XML Security,. Keywords.
XML Security in Healthcare Web Systems Wasim A Al-Hamdani, Ph.D. Kentucky State University 400 East Main, KY 40601, US

[email protected] corporate networks and between different Web services. XML enables suitably coded documents to be read and understood without difficulty by both humans and machines. Consequently, XML is increasingly being applied to help solve healthcare industry problems today.

ABSTRACT XML has now opened a totally new approach in digital document handling, processing, and message transmission. XML serves as a strong base for healthcare information systems and HL7 standards for healthcare. Therefore, XML security must be integrated into XML in such a way as to preserve the advantages and abilities of XML while adding necessary security capabilities to maintain the patient and healthcare records as readily available and secure. New techniques are being developed as well as standards based on XML and HL7 health industry standards, which are key for healthcare industry expansion and security in the global environment. This work focuses on XML usage for security implementation in Web-based healthcare. The work presents a general introduction to XML, followed by general issues in XML security, XML security application in healthcare, and finally the future of XML in healthcare, focusing in particular on security issues.

Historically, healthcare was not a practical industry for information management and personal record transformation. Inadequate transfers of information among healthcare providers, applications, clinics, and devices were just some of the issues. The replication of information and errors was introduced when transcribing handwritten notes or rekeying results read from one system into another. In the meantime, the lack of access to a complete medical record often caused different doctors to prescribe drugs with possibly harmful interactions. Thus, fully computerized processes were needed to integrate data and secure features for healthcare by ensuring that they fit global electronic commerce. XML was what simplified such issues for the healthcare market considerably.

Categories and Subject Descriptors C.2.0 [Computer Communications Networks]: General – Security and protection D.1.m [Programming Techniques] Miscellaneous D.4.6 [Security and Protection]: Access controls, Authentication K.6.5 [Management of Computer and Information Systems]: Security and Protection- Authentication, Insurance, Invasive software, Physical security J.3 [Life and Medical Sciences]: Medical information systems

The rise of XML in the healthcare business has been motivated partly by legislation intended to protect patients' information and privacy, including the Health Insurance Portability and Accountability Act (HIPAA). Endorse by the U.S. Congress to protect insurance coverage, HIPAA includes standards for electronic transactions and provisions for the privacy and security of data and applies to claim, payment, benefit inquiry, claim status, and other transactions. HIPAA also requires the U.S. Department of Health and Human Services to define rules for the distribution of healthcare information.

General Terms Information Security, Information Assurance, XML Security,

Keywords

Translating those requirements into standards is often the work of standards-development organizations; one of the most prominent in the healthcare industry is Health Level Seven (HL7). HL7 produces standards for operations involving the exchange of administrative and clinical data in healthcare, including imaging, claims processing, and pharmacy.

XML, XML security, XML signature, XML encryption

1. INTRODUCTION Extensible markup language (XML) is a set of rules for programming documents electronically. XML has become one of the major Web programming environments and has been used for transfer of data between different platforms. XML’s advantage is that it is open standard and user-driven, exchanging data over

This work provides a basic introduction to XML and its security components related to healthcare as well as how XML security is integrated within healthcare systems. The XML security standards define XML languages and rules for common security necessities. For the most part, these standards include the use of other XML security standards, especially core XML digital signature and XML encryption standards.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. InfoSecCD’10, October 1-2, 2010, Kennesaw, GA, USA. Copyright © 2010 ACM 978-1-60558-661-8/10/10…$10.00.

2. XML New standards in different industries (accounting, court systems, office systems, education records, etc.), including healthcare, have benefited from the development of XML-enabled 80

XML was first developed to make SGML usable for Web purposes. SGML is very large and complicated. XML is kind of a subset of XML, but at the same time has more restrictive rules, so that parsers and transformers working on XML are easier to implement. XML is a W3C recommendation to which is attached other recommendations, such as XPath, XPointer, XLink, DOM, XSL, SVG, etc. XML is case sensitive, ignores white spaces, and uses the following reserved symbols: , &, ”, %.

applications such as XML-enabled PHP/Oracle applications. This new wave of technology means that we can build applications – often composite applications using SOA plumbing – to access medical data with a combination of interoperable services and rich database support. Many products currently support XML as an input/output format – that is, they can translate back and forth between their internal data formats and APIs and those of XML. Such "XML-enabled" products clearly have many advantages over their competitors that do not support XML.

There are XML to deal with different environments, such as accounting, advertising, astronomy, building, chemistry, construction, education, finance, food, government, human resources, instruments, insurance, legal, and so on.

IBM introduced the pureXML solution, which permits storage, indexing, and querying of documents in their native XML format. Several leading institutions have taken advantage of the native XML capabilities of IBM DB2 to build systems that not only exploit healthcare industry standards, but also improve data access and performance.

XML Simplifies Data Transport With XML, data can easily be exchanged between incompatible systems. XML Simplifies Platform Changes XML data is stored in text format. This makes it easier to expand or upgrade to new operating systems, new applications, or new browsers without losing data.

XML security is “security functionality for XML data”. These include cryptography, authentication, authorization, access control, digital signature, key managements, and other issues. XML stands for extensible markup language. It was designed to describe data and focus on what data is. XML is a “markup” language used to create other XML-based markup languages or documents. XML is used for many things, including books and other documents, so the term “document” is used to describe whatever you create in XML, including automatic location identification (ALI) “records”. A close "cousin" to XML is hyper text markup language, or HTML. The syntax of XML and HTML is similar, but XML was designed to describe data and focus on what data is, while HTML was designed to display data and focus on how data looks. Many Web site pages are created using HTML. XML is not a replacement for HTML, but rather a complement. HTML is about displaying information, while XML is about describing information. XML and HTML are both subsets of standard generalized markup language (SGML). Metadata is data that describes other data. Similarly, SGML and XML are considered metalanguages, meaning they are used to describe other markup languages.

XML Makes Data Available Different applications can access the same data, not only in HTML pages, but also from XML data sources. With XML, data can be available to all kinds of "reading machines" (handheld computers, voice machines, newsfeeds, etc.), and make it more available for blind people or people with other disabilities. Well-formed XML Although XML allows users to invent as many different elements and attributes as they need, these elements and attributes as well as their contents and the documents that contain them, must all follow certain rules in order to be well-formed. If a document is not well-formed, any attempts to read it or make it will fail. The XML specification strictly prohibits XML parsers from trying to fix and understand malformed documents. All a conforming parser is allowed to do is report the error. There are usually six kinds of markup in an XML document: elements, entity references, comments, processing instructions, CDATA sections, and document type declarations.

Since XML only describes data, it does not do anything. Just like the old fixed length data record layouts, XML does not contain software so it does not perform any functions. There are various software tools that are designed to read and process XML. These are called XML applications. Most XML applications fall into two categories – complex document creation and data exchange. We’ll focus on XML used for data exchange rather than creating “book type” documents.

Elements Elements are the basic components of an XML document and represent pieces of information. Entity references Entity references are markup that the parser can replace with character data; the entities represent some special characters, and every entity has a unique name. Entity references provide an alternative way to insert some specific characters into the document. For example, 9 is > 7 & 5 is < 6 could be made by doing the following: 9 is > 7 & 5 < 6 .

XML documents consist of several layers of code. The first, data type definition (DTD), defines the rules and relationships of the metatags. The second part of the code is the application of the XML tags one defined. These two sections can be imbedded into one file, or they can be two separate files that are linked together. The presentation is the last layer of an XML document. Style sheets (either CSS or XLS) allow developers to alter the appearance and function of their XML. This instruction sheet should explain what one needs to do to construct the first two layers of an XML document.

CDATA sections The sections are defined by beginning at the .

81

Learning XML Erik T. Ray 2003 39.95

Name Spaces Name spaces are just prefixes that point to appropriate source files so they can distinguish two elements that would otherwise have the same name, without altering their source, such as . Element Content The content model defines what an element may contain. An element type declaration takes the following form:

Displaying XML on the Web Generic XML documents do not carry information about how to display the data. Without using CSS or XSLT, a generic XML document is delivering as raw XML text by most Web browsers. Some display it with “handles” (e.g., + and - signs in the margin) that allow parts of the structure to be expanded or collapsed with mouse clicks.

PCDATA is short for parsed character data and means any text or other characters that is not markup or ",& or[]. Another example for element declaration:

The XML ADVANTAGES [1,2] XML has a number of advantages compared to all these approaches.  Most importantly, XML is faster because it moves a lot of the work to the client.  XML is a standard and based on existing international standards.  The advantage of XML in Java is that XML uses the common component architecture (CCA) and the common object request broker architecture (CORBA). It also allows RMI, or remote method invocation, in Java and invokes another Java object. It also allows the clients to connect to the program using the remote procedure calling (RPC).  An XML advantage with tags is that it is fully extensible and has no tag limitations.  Business applications and advantages of XML  XML is internationalized and based on unicode.  XML supports validation and editorial control.  It can model any kind of data.  XML allows for automatic generation of links and navigational aids.  XML has an increased speed of user access to data.  XML allows for easy repurposing of documents.  With XML, one can have print and online versions from the same source.  XML offers dynamically user-configurable views.  XML offers simpler system administration of Web sites.  XML has next-generation hypertext capabilities.

Rules for combining content: A|B A or B, but not both A,B,C A and B and C, in order A&B A and B only, in any order A? A occurs 0 or 1 time A* A occurs 0 or n times A+ A occurs 1 or n times Element Attributes Naming rules for attributes are the same as for DOCTYPE and ELEMENT. XML Tree XML documents form a tree structure that starts at "the root" and branches to "the leaves”. For example: ..... And [4] Everyday Italian Giada De Laurentiis 2005 30.00 Harry Potter J K. Rowling 2005 29.99

XML Disadvantages [3]  Lack of applications processing – There are no browsers yet that can read XML. In the case of HTML, anyone can write up a program and that can be read using any browser. To be able to be read in a browser, XML still depends on HTML, and is not independent of it.  General weaknesses of XML – It is totally dependent on who is writing it.  XML documents can be difficult and also expensive to setup.

82

 When XML is tied closely to unicode, the unicode changes XML's attributes, which might result in something that is totally different from the original. The XML parsers when used along with the RSS and the component called next cannot disable the external entities. Instead, they recognize them as their own, which can prove to be a major disadvantage.

of use of databases like dBASE and Access, there seems little reason to use an XML document as a database even in these cases. The only real advantage of XML is that the data is portable, and this is less of an advantage than it seems due to the widespread availability of tools for serializing databases as XML” [5]. A common question when transferring data between XML documents and a database is how to generate XML schemas from database schema and vice versa. Before explaining how to do this, it is worth noting that this is a design-time operation. The reason for this is those most data-centric applications, and virtually all vertical applications, work with a known set of XML schemas and database schemas. Thus, they don't need to generate schemas at run time. Furthermore, as will be seen below, the procedures for generating schemas are less than perfect. Applications that need to store random XML documents in a database should probably use a native XML database instead of generating schemas at run time.

The use of XML [4] XML is used in many characteristic of Web services, Web programming and development, often to simplify data storage and sharing. In addition, it simplifies data sharing, data transport, platform changes, and data availability. XML is Used to Create New Internet Languages A lot of new Internet languages are created with XML, for example:  XHTML, the latest version of HTML  WSDL for describing available Web services  WAP and WML as markup languages for handheld devices  RSS languages for newsfeeds  RDF and OWL for describing resources and ontology  SMIL for describing multimedia for the Web  XProc: An XML Pipeline Language  XQuery for the systems analyst or architect  SXML is an XML infoset in the form of S-expressions

To generate a relational schema from an XML schema:  For each complex element type, create a table and a primary key column.  For each element type with mixed content, create a separate table in which to store the PCDATA linked to the parent table through the parent table's primary key.  For each single-valued attribute of that element type, and for each singly occurring simple child element, create a column in that table. If the XML schema has data type information, then set the data type of the column to the corresponding type. Otherwise, set the data type to a pre-determined type, such as CLOB or VARCHAR (255). If the child element type or attribute is optional, make the column nullable.  For each multi-valued attribute and for each multiply occurring simple child element, create a separate table to store values, linked to the parent table through the parent table's primary key.  For each complex child element, link the parent element type's table to the child element type's table with the parent table's primary key.

3. XML AND DATABASE XML is data structure representation to describe data and can aid information systems in sharing structured data, especially via the Internet, to encode documents and to serialize data. In this context, XML can display the content of classical databases from a server to a Web base, possible from a legacy system to display on a Web base. “An XML document is a database only in the strictest sense of the term. That is, it is a collection of data. In many ways, this makes it no different from any other file -- after all, all files contain data of some sort. As a ‘database’ format, XML has some advantages. For example, it is self-describing (the markup describes the structure and type names of the data, although not the semantics), it is portable (unicode), and it can describe data in tree or graph structures. It also has some disadvantages. For example, it is verbose and access to the data is slow due to parsing and text conversion” [4]

To generate an XML schema from a relational schema:  For each data (non-key) column in that table, as well as for the primary key column(s), add an attribute to the element type or a PCDATA-only child element to its content model.  For each table, create an element type.  For each foreign key, add a child element to the content model and process the foreign key table recursively.  For each table to which the primary key is exported, add a child element to the content model and process the table recursively.

XML provides many of the elements found in databases: programming interfaces (SAX, DOM, JDOM), schemas (DTDs, XML Schemas, RELAX NG, and so on), storage (XML documents), and query languages (XQuery, XPath, XQL, XMLQL, QUILT, etc.). On the other hand, it lacks many of the elements found in real databases: efficient storage, indexes, security, transactions and data integrity, multi-user access, triggers, queries across multiple documents, and so on. It may be possible to use an XML document as a database in environments with small amounts of data, few users, and modest performance requirements; however, this will fail in most production environments, which have many users, strict data integrity requirements, and the need for good performance. “More sophisticated data sets for which an XML document might be suitable as a database are personal contact lists (names, phone numbers, addresses, etc.). However, given the low price and ease

Storing and retrieving documents could be done through:  Storing documents in the file system - The easiest way to store documents is in the file system, especially for simple set of documents.  Storing documents in binary large objects (BLOBs) – A more sophisticated option is to store documents as 83

BLOBs in a relational database. This provides a number of the advantages of databases, including transactional control, security, and multi-user access.  Native XML databases – These are databases designed especially to store XML documents. Like other databases, they support features like transactions, security, multi-user access, programmatic APIs, query languages, and so on. The only difference from other databases is that their internal model is based on XML and not something else, such as the relational model. Native XML databases are most commonly used to store document-centric documents. The main reason for this is their support of XML query languages. Another reason is that native XML databases preserve things like document order, processing instructions, and comments, and often preserve CDATA sections, entity usage, and so on, while XML-enabled databases do not. Native XML databases are also commonly used to integrate data. The third major use case for native XML databases is semi-structured data, such as is found in the fields of finance and biology, which change so frequently that definitive schemas are often not possible. The final major use of native XML database is in handling schema evolution. While native XML databases do not provide complete solutions by any means, they do provide more flexibility than relational databases.

Native XML Database [7,8] The internal model of such databases depends on XML and uses XML documents as the fundamental unit of storage, which is, however, not necessarily stored in the form of text files. One possible definition (developed by members of the XML:DB mailing list) is that “a native XML database is one that: Defines a (logical) model for an XML document -- as opposed to the data in that document -- and stores and retrieves documents according to that model” [6]. At a minimum, the model must include elements, attributes, PCDATA, and document order. Examples of such models are the XPath data model, the XML Infoset, and the models implied by the DOM and the events in SAX 1.0. An XML document has its fundamental unit of (logical) storage, just as a relational database has a row in a table as its fundamental unit of (logical) storage. It is not required to have any particular underlying physical storage model; for example, it can be built on a relational, hierarchical, or object-oriented database, or use a proprietary storage format such as indexed, compressed files. A native XML database system usually includes the following three main points: a) The storage object of NXDBs is XML data. b) XML documents are regarded as the basic unit of storage. c) NXDBs store XML in a "native" form, but may not be a standalone database. There are three main differences between a native XML database and an XML-enabled database:  The native XML database preserves physical structure (e.g., CDATA sections, comments, PIs, DTDs, etc.). Theoretically, an XML-enabled database can do so, too, but practice tells a different story so far.  Schemaless data can be stored in native XML databases. Attempts to learn schema have been made to try and alleviate this problem for XML-enabled databases.  XPath, DOM, etc. – XML-related API is a must for accessing data in native XML. On the other hand, XML-enabled systems can offer direct access to the data, such as through ODBC.

One can classify XML documents into two categories – data centric and document centric. Sales orders, scientific data, employee records, etc., are examples of data centric XML documents. Similar to traditional databases, the physical structure especially order of sibling elements is unimportant in such documents. On the other hand, document centric documents are highly unstructured, have mixed content and their physical structure is important to ascertain the information contained in them. Web pages and marketing brochures are examples of such documents. A relational or object-oriented database is suitable for storing and retrieving the data-centric XML documents. Native XML databases or content management systems are better suited for storing and retrieving document-oriented XML data. Current XML management systems can be classified into XMLenabled databases and native XML databases.

Some add a third type of database with XML [6] this is called a hybrid XML database (HXD). Such a database can be treated as either a native XML database or as an XML-enabled database depending on the requirements of the application. An example of this would be Ozone.

XML-enabled database systems [6,7,8] These map all XML to a traditional database (such as a relational database), accepting XML as input and rendering XML as output. This term implies that the database does the conversion itself (as opposed to relying on middleware). XML-enabled databases first break down XML documents into pieces of data elements, and then store these data elements as basic data objects in the relational tables. A form of interfacing software or middleware is obviously needed to transfer data between XML documents and XML-enabled databases. Through this middleware, the database systems can efficiently map the whole XML document into relational tables. XML enabled databases (usually relational) contain extensions (either model or template driven) for transferring data between XML documents and themselves and are generally designed to store and retrieve data-centric documents based on storage techniques used.

4. XML BENCHMARKS A benchmark is a standard against which something can be measured [9] or assessed to provide a standard against which something can be measured or assessed. XML database benchmarks serve as vehicles for fair and meaningful performance evaluation of XML database systems. The benchmarks should be application-oriented and relevant to database users, database and hardware vendors, researchers, and the XML community. The benchmarks should stress all key components of a database system, and should be scalable from gigabytes to petabytes and usable on all major computing platforms including UNIX, Windows, and Linux. “The purpose of benchmarks is to push XML database systems and the underlying hardware to their limits. Benchmarks are 84

intended to aid in the investigation of performance enhancements and evaluation of design alternatives. The overall goal is to drive technological advancements in hardware and software to support XML database workloads efficiently. The benchmarks will also help in the performance comparison of alternative technologies and database products [10].

different domains. For example, the transaction cost is crucial in a network system, while the processing time and storage space are critical in a database system. Hence, it is important that a benchmark captures the characteristics of the system to be measured. In addition, systems vary in hardware and software support. Some run on Windows, while others run on Linux. As a consequence, it is important that a benchmark be portable and scalable. Finally, it is obvious that a benchmark should be easy to understand, and the results analyzable.

XML benchmarks provide a framework that allows a user to analyze the performance of various XML processing engines, such as the Intel XML Software Suite, and compare the results. XML benchmarks allow performance testing of XML parsing (SAX and DOM), XSLT, XML Schema validation, and XPath operations. They are also extendable, allowing a user to write their own drivers to test other XML processing engines.

5. SECURITY REQUIREMENTS FOR WEB ENVIRONMENT Security is very important to online business [14, 17,18 and 19] Technologies designed to meet security requirements have grown, but the requirements have remained relatively constant. These requirements include authentication, authorization, integrity, signature, confidentiality, privacy, and digital rights management.

“In the absence of standardization among standards, a benchmark across which various XML query processing systems can be tested and compared could greatly help the user. This benchmark should be useful to not just assess the power of query languages but rather the XML management systems, which use these query languages and related optimization techniques. The complexity in developing a generic benchmark comes from the myriad and yet unknown domains and scenarios in which an XML query processing system could be present. Standalone XML databases, ecommerce systems, Web catalogs, and data integration systems are but some of the applications where such query processing could be useful” [11]

These requirements are briefly summarized herein. Confidentiality Confidentiality refers to not revealing or exposing critical or sensitive information. Data must be stored and transmitted securely. Over the Internet and in wide area network (WAN) environments, both public carriers and private network owners often route portions of their network through insecure lines, enormously vulnerable microwave and satellite links, or a number of servers, thereby leaving valuable data open to scrutiny by any interested party. However, communications known to be sensitive (e.g., credit card numbers) are routinely encrypted, so that—even if observed—they cannot be read or used.

Performance analysis covering all performance critical aspects of processing of XML, Schmidt in [12,13] suggest a benchmark for XML databases needs to address these 10 points. 1. Bulk loading of XML documents into the database 2. Reconstruction of the original documents 3. Path traversal to retrieve parts of the document 4. Casting of text to data types 5. Handling of missing elements 6. Ordering of elements 7. References between elements 8. Joint operations on values 9. Construction of large results 10. Containment and full text search

Authentication Authentication is the process of determining that a user’s claimed identity is genuine. The most common form of authentication is passwords. Authentication is based on one or more of the following factors:  Something the user knows, such as the password, personal identification number (PIN), or lock combination;  Something the user has, such as a smart card, automatic teller machine (ATM) card, or key; and/or  Something the user is, or a physical characteristic, such as a fingerprint or retinal pattern or a facial characteristic. Authentication is normally stronger if two or more factors are used.

Others look at benchmarks for system views [14] and as a referee to the Benchmark Handbook by Jim Gray [15]. The four key criteria for a domain-specific benchmark are:  Relevance: The benchmark must capture the characteristics of the system to be measured.

Authorization Authorization is the mechanism by which a system determines what level of access a particular authenticated user should have to secure resources controlled by the system. An authorization guard against misuse of systems, applications, or data after access has already been granted; it controls what objects and actions can be used. Authorization generally refers to the process that determines what a user can access or maintains a record thereof. The enforcement of that authorization is referred to as access control, which can require an additional validation of a request for resources against lists of approved users or permissible activities.

 Portability: The benchmark should be able to be implemented in different systems.

 Scalability: The benchmark should be able to test various databases in different computer systems.

 Simplicity: The benchmark must be understandable; otherwise it will not be credible. Basically, a benchmark is used to test the peak performance of a system. Different aspects of a system have varying importance in

Non-Repudiation

85

interoperability with a wide range of existing infrastructures and across deployments. Furthermore, XML security reduces barriers to adoption by defining the minimum modular mechanisms to obtain powerful results. By employing existing technologies and enabling the use of XML paradigms and tools, XML security minimizes the need to modify applications to meet security requirements.

The intent of non-repudiation is to preserve accountability and prevent misrepresentation. Non-repudiation means that, when someone actually sends a message, the sender cannot later disclaim responsibility for sending it. To ensure against false claims, there must be a digital “signature”, usable only by the true sender, that any recipient can verify. A digital signature also solves the parallel problem of someone else sending a message that falsely claims to be from a third party.

Assembly security requirements for confidentiality, integrity, and privacy are essential in order to move business online. The XML security standards define XML vocabularies and processing rules in order to meet security requirements. These standards use cryptographic and security technologies as well as emerging XML technologies to provide a flexible, extensible, and practical solution for meeting security requirements.

Signature Producing or verifying an electronic signature is proposed to be the equivalent of a handwritten signature. Such a signature may be used for different purposes such as approval, confirmation of receipt, acceptance, or agreement. Integrity Integrity refers to guarantee that information has not been changed (delete some or added some), either due to hateful intent or by accident, whether through transmitted over a network, such as from a Web browser to a Web server, information stored in a database or file system, or information passed in a Web services message and processed by intermediaries.

XML is widely being adopted for a growing variety of applications. It is also forming the basis for distributed system protocols to integrate applications across the Internet, such as Web service protocols. XML languages are text-based and designed to be extended and combined. It should be natural to provide integrity, confidentiality, and other security benefits to entire XML documents or portions of these documents in a way that does not prevent further processing by standard XML tools. Therefore, XML security must be integrated into XML in such a way as to maintain the advantages and capabilities of XML while incorporating the necessary security capabilities. This is especially important in XML-based protocols, such as XML protocol (XMLProt) and Simple Object Access Protocol (SOAP), which are explicitly designed to allow the intermediary processing and modification of messages.

Privacy “Freedom from the observation, intrusion, or attention of others” [9] and “is the ability of an individual or group to seclude themselves or information about themselves and thereby reveal themselves selectively” [20]. An example is a doctor’s office, which requires medical records to track a patient’s health. Privacy relates to control over what is done with this information and whether it is redistributed to others without the individual’s knowledge or consent. Privacy may be managed through a combination of technical and legal means. Confidentiality technology may be used to protect privacy, but it cannot prevent inappropriate sharing of information.

Older security technologies provide a set of core security algorithms and technologies that can be used in XML security, but the actual formats used to implement security requirements are inappropriate for most XML security applications. One reason is that these standards use binary formats that require specialized software for interpretation and use, even for extracting portions of the security information. A second reason is that these standards are not designed for use with XML and do not support common XML technical approaches for managing content, such as specifying content with uniform resource identifier strings (URIs) or using other XML standard definitions for locating portions of XML content (like XPath).

Digital Rights Management Digital right management ensures that content is used according to license agreements. Generally speaking, access rules are incorporated into the content while enforcement controls are integrated into the clients’ need to use the content. “Traditionally, security technologies have required applications to be security or Public Key Infrastructure (PKI) ‘enabled,’ which often involves integrating specialized security code with the application in order to meet security requirements. This resulted in a slow, cumbersome, and inflexible customization process. An alternative is to create generic XML tools and generic XML security and then allow them to be used with a variety of XML applications. This allows generic XML security filters to be applied to arbitrary content without requiring extensive customization for each application, thereby reducing costs and delay” [21].

6. XML AND HEALTHCARE XML, a formatting language based on open standards, is the key to the fluid exchange of information and transactions as healthcare gradually gets wired into a dynamic system. XML lets different kinds of information “talk” to one another, which is exactly what a national or global network that connects patients, doctors, hospitals, and every other imaginable player together will need—a seamless and automated way to exchange data and orchestrate processes. XML is the core technology underlying HL7 standard under development for use in electronic health infrastructure [27].As a matter of fact, an electronic health record is not a static document stored in one place; it is really an XML file that pulls together and integrates data from a variety of sources (pharmacy, doctor, and hospital personnel). Such a composite will also have special rules and logic associated with it

General XML Security XML security defines a common framework and processing rules that can be shared across applications using common tools, thereby avoiding the need for extensive customization of applications to add security [22, 23, 24, 16, 15, and 26] XML security reuses the concepts, algorithms, and core technologies of legacy security systems while introducing the changes necessary to support extensible integration into XML, thereby enabling 86

An XML signature can sign more than one type of resource. For example, a single XML signature might cover character-encoded data (HTML), binary-encoded data (a JPG), XML-encoded data, and a specific section of an XML file.

to control who can see it, which parts of it, and under what circumstances. In this sense, electronic health records are really complex data an object, meaning it is important to both accelerate the processing of XML networking and keep it secure.

Signature validation requires that the data object that was signed be accessible. The XML signature itself will generally indicate the location of the original signed object. This reference can  be referenced by a URI within the XML signature;  reside within the same resource as the XML signature (the signature is a sibling);  be embedded within the XML signature (the signature is the parent); and/or  have its XML signature embedded within itself (the signature is the child) [29].

Security has always been vitally important in the healthcare world to ensure the integrity of content and transactions, to maintain privacy and confidentiality, and to ensure that information is used appropriately. Thus, XML security must be integrated into XML in such a way as to maintain the advantages and capabilities of XML while adding necessary security capabilities. Core XML Security Standards for Healthcare The core XML security standards [24] that are also used in the healthcare industry are:  XML digital signature (integrity and sign)  Management (eXtensible Rights Markup Language 2.0 XrML)  XML encryption (confidentiality)  Authentication and authorization assertions (SAML)  XML key management (XKMS) (public key)  Security Assertion Markup Language (SAML) ( authentication, authorization, and attribute assertions)

Digital signatures are useful for two purposes: to provide persistent content integrity and to create and verify portable electronic signatures. Persistent integrity enables the user of content to detect unexpected changes to the content—whether malicious or accidental. Unlike a simple checksum, a digital signature associates a digest of the content with the signer of the content using a cryptographic technique. A digest is a digital “fingerprint,” or a short fixed-length value that is unique to the content and impractical to determine without the content. Using a cryptographic technique with the digest makes it hard for anyone other than the original signer to change the content without detection. Persistent integrity offers the benefit that content is protected not only in transit, but also when stored and processed.

XML access control markup language (XACML) for defining access control rules and platform for privacy preferences (P3P) for defining privacy policies and preferences (major use cases include securing Web services [WS-security] and digital rights integrity and signatures—XML digital signature) Major XML security applications include:  Web services security—Roadmap and  WS-security Privacy—Platform for privacy preferences (P3P)  Digital rights management—eXtensible Rights Markup Language 2.0 (XrML)

Electronic signatures offer the digital equivalent of handwritten signatures and may be used for a variety of purposes, such as content approval, receipt confirmation, and contract agreement. Using digital signatures makes it possible to move business workflows online without requiring manual approval steps, thereby reducing the delays, costs, and inconveniences caused by geographic separation and time zone differences. Digital signatures use cryptographic techniques to construct signatures that are stronger and more portable than other techniques for creating electronic signatures.

XML Signatures for Integrity XML signatures are digital signatures designed for use in XML transactions. The standard defines a schema for capturing the result of a digital signature operation applied to arbitrary (but often XML) data. Like non-XML-aware digital signatures (e.g., PKCS), XML signatures add authentication, data integrity, and support for non-repudiation to the data that they sign. However, unlike non-XML digital signature standards, an XML signature has been designed to both account for and take advantage of the Internet and XML.

The XML digital signature recommendation defines mechanisms to support the full range of digital signature creation and verification, including the ability to sign and verify:  Entire XML documents as well as element and element content portions of XML documents;  Arbitrary documents, including binary documents;  Compound documents, including multiple documents and/or XML elements and element contents;  Properties to be included with a signature; and  Counter-signatures (signatures that include other signatures).

A fundamental feature of an XML signature is the ability to sign only specific portions of the XML tree rather than the complete document. This will be relevant when a single XML document may have a long history in which the different components are authored at different times by different parties, each signing only those elements relevant to it. This flexibility will also be critical in situations where it is important to ensure the integrity of certain portions of an XML document, while leaving open the possibility for other portions of the document to change. Consider, for example, a signed XML form delivered to a user for completion; if the signature covered the full XML form, any change by the user to the default form values would indicate the original signature.

In addition, the XML signature recommendation supports the application of multiple XML signatures to an XML document or to different sections of a document, supporting a variety of use cases. The XML digital signature specification and related specifications also define techniques so that signature verification can be robust even with the variations allowed in XML, such as 87

for one room will not be visible to other rooms.XML encryption can handle both XML and non-XML (e.g., binary) data. Using XML encryption, the sender may send content confidentially using the following steps:  Encrypt the content using a symmetric key.  Encrypt the symmetric key using the recipient’s public key.  Package the encrypted content, encrypted key, and necessary algorithm information together.  Send the package to the recipient.

white space. The reason for such concern is that cryptographic algorithms are concerned with exact text, whereas XML allows some flexibility. Canonicalization is used to reduce variations so that all XML security applications can interoperate. Signature Data Model and Syntax  XML-signature data structures must be predicated on the RDF data model [RDF] but need not use the RDF serialization syntax.  XML signatures apply to any resource addressable by a locator—including non-XML content. XML-signature referents are identified with XML locators (URIs or fragments) within the manifest that refer to external or internal resources (e.g., network accessible or within the same XML document/package).  XML signatures must be able to apply to a part or totality of a XML document.  Multiple XML signatures must be able to exist over the static content of a Web resource given varied keys, content transformations, and algorithm specifications (signature, hash, canonicalization, etc.).  XML signatures are first-class objects themselves and consequently must be able to be referenced and signed [29,30]

The recipient may obtain the original content using the following steps:  Unpack the package to obtain the algorithm information, the encrypted symmetric key, and the encrypted content.  Decrypt the symmetric key with the private key.  Decrypt the content with the symmetric key. XML encryption recommendation defines the framework and processing rules for XML encryption and decryption. It defines an XML vocabulary for packaging all the information needed to process encrypted content, such as encryption algorithm and parameters, information about keys, and encrypted content. XML encryption recommendation supports the following features:  XML and non-XML content may be encrypted, giving broad applicability to the recommendation.  Confidentiality may be applied at a fine level of granularity to XML content; it may be applied to XML elements and XML element content as well as entire XML documents, which is valuable for securing portions of XML protocol messages to be routed through intermediary processors.  XML encryption produces well-formed XML from wellformed XML. This allows portions of XML content to be encrypted yet subsequently processed by XML tools.  XML encryption is compatible with and may be used in conjunction with XML digital signatures.  XML encryption allows for the encryption of a symmetric key that may be packaged with encrypted content.  XML encryption supports a variety of encryption algorithms and techniques.  The XML encryption recommendation defines the framework and processing rules for XML encryption and decryption; it also defines an XML vocabulary for packaging all the information needed to process encrypted content, such as encryption algorithm and parameters, information about keys, and encrypted content.

Core Signature Syntax The syntax is defined via DTDs and [XML-Schema] with the following XML preamble, declaration, and internal entity. Schema Definition The following defines the schema: ]> XML Encryption for Confidentiality XML encryption is not intended to replace or supersede SSL/TLS. Rather, it provides a mechanism for security requirements that are not covered by SSL. Two important areas not addressed by SSL are encrypting part of the data being exchanged and securing sessions between more than two parties. With XML encryption, each party can maintain secure or insecure states with any of the communicating parties. Both secure and non-secure data can be exchanged in the same document. An example is a secure chat application containing a number of chat rooms with several people in each room. XML-encrypted files can be exchanged between chatting partners so that data intended

“XML encryption is capable of encrypting user data such as complete XML documents, single elements (and all their descendants) inside an XML document, the contents of an element (some or usually all child nodes and all their descendants) inside an XML document, and arbitrary binary contents outside an XML document” [31]. XML encryption allows for the direct inclusion of the encrypted contents into the container or to dereference the encrypted contents via the URI/transforms mechanism already known from the XML signature. XML encryption offers key management facilities for 88

wise encryption can be likened to carrying out semantic operations on sections of an XML document.”

the symmetric wrapping of secret keys (secret key needed to retrieve secret key), key transport of secret keys (private key needed to retrieve secret key), and key agreement using DiffieHellman. Another method for confidentiality is pool encryption [31]. The idea behind pool encryption is to remove confidential nodes from the document tree and to encrypt each confidential node individually. These encrypted nodes are stored in a container. After decryption, each node can “find its way back” to its appropriate position in the document, so that it can be restored correctly.

Key Management: XML Key Management Specification (XKMS) “The XML Key Management Specification (XKMS) defines protocols for public key management services. Public key management includes the creation of a public and private key pair, the binding of this key pair with identity and other attributes, and the representation of this key pair in different formats (e.g., by key name, digital certificate, or key parameters)” [35]. Public key technology is an essential part of XML digital signatures, XML encryption, and other security usages. When signing, the private key is used to sign and the public key is used to verify signatures. When encrypting, the public key is used to encrypt and the private key is used to decrypt. In either case, the private key must be preserves under the power of the owner while the public key may be shared with others. XKMS is designed to help manage the sharing and distribution of the public key to enable signature verification and encryption for recipients.

Encrypting Arbitrary Data and XML Documents[33] Arbitrary data including XML documents.