Integrating Digital Signatures with Relational ...

4 downloads 57119 Views 122KB Size Report
Integrating Digital Signatures with Relational Databases: Issues and Organizational Implications. Randal Reid. University of Alabama in Huntsville, USA.
42 Journal of Database Management, 14(2), 42-51, Apr-June 2003

Integrating Digital Signatures with Relational Databases: Issues and Organizational Implications Randal Reid University of Alabama in Huntsville, USA Gurpreet Dhillon Virginia Commonwealth University, USA

ABSTRACT This paper explores the nature and scope of integration of digital signatures with relational databases. Such as integration is essential if Internet commerce is to succeed. While evaluating the pros and cons of the integration and the related technologies, this paper identifies challenges, both technological and organizational. The paper proposes that any implementation needs to consider the organizational policies and the related rules. Careful consideration of these aspects will ensure a successful integration and implementation. Keywords:

INTRODUCTION Relational database technologies have become the core of the information technology systems group in most, if not all large corporations. Any new technology that is being proposed for adoption within a business environment must be examined in its relationship to and impact on this core technology. At the same time, a majority of organizations have begun relying on the Internet to conduct their business. The technology that forms the basis of Internet commerce is not new. Earlier forms of Internet commerce were prevalent in the 1980s through Electronic Data Interchange (EDI)

links. Over the years these evolved into Inter-organizational Systems, which finally took the form of Business-to-Business and Business-to-Customer networks. Although the nature of the technology per se has not changed, development of the understanding to employ these technologies to better manage the supply chains, thus resulting in reduced overall costs, has become a challenge. Internet commerce, in various guises, has always had one significant barrier, i.e. the legal admissibility of electronic evidence. At least in the US, the passage of the ESign act by congress has helped ease the hurdles. The act suggests that an elec-

Copyright © 2003, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.

Journal of Database Management, 14(2), 42-51, Apr-June 2003 43

tronic signature attached to a document has the same legal status as a handwritten signature. While the E-Sign act is technologically neutral, digital signature technology is the only technology that is currently mature enough to meet the act’s requirements. This paper looks at the intersection of digital signature technology, relational database technology, and the possible organizational impacts that the merging of these two technologies may require. The paper is organized by presenting a background in encryption, followed by a discussion of digital signatures and normalization of relational data structures. Two alternative models for the integration are proposed and the organizational impacts are examined.

ENCRYPTION BASICS A prerequisite to understand digital signatures is the understanding of the encryption process. The goal in the encryption process is to protect the contents of a message and insure its confidentiality. The encryption process starts with a plain text document. A plain text document is any document in its native format. Examples would be a .doc (Microsoft Word), .xls (Microsoft Excel), .txt (an ASCII text file) and so on. Once a document has been encrypted it is referred to as cipher text. This is the form that allows the document to be transmitted over insecure communications links or stored on an insecure device without compromising the confidentiality requirement. Once the plain text document has been selected it is sent through an encryption algorithm. The encryption algorithm is designed to produce a cipher text document that cannot be returned to its plaintext form without the use of the algorithm and the associated key(s). The

key is a string of bits that is used to initialize the encryption algorithm. There are two types of encryptions, symmetric and asymmetric. In symmetric encryption, a single key is used to encrypt and decrypt a document. Asymmetric encryption uses two keys that are generated as a pair. One of the keys will be used to encrypt (public key) and the other member of the pair to decrypt (private key). In almost all cases, the key used to decrypt (private key) a message is protected by a password or pass phrase. When using asymmetric encryption the key used to encrypt the document (public key) is shared widely. The asymmetric encryption process is a one-way process. Either member of the key pair can decrypt what other key encrypts. Figure 1 show an asymmetric encryption process. A more through discussion of the encryption process with the supporting mathematics can be found in Boncella (2000), Gollman (1999), Singh (1999), and Phleger (1997).

DIGITAL SIGNATURES Asymmetric encryption was proposed by Diffie and Hillman (1976) at which time they observed that the process could be used in reverse to produce a digital signature. The primary goal was not the confidentiality of the message but to authenticate the sender and to guarantee the integrity of the message. The contents of the message, the plain text portion, remain in plain text format. The digital signature portion of the message is mathematical digest of the message that has been encrypted using the sender’s private key. The relationship observed by Diffie and Hillman was that anything encrypted using the public key can be decrypted using the private key and anything encrypted using the private

Copyright © 2003, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.

44 Journal of Database Management, 14(2), 42-51, Apr-June 2003

Figure 1: Asymmetric Encryption

key could be decrypted using the public key. Since the private key and its associated password are under the control of only one individual this allows for the authentication that that person and only that person could have originated the message. Figure 2 shows a digitally signed plain text message.

INTEGRITY OF THE MESSAGE The integrity or inalterability of the contents of digitally signed messages comes about through the “hashing” process. The hashing process as it relates to digital signatures is quite different than the hashing process used to convert a key filed to storage address in the database environment.

A cryptographic hash function such as SHA-1 or MD4/MD5 is a one-way process that produces a fixed length digest of the original plain text document. One of the most important features of a cryptographic hash function is its resistance to collisions (Atreya et al. 2002). Since the digest is a fixed length, 128 bits for MD5 and 160 SHA-1, there is a probability that more than one message will map to the same digest. The larger the digest of the hash function the lower the probability of a collision occurring. A hash function is analyzed as being weakly resistant, which means that given one message you can’t find a second with the same hash or strongly resistant if you can not find two message with the same hash. Hash func-

Figure 2: Digitally Signed Message

Copyright © 2003, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.

Journal of Database Management, 14(2), 42-51, Apr-June 2003 45

Figure 3: Digital Signature Process

tions operate on blocks of contiguous bits and are exceptionally sensitive to any change in the ordering or the value of the bits. This sensitivity is where the integrity feature is derived. The digital signing process starts with a plain text file. Using one of the cryptographic hash functions a hash of this file is calculated. The hash or message digest is then encrypted using the sender’s private key. The plain text file and the encrypted hash aka digital signature are then concatenated together and transmitted to the receiver. Upon receipt, the two parts of the message, the plain text file and the digital signature, are separated and the recipient then runs the same hash algorithm against the plain text file. The encrypted hash is decrypted using the sender’s public key. The two hashes are compared. If they match the recipient knows that the file has not been altered and the sender has been authenticated. Figure 3 graphically depicts the Digital Signature Process.

AUTHENTICATION OF THE SENDER The authentication of the identity of the sender requires verification by a thirdparty as to the identity of the sender. The requirement for authentication comes about because any individual can generate a key pair with any name associated with the keys. Authentication is the process of associating the object key with an individual. This is accomplished in one of two methodologies. The PGP (pretty good privacy) model provides for authentication through a process known as a web-of-trust. The business world uses a hierarchical structure that has been standardized as X.509. The web-of-trust authentication model is based on a decentralized transitory trust principal. If an individual, I1, knows for a fact that second individual I2 has generated a key pair K2, then I1 signs the key K2 with their key K1. When a third individual, I3, receives the key K2, and rec-

Copyright © 2003, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.

46 Journal of Database Management, 14(2), 42-51, Apr-June 2003

Figure 4: Hierarchical X.509 Certificate Structure Root Certificate Authority

Subordinate Certificate Authority

Subordinate Certificate Authority

Subordinate Certificate Authority Individual Certificate

ognizes I1 through their signing and trusts I1, trust is then transferred to I2 and I3 can safely assume that the key K2 belongs to the individual who is claiming ownership. The number of signatures or vouchers to the identity of the key and its creator may continue to increase until every possible recipient knows one of the individuals who have vouched for the authenticity of the key. This model works very well in small communities or environments where a central authority is impractical or inadvisable. Further discussion of this model can be found in Zimmermann (1994). The X.509 structure is based on a hierarchical model where there is one ultimately trusted endorser, root certificate authority. The “root” transfers its endorsement through a series of sub-endorsers that will finally authenticate the key as belonging to the stated individual. Figure 4 shows this hierarchical structure. In the business environment, the root certificate is held by either the corporate headquarters or a third party organization that specializes in this area such as VeriSign (www.verisign.com). For two individuals from different organizations to conduct business they would need to arrive at a common certification scheme. This could be accomplished by exchanging root cer-

Individual Certificate

tificates or by having both root certificates signed by a trusted third party. In the X.509 environment each key has a single endorsement, that of the authority immediately superior to it. The root signs it own key. The certificate chain is the certificate of the individual who signed a document and all of the certificates that signed that individual’s certificate and subordinate certificates back to the root certificate. This chain establishes the authenticity of the individual. Extensive discussion of the X.509 format and certification schemes can be found in Atreya et al.

NORMALIZATION Normalization is a process that is used in a relational database environment to establish which attributes should be placed in which relations. The purpose of normalization to prevent anomalies that could result in inconsistent data when the data is modified, deleted or new values added. There are 6 levels of normalization with the least restrictive and poorest structure being designated 1st normal form (1NF) to the most restrictive and best data structure 5th normal form (5NF). Boyce-Codd normal form (BCNF) is placed between 3NF and 4NF. Table 1 identifies ordering and

Copyright © 2003, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.

Journal of Database Management, 14(2), 42-51, Apr-June 2003 47

Table 1: Normal Forms and Their Requirements 1NF

All attributes must be single valued

2NF 3NF BCNF

All non-key attributes must be dependent on the key attribute There are no dependencies between the attributes A portion of a multivalued key can not be a fact about a non-key attribute A tuple can not contain 2 or more independent multivalued pieces of information about the key Requires that only candidate keys be involved in the JOIN operation between relations

4NF 5NF

requirements of the different normal forms. Each successive normal form has the prerequisite that all of the requirements of previous normal forms must be met. Information that arrives in the corporate environment as a single unit such as an invoice which is then decomposed into its elements which are then stored in a series of relations that are defined by the normalization process. Figure 5 is an example of this process. Advanced discussion of the normalization process and its requirements can be found in Watson (1999).

INTEGRATION OF DIGITAL SIGNATURES AND RELATIONAL DATABASES The combination of a digital signature and a relational database environment gives the firm the ability to authenticate the sender, verify the integrity of the message and to store and process information in accordance with normal business practices. Currently there are two models of how the digital signature and the relational database can be integrated. One model separates the signed document from the relational database, the other model integrates the two together. In the separated model the signed document is sent electronically to the recipient. The recipient than manually transfers the data from the signed document into

the relational database, the signed document is then stored electronically for later retrieval. Once the data has entered the relational database system it is then processed according to normal business procedures. Figure 6 shows this process. In the integrated model, the signed document is also transmitted electronically to the recipient. Upon receipt, the signed document is decomposed into its elements that are then placed into the relational data structure. This includes the digital signature and the certificate chain portions of the document. To verify the transaction at a later point in time the entire document is retrieved from the relational data structures and reassembled into its original form. This model is shown in figure 7.

COMPARING SEPARATE AND INTEGRATED STORAGE OF SIGNED DOCUMENTS The process where the digitally signed document is store separately from the data used for processing has the advantage of relatively inexpensive commercial availability to produce, sign and validate the document. This allows an organization to quickly establish an authenticated electronic commerce system at a reasonable expenditure level. This approach does have significant limitations and disadvantages though. The first is the separation of the digitally signed document from the data. This produces a

Copyright © 2003, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.

48 Journal of Database Management, 14(2), 42-51, Apr-June 2003

Figure 5: Invoice with Data Stored in Normalized Relational Database

redundancy and can allow for a break down in the integrity of the system if the data in the data structure is manipulated or altered regardless of whether the change was accidental or intentional. The manual keying of the data from the digitally signed document into the data system is another significant drawback. Manual keying has had historically high error rates and can become a bottleneck in the workflow process. Integrated systems have the advantages associated with source document automation such as better response time and significantly reduced data entry error

rates. They have the further advantage of the elimination of redundancy in the data structure that will help in maintaining the integrity of the data. There are two primary disadvantages of the integrated systems. The first is the relative high cost of software that can accomplish the integration. The second problem is the difficulty in the integration process. The integrity of the data is established through the hashing process. The hashing process requires the bits to exist in a contiguous format. Hash algorithms such as SHA-1 or MD/5 are designed to be ex-

Figure 6: Separate Storage of Signed

E-Signed Original Invoice

Document Receiving & Processing

Stored Signed Invoice

Corporate Data Structure

Copyright © 2003, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.

Journal of Database Management, 14(2), 42-51, Apr-June 2003 49

Figure 7: Integrated Storage of Signed E-Signed Original Invoice

Document Receiving & Processing

tremely sensitive to the alteration of a single bit within the data structure. Given that the documents contents will be normalized to fit the structure of the relational database, the disassembly/reassembly process becomes critical. Even though the contents may be unaltered if there is any change in the spacing, formatting, or ordering the hash, when compared to the original hash, will indicate an integrity violation. Under current development to address the issue of digital signatures in a relational database environment is an XML digital signature specification. The specification is designed to identify the structure of the data when it was signed so that it can be recreated when the data is retrieved from the data structure. The format for an XML Digital Signature is contained in Figure 8. Figure 8: XML Digital Signature Format (Simon, et al., 2001)

(Canonicalization Method> (Signature Method) ( (Transforms) (DigestMethod) (DigestValue) ) (SignatureValue) (KeyInfo) (Object)

Corporate Data Structure

Retrieved Signed Invoice

The XML digital signature format is designed so that all or part of the document can be signed. Descriptions of the components and the XML digital signature process can be found in Simon et al. (2001) and Mactaggart (2001). Information on the XML digital signature development process and its status can be found at http:// www.w3.org/Signature/. Another area that requires a modification to the data structure is the certificate chain. The recipient of a digitally signed document will want to store the certificate chain associated with that digital signature. In an active electronic commerce application environment is should only be desirable and necessary to store a single copy of each member of the certificate chain although there will be multiple transactions referencing that certificate. Without this chain it would be impossible to verify the authenticity of the signature or to audit the transaction at some point in the future.

DISCUSSION Although advances in digital signatures have helped in securing electronic communications and in maintaining the integrity of Internet commerce business transactions, there still are a number of issues that need to be considered. Clearly the Separated model is a low cost solution. In most cases fast software already exists and has minimal hardware and training requirements. There is also a minimal impact on current business practices. Moreover proprietary key system can also be sup-

Copyright © 2003, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.

50 Journal of Database Management, 14(2), 42-51, Apr-June 2003

ported. However, lack of integration with a database, data redundancy and lack of an ability to scale are some of the inherent problems. The Integrated model on the other hand, although being fast and software being available, is a more expensive solution. Furthermore an ability to use XML to develop own advanced integrated applications, higher integrity in data – only single instance exists and support for std. X.509 digital key structure directly are some of the advantages. There is however a difficulty in decomposing/reconstructing digital signature. A conscious effort also needs to be made to integrate the model into current business systems. There are also challenges in implementing digital signatures from an organizational standpoint. In situations where the concerned parties know each other, there is already a level of trust and hence sharing keys is less challenging. However when the businesses tend to have stronger external coalitions, level of trust between parties decreases. Advances in XML digital signatures do incorporate principles of confidentiality, authenticity, data integrity and non-repudiation, however it goes without saying that creating a trustworthy organizational environment that focuses on establishing adequate business processes and authority structures is essential of an adequate implementation of XML digital signatures. Clearly unless the use and adoption of digital signatures considers formal rules and procedures, there can be doubts of successful implementation and adoption. It is important to consider the integrity of rules and procedures because of the evolving nature of the organizations. As Dhillon and Backhouse (2000) point out, emphasis for security has to shift from maintaining confidentiality and integrity of communications

to establishing responsibility structures and trusting environments. Therefore any implementation of digital signatures and related communication securities has to be undertaken in light of organizational policies, procedures and existing normative structures.

CONCLUSION This paper has presented the range issues related to digital signatures. There have indeed been a number of technological advances in the development of XML digital signatures, however there has been a lack of emphasis on the processes, policies and standards that govern the issuance, maintenance and revocation of certificates, including public and private keys. This was indeed the remit of the Public Key Infrastructure, but it has neither become popular nor have many organizations implemented it. Perhaps with the ability to interoperate with database structures, the much cleaner legal status (E-Sign), and the high degree of low cost communications provided by the Internet it may be time for PKI to become a part of the main stream business processes. Moreover with the advent of Internet commerce, the supply chain has been streamlined and the need to have built-in security has been heightened. Perhaps this change will be a contributing growth factor. Nevertheless any implementation has to consider the range of organizational policy consideration. Ultimately success will depend on considering the rules and policies prior to a technological implementation, rather than other way round.

REFERENCES Atreya, M., Hammond, B., Paine, S.,

Copyright © 2003, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.

Journal of Database Management, 14(2), 42-51, Apr-June 2003 51

Starrett, P., & Wu, S. (2002). Digital Signatures. New York: RSA Press. Boncella, R. J. (2000, November). Web Security for E-Commerce. Communications of the Association for Information Systems, (4)11. Diffie, W. & Hillman, M. (1976, November). New Directions in Cryptography. IEEE Transactions on Information Theory, (22), 644-54. Dhillon, G. & Backhouse, J. (2000). Information System Security Management in the New Millennium. Communications of the ACM, 43(7), 125-128. Gollman, D. (1999). Computer Security. New York: John Wiley & Sons. Mactaggart, M. (2001) An Introduction to XML Encryption and XML Signature. http://www-106.ibm.com/ developerworks/xml/library/sxmlsec.html/index.html (Current July 15, 2002).

Pfleeger, C. A. (1997). Security in Computing. (2nd ed.) Upper Saddle River, NJ: Prentice Hall. Rivest, R., Shamir, A. & Adleman, L. (1978, February). A Method for Obtaining Digital Signatures and Public-Key Cryptosystems. Communication of the ACM, 2(21), 120-126. Simon, E., Madsen P. & Adams, C. (2001). An Introduction to XML Digital Signatures. http://www.xml.com/pub/a/2001/ 08/08/xmldsig.html (Current July 15, 2002). Singh, S. (1999). The Code Book. New York: Doubleday. Watson, R. T. (1999). Data Management: Databases and Organizations (2nd ed.) New York: John Wiley & Sons. Zimmermann, P. (1994, October). PGP 2.6.2 Users Guide Volume II, Special Topics.

Randall C. Reid recieved his PhD from the University of South Carolina. Currently he is an Assistant Professor at the University of Alabama in Huntsville, USA. His primary teaching and research interests are in networking and security. Gurpreet Dhillon, PhD, is an Associate Professor of IS at the school of Business, Virginia Commonwealth University, USA. He is a graduate of the London School of Economics and has previously held faculty positions in UK and Hong Kong. He is a an author of four books and his research has been published in journals such as Information Systems Research, Communications of the ACM, European Journal of Information Systems, Information Systems Journal, International Journal of Information Management among others.

Copyright © 2003, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.

Suggest Documents