of electronic verification of a person's is biometrics. ...... capturing the signature from the digitizer, pre-processing the data and how feature extraction is done ...
Chapter 1 Introduction The identification of individuals can be defined by the uniqueness of the person. For measuring the uniqueness and analyzing a person’s behavioral characteristic biometric can be used. The different biometric features of the human body include the eyes, fingerprint, human face, signature, palm, retina, and iris. The biometric systems are able to provide an efficient and secured way to their users, when this system is compared with the traditional authentication systems which are based on token or password which are known to the users or on the possession of a token or password which can be forgotten or stolen. These security concerns are not affected by the biometric system, which is an alternative method for personal identification and verification. In the identification mode person’s identity can be extracted from the database where in case of verification mode person’s identity can be authenticated on the basis of claimed by him/her. Psychological and behavioral can be defined as the types of biometrics. The psychological are a measurement of the biological traits of users, like fingerprint, retina, iris and face. The behavioral traits of users, like signature, voice. To process such traits recognition system will be called as biometric system.
1.1 Biometrics Security is one of the rising concerns in the real world as well as online systems. Human identification and authentication is an important aspects of the surveillance systems as well as online security systems. Biometrics are the common answer to all the problems and it has been widely accepted. Positive identification of individuals is a very basic societal requirement. The user authentication is nowadays a very significant part of the web world. The consequences of an insecure authentication system in a corporate or enterprise environment can be huge, and may include loss of private information, denial of avail, and compromised data integrity. The value of authentic, user authentication is not restricted to just computer or network access. There are many other systems in everyday life, who also require user authentication, such as banking, e-commerce etc and could benefit from improved enhanced security.
In fact, as more interactions take electronically, it becomes still more important to receive an electronic verification of a person’s individuality. Until lately, electronic verification took one 1
of two kinds. It was based on something the person had in their possession, like a magnetic swipe card, or something they knew, like a password. The problem is, these forms of electronic identification are not very secure, because they can be given away, taken away, or lost and motivated people have found ways to forge or circumvent these credentials. The ultimate kind of electronic verification of a person’s is biometrics. In Biometrics, a person is identified based on his/her physiological or behavioral characteristics such as finger scan, retina, iris, voice scan, and signature scan, etc. By applying this technique physiological characteristics of a person can be changed into electronic processes that are inexpensive and comfortable to apply. These characteristics can uniquely identify a person, replacing traditional security methods by providing two major improvements: biometric belonging to a person cannot be easily robbed and no password memorization is needed here. As biometrics is proved to better solve the problems such as access control, fraud and theft, more and more organizations and institutes are considering biometrics as a solution to their security problems. A Biometric system is a technical scheme that uses data of a person to recognize an individual. Biometric systems need a specific type of data which is about unique biological traits in order to function effectively. A biometric system will involve moving human identification through algorithms for a particular outcome, such as a positive recognition of a user of a person. A Biometric system is basically a pattern recognition system that works by acquiring biometric data from an individual, extracting a feature set from the acquired data, and comparing this feature set against the template set in the database. Let’s consider any one of the biometric trait, for example signatures. Before using the biometric system, the user has to be enroll his biometric traits in the biometric system. Here signature is considered as one of the biometric traits. For the enrollment operation, the signature information from an individual is taken on a signature pad which is processed as shown in the Figure 1.1 and stored in the database. Ones the signature information is stored, the user can now start using the biometric system for either verification or identification as described in the below sections
2
Figure 1.1: Block diagram of Enrollment
Depending on the user necessity, a biometric system may work either in verification mode or identification mode. • Verification Mode: In the verification mode as illustrated in Figure 1.2, the system verifies a person’s identity by comparing the captured signature data with his/her own signature template stored in the signature database. When a person wants to be verified claims an identity by signing on the signature pad and submitting his/her signature to the system and the system uses a comparison method to determine whether the claim of the user is valid or not. Identity verification is typically utilized for positive identification, where the objective is to prevent impersonation of identity. • Identification Mode: In the identification mode as illustrated in Figure 1.3, the system identifies a person by comparing the signature template of person with signature templates of all users in the database for a match. Thus, the signature system performs a one-to-many comparison to identify a person's identity, without the subject having to claim an identity. Identification is a critical component in negative recognition applications where the system establishes whether the person is who she (implicitly or explicitly) denies to be. 3
Figure 1.2: Block diagram of Verification
Figure 1.3: Block diagram of Identification Mode
4
The purpose of identification mode is to prevent a single person from making multiple signature templates in the database. With the traditional methods of personal recognition such as passwords, PINs, keys, and tokens only positive recognition can be performed. Negative recognition cannot be performed by this traditional methods. But with Biometrics both positive as well as negative recognition can be done.
1.2 Biometric Technologies A number of biometric characteristics exist and are in use in various applications. Each biometric has its strengths and weaknesses, and the choice depends on the application. No single biometric is expected to effectively meet the requirements of all the applications. In other words, no biometric is “optimal”. The match between a specific Biometric and an application is determined depending upon the operational mode of the application and the properties of the biometric characteristic.
1.2.1 Face Recognition Face recognition is a process in which a portion of the person’s face is photographed and the resulting picture is trimmed back to digital code. It is based on the location & shape of Facial Attributes like eyes, eyebrows, nose, lips, etc. Face recognition analyzes facial characteristics. It requires a digital camera to develop a facial image of the user for authentication. Because facial scanning needs an extra peripheral not customarily included with basic PCs, it is more of a niche market for network authentication [1]. With security cameras presents in a variety of public places facial recognition is a viable option for biometric identification. Advantages: 1) An Individual photo can be taken from a distance and the person has no idea that his picture is taken by the system. Moreover, if thermo gram is used, then even if the person is wearing a mask he can be identified. Disadvantages: 1) Individuals can alter their facial expressions or even change their hairstyle to fool the system. Moreover, some systems have difficulty in maintaining high levels of accuracy as the database size increases. 2) Different angles of the poses affects the entire face recognition process.
5
1.2.2 Fingerprint Recognition Fingerprint identification remains as one of the most widely used and reliable biometric identification methods [2]. Fingerprint verification and identification algorithms can be classified into two categories: image-based and minutiae-based. Image-based methods include methods involving optical correlation and transform-based features other aspects of fingerprint identification are orientation, segmentation and core point detection. Today such system named as AFRS (Automated Fingerprint Recognition System). As the prices of these devices and processing costs fall, using fingerprints for user verification is gaining acceptance despite the common-criminal stigma. Advantages: 1) It is very easy to use. 2) The device is very cheap and portable. 3) Devices power consumption is very less. Disadvantages: 1) The system needs a high amount of computational resources, especially when operating in the identification mode. 2) Fingerprints consisting of only a small fraction may be unsuitable for identification.
1.2.3 IRIS Scanning Iris can be used as a biometric cue for person recognition. The richness and variability observed in the iris texture is due to the agglomeration of multiple anatomical entities composing its structure. An iris-based biometric [3], on the other hand, involves analyzing features found in the colored ring of tissue that surrounds. Iris scanning, undoubtedly the less intrusive of the eye related biometrics, uses a fairly conventional camera element and requires no close contact between the user and the reader.
Advantages:
1) Easily detects artificial irises. 2) The pattern of the iris is not affected by the glasses, contact lens, and even surgery. 3) Even twins have different patterns of Iris.
Disadvantages:
1) The devices are very costly. 2) The person should remain still during the enrollment or recognition process. 6
1.2.4 Retina Scanning In retinal scanning, an electronic scan of the retina-the innermost layer of the wall of the eyeball is done. The blind vessels at the back of the person's eye have unique patterns. A person two eye has a two different pattern. In retina scanning this pattern is used to identify an individual. The retina scanner, emits a beam of light in the person's eye, which then bounces off the person's retina and returns to the scanner, after that the retina scanning device quickly maps the eye's blood vessel pattern and records it into a database.
Advantages:
1) High accuracy. 2) Retina patterns do not change over a period of person lifetime.
Disadvantages:
1) The retina scanning process is slow. 2) Retinal Scanning can disclose certain medical conditions of the person. 1.2.5 Signature Verification Signature recognition, uses a digitizer to record patterns of an individual's signature, such as pen/stylus speed, pressure, direction in signature and other characteristics. There are two key types of signature verification, Static and Dynamic. Static is most often a visual comparison between one scanned signature and other scanned signature, or a scanned signature against an ink signature. And in dynamic signature recognition the signature is taken from the digitizer and compared with the signature templates in the database. People are used to signatures as a means of transaction-related identity verification, and most would see nothing unusual in extending this to encompass biometrics. Signature verification devices are reasonably accurate in operation and obviously lend themselves to applications where a signature is an accepted identifier. The main application domain of a handwritten signature is in banking and e-commerce & document authentication.
Advantages:
1) It is very easier and cheaper. 2) It does not consume too much time for the entire process; just the time required for signing is taken for verification.
7
Disadvantages:
1) Signatures are a behavioral biometric that change over a period of time and are influenced by physical and emotional conditions of the signatories. 2) Successive impressions of the signatures of some people are significantly different.
1.3 Cloud Computing Cloud computing is defined as a type of computing that relies on sharing computing resources rather than having local servers or personal devices to manage applications. Cloud computing is a system for empowering ubiquitous, convenient, on-demand network access to shared & configurable computing resources that can be swiftly provisioned and freed with marginal management effort or service provider interface [4]. Some of the significant characteristics of cloud computing are:
On-demand self-service
A broad network access
Resource pooling
Rapid elasticity
Measured service
Business applications are moving to the cloud. It’s not hardly a fad—the switch from traditional software models to the Internet has steadily gathered momentum over the final 10 years. Looking forward, the next decade of cloud computing promises new ways to collaborate everywhere, through mobile devices. Traditional business applications have always been very complicated and expensive. The quantity and variety of hardware and software needed to hunt them are daunting. It calls for a whole squad of experts to install, configure, test, run, secure, and update them. When multiplied this effort across dozens or hundreds of apps, it’s leisurely to find out why the biggest companies with the best IT departments aren’t getting the apps they need. Small and mid-sized businesses don’t stand a chance. With cloud computing, the developer eliminates those headaches because they are not managing hardware and software, that’s the responsibility of an experienced vendor like salesforce.com, AWS, Azure, etc. The shared infrastructure means it acts like a utility: developer only pays for what he need, upgrades are automatic, and scaling up or down is easy. Cloud-based apps can be up and running in days or weeks, and they cost less. With a cloud 8
app, the developer just opens a browser, log in, customize the app, and pop out using it. Commercial enterprises are playing all sorts of apps in the cloud, like customer relationship management (CRM), HR, accounting, and much more. Some of the world’s largest companies moved their applications to the cloud with salesforce.com, AWS, azure it’s after rigorously testing the security and dependability of our infrastructure. As cloud computing grows in popularity, thousands of companies are simply rebranding their non-cloud products and services as “cloud computing.” Always dig deeper when evaluating cloud offerings and keep in judgment that if developer have to buy and manage hardware and software, what he is looking at isn’t really cloud computing but a false cloud. Cloud applications are offered as one of the three main service models as
Software as a Service (SaaS): Software as a service (or SaaS) is a means of delivering applications over the Internet as a service as shown in Figure 1.4. Instead of installing and maintaining software, developer only access it via the Internet, releasing himself from complex software and hardware management. The SaaS provider manages access to the application, including security, accessibility, and performance. SaaS customers have no hardware or software to purchase, set up, maintain, or update. To access to applications is easy as the developer just needs an Internet connection.
Figure 1.4: Software as a Service Model
Platform as a Service (PaaS): Building and running on-premise applications have always been complex, expensive, and slow. Each application required hardware, an operating 9
system, a database, middleware, Web servers, and other software. Once the stack was assembled, a team of developers had to navigate frameworks like J2EE, .NET, etc. A team of network, database, and system management experts was needed to keep everything up and running. PaaS provides all the infrastructure needed to develop and run applications over the Internet as shown in Figure 1.5. Users can access, custom apps built in the cloud, just like their SaaS apps, while IT departments and ISVs can focus on innovation instead of complex infrastructure. PaaS is driving a new era of mass innovation and business agility. For the first time, developers can focus on application expertise for their business, not managing complex hardware and software infrastructure.
Figure 1.5: Platform as a Service Model
Infrastructure as a Service (IaaS): Infrastructure as a service provides companies with computing resources, including servers, networking, storage, and data center space on a pay-per-use basis. Infrastructure as a service (IaaS) is a type of cloud computing in which a third-party provider hosts virtualized computing resources over the Internet as shown in Figure 1.6. IaaS platforms offer highly scalable resources that can be adjusted on-demand. IaaS customers pay on a per-use basis, typically by the hour, week or month. This pay-asyou-go model eliminates the capital expense of deploying in-house hardware and software.
10
Figure 1.6: Infrastructure as a Service Model
The cloud setup is deployed in five different deployment models as follows.
Public Cloud: Public clouds are owned and operated by companies that use them to offer rapid access to affordable computing resources to other organizations or individuals as shown in Figure 1.7. With public cloud services, users don’t need to buy hardware, software or supporting infrastructure, which is owned and handled by providers.
Figure 1.7: Public Cloud
Private cloud: A private cloud is owned and operated by a single company that controls the way virtualized resources and automated services are customized and used by various 11
lines of business and constituent groups as shown in Figure 1.8. Private clouds exist to engage vantage of many of cloud’s efficiencies, while providing more control of resources and steering clear of multi-occupancy.
Figure 1.8: Private Cloud
Hybrid Cloud: A hybrid cloud uses a private cloud foundation combined with the strategic use of public cloud services. The reality is a private cloud can’t exist in isolation from the rest of a company’s IT resources and the public cloud. Most companies with private clouds will evolve to manage workloads across data centers, private clouds and public clouds, thereby creating hybrid clouds as shown in Figure 1.9.
Figure 1.9: Hybrid Cloud
In a nutshell, cloud infrastructure has five essential characteristics, three service models, and four deployment models [4] [5] [6] [7]. 12
1.4 Problem Statement The online signature recognition is one of the popular trait for identification of individual based on their behavioral characteristics. This project implements a dynamic signature recognition system using the successive geometric center of dept-2 for parameters such as Pressure, Azimuth, Altitude, Timestamp and soft biometric features. The Soft biometric features were added for improving the accuracy. After adding this additional feature the results were enhanced than the previous improved technique of the geometric center of depth2 [8]. For more optimal outcomes and improved performance of the above implemented Dynamic signature Recognition system, the system was implemented on an architecture based on public cloud using Software as a Service (SaaS) architecture. This proposed architecture of a dynamic signature recognition system is built on Microsoft Windows Azure cloud computing platform. The proposed architecture ensures appropriate scalability of the technology, sufficient amounts of storage, parallel processing capabilities, and with the widespread availability of mobile devices also provides an accessible entry point for various applications and services that rely on mobile clients. This proposed architecture is capable of addressing issues related to the next generation of biometric technology, but at the same time it offers new application possibilities for the existing generation of biometric systems. The above proposed architecture was intensively tested on highly compute intensive online signature recognition system using CALSAL feature vector extraction mechanism [9]. The time taken by traditional standalone system to extract the feature vector based on CALSAL feature vector extraction technique was about one hour forty five minutes, whereas the time taken by the proposed architecture to extract the feature vector for the same technique was about less than 6 minutes. Moreover, it provided scalability, plugability and faster online signature recognition system. To add more versatility in the verification process of the online signature recognition system, a classifier is design. The classifier was designed by taking training signatures set and then determine the thresholds for classification based on the features of the training signatures. The threshold is being evaluated based on the training set of signatures and the decision is taken after comparing the coefficient values calculated by the Extended Regression coefficient method, with the thresholds.
13
The final step is to analyze the performance of the proposed architecture for the online signature recognition system. TAR-TRR (Performance Index PI) analysis will be performed on intraclass as well as inter-class testing. Performance metrics such as Performance Index (PI), Security Performance Index (SPI) will be used for evaluation and final validation will be performed by comparing the results with the existing system [8] [9].
1.5 Summary This chapter enlightens various ways for person identification and verification which are available in the market. It talks about its security concerns and how with the use of biometrics technology for identification and verification purpose it overcomes this security concerns. This chapter explained various popular biometric technologies such as Face Recognition, Fingerprint Recognition, IRIS Scanning, Retina Scanning and Signature Verification with its advantages and disadvantages. Cloud computing was discussed here with its various service and deployment models available. The problem definition was formulated which gave a brief overview of the work carried out in this research. In the next chapter the theory related to this research will be discussed in detailed.
14
Chapter2 Theory 2.1 Windows Azure Technologies Azure is Microsoft’s cloud based platform, a growing collection of integrated services such as compute, storage, data, networking and app that help developer move faster, do more and save money.
2.1.1 Cloud Services Cloud Services is a Platform-as-a-Service (PaaS) model. Like Websites, cloud service technology is designed to support applications that are reliable, scalable and cheap to operate. Like Websites, Cloud Services is based on VMs, but gives more control over the VMs, than Web sites. The developer can set up his own software on Cloud Service VMs and can remote into them. Figure 3.12 illustrates the idea. More control also means less ease of use; unless developer needs the additional control choices, it's typically more agile and easier to start a web application up and running in Websites compared to Cloud Services. The technology provides two slightly different VM options, instances of web roles run a variant of Windows Server with integrated IIS, while instances of worker roles run the same Windows Server variant without IIS on it. A Cloud Services application relies on some combination of these two alternatives. For instance, a simple application might employ only a web role, while a more complex application might employ a web role to manage incoming requests from users, and the worker role for processing those requests. As the Figure 2.1 suggests, all of the VMs in a single application run in the same cloud service. Because of this, user’s access the application through a single public IP address, with requests automatically load balanced across the application's VMs. Even though applications run in virtual machines, it's important to see that Cloud Services provides PaaS, not IaaS. Here's one way to recollect about it,with IaaS, such as Azure Virtual Machines, the developer first creates and configure the environment the application will run in, then deploy the application into this environment. The developer is responsible for overseeing much of this world, doing things such as deploying new patched versions of the 15
operating system in each VM. With PaaS, by contrast, it's equally if the environment already exists. All the developer has to do is deploy the applications. Management of the platform it runs on handles operation such as, deploying new versions of the operating system, for the developer.
Figure 2.1: Azure Cloud Services provides Platform as a Service
With Cloud Services, the developer don't create virtual machines. Instead, the developer provides a config file that tells Azure how many of web role and worker role instances the developer needs, and the platform creates them. The developer still have to choose what size those VMs should be, the options are the same as with Azure VMs, but it don't explicitly create them. If the application needs to handle a greater load, the developer can ask for more VMs, and Azure will create those instances. If the load decreases, the developer can shut those instances down and wind up paying for them. A Cloud Services application is normally made available to users by a two-step process. In the first step, developer uploads the application to the platform's staging area. And in the second step, when the developer is ready making the application go live, he uses the Management Portal to request that it be swapped into production stage. This switch between staging and production can be executed with no downtime, which permits a running application be upgraded to a new version without disturbing its users.
16
The PaaS nature of Cloud Services has other implications, as well. One of the most significant is that applications built on this technology should be composed to run correctly when any web or worker role instance fails. To accomplish this, a Cloud Services application shouldn't maintain state in the file system of its own VMs. Writes made to Cloud Services VMs aren't persistent; there's nothing like a Virtual Machine data disk. Rather, a Cloud Services application should explicitly write all states to blobs, tables, or some other external computer memory. Building applications this way makes them easier to scale and more immune to failure, both important goals of Cloud Services [10] [11] [12].
2.1.2 Azure Storage Services Cloud computing enables new scenarios for applications requiring reliable, scalable and highly available storage for their data. Azure Storage is highly scalable, so the developer can store and process terabytes of data required by financial analysis, business and media applications. Even the developer can store smaller amounts of data needed for a small business web site. Wherever the developer needs to pay, he pays only for the data he’s storing. Azure Storage currently stores tens of trillions of unique client details, and handles requests from millions of users in a search on average. As Azure Storage is elastic, the developer can design applications for a large global customers, and scale those applications whenever needed, both in terms of the amount of data stored and the number of requests that have come from it. The developer pays only for what he uses, and only when he uses it [13] [14].
Figure 2.2: Azure Storage service
Azure Storage uses an auto-partitioning feature that automatically load-balances the developer data based on traffic. This means that as the demands of the application grow, Azure Storage automatically allocates the appropriate resources to meet them. 17
Azure Storage is accessible from anywhere in the world, from any type of application, whether it’s working in the cloud, on the desktop, on an on-premises server, or on a mobile or tablet device. For mobile application scenarios, the developer uses the Azure storage to store a part of data on the mobile devices and the critical data such as login details can be stored in the cloud. 2.1.2.1 Blob Storage For users with heavy quantities of unstructured information to store in the cloud, Azure blob storage provides a cost-effective and scalable solution. The developer can use Blob storage to store content as listed below:
Documents
Photos, videos, music, and blogs
Backups of files, computers, databases, and devices
Images and text for web applications
Configuration data for cloud applications
Logs and other large datasets
Fig 2.3 shows how the azure blob storage account is categories in the URL of the azure storage account.
Figure 2.3: Blob Storage Account
Every blob is arranged inside a container. Containers also provide a utilitarian means to specify security policies to groups of objects. A storage account can withstand any number of
18
containers, and a container can consist of any number of blobs, up to the 500 TB capacity limit on the storage bill. Blob storage consist of two types of blobs, block blobs and page blobs. Block blobs are designed for streaming and storing cloud objects, and are a good choice for storing documents, media files, backups and so on, a block blob can be upwards to 200 GB in size. Page blobs are designed for representing IaaS disks and supporting random writes that can be up to 1 TB in size [13] [14] [15]. 2.1.2.2 Table Storage Modern applications often demand high data stores with a high scalability and flexibility than what previous generations of software required. Table storage provides a highly scalable storage, so that the application can automatically scale to meet user demand. Table storage is Microsoft’s NoSQL key/attribute storage, it owns a schema less design, making it differ from traditional relational databases. With a schema less data storage, it's easy to adapt the data as the needs of the application evolve. Table storage is easy to use, thus developers can create applications rapidly and efficiently. Access to data is faster than traditional relational databases and is less expensive for different variety of applications. The Fig 2.4 illustrates the table storage structure. In Table storage it’s a key-attribute store, which means every value in a table storage is laid in with a property name. The attribute name is helpful is performing operation such as filtering and specifying selection criteria. The collection of attributes and their values comprise an entity. Since Table storage has no scheme, two entities in the same table can contain different collections of properties, and those properties can be of different events.
Figure 2.4: Table Storage
19
The developer can use Table storage to store adjustable datasets, such as user data for networking applications, address records, model information, and so on that the service requires. The developer can store any count of entities in a table, and a storage account may consist of any count of tables, up to the capacity limit of the storage account. 2.1.2.3 Queue Storage Queue storage consists of an impeccable messaging solution for asynchronous communication between various application devices, whether they are executing in the public cloud, on an onprivate cloud, on the desktop, or on a mobile device. Queue storage also offers support for managing uncontemporanous work and building process workflows. A Queue storage account can contain any number of queues. A queue inside a queue storage account can hold any number of messages, up to the capacity limit of the storage account. Each messages may be upwards to 64 KB in size.
2.1.3 Azure Messaging Services Whether an application or service runs in the cloud or on premises, it often needs to interact with other applications or services. To provide a broadly useful way to do this, Azure offers Messaging Service [16]. 2.1.3.1 Azure Queue Messaging Azure Queue storage is a service for storing large numbers of messages that can be accessed from anywhere in the world via authenticated calls using HTTP or HTTPS. A single queue message can be up to 64 KB in size, and a queue can contain millions of messages, up to the total capacity limit of a storage account. A storage account can hold up to 500 TB of blob, queue, and table data. Common uses of Queue storage include, creating a backlog of work to process asynchronously and passing messages from an Azure Web role to an Azure Worker role. 2.1.3.2 Service Bus Messaging Microsoft Azure Service Bus messaging is a reliable data delivery service. The purpose of this service is to make communication easier. When two or more parties want to interchange information, they need a communication mechanism. Service Bus messaging is a brokered, or a third party communication mechanism. This is similar to a postal service in the physical 20
world. Postal services make it very comfortable to deliver different kinds of letters and bundles with a diversity of delivery guarantees, anywhere in the world. Similar to the postal service for delivering letters, Azure Service Bus messaging is most elastic information delivery from both the sender and the recipient as illustrated in Figure 2.5. The azure service bus messaging service ensure that the message is delivered to the receiver even if the receiver is not online at the time of sending the message.
Figure 2.5: Windows Azure Service Bus
The message sender can also involve a change of delivery characteristics, including transactions, repeat detection, time based expiration, and batching. Service Bus messaging has two separate features: queues and topics. 1. Service Bus Queue: Service Bus queues consist of a brokered messaging communication model. When using queues, devices of a distributed application do not communicate directly with each other; instead they exchange messages via a queue, which works as an intermediary as shown in Figure 2.6. A message sender hands off a message to the queue and then goes on its processing. Asynchronously, a message receiver pulls the message from the queue and processes it. The sender does not have to wait for a reply from the receiver in order to continue to process and send further messages. Queues offer First In, First Out (FIFO) message delivery to one 21
or more competing clients. That means, messages are typically received and processed by the receivers in the order in which they were added to the queue by the sender, and each message is received and processed by only one message receiver.
Figure 2.6: Service Bus Queue
2.
Service Bus Topics and Subscription: Azure Service Bus topics and subscriptions consist of a publish/subscribe messaging communication model. When using topics and subscriptions, devices of a distributed application do not communicate directly with each other; instead they exchange messages via a topic, which behaves as an intermediary or agent as shown in Figure 2.7. In contrast with Service Bus queues, in which each message is processed by a single receiver, topics and subscriptions supports a "one-to-many" type of communication, using a publish/subscribe pattern.
Figure 2.7: Service Bus Topics and Subscription
22
It is possible to register multiple subscriptions to a topic. When a message is comes to a topic, it is then made available for each subscription to manage/process independently. A subscription to a topic is featured within a virtual queue that has copies of the messages that were transmitted to the topic. The developer can optionally impose filter rules for a topic on a per-subscription basis, which allows him to filter/restrict which messages should be adding up to a topic are received and which topic subscriptions should be appended. Both messaging entities support all of the concepts presented above - and more. The principal dispute is that topics support publish/subscribe capabilities that can be used for sophisticated content-based routing and delivery logic, including shipping to multiple receivers.
2.2 Summary This chapter explains various azure technologies such as Cloud services, Storage services and Messaging services. Azure Cloud Services is a PaaS (Platform as a Service) model. The cloud service technology is designed to support applications that are reliable, scalable and cheap to operate. Moreover, it’s stateless and does not maintain any state of the users. The Azure Storage Services consist of the Blob, Tables and Queues. The Blob is simple named files along with metadata for the file. The Tables are a structured storage, which is a set of entities; an entities are a set of properties. The Queues are Reliable storage and delivery of messages for an application. Lastly the Azure Messaging service consists of an azure queue which is used for communication between the web role and worker role and Service bus messaging service ensure that the message is delivered to the receiver even if the receiver is not online at the time of sending the message. In the next chapter the literature study of different papers in the current context of the research will be extensively be studied here.
23
Chapter 3 Review of Literature A broad diversity of systems requires reliable personal recognition schemes to either confirm or determine the identity of an individual requesting their services. The design of such schemes is to ensure that the provided services are accessed entirely by a legitimate user and no one else. The biometric system is the one of the ways to resolve this trouble.
3.1 Biometric Biometrics are related to human characteristics and traits. Biometric identification or biometric authentication is used in computer science as a form of identification and access control. It is also used to identify individuals from the group of many people. As by knowing the fact that biometrics are
categorized as physiological
and behavioural characteristics
[17]. Physiological characteristics are related to the shape of the body, which include fingerprint, face recognition, DNA, palm print, hand geometry, iris and retina. Behavioural characteristics are related to the pattern of behaviour of a person which includes signature, gait and voice. In [18], the author have discussed, about biometric in 5 different parts. Flows goes in an overview of authentication, explaining the concept of identity assurance, which also explain how authentication mechanisms work, their common traits and discuss different types of authentication mechanisms. In the next section author has discuss the different types of biometrics. Then discusses current issues involving biometrics from the technical perspective. Lastly, they provide a detailed treatment of the privacy, policy, and legal concerns raised by biometrics. In [19], how these systems works, their strengths and weaknesses, where they can be effectively deployed. It helps us to understand how biometrics are associated with technologies such as public key infrastructure (PKI) and smart cards. It also provides guidelines for successful deployment of biometrics in today’s enterprise environment. The book is organized into four different parts, Part I discusses of Biometric Fundamentals, including reasons why the technology is deployed, how the technology operates and how the accurate system are. Part II discusses of leading biometric technologies, on real-world experience in deploying and testing
24
systems in operational environments Part III discusses of Biometric Applications and Markets, Part IV discusses of Privacy and Standards in Biometric System Design. The paper [20] also describes the major pros and cons of each technology with a clear indication of some suitable technologies. People are identified by three basic means: By something they know (password, PIN code), something they have (door key, physical ticket into a concert) and something they are (biometrics). The biometric systems are not only limited to the technology, but such systems having a problem with respect to its storage. Emin Martinian, Sergey Yekhanin and Jonathan S. Yedidia [21] have define a paper with the concern of secure biometric regarding storage problem and develop a solution using syndrome codes. Specially, biometrics such as fingerprints, irises, and faces are often used for authentication, access control, and encryption instead of passwords. Because the passwords are never stored in the system. Typical (and insecure) biometric based encryption architecture an encryption key is derived from the user's biometric and used either to encrypt data stored on the device or to control access. To allow decryption or authorized access, the original biometric is stored on the device. As discussed the security concern of biometrics Taekyoung Kwonand and Hyeonjoon Moon [22] has proposed an authentication methodology that combines multimodal biometrics and cryptographic mechanisms for border control applications. This paper [23] presents an evolutionary approach to the biometric security system that improves robustness. Multiple biometrics are fused at the decision level to support a system that can meet more challenging and varying accuracy requirements as well which full fills the user needs. Such system named as adaptive, multimodal biometric management (AMBM) which provides more security and accuracy.
3.2 Signature Recognition Signature verification has been extensively studied & implemented. Its many applications include banking, credit card validation, security systems, etc. In general, handwritten signature verification can be categorized into two kinds, on-line verification and off-line verification. For the signature verification author Andrzej Pacut and Adam Czajka says [55] signature need to be collected on the digitizer in case of online signature verification. With the five quantities recorded, namely horizontal and vertical pen tip position, pen tip pressure, and pen azimuth
25
and altitude angles. In this paper they have used the concept of neural networks, which performs classification and signature verification. Piotr Porwik and Tomasz Para [24] propose a new method for signature verification. They analysis the offline signature based on the features (weights). Initially the signature is a pre process. In proposed approach the Hough transform is used, then the center of signature gravity is determined, and the horizontal and vertical signature histograms are performed, this leads to a good signature recognition level, method can be used in many areas. The unique combination of static and dynamic signature recognition fusion has been approached by F. Alonso-Fernandez, J. Fierrez, M. Martinez-Diaz and J. Ortega-Garcia [25]. Two off-line and two on-line recognition approaches exploiting information at the global and local levels are used. Fusion experiments are done using a trained fusion approach based on linear logistic regression. The output is generated in the individual method was not the global. So as to improve the result, fusion will be done. When combining the two on-line systems, which is not the case with the off-line systems. The best performance is obtained when fusing of all the systems is taken together. Rafal Doroz and Krzysztof Wrobel [26] present a new method of recognizing handwritten signatures, based on the mean differences, which has been modified appropriately. For modification they divide the signature into windows and calculate the similarities between the windows in which signature has a set of features with some values and the writing speed , pen pressure can be examined and at the last stage signature will analyze with reference with the feature.
3.2.1. Static Signature Recognition To detect the line strokes from signature image Kaewkongka, Chamnongthai and Thipakom [27] proposed Hough transform was used to extract the parameterized Hough space from the signature skeleton as a unique characteristic feature of signatures. Armand, Blumenstein and Muthuk kumarasamy [28] have used grouping of the Modified Direction Feature (MDF) in conjunction with additional distinctive features to train and test two Neural Network-based classifiers. A resilient back propagation neural network and a Radial Basis Function neural network were compared with a publicly available database of 2106 signatures results containing 936 genuine and 1170 forgeries, they obtained a verification rate of 91.12%.
26
Sabourin [29] used granulo metric size distributions for the description of local shape descriptors in challenge to characterize the amount of signal activity exciting each retina on the focus of a superimposed grid, then nearest neighbor and threshold-based classifier is applied to detect random forgeries. After the test total error rate of 0.02% and 1.0% was reported for the respective classifiers. Abbas [30] used a back propagation neural network prototype for the offline signature recognition along with NN he used feed forward neural networks and different training algorithms batch, Enhanced and Vanilla were used. Zhang [31] have proposed a Kernel Principal Component Self Regression (KPCSR) model for off-line signature verification and recognition problems. He reported FRR 92% and FAR 5%. A neuro-fuzzy system was proposed by Hanmandlu [32], they compared the perspective made by the signature pixels are computed with respect to reference points and the angle distribution was then clustered with fuzzy c-means algorithm. The system reported FRR in the range of 5-16% with varying threshold. S. Audet, P. Bansal, and S. Baskaran [33], designed Off-line Signature Verification and Recognition using Support Vector Machine and Support Vector Machine (SVM) was used to classify and verify the signatures. Justino [34] used a discrete observation HMM to detect random, casual, and skilled forgeries which has FRR of 2.83% and an FAR of 1.44%, 2.50%, and 22.67% are reported for random, casual, and skilled forgeries Dr H B Kekre and V A Bharadi approach a system which contains Different features and later those combined to improve accuracy of the final system. A morphological approach is also discussed, were evaluating the variation is signature pixels by calculating their locations [35]. Fang [36] developed a system that is based on the assumption that the cursive segments of forged signatures are generally less smooth than that of genuine ones. Two approaches are proposed to extract the smoothness feature: a crossing method and a fractal dimension method. Majhi, Reddy and Prasanna [37] proposed a morphological constraint for signature recognition, with center of mass of signature segments, and the signature was split again and again at its center of mass to obtain a series of points in the horizontal as well as vertical mode. The point sequence is then used as discriminating feature; the thresholds were selected separately for each person. They achieved FRR 14.58% and FAR 2.08%.
27
3.2.2. Dynamic Signature Recognition Online signatures are acquired using a digitizing tablet which captures both dynamic and spatial information about the writing. The signature is the hand-written document in a digital world, and is considered an acceptable and trustworthy means of authenticating all written documents. Dynamic Signature Verification authenticates the identity of individuals by measuring their handwritten signatures. The signature is treated as a series of movements that contain unique biometric data, such as personal rhythm, acceleration and pressure. Unlike electronic signature captures that are often used today, Dynamic Signature Verification does not treat the signature as a graphic image. Online signature recognition is referred as dynamic signature recognition Rhee and Cho [38] perform on line signature recognition Using Model Guided Segmentation approach for segment to segment comparison to obtain consistent segmentation. They used discriminative feature selection for skilled as well as random forgeries. They have reported EER 3.4 %. Using image invariant and dynamic features for On-Line signature recognition Abdullah and Shoshan [39] proposed the Fourier descriptors for invariance and writing speed which was used as dynamic feature. The multilayer perceptron neural network was used for classification. Due to the dynamic characteristics On-line signature recognition considers the dynamic signatures. Jain &Ross [40] have used critical points, speed, curvature angle as features, they used common as well as writer dependent thresholds, but it was observed that the writer dependent thresholds give better accuracy. Hence, they have reported FRR 2.8% and FAR 1.6 %. H B Kekre and V A Bharadi [41] they have used Gabor filters to extract the feature vectors of the dynamic signature. Based on feature vector verification and identification has done. Along with the feature vector they have derived the timing information of the signature as well. Gabor filters have been widely used for image, texture analysis. EER for TAR Vs TRR plot is 95% and the same for the FAR Vs FRR plot is 5%. With time stamp and the analysis shows that the Equal Error Rate Gabor Filter based Feature vector based classification is 90 %, for TAR Vs TRR plot, and it is 10 % for FAR Vs FRR without time stamp. Considering another approach Lei, Palla and Govindarajalu [42] have proposed a technique for finding correlation between two signature sequences for online recognition, they mapped the occurrence of different critical points on signature and the time scale and the correlation between these sequences was evaluated using a new parameter called Extended Regression 28
Square (ER2) coefficient the results were compared with an existing technique based on Dynamic Time Warping (DTW). They reported Equal Error rate (EER) 7.2% where the EER reported by DTW was 20.9 % with user dependent thresholds. Robust automatic on-line signature is proposed by S A Daramola and Prof. T S Ibiyemi [43]. The effectiveness of the signature depends on the robustness of the dynamic features used in the system. Verification is based on the average of all the distances obtain from the crossalignment of the features. The proposed system is tested with quality signature samples and it has a 0.5 % error in rejecting skilled forgeries while rejecting only 0.25% of genuine signatures. These results are better in comparison with the results obtained from previous systems. In [8], a new set of features are proposed of online and dynamic signature recognition. This feature was initially proposed for static systems. The successive geometric centers are modified for dynamic signature based systems. The feature vector extraction algorithm is applied to the pressure distribution templates created for the captured dynamic signature.V ABharadi [9] has proposed a Fusion based variants generated with Col/Row based Mean, Density with DC and sequences values of last Col/ Row and effect of Soft Biometrics feature set along with the above feature vectors is also studied. The Soft Biometrics features are collective to get the better result as Unimodal and Multi Algorithmic. On variable length segmentation & Hidden Markov Model (HMM) have proposed by Shafiei & Rabiee [44], they have proposed use of vector Quantization for feature vectors for signature recognition, VQ algorithms like KFCG, KMCG for generating the codebook for the scanned signature [45]. J. hasna [46] have proposed a neural network based prototype for dynamic signature recognition, for the verification they used Conjugate Gradient Neural Network (NN), and the FRR achieved was 1.6%. Dr. H B Kekre and V A Bharadi [47] proposed a preprocessing method based on the modified Digital Difference Analyzer (DDA).
3.3 Biometrics on Cloud Biometric systems offer the solution to ensure that the provided services are accessed solely by a legitimate user and no one else. Biometric systems identify users based on behavioral or physiological characteristics. The advantages of such systems over traditional authentication methods, such as passwords and IDs, are well known; hence, biometric systems are gradually gaining ground in terms of employment. As security is the primary business in using cloud computing fused biometric authentication technique which can be used as single sign on so that 29
the services can be more secure and reliable, and that biometric authentication is offered as a service by a cloud provider. In [48] the authors describe how computationally intensive biometric recognition can be performed on a mobile device by offloading the actual recognition process to the cloud. The authors have proposed a systematic approach for dividing a recognition operation and a bulk enrollment operation into multiple tasks, which can be executed in parallel on a set of servers in the cloud, and shown how the results of each task can be combined and post-processed for individual recognition. Peter Peer and Jernej Bule [49] have proposed a face recognition system on cloud, This paper tries to elaborate on the issues such as the most common challenges and obstacles encountered, when moving the technology to a cloud platform, standards and recommendations pertaining to both cloud-based services as well as biometrics, and existing solutions. It describes the most common pitfalls encountered in the development work and provides some directions for their avoidance. In [50] face recognition system (FRS) is proposed by Akshay A. Pawle, Vrushsen P. Pawar. Researchers have proposed a new System (FRS) which is based on biometric characteristics of user for proper authentication in cloud computing. This proposed new face recognition system (FRS) overcome all drawbacks of traditional and other biometric authentication techniques and enables only authorized users to access data or services from a cloud server. In [51] authors Dr. Vinayak Bharadi and Mr. Godson D’silva has proposed an architecture for implementing online signature recognition system on a public cloud like windows azure. They have discussed pitfalls of the exciting signature recognition systems such as they need a high configuration machine to perform multiple operations of feature vector extraction, enrollment and verification. The authors proposes , a highly scalable, pluggable and faster cloud based online signature recognition system, which is capable of operating on enormous amounts of data, which, in turn, induces the need for sufficient storage capacity and significant processing power. In [52] paper, authors C. N. Hoefer and G. Karagianni describes the available cloud computing services, and proposes a tree-structured taxonomy based on their characteristics, to easily classify cloud computing services making it easier to compare them. E. Kohlwey, A. Sussman, J. Trost, and A. Maurer [53] represents a paradigm system for generalized searching of cloud30
scale biometric data as well as an application of this system to the chore of matching a collection of synthetic human iris images.
3.4 Research Gaps From the above extensive literature review, it was found that the existing signature recognition systems need a high configuration machine to perform multiple operations of feature vector extraction, enrollment and verification. These implementations are generally standalone and implemented on a single server based architecture, in this case even a single point of failure may occur. The standalone application are not scalable. With the increasing number users the biometric implementation has to be scalable and capable of handling large datasets for a large population. The common set of problem face by the existing signature recognition are listed below.
Single point of Failure: The biometric system implementations are generally standalone and implemented on a single server based architecture. Hence, if any problem occurs in that system on which the biometric system are deployed such as a hard disk crash, OS corruption, etc., then the whole biometric system gets affected and becomes non-functional.
Scalable: With the increasing number of users the biometric system implementation has to be scalable and capable of handling large data sets for a large population, which in normal case is not.
Pluggable: When testing new releases or updates of the biometric system, it leads to impacting the existing biometric system implementation, increasing the chances of system downtime.
Faster: The existing biometric system is very slow in providing the results and its speed depends on the hardware configuration of the system on which is deployed. It becomes too slow when people try to combine the feature extraction techniques to give a better result.
Accuracy: The existing biometric system has a lagging in accuracy and consistency. When it has to be deployed in a highly sensitive biometric based organization with a huge number of users, the existing biometric system fails to meet up the standards of it.
31
Inconsistent Verification Process: The existing biometric system verification process shows some inconsistent behavior for the same users every single time. Sometimes it accepts the user based on his/her biometric trait and sometimes it rejects the same person based on his/her same biometric trait.
3.5 Summary This chapter presents the review of the research in three parts, the Biometric system, Signature recognition and lastly biometric on a cloud. Different papers are reviewed in all the three sections which gave understanding about different approaches and methodology followed by an author working in this area. The pros and cons of different approaches are understood and the research gap is entitled from it. In this research dynamic signature recognition is used as it gives more accuracy over the static signature recognition system. The dynamic signature recognition system implementations are generally standalone and implemented on a single server based architecture, in this case even a single point of failure may occur. The standalone application are not scalable. With the increasing number users the biometric implementation has to be scalable and capable of handling large data sets for a large population. Taking into consideration all the merits and demerits of an existing platform, the standalone approach of dynamic signature recognition [38] [39] [40] [41] [42] [43] [44] [45] [46] [47] is combined with the architecture proposed to build a cloud based online signature recognition system that has been studied [48] [49] [50] [51], the proposed system is aimed at the e-commerce transactions. In the next chapter the Design Methodology of the proposed system is discussed in detail.
32
Chapter 4 Design Methodology This chapter explains online signature recognition system with its various operations like capturing the signature from the digitizer, pre-processing the data and how feature extraction is done using successive geometric center depth 1 and depth2. Soft biometric features and the enrollment and training process are discussed here. Further the chapter goes on to elaborate the proposed architecture in two types, public and hybrid cloud based architecture. The proposed system is based on the public cloud based architecture. Here the general architecture and its major section working are explained. Various core operations such as the enrollment operation, verification operation and its two types i.e (live signature operation and match operation), blob storage operation and lastly matching operation are elaborated in depth here. The detection engine working with its different layers are explained here.
4.1 Online Signature Recognition System Signature recognition is one of the most important research areas in the field of identity recognition based on biometrics. The technology is also regarded as a front subject in many fields such as pattern recognition and signal processing. An important advantage of the signature recognition compared with other biometric attributes is its long tradition in many common commercial fields. Generally, there are two main types of signature recognition, namely off-line and on-line signature recognition. Off-line signature recognition deals with the analysis of the signature image alone. The major drawback of this type of signature recognition is that the signature image alone constitutes a limited database for analysis, difficult to make an effective determination of the validity of the signature. On-line signature recognition, in turn, consists of digitizing the signature as it is being produced. With this method, the information obtained will contain not only the signature image, but also time domain information, such as signing speed and acceleration. In addition, signing pressure can be recorded through handwriting pad. All this information can be combined to determine the validity of a signature much more effectively than what an off-line recognition system is capable of.
33
In this research work, online signature recognition is used because it acquires more information about the signature, which includes the dynamic properties of signatures. It can extract information about the writing speed, pressure points, strokes, acceleration as well as the static characteristics of signatures. This leads to better accuracy as dynamic characteristics are very difficult to imitate. For online signature recognition, a digitizer tablet or pressure sensitive pads are used to scan signature dynamically.
4.1.1 Capturing Data from the Digitizer Device In the context of this project, the system needs a digitizing tablet for capturing the signature in real time. So after research for the requirements for the signature recognition, the various digitizing tablet are like Bamboo fun, Cintique, Wacom and Bamboo, it was found with the same feature and the cost of the Wacom tablet was most suitable for the proposed model requirements. The Wacom Intuos 4 digitizer Specifications are listed below:
Active Area (W x D) 157.5 x 98.4 mm
Connectivity-USB connectivity
Pressure levels -2048
Sensor pen without battery
Report rate -200 points/sec.
LPI –lines per inch -5080 lpi.
Minimum ON weight (Minimum weight sensed by the pen tip) – 1Gram.
Figure 4.1: Digitizer tablet
34
The Wacom Intuos 4 digitizer along with conventional parameters this device also gives Z-coordinate of the tip of the pen while signing, this enables to capture X, Y, Z co-ordinates of the signature in a 3 dimensional space. Microsoft Visual C# (.NET Framework 4.5) is used for interacting this device. Developing such interfacing application requires programming using the Component Object Model (COM) as well as .NET assembly programming [24]. The Wacom provides a driver to install this device on the operating system. The designed interface captures all the above mentioned features in the application for signature recognition. The data coming from the device is coming in the form of data packets. The captured data packet consists of the following features: 1. X, Y,Z co-ordinates of the pen tip. 2. Pressure – pressure applied at the point. 3. Tangent Pressure – tangent pressure of the tip. 4. Azimuth – pen tip azimuth (corresponding to tip angle). 5. Altitude – tip altitude corresponding to the different tip of a pen. 6. Packet Serial - packet serial number. 7. Packet Timing – timestamp. Some of the captured features are displayed in Figure. 4.2 (a) Captured pen strokes at different pressure levels are shown in Figure. 4.2 (b).
(a)
(b)
Figure 4.2: (a) Captured Packed Data from Wacom Intuos 4 (b) Captured Pen Strokes & Signature
35
Figure. 4.3 Shows some the plot for the data captured for signature given.
Figure 4.3: Signature Feature Plot for Multidimensional features- X,Y,Z Co-ordinates, Pressure Azimuth & Altitude parameter
4.1.2 Pre-processing Signature Data When the data comes from the hardware, it’s raw and the system has to preprocess to normalize the errors due to sampling, quantization, speed of hardware, signing position, etc. Doroz and Wrobel [25] have discussed this issue and proposed a technique of sampling the point uniformly to have equal number of points per unit time. As the digitizer has a finite rate of sampling and data transfer, it cannot capture all the points on a curve, but captures finite points as per the sampling rate. This gives the results as shown in Figure. 4.4. There is loss of continuity in the captured points; there is a static scanned signature as shown in Figure 4.4(a), same persons dynamic signatures are shown in Figure 4.4(b), (c), (d); different colors indicate different pressure levels as shown in Figure 4.4 (e). It can clearly observe the points that are sampled. If the signing speed is high then the captured points are less. One such situation is shown in Figure 4.5. This causes loss of precision in the input data and may result in decreased accuracy of the matching algorithm.
36
Figure 4.4: Signature Samples of a Person (a) Static Scanned Signature, (b)(c),(d) Dynamic Signature Scanned by Wacom Intuos 4 (e) Pressure Levels for the Dynamic Signatures shown.
Figure 4.5: Poorly Sampled Signature Due to High Signing Speed
To solve this problem a method to calculate missing point’s information and reduce the sampling error has been proposed by the authors. This method interpolates the captured points and calculates the missing information without loss of consistency. The method proposed is 37
not same as normal interpolation as the missing points are on a curve and accuracy should be preserved in calculation as these points are on a biometric signature of a human being which can be used for authentication or authorization.
4.1.3 Successive Geometric Center based Feature Extraction The successive geometric centers of signature are used as a global parameter in the development of a signature verification system. This parameter is derived from the center of mass of an image segment. The term ‘Successive’ is used to describe the nature of the feature extraction process. The process is used recursively to generate the set of points called as geometric centers. Then finding the center of mass of given signature template and divide the template in two parts at the center of mass, this process is repeated until the specified number of points are returned. This algorithm requires splitting of image for four times and extracting 24 points in each mode, this will be discussed in detail in the coming part. This process requires signature template of larger size to capture the details properly. After testing various sizes this feature has been implemented using size of 320*240 pixels. Everything else is same. After pre-processing, scaling is performed on the signature to get the required size. 4.1.3.1 Successive Geometric Center Depth 1 Signature template that is to be segmented is pre-processed. The binary image that is obtained after pre-processing as described in previous sections. The pre-processing stage removes the noise in the template and a normalized binary template is produced.
Geometric center:
The geometric center is giving the idea of the distribution of pixels. Physically, it is the point where the center of mass of an object is located. For an image the geometric center is defined by Cx, Cy Where,
x max y max x b[ x, y ] x1 y 1 Cx x max y max b[ x, y] x 1 y 1
38
(4.1)
x max y max y b[ x, y ] x1 y 1 Cy x max y max b[ x, y] x1 y 1
(4.2)
For the current case consider a normalized signature template of 320*240 pixels size. Now to develop the new set of feature point lets adopt a method of splitting the template at the geometric centers and finding the center of mass of the two segments obtained after splitting. The system performs the splitting in two manners once horizontally and once vertically. The algorithm generates two sets of points based on Vertical splitting mechanism and Horizontal splitting Mechanism.
Feature points based on vertical splitting
Six feature points are retrieved based on vertical splitting. Here feature points are nothing but geometric centers. The procedure for finding feature points by vertical splitting is mentioned in algorithm below. Algorithm: This is the procedure for generating feature points based on vertical splitting. Input: Static signature image after moving the signature to center of the image Output: v1, v2, v3, v4, v5, v6 (feature points) (a) Split image with a vertical line at the center of the image that will provide left and right parts of image. (b) Find geometric centers v1 and v2 for left and right parts correspondingly. (c) Split left part horizontal line at v1 and find out geometric centers v3 and v4 for top and bottom parts of left part correspondingly. (d) Split right part horizontal line at v2 and find out geometric centers v5 and v6 for top and bottom parts of left part correspondingly. Figure. 4.6 shows the feature points retrieved from signature image. These features are to be calculate for every signature image in both training and testing.
39
Figure 4.6: Feature points retrieved from signature image by vertical splitting
Feature points based on horizontal splitting
Six feature points are retrieved based on horizontal splitting. Here feature points are nothing but geometric centers. The procedure for finding feature points by horizontal splitting is mentioned in algorithm below. Algorithm: This is the procedure for generating feature points based on horizontal splitting. Input: Static signature image after moving the signature to center of the image Output: h1, h2, h3, h4, h5, h6 (feature points) (a) Split image with a horizontal line at the center of the image that will provide top and bottom parts of image. (b) Find geometric centers h1 and h2 for top and bottom parts correspondingly. (c) Split top part with vertical line at h1 and find out geometric centers h3 and h4 for left and right parts of top part correspondingly. (d) Split bottom part with vertical line at h2 and find out geometric centers h5 and h6 for left and right parts of left part correspondingly. Figure. 4.7 shows the feature points retrieved from signature image. These features are to be calculate for every signature image in both training and testing. Now total twelve feature points (v1;:::; v6 and h1;:::;h6) are calculated by vertical and horizontal splitting.
40
Figure 4.7: Feature points retrieved from signature image by horizontal splitting
4.1.3.2 Successive Geometric Center Depth 2 The previous concept was described for a splitting depth of one to give set of six points. The system extends this concept to a splitting depth of two so that total 24 points are generated for this it apply the splitting algorithm to the four subparts obtained by splitting depth 1. This operation is illustrated in Figure 4.4 for vertical splitting and in Figure 4.5 for horizontal splitting By this procedure it obtain total 48 feature points (v1;;;v24 and h1;;h24). This set of point can be used for comparing two signatures. For comparison purpose system uses coefficient of an Extended Regression Square (ER2). Defined as R-squared is also called the coefficient of determination. It can be interpreted as the fraction of the variation in Y that is explained by X. R-squared can be further derived as:
ER 2
M n x ji X j yij Y j j 1 i 1
x ji X j M
n
j 1 i 1
2
y M
n
j 1 i 1
ji
2
Yj
2
(4.3) n = Number of dimensions (For current scenario there are two dimensions) xi = Points for first sequence yi = Points for second sequence Where X, Y two sequences to be are correlated each of two dimensions.
41
For perfect regression the value of ER2 is 1, hence for signature from the same person the value should be close to 1. As discussed above, the system finds the 48 feature points for each signature in the comparison and find the extended regression coefficient. Figure 4.8 & 4.9 shows the result of vertical & horizontal splitting, this is Depth 2 splitting. In case of Dynamic signature this concept is expended to the pressure distribution of signature.
Figure 4.8: Feature points retrieved from signature by vertical splitting of depth 2
Figure 4.9: Feature points retrieved from signature by horizontal splitting of depth 2
Figure 4.10: Feature points retrieved from Dynamic Signature Pressure Distribution by Horizontal (Green Points) & Vertical splitting (Orange Points) of Depth 2
42
While calculating the center of the center of mass, consider the pressure points as a system of particles. In the case of a system of particles Pi, i = 1, …n , each with pressure pi that are located in space with coordinates ri, i = 1, …, n , the coordinates R of the center of mass satisfy the condition n
p (r R) 0 i 1
i
i
(4.4)
Solve this equation for R to obtain the formula R
1 n pi ri P i 1
(4.5) Where P is the sum of the pressure of all of the particles. This method combined with the static signature Geometric Center calculation and similarly the points are determined. This plot is shown in Figure 4.10. In this research work, the geometric center depth 2 method is used for feature extraction.
4.1.4 Soft Biometric Features Soft biometrics traits are physical, behavioural or adhered human characteristics, classifiable in pre-define human compliant categories. These categories are, unlike in the classical biometric case, established and time proven by humans with the aim of differentiating individuals. Soft biometric traits are defined as characteristics that provide some information about the individual, but lack the distinctiveness and permanence to sufficiently differentiate any two individuals [26]. In other words soft biometric traits are created in a natural way, used by humans to distinguish their peers. In Online signature recognition apart from using the successive geometric center for feature extraction, soft biometric features such as number of pixels, Arc length, signature length and baseline shift are used. The soft biometrics includes physical and behavioral traits of human which are listed below: 1. Physical: skin color, eye color, hair color, presence of beard, height and weight. 2. Behavioral: gait, keystroke. 3. Adhered human characteristics: clothes color, tattoos, accessories. This system is going to use soft biometric features of signatures for performance improvement of the proposed feature vector generation method. 43
4.1.5 Enrollment & Training Process Handwritten signatures are biometric attribute of a human and they have a certain amount of intra-class variation. While recognizing of verifying a signature this thing must be taken care of. In current signature recognition system the developed enrollment process tries to create a reference record based on the intra-class variation of a specific person’s signature.
Figure 4.11: Data flow diagram for training
The Signature features are based on geometric properties. So the system used Euclidian distance model for classification. This is the simple distance between a pair of vectors of size n. Here vectors are nothing but feature points, so the size of vector is 2. In threshold calculation these distances are useful.
Euclidean distance model
Let A(a1;a2; :::;an) and B(b1;b2; :::;bn) are two vectors of size n. To calculate distance (d) by using equation 4.6. n
distance(d)=
(a b ) i
i=1
44
i
2
(4.6)
In our application, vectors are points on plane. So d is the simple distance between two points. 4.1.5.1 Successive Geometric Centers Threshold & Training The successive geometric centers are obtained by splitting the signature horizontally & vertically at the gravitational centers. This process thus yields total 2 sets of 24 points each. Each point is a pixel hence has X-coordinate and Y-coordinate. The method discussed earlier is applicable for a single feature vector, but here the feature vector has X & Y component. Hence the system apply this method to X-vector and Y-vector separately. Hence for Horizontal and vertical splitting there are total four vectors with 24 points in each. For horizontal splitting the system has GCHX0 to GCHX23 and GCHY0 to GCHY 23, similarly for vertical splitting it has GCVX0 to GCVX23 and GCVY0 to GCVY
23
and it finds the
individual threshold for vertical and horizontal splitting. The system have such 7 training signature, successive geometric centers are calculated for these training signatures and the data extracted is used for threshold calculation. Let n signatures are taking for training from each person. There are 48 feature points from each original signature, 24 are taken by vertical splitting (Section8.2) and 24 are taken by horizontal splitting (Section 8.3). Individual thresholds and patterns are calculated for vertical splitting and horizontal splitting. Pattern points based on vertical splitting are shown below. Vpattern0 = median(v1;0;v2;0; :::;vn;0) Vpattern1 = median(v1;1;v2;1; :::;vn;1) vpattern2 = median(v1;2;v2;2; :::;vn;2) vpattern3 = median(v1;3;v2;3; :::;vn;3) (4.7) ↓ vpattern23 = median(v1;23;v2;23; :::;vn;23) Where vi;0;vi;1; :::;vi;23 are vertical splitting features of ith training signature sample. Threshold based on vertical splitting is shown below.
vthreshold
23 i 0
(vdavg v , i ) 2
(4.8)
In equation 4.8 vdavg;i is same as average distance and v;i is same as standard deviation. Pattern points based on horizontal splitting are shown below. 45
hpattern0 = median(h1;0;h2;0; :::;hn;0) hpattern1 = median(h1;1;h2;1; :::;hn;1) hpattern2 = median(h1;2;h2;2; :::;hn;2) hpattern3 = median(h1;3;h2;3; :::;hn;3) (4.9) ↓ hpattern23 = median(h1;23;h2;23; :::;hn;23) Where hi;0;hi;1; :::;hi;23 are horizontal splitting features of ith training signature sample. Threshold based on horizontal splitting is shown below.
hthreshold
23 i 0
(hdavg h , i ) 2 (4.10)
The system will store pattern points and thresholds of both horizontal splitting and vertical splitting. These values are useful in testing.
4.1.6 Classification Whenever a new signature comes for testing the system has to calculate features of vertical splitting and horizontal splitting. Feature points based vertical splitting are vnew0, vnew1, vnew2, vnew3, vnew4…….. vnew23. Distances between new signature features and pattern feature points based on vertical splitting are shown below. vdnew0 = distance(vpattern0;vnew0) vdnew1 = distance(vpattern1;vnew1) vdnew2 = distance(vpattern2;vnew2) vdnew3 = distance(vpattern3;vnew3)
(4.11)
vdnew4 = distance(vpattern4;vnew4) ↓ vdnew23 = distance(vpattern23;vnew23)
For classification of new signature the system have to calculate vdistance and compare this with vthreshold. If vdistance is less than or equal to vthreshold then new signature is acceptable by vertical splitting. 23
vdistance
vd
2 new, i
i 0
46
(4.12)
Feature points based vertical splitting are hnew;0, hnew;1, hnew;2, hnew;3, hnew;4;…. hnew;23. Distances between new signature features and pattern feature points based on vertical splitting are shown below. hdnew0 = distance(hpattern0;hnew0) hdnew1 = distance(hpattern1;hnew1) hdnew2 = distance(hpattern2;hnew2) hdnew3 = distance(hpattern3;hnew3)
(4.13)
hdnew4 = distance(hpattern4;hnew4) ↓ hdnew23 = distance(hpattern23;hnew23)
For classification of new signature calculate hdistance and compare this with hthreshold. If hdistance is less than or equal to hthreshold then new signature is acceptable by horizontal splitting. 23
hdistance
hd
2 new, i
(4.14)
i 0
New signature features have to satisfy both vertical splitting and horizontal splitting thresholds. Like this total 24*2 different feature points are there for both vertical and horizontal splitting based on average distance (davg) and standard deviation (σ). Equation 4.15 shows the main formula for threshold.
threshold (t )
23
davgi
2
i
i 0
(4.15)
Using the above formula shown in equation 4.15, calculate thresholds for the X and Y components of the geometric centres, which gives four thresholds as shown below Geometric center horizontal threshold – X: GCHXTH Geometric center horizontal threshold – Y: GCHYTH Geometric center vertical threshold –
X: GCVXTH
Geometric center vertical threshold –
Y: GCVYTH
The system use these four thresholds with four set of median vectors (Vpatternx, Hpatternx, Vpatterny, Hpatterny) as decision parameters of the detection engine.
47
4.2 Proposed Architecture The proposed architecture is in two types, a public cloud based architecture and a Hybrid cloud based architecture [51]. In this project the public cloud based architecture is discussed in particular. Figure 4.12 & 4.13 shows proposed architectures, as the biometric data is sensitive, extreme care must be needed in storing the biometric traits as well as feature vectors. The first model is as shown in Figure 4.8 and proposed for online SRS is based on Public cloud, where the Biometric Authentication SaaS is running on a public cloud framework such as Microsoft Azure or Amazon Web services, the service as well as data storage is on the cloud. The various clients such as the mobile applications, desktop application, Bank server etc for enrollment or verification of biometric data are going to call the biometric authentication web service running on the cloud, this web service will perform the required computation as per request received. If the request received is an enrollment request, it will calculate the feature vector of the biometric data and store the resulted feature vector data in the cloud database itself or if the request is a verification request, it will calculate the feature vector of the biometric data and then compare it with the feature vector stored in the cloud database. If a match is found, then it will return a success response or else it will return a failure response.
Figure 4.12: Proposed Biometric Authentication Models on Public Cloud based Model [51]
48
The public cloud based model is a low cost and has a flexible implementation, but as the data is stored on a public cloud, it has risk involved. Many companies hesitate to preserve their mission critical information into the cloud which belong to a third party supplier. To solve this issue of data insecurity of mission critical data, a hybrid cloud based model is proposed as illustrated in Figure 4.12. The Figure 4.13 depicts a more secure model based on a hybrid cloud architecture, where the data is stored on private cloud and the Biometric Authentication SaaS is moving along a public cloud. This model is secure and flexible, but complexity is high and has a high implementation cost. The hybrid model is not suited for small scale application.
Figure 4.13: Proposed Biometric Authentication Models on Hybrid Cloud based Model [51]
49
The proposed models base on Public Cloud will improve the flexibility and scalability of the Dynamic Signature Recognition Systems as discussed in [38] [39] [40] [41] [42] [43] [44] [45] [46] [47]. The usability of the Dynamic Signature Recognition systems will be improved when it is hosted as a Cloud Service as many other applications can consume the web service over the internet. The scheme is also becoming capable of managing the high computational load as well every bit gain in storage requirement.
4.3 Proposed System Over the following few years the amount of biometric information being at the disposition of the assorted agencies and authentication service providers is expected to rise significantly. Such quantities of data require not only enormous amounts of storage, but unprecedented processing power as well. To be capable to face this future challenges more and more people are looking towards cloud computing, which can address these challenges quite effectively with its seemingly limitless storage capacity, rapid data distribution and parallel processing capabilities.
4.3.1 General Architecture The General architecture as shown in Figure 4.14, is split into four major sections which are identified as follows:
Signature Capturing: The user sign on the WACOM digitizer tablet, which record all the pressure level of the signature. And this pressure levels of the signatures are send to the windows forms application. The signed signature is recorded along the windows form application. The theme songs are composed into a .txt file.
Blob storages: There three blob storage are used in proposed architecture, named as enroll blob storage, verify blob storage and live blob storage. The enroll blob storage is used to store the signature during the enrollment process of the user. The verify blob storage is used to store the extracted feature vector of the enrolled signatures. The live blob storage is used to store the live signature during the live verification process.
Service Bus Queues: Service bus queue are communication mechanism between the windows form application and the different worker roles. There are three service bus queue, named as the enroll service bus, live service bus and verify service bus. The enroll service bus passes the enroll container name as a messages from the windows form application to 50
the enroll worker role. The live service bus passes the live userid and type of verification as a message from the windows form application to the verify worker role. The verify service bus passes the confidences of the signature match as a messages from the verify worker role to the windows form application.
Worker Roles: Worker role is used to perform the high computational processing of calculating the feature vector, matching the various feature vectors for verification of the user, generating the resulted threshold factor, determining whether the resulted threshold factor satisfies the minimum requirement or not and sending the appropriate solution. There are two worker roles, one is the enroll worker role and another is the verify worker role.
The enroll worker role pull down the messages from the enroll service bus queue, downloads the signature files from the enroll blob storage, does the feature vectors extraction mechanism on this downloaded signature and stores this extracted feature vector file in the verify blob storage.
Figure 4.14: General Architecture
51
The verify worker role for live signature verification, pull down the messages from the live service bus queue, downloads the signature file from the live blob storage, performs feature vector extraction mechanism on the downloaded signature, downloads the feature vector files of the user id specified in the message, perform a matching operation between the downloaded feature vector files and the extracted feature vector file and the appropriate confidences are send as a message to the windows form application by the verify service bus. The verify worker role for match signature verification, pull down the messages from the live service bus queue, downloads the feature vector files of both the user id specified in the message from the verify blob storage, perform a matching operation between both this feature vector files and the appropriate confidences are send as a message to the windows form application by the verify service bus.
4.3.2 Enrollment Operation Enrollment operation as shown in Figure 4.15, consists of capturing the signatures from the WACOM digitizer tablet and uploading this signature in the enroll blob storage in the cloud. It also consists of background processing of calculating feature extraction done by the worker role on the uploaded signatures. After feature extraction, storing this calculated feature vector files in the verify blob storage.
Figure 4.15: Enrollment Operation
52
The enrollment operation consists of the following steps: 1. Capture the signatures from the WACOM digitizer tablet. 2. Uploads the signatures to enroll blob storage. 3. After storing the signature in the blob storage, the container name is send as a message to the enroll service bus queue. 4. The enroll worker role pulls down the message from the enroll service bus queue. 5. The enroll worker role points to the respective blob storage container as specified in the enroll service bus queue message. 6. Signature files are downloaded from the respective blob storage container. 7. Feature vector extraction mechanism is performed on each of this signatures. 8. The resulted feature vector file of each signature is uploaded to the verify blob storage.
4.3.3 Verification Operation The Verification operation is verifying whether the signature is a valid one or not. This is done by calculating the feature vector of the verifying signature or downloading the calculated feature vector of the specified userid and then comparing this feature vector with the feature vectors stored in the blob storage of the in question userid. The matching percentage of this two feature vectors is send as a confidences to the windows form application. The verification operation as shown in Figure 4.16 & 4.17, consists of two type of verification operation as mentioned below: 1. Live Signature Verification operation. 2. Match Signature Verification operation. 4.3.3.1 Live SignatureVerification Operation In the Live signature verification operation, the signature to be verified is captured on the windows form application through the Wacom digitizer tablet. Once the signature is captured the user id of the verifying person is inserted in the form and the live signature button is clicked. Upon the click event, the signature data is uploaded to the live blob storage. Once the uploading of the signature is done, a message which consist of the user id and the operation type are send to the live service bus queue.
53
Figure 4.16: Live Verification Operation
The verify worker role pull down the message from the live service bus queue, point to the respective live blob storage container and downloads the signature file. After that calculates the feature vector of the uploaded signature file and match it with the in question feature vector file by downloading it from the verify blob storage. The matching confidence level is send to the windows form application through the verify service bus queue. This confidences level is displayed to the user on the windows form application. Its illustrated verification operation as shown in Figure 4.16 The live verification operation consists of the following steps: 1. Captures the signature from the WACOM digitizer tablet. 2. Upload the signature to the live blob storage. 3. After storing the signature in the blob storage, the User Id and type of verification operation is send as a message in the live service bus queue. 4. The verify worker role pulls down the message from the live service bus queue. 5. The verify worker role checks the user id and type of verification operation to be performed, with respect to this calls the appropriated method.
54
6. The verify worker role then points to the respective blob storage as specified in the live service bus queue message. 7. Signature files are downloaded from the respective blob storage container and calculates the feature vector for the downloaded signature. 8. The worker role than points to the blob storage container which is specified in the user id message. 9. The respective feature vector files are downloaded from the verify blob storage. 10. Matching operation is performed on the Calculate the feature vector and downloaded feature vector and based on that a threshold or confidence percentage is generated. 11. This confidence percentage is sent to the verify service bus queue. 12. The verify service bus queue sends this confidence to windows form application which is being displayed on the form. 4.3.3.2 Stored SignatureVerification Operation In the Stored signature verification operation, both the user Ids of the users whose signatures are to be verified with each other are entered in the textbox. Once the user Ids are entered and the match signature button is clicked, both the user Ids entered in the textbox are inserted in a message and send to the live service bus queues.
Figure 4.17: Stored Signature Verification Operation
55
The verify worker role pull down the message from the live service bus queue, points to the respective blob storage containers and downloads the feature vector files. After that the matching operation is performed which gives the resulted threshold or confidences level. This confidence level is send to the windows form application through the verify service bus queue and is displayed to the user on the windows form application. Its illustrated stored signature verification operation as shown in Figure 4.17 above. The stored signature verification operation consists of the following steps: 1. Enter both the user Id of the users in the textbox provided on the windows form application. 2. After clicking on the match signature button, both this user Ids are send as a message to live service bus queue. 3. The verify worker role pulls down the message from the live service bus queue. 4. The verify worker role checks the user id and type of verification operation to be performed, with respect to that, calls the appropriated method. 5. The verify worker role then points to the respective verify blob storage containers as specified in the live service bus queue message. 6. The feature vector files are downloaded from the respective blob storage containers. 7. Matching operation is performed on this both downloaded the feature vector files and based on that a threshold or confidence percentage is generated. 8. This confidence percentage is sent to the verify service bus queue. 9. The verify service bus queue sends this confidence to windows form application which is being displayed on the form.
4.3.4 Blob Storage Operation In the blob storage there are two storage accounts created one which store the signatures at the time of enrollment operation that is named as the enroll blob storage account and the other which stores the calculated feature vector of the signatures at the time of verification operation is addressed as the verify blob storage account. As per the number of enrolled users, there will that many numbers of containers created in both enroll and verify storage account.
56
Figure 4.18: Blob storage account
During the enrollment operation ten signatures of the users are taken on the digitizer tablet, this ten signatures are sent to the windows form application which uploads them to the enroll blob storage as shown in the Figure 4.18. Then the feature vector extraction is performed on these signatures and the resulted feature vector files are stored in the verify blob storage as shown in the Figure 4.18. During the verification operation the user signs on the digitizer tablet, this signature is sent to the windows form application which uploads the sign to verify in the verify blob storage.
4.3.5 Matching Operation The Matching Operation is performed by the Detection engine running inside the verify worker role. The detection engine takes the input from the verify blob storage and live blob storage and performs the required computation to give the output. The output of the detection engine is send to client via a verify service bus queue. The detail working of the detection engine is explained below. Detection engine is actually group of functions arranged in layers for comparison and decision purpose. It is this block which actually compares the signatures and decides the authenticity. The system has implements a user level thresholds and cluster based features for deciding authenticity of the signatures.
57
Figure 4.19: Block Diagram of Detection Engine
This engine works in signature verification mode. In signature verification mode this engine compares the test signature against known person’s signature to check authenticity of the signature for a specific person. This is like a banking operation of authenticity verification of a signature on a document like cheque. The block diagram of this module is as shown in Figure 4.19. This module consists of comparator function, weighted sum evaluation and decision arranged in layers as shown below. The module takes input from the pre-processing block & the blob storages. The test signature is first preprocessed to find the feature vector as discussed in the above. The system extract a feature vector consisting of 20 feature points. Geometric Centers consists of total 48 points; 24 points given by horizontal and vertical splitting each. This feature vector is then compared with the median vector for the person.
58
4.3.5.1 Feature Vector Generation The system classify the signature based on the Euclidian distance between the feature vector of a test signature as described in above section and the median vector of the persons signature set used for training. The Online signature system take 10 training signatures, out of which 7 signatures are used for global feature extraction, 8th signature is used to find the intra-group distance for the cluster features. Using this mechanism the system get distance points for the cluster features, from this data it calculate intra-group distance medians and intra-group distance threshold. This signifies the variation of the cluster based parameter within the training samples. Hence for the test signature the system have to compare the distance rather than the feature vector itself for the cluster based features. Keeping this point in mind it calculate the distance for the cluster based features. The system store all the pre-processed training samples and the extracted features per sample in the verify blob storage. These feature set is fetched from the verify blob storage; thus for each of the cluster feature it have ten sets of feature points. The engine extract the cluster feature for the test signature and find the distance of each cluster feature vector with the seven reference signature cluster feature vector. This process gives us seven distance values for seven signatures (dCi). These values are stored in an array. Each of this distance is then compared with median of corresponding cluster feature median d cmi, this operation gives set of distance for cluster feature, this has ten elements, each element is a distance with the median, and this can be compared with the threshold for corresponding cluster feature. For qualification of the feature this distance should be less than the threshold for the distance for that specific cluster feature. Using this method the system can compare the cluster based features which are actually collection of coefficients or matrix form. This forms a feature vector array for the cluster feature. This is implemented in the layer -1 of detection engine. The algorithm is as follows.
Algorithm to generate feature vector for a test signature 1. Load Feature vector of the test signature from feature extraction module. 2. Load Reference Medians (RMD) & Distance thresholds (dth) for the current person from the blob storage.
59
3. Load the feature vectors of the training signature. Read total ten vectors for ten training samples. 4. Extract the global features which results in total 10 feature points which are extracted. 5. For each cluster feature (Ci) find the distance between the test signature cluster feature (Ci-test) & the training signature cluster feature vector (Ctrain-ij ) such ten signatures are there hence 1