Document not found! Please try again

LNCS 2739 - The Service to Businesses Project ... - Springer Link

4 downloads 818 Views 130KB Size Report
autonomously managed by different agencies that, historically, have never ... The Services to Businesses project, currently under development as part of the.
The Service to Businesses Project: Improving Government-to-Business Relationships in Italy Marco Bertoletti1 , Paolo Missier2 , Monica Scannapieco3 , Pietro Aimetti1 , and Carlo Batini2 1

Gruppo Clas {m.bertoletti,p.aimetti}@gruppoclas.it 2 Universit` a di Milano “Bicocca” [email protected] [email protected] 3 Universit` a di Roma “La Sapienza” [email protected]

Abstract. The paper describes the main ideas and results of a project improving the government to business relationships in Italy by a cooperative architecture based on a Publish & Subscribe communication paradigm. Keywords: e-Government, data quality, electronic service delivery

1

Introduction

In 1999, the Italian Public Administration started a pilot project, called Services to Businesses, which involves extensive data reconciliation and cleaning, as well as business process re-engineering, aimed at enhancing the relationship between citizens, businesses and the government agencies. The initial focus was on simplifying the large number of transactions required for a business to register itself with various agencies, as well as to update their existing registry entries. Complicating the project is the fact that similar information about one business is likely to appear in multiple databases, each autonomously managed by different agencies that, historically, have never been able to share their data about the businesses. Furthermore, many errors were present in those databases, causing mismatches among different records that refer to the same business. Because of these complications, the comprehensive approach chosen for the project followed two main strategies, aimed at improving the state of existing business data and at maintaining correct record alignment for all future data: 1 A “one stop shop” approach was followed to simplify the life of a business and to ensure the correct propagation of its data. In this approach, one single agency is selected as a front-end for all communication with the businesses. Once the information received by a business is certified, it is made available to other interested agencies through a Publish & Subscribe (P&S) mechanism. R. Traunm¨ uller (Ed.): EGOV 2003, LNCS 2739, pp. 468–471, 2003. c Springer-Verlag Berlin Heidelberg 2003 

The Service to Businesses Project

469

2 Extensive record matching and data cleaning was performed on the existing business information, resulting in the reconciliation of a large amount of business registry entries.

2

The Services to Businesses Project: A Cooperative Architecture

The Services to Businesses project, currently under development as part of the Italian e-Government initiative, aims at offering businesses a way to interact efficiently with central and local agencies in the Italian Public Administration (PA). Three central agencies are involved, namely the social security agency called INPS in the following, the accident insurance agency, called INAIL, and the chambers of commerce, called CoC. Agencies operate autonomously and they usually require accessing each other’s data and services in order to fulfill business goals. Although this calls for inter-agency cooperation, the complexity of their organization and of their legacy information systems makes the migration to new and open systems impractical. The approach to cooperation among agencies followed in Italy to address this problem is based on the concept of Cooperative Information Systems (CIS), i.e., systems capable of interacting by exchanging services with each other. The general cooperative architecture for the Nationwide CIS network of the Italian P.A., is specialized to the Services to Businesses project in Figure 1. Besides transport and basic services, a cooperative services layer is shown, including application protocols, repositories, gateways, etc.. Each administration defines the set of cooperative interfaces that include data and application services available to other systems. The general architecture supports cross-administration applications that are assembled using those interfaces. In addition, for the purpose of the project, two specific features have been implemented. First, a central database (DB) has been created to manage all the records resulting from linking the identifiers of independent business records that pertain to individual agencies. This new repository provides agencies with a unified view of Italian businesses. The process of creating this new repository from the existing databases is sketched in the upcoming section. Second, an event notification service has been deployed in order to guarantee synchronization between the new unified view and the independent views that each administration still maintains. The event notification service has been implemented according to a Publish&Subscribe communication paradigm. The combination of these two extensions results in a clean base of business data whose high quality can be sustained over time. A number of administrative processes were re-engineered in order to take advantage of this architecture, following the main criteria of moving away from multiple front end transactions (business-to-agency), and towards a single front end transaction plus a number of back end transactions (agency-toagency) that are supported by the cooperative architecture. In particular, specific agencies have been selected as front end entry points to businesses. Once these selected agencies accept and certify the quality of the incoming information, the messaging service (e.g., notification) is used to propagate it to other interested

470

M. Bertoletti et al.

Fig. 1. A cooperative architecture for the Services to Business project.

domains. Results have been achieved in terms of reduction of per-transaction cost, reduction in the total number of transactions, and increased quality of the information acquired by the agencies. 2.1

Building the DB of Linked Internal Identifiers: Data Integration and Cleaning

In this Section we provide an overview of the steps followed to produce the DB of linked internal identifiers. Similar to what is done in the context of data warehousing, the project includes both schema and data integration, the latter requiring a complex record linkage activity. Six databases were included in the project. INPS provides four different databases, each representing a different view on the collection of all the owners and employees that are subject to Social Security regulations. INAIL contributes its single database of employees that are subject to work insurance obligations, and finally the CoC database is the official registry of all Italian businesses. In order to determine the project scope and to identify the useful database pairs, the following key assumptions were used in the project. The main assumption is that all entities (and the corresponding extensions) in the CoC are considered in the integrated schema, since the database of the CoC is considered the official registry of businesses by law. Second, it was decided that an entity (and the corresponding extension) is within the scope of the project, if it appears in at least two of the schemas. The general strategy for record linkage is the following. For each record, our goal is to determine which entity in the integrated schema it belongs to, by linking it to records in the other DBs. When linking is not feasible, it may

The Service to Businesses Project

471

still be possible to explain the record, i.e., to determine that it refers to a valid business. Thus, unexplained records are those that cannot be traced at all to a real business. The overall methodology consists of the following steps: 1 Build the conceptual schemas of INPS, INAIL, CoC; 2 Build the integrated conceptual schema, that, due to the above assumptions, results from the integration of the CoC conceptual schema and the common part between INPS and INAIL conceptual schemas; 3 Perform the record linkage activities, in order to identify and link all records that potentially refer to the extension of the integrated conceptual schema. Figure 2 reports the final results of the record linkage process. The first column shows the total amount of records, that are then broken down into deleted (i.e., no longer active) and active Economic Agents. The fourth column gives the number of matched records, while the fifth one reports the unmatched but explained residuals, and the last column reports the unexplained residual records. It may be noted that the percentage of unexplained residuals is fairly low. The CoC database contains no unexplained records because it is considered as the benchmark. For the other databases, the number of unexplained residuals ranges from 0,15 millions to 0,2 millions in both INPS and INAIL databases.

# of # of # of # of records records active matched deleted records records

Residual records explained

Residual records not explained

CoC Business DB

6.6

0.7

5.9

3.2

2.7

0.0

INPS Business DB

2.5

0.2

2.3

1.95

0.2

0.15

INPS Owners DB

4.0

0.0

4.0

3.5

0.3

0.2

INAIL Business DB

7.1

3.7

3.4

3.1

0.15

0.15

Total

20.2

4.6

15.6

11.75

3.35

0.5

Fig. 2. Final results from the linkage process.

3

Conclusions

The major contribution of the paper is a solution for improving G2B relationships, that has been adopted in a real project and implemented in the context of an e-Government initiative. The idea of putting specific agencies in charge of the quality of specific information, and then having them propagate the information using a P&S system, can be generalized and adopted in similar contexts.