model for implementing a risk management data warehouse that could be ..... cargo scanning at the assigned ports with the goal of managing risks ... The business process model (what happens in the system/what is going to happen in the ..... (Business objects, Cognos, Crystal, etc), many simple but effective programs are ...
DEVELOPMENT OF A MODEL FOR IMPLEMENTING RISK MANAGEMENT DATA WAREHOUSE FOR SCANNING COMPANIES IN NIGERIA BY WILSON NWANKWO (STU38538)
SUBMITTED TO THE SCHOOL OF SCIENCE AND TECHNOLOGY ANGLIA RUSKIN UNIVERSITY CHELMSFORD IN PARTIAL FULFILMENT OF THE REQUIREMENTS FOR THE AWARD OF BACHELOR OF SCIENCE IN BUSINESS COMPUTING
3809 words
Abstract Risk management in the area of imports is a very critical element in developing countries whose economies depend greatly on imports. As a result, in Nigeria, Scanning and risk management providers have been involved over the years to assist the local customs service in detecting and managing risks associated with imports. This project is conceived to address the analytical and data consolidation problems arising from the limited automation usually witnessed at the operational ports where cargo scanning and risk management activities are prevalent. This project studies the scanning processes in two popular ports in Lagos, Nigeria and develops a model for implementing a risk management data warehouse that could be used to harness the data from legacy systems at the ports into a central data warehouse. The data warehouse when implemented would provide the needed solution to intelligence reporting and analysis for decision-making.
Key words: Data warehousing, cargo scanning, risk management system, Analytics
ii
Table of contents Title page………………………………………………………………….. …………i Abstract……………………………………………………………………………….ii Table of contents…………………………………………………………………......iii 1. Introduction 1.1
Problem definition……………………………………………………1
1.2
Objectives…………………………………………………………….1
1.3
Scope of the project…………………………………………………..1
1.4
Significance of the project……………………………………………2
1.5
Assumptions…………………………………………………………..2
1.6
Risk and mitigation strategies………………………………………...2
2. Background 2.1 Data warehouses…………………………………………………………..3 2.1.1 Benefits of data warehousing……………………………………….3 2.1.2 Decision support systems…………………………………………....4 2.1.3 Developing a data warehouse……………………………………….4 2.14 Alternatives to the data warehouse…………………………………..4 2.1.5 Conclusion…………………………………………………………...5 2.2 Cargo scanning in Nigeria………………………………………………...5 3. Methodology 3.1 Introduction…………………………………………………………….....6 3.2 Materials and Methods…………………………………………………....6 3.2.1 Site visits/Documentation…………………………………………...6 3.2.2 Development tools…………………………………………………..6 3.3 Analysis of the present system …………………………………………...6 3.3.1 Use case analysis…………………………………………………....9 3.3.2 Domain analysis…………………………………………………...10 3.4 Problems in the existing system…………………………………………13 3.5 Model Validation…………………………………………………..........13
iii
4. Design 4.1 Data warehouse design…………………………………………………..14 4.1.1 Architecture…………………………………………………..........14 4.1.2 Conceptual model design………………………………………….14 4.1.3 Logical design…………………………………………………......15 4.1.4 Physical design………………………………………………….....17 4.1.5 Extract-Transform-Load (ETL) design……………………………21 4.2 Data quality assessment…………………………………………………22 5. Implementation 5.1 Database requirements…………………………………………………...24 5.2 Data warehouse implementation…………………………………………24 5.2.1 Loading the Staging and the data warehouse tables………………24 5.3 Data access…………………………………………………....................25 5.4 Security………………………………………………….........................25 6. Conclusion 6.1 Conclusion …………………………………………………...................26
LIST OF FIGURES 1: Theirauf’s model of data warehousing……………………………………………3 2: Business process model of the system……………………………………………8 3: Use case diagram representing the scanning process……………………………..9 4: The Conceptual model…………………………………………………................11 5: The Class specification of the system……………………………………………12 6: The design traceability matrix…………………………………………………....13 7: The Scanning system schema………………………………………………….....14 8: Dimensions and fact in the schema………………………………………………15 9: Logical model of the Data warehouse …………………………………………..16 10: Physical schema of the data warehouse ……………………………………….18 11: Physical schema of the staging database……………………………………….20 12: Physical design of the data warehouse…………………………………………21 13: Extract-Transform-Load process design……………………………………….22 iv
14: Single-phase table ETL logic…………………………………………………...25
LIST OF ABBREVIATIONS AWB
AIR WAY BILL
BI
BUSINESS INTELLIGENCE
BOL
BILL OF LADING
CBN
CENTRAL BANK OF NIGERIA
CCVO
COMBINED CERTIFICATE OF VALUE AND ORIGIN
DB
DATABASE
DWH
DATA WAREHOUSE
ETL
EXTRACT TRANSFORM AND LOAD
FK
FOREIGN KEY
INV
INVOICE
MOF
MINISTRY OF FINANACE
NAFDAC
NATIONAL AGENCY FOR FOOD DRUG AND CONTROL
NCS
NIGERIA CUSTOMS SERVICE
NESREA
NATIONAL ENVIRONMENTAL REGULATION AGENCY
NPF
NIGERIA POLICE FORCE
PI
PROFORMA INVOICE
PK
PRIMARY KEY
PL
PACKING LIST
RAM
RISK ASSESSMENT MODEL
SCD
SLOWLY CHANGING DIMENSION
SGD
SINGLE GOODS DECLARATION
SON
STANDARDS ORGANIZATION OF NIGERIA
SP
SERVICE PROVIDER
OOADM
OBJECT-ORIENTED ANALYSIS AND DESIGN METHODOLOGY
PDI
PENTAHO DATA INTEGRATION
RMDWH
RISK MANAGEMENT DATA WAREHOUSE
WB
WAY BILL
v
REFERENCES APPENDICES
vi
A MODEL FOR IMPLEMENTING RISK MANAGEMENT DATA WAREHOUSE FOR SCANNING COMPANIES IN NIGERIA
1. Introduction Business process analytics provides process participants, decision makers, and related stakeholders with insight about the efficiency and effectiveness of organizational processes (zurMuehlen and Shapiro, 2009). Data warehousing provides revolutionary means for the analysis and monitoring of critical processes in risk management like any other business area as a result constitutes a key component of modern enterprise business intelligence and analytic systems. 1.1 Problem definition Scanning companies in Nigeria depend on the large volume of data generated at the various ports for intelligent decision making. Most data on scanning sites are usually disparate; as a result there is the usual problem of filtering out the relevant details when periodic intelligence reports are needed by the supervising authorities. Data warehousing may be a cost-effective approach to establishing a centralized data infrastructure that would support the integration of data from the y legacy/single user programs used at the scanning sites, thereby providing a platform for information distribution and decision support. 1.2 Objectives The objectives of this project are: i.
To study the processes in a scanning/risk management company in order to identify challenges posed by data integration;
ii.
To identify the sources of data, relevant reports needed and decision support requirements and whether the existing processes provide the requisite solution needed for data analysis;
iii.
To present a model for implementing a risk management data warehouse in scanning companies.
1.3 Scope of the Project As high capacity cargo scanning is a major risk detection activity undertaken by the various scanning companies appointed by the government in Nigeria, this project will study the challenges posed in harnessing the large volume of data generated from the scanning process for use in decision support, with a view to designing a data warehouse model that will be suitable for eliminating these challenges, when fully implemented.
1
A MODEL FOR IMPLEMENTING RISK MANAGEMENT DATA WAREHOUSE FOR SCANNING COMPANIES IN NIGERIA
1.4 Significance of the project This project is important in that it will present a model for harmonizing the data generated from the legacy systems during scanning operations, into a central repository (data warehouse) that would serve as a source for data analysts and decision makers. For organizations involved in trade facilitation and risk management, having a single data stream from where risk analysis and reporting are drawn from would reduce the time taken to gather data for analysis and cost/resources expended in decision support processes. 1.5 Assumptions We assume the scanning companies have sufficient technology platform to drive the implementation of a data warehouse; 1.7 Risk and Mitigation strategies The anticipated risks and mitigation strategies are presented in Appendix A.
2
A MODEL FOR IMPLEMENTING RISK MANAGEMENT DATA WAREHOUSE FOR SCANNING COMPANIES IN NIGERIA
2. Background 2.1 Data Warehouses (DWH) Data warehouses contain information ranging from measurements of performance to competitive intelligence (Tanler, 1997). Thierauf (1999) describes the process of data warehousing as consisting of data extraction of operational production data, and loading extracted data to the warehouse database located on a server that also hosts a decision support application. Users can then extract useful data from the data warehouse through some form of software. Theirauf's model is shown in figure 1.
Figure 1: Theirauf’s model of data warehousing Hoffer, et al(2005) define a data warehouse as an integrated decision support database whose content is derived from various operational databases. 2.1.1 Benefits of data warehousing The benefits of data warehousing as stated by Senn (2004) are: i.
Access to a large amount of information for solving problems like forecasting, process control, planning, etc.;
ii.
Data quality and consistency;
iii.
Maintenance of large pool of data from different locations and sources;
iv.
Enhanced business and historical intelligence;
3
A MODEL FOR IMPLEMENTING RISK MANAGEMENT DATA WAREHOUSE FOR SCANNING COMPANIES IN NIGERIA
v.
Data interchange between operational systems;
vi.
Potential huge returns on investment;
vii.
Competitive advantage;
viii.
Increased productivity of corporate decision-makers.
Petrenko et al.(2012) have emphasized that a
good data warehouse design is the key to
maximizing and speeding the return on investment from a data warehouse implementation, and that a good design leads to a data warehouse that is scalable, balanced, and flexible enough to meet existing and future needs. 2.1.2 Decision support systems Rob, et al (2012) see a decision support system (DSS) as an arrangement of computerized tools used to assist managerial decision making within a business. They describe a data warehouse as a read-only database containing integrated and transformed data extracted from other sources and optimized for data analysis and query processing and in their opinion the data warehouse provides support for the DSS. Users access the data warehouse through the use of front-end tools and/or end-user application software. 2.1.3 Developing a data warehouse As Pentreko et al puts it, the development of a data warehouse can be summarized into two major stages namely: i.
Designing the logical data model that defines various logical entities and their relationships between each entity;
ii.
Designing the physical data model that will speed up performance of various database activities, balance data across multiple database partitions in a clustered warehouse environment and provide provides for fast data recovery.
2.1.4 Alternatives to the data warehouse Many experts argue that the emergence of cloud computing and big data technologies is much likely to destroy the implementation of data warehousing systems. Some argue that a free and open source distributed computing technology like Hadoop can handle the job of a data warehouse at just a split of time and cost. However, Scott Gnau, President, Teradata Labs, a global leader in analytic data solutions with focus on integrated data warehousing, big data analytics, and business applications argues otherwise. According to Scott Gnau, the basis of argument has been that data warehouses are too
4
A MODEL FOR IMPLEMENTING RISK MANAGEMENT DATA WAREHOUSE FOR SCANNING COMPANIES IN NIGERIA
rigid and inflexible (Henschen, 2012) He submitted that the rigid schemas attributed to Enterprise data warehouses step from rigid IT policies and sometimes inadequate data warehouse architecture. His emphasis remains that the major factor is what the business or the situation needs and not what data warehouse can do. 2.1.5 Conclusion The data warehouse remains invaluable in supporting decision making at both tactical and strategic management levels. However, implementing a data warehouse is a daunting task, and this accounts for our approach in this project, that is, developing a realistic model which would cut short the usual lengthy period required for designing and implementing a data warehouse. 2.2 Cargo scanning in Nigeria In Nigeria, high capacity cargo scanning was introduced in 2006 during the destination inspection scheme. Three scanning service providers (SPs) namely, SGS, COTECNA and Global Scan were contracted by the Federal government to provide scanning/risk management services to augment the services of the Nigeria Customs Service (NCS). While the contract lasted, each SP conducts non-intrusive cargo scanning at the assigned ports with the goal of managing risks associated with imported goods. At the end of the contract in late 2013, the NCS took over the scanning infrastructure and appointed one of the scanning companies as the sole SP to man all the scanning activities at the ports.
5
A MODEL FOR IMPLEMENTING RISK MANAGEMENT DATA WAREHOUSE FOR SCANNING COMPANIES IN NIGERIA
3. Methodology 3.1 Introduction Methodology is a framework used to structure, plan, and control the process of developing an information system and consists of steps, methods, techniques and procedures which govern the collection, analysis and design of a particular project (Eze, 2008). The common methods that are employed to develop data warehouses include: Top-down, Bottom-up, Agile, and object-based methods. The object-based method was employed in this project due to its support for formal analysis, modeling of dynamic complex systems, and rapid development and maintenance. 3.2 Materials and Methods The methods and materials used in this project are described in the subsections following; 3.2.1 Site Visits/Documentation The scanning processes at two key ports in Lagos (Apapa and Tincan) were carefully observed and documented. The following areas are covered: i.
The scanning process;
ii.
Databases(if any) and other programs used;
iii.
The outputs generated by the scanners;
iv.
Challenges posed by data storage and reporting.
3.2.2 Development tools The tools that were used for analysis and development are: trial versions of Erwin Data modeler and Microsoft Visual studio; Pentaho data integration (Kettle), and MySQL database server. These tools are basically free thus reducing development costs. 3.3 Analysis of the present system To detect whether or not problems exist in the present system, we specified the following: i.
The actors (who play a role in the system);
ii.
The business process model (what happens in the system/what is going to happen in the new system) using an activity diagram;
iii.
External user(individual/organization outside the logical boundary of the business area) who also uses the system;
6
A MODEL FOR IMPLEMENTING RISK MANAGEMENT DATA WAREHOUSE FOR SCANNING COMPANIES IN NIGERIA
iv.
Use cases (what the participants are doing in the system/what the users will do with the new system);
v.
The interaction among two or more classes or objects using sequence/collaboration diagrams;
vi. Classes of objects/entities, their attributes, relationships and methods using class diagrams Actors: check-in/verification officers, port operators, image analysts, radiation safety officers, scanning/maintenance officers, importers, clearing agents, documentation officers, and report officers. External users: include agencies such as: NCS, MOF, CBN, NAFDAC, SON, NESREA. Business process model: this is illustrated in the activity diagram in figure 2.
7
A MODEL FOR IMPLEMENTING RISK MANAGEMENT DATA WAREHOUSE FOR SCANNING COMPANIES IN NIGERIA
Figure 2: Business process model of the system The process model is summarized as follows: i.
Importer/clearing agent presents import documents to the check-in/verification officer;
ii.
Check-in officer reviews the documents, approves/disapproves;
iii.
Where approval is made, the cargo is scheduled for scanning;
iv.
The cargo is scanned and a report is generated and processed as spreadsheets by image analysts;
v.
Report officers use the scanning data to prepare weekly, monthly, quarterly and ondemand intelligence reports for government agencies such as: MOF, NCS, etc.
8
A MODEL FOR IMPLEMENTING RISK MANAGEMENT DATA WAREHOUSE FOR SCANNING COMPANIES IN NIGERIA
3.3.1 Use case analysis A use case is a functionality that users need from the system. In object-based analysis, use cases are also used to depict the requirements analysis process. The functionalities defined by a use case are represented using the use case diagram. Figure 3 is the use case diagram representing all the use cases associated with the present system. The interrelationships between the use cases are also established. As a risk management tool, the scanning results derived from the cargo scanning process are analysed by analysts who produce intelligence reports based on parameters such as: severity of risks detected, risk-free scans, value of SGD, port of discharge, etc. over a period.
Figure 3: Use case diagram representing the scanning process
9
A MODEL FOR IMPLEMENTING RISK MANAGEMENT DATA WAREHOUSE FOR SCANNING COMPANIES IN NIGERIA
3.3.2 Domain analysis In this phase, we identified and defined the objects/concepts inherently present in the use cases using a conceptual model. The model described what data/information is (would be) managed by the scanning process chain, and what data flow between users and the system. We used class diagrams and the unified modeling language for the conceptual modeling. The data represented by a class is broken into: concept and association. The concept (condensed form of an object/class) is the representation of complex information that has a coherent meaning in the scanning business domain. Concepts aggregate attributes and may be associated to each other. The identified concepts in the existing are presented in the model in figure 4. The conceptual model consists of condensed classes with associated relationships. The arrows show the relationships. The dotted arrow shows a dependency relationship, a solid arrow shows an association whereas an arrow with a triangular pointer shows inheritance (for instance, the analyst class inherits from the Person class).
10
A MODEL FOR IMPLEMENTING RISK MANAGEMENT DATA WAREHOUSE FOR SCANNING COMPANIES IN NIGERIA
Figure 4: The Conceptual model Identifying classes and their relationships is a very prior to implementing the requirements of any system. The object schema represented by the class diagram in figure 5 shows the various classes that form the foundation of the new system. Each object/class has static (attributes) and dynamic (behaviour/methods) characteristics. We are concerned with the attributes alone, as shown in the model in figure 5. In the model, the primary (PK) and foreign key (FK) attributes have been appropriately identified.
11
A MODEL FOR IMPLEMENTING RISK MANAGEMENT DATA WAREHOUSE FOR SCANNING COMPANIES IN NIGERIA
Figure 5: The Class specification of the system
12
A MODEL FOR IMPLEMENTING RISK MANAGEMENT DATA WAREHOUSE FOR SCANNING COMPANIES IN NIGERIA
3.4 Problems in the existing system Our findings from the visits to the cargo scanning sites and documentation revealed the following challenges in the current system: i.
Substantive risk management data for analytic purposes are synthesized from spreadsheet files generated during scanning operations: this process is prone to mistakes and discrepancies due to the large volume of data;
ii.
Report generation from scanning data files requires a lot of filtering and sorting by analysts;
iii.
Generating intelligence reports are often slow and inefficient since related import documents and the various scanning data files within a period must be retrieved.
These challenges were further validated with the use case analysis discussed above. 3.5 Model Validation We employed tracing (forward tracing) in validating the design specifications. The user requirements consist of pre-defined activities which were reflected in figure 3 above. In tracing we compared the user requirements to the design specifications using the traceability matrix shown in figure 6. ‘X’ in validation status indicates a validated status. Verification is a postimplementation process which is done after the model is implemented to test if the system achieved all user requirements.
Figure 6: The design traceability matrix
13
A MODEL FOR IMPLEMENTING RISK MANAGEMENT DATA WAREHOUSE FOR SCANNING COMPANIES IN NIGERIA
4. Design Object-based design has two objectives: i.
To design the various objects/classes identified during the analysis stage
ii.
To design and the human interfaces that will be used to capture/process/retrieve data.
The design of the data warehouse may not include all the considerations employed during the object-oriented design of a typical software system. 4.1 Data warehouse design 4.1.1 Architecture The architecture consists of the totality of all components that make up the data warehouse. The warehouse synthesizes data from data sources particularly spreadsheet and database files, whenever a user/analyst submits requests. 4.1.2 Conceptual model design What is important in data warehouse design is not all the data in a transactional data store, but those elements, which are the essential drivers of the decision-making process. These essential drivers are the grains which grouped into a single package called the ‘scanning schema’. The schema is divided into two parts: fact and dimension classes. A fact represents measures and context data. A dimension is a set of data that describe one business dimension. Dimensions determine the contextual background for the facts. Both parts are derived from the classes in figure 5. Figure 7 shows the conceptual star schema.
Figure 7: The Scanning system schema
14
A MODEL FOR IMPLEMENTING RISK MANAGEMENT DATA WAREHOUSE FOR SCANNING COMPANIES IN NIGERIA
The package (schema) is defined using association relationships between the fact and the dimensions. The diagram in figure 8 shows the schema represented as with fact and dimension stereotypes.
Figure 8: Dimensions and fact in the schema 4.1.3 Logical design The logical schema is an extension of the conceptual schema represented earlier using a package stereotype. We established the attributes of each class, and the relationships (using association) between the fact class and the dimension classes. Thus, the fact class is associated to the dimension classes. Each class (fact/dimension) is identified by a unique object identifier {OID} attribute. Figure 9 shows the logical data design of the data warehouse. The data dictionary of the classes is provided in Appendix B.
15
A MODEL FOR IMPLEMENTING RISK MANAGEMENT DATA WAREHOUSE FOR SCANNING COMPANIES IN NIGERIA
Figure 9: Logical model of the Data warehouse
16
A MODEL FOR IMPLEMENTING RISK MANAGEMENT DATA WAREHOUSE FOR SCANNING COMPANIES IN NIGERIA
4.1.4 Physical design The most important activity in this phase is the conversion of the logical design into a physical model by using database system structures such as tables, tablespaces, etc. The various classes in the logical model in figure 9 would be mapped to tables, relationships to foreign key constraints, attributes to fields, and object identifiers to primary key constraints. Further denormalization is also necessary since the emphasis is on query performance in a data warehouse other than storage optimization. This is shown in figure 10. The data type, field length, primary and foreign key constraints were all defined. Due to the fact that some attributes of one or more dimensional tables may be updated in the course of the business, we used slowly changing dimensions (SCD). Though many types of SCDs exist, we identified only three which are relevant in this project namely: i. SCD – Type 1, in which the old data in the record is overwritten with new data during updates; ii. SCD–Type 2, which creates additional record with the new data at the time of change and can track changes in the history of the data but required a generalized key to record all the iterations of the original record; iii. SCD–Type 3 creates new fields in the record for the new data and the time of the change, and tracks original and current values only, loosing intermediate values. The SCD type 2 is widely used in this project. Thus, for the four dimension tables (DIM_IMPORTER, DIM_SCAN_RESULT, DIM_SHIPPING_DOC, DIM_IMPORTER and DIM_SGD) we added the fields, VERSION, DATE_TO and DATE_FROM to enable the tracking and storing of modifications to records in the tables.
17
A MODEL FOR IMPLEMENTING RISK MANAGEMENT DATA WAREHOUSE FOR SCANNING COMPANIES IN NIGERIA
DIM_SHIPPING_DOC
DIM_IMPORTER
SHIP_DIM_ID: INTEGER IDENTITY
IMPORTER_DIM_ID: INT IDENTITY
SHIP_DOC_NO: VARCHAR(20) NOT NULL FLIGHT_VOYAGE_NO: VARCHAR(30) NULL IMPORTER_NAME: VARCHAR(100) NULL SHIPPING_DATE: DATE NULL CARRIER_NAME: VARCHAR(70) NULL CUSTOMS_CODE: VARCHAR(4) NULL DISCHARGE_PORT: VARCHAR(15) NULL FORM_M_NO: VARCHAR(17) NULL LOAD_PORT: VARCHAR(30) NULL PARTIAL_NO: CHAR(3) NULL INVOICE_NO: VARCHAR(30) NOT NULL TRANSPORT_MODE: VARCHAR(7) NULL DATE_TO: DATE NULL DATE_FROM: DATE NULL VERSION: INTEGER NULL
IMPORTER_ID: INT NOT NULL IMPORTER_NAME: VARCHAR(100) NOT NULL ADDRESS: VARCHAR(100) NOT NULL CITY: VARCHAR(50) NOT NULL TELEPHONE: VARCHAR(20) NULL POSTCODE: VARCHAR(10) NULL EMAIL: VARCHAR(50) NULL CONTACT_PERSON: VARCHAR(50) NULL STATE: VARCHAR(30) NULL DATE_FROM: DATE NULL DATE_TO: DATE NULL VERSION: INTEGER NOT NULL
FACT_SCAN
DIM_DATE
DIM_SCAN_RESULT FACT_SCAN_ID: INTEGER IDENTITY
DATE_ID: INT IDENTITY
SHIP_DIM_ID: INTEGER NULL (FK) SCAN_DIM_ID: INTEGER NULL (FK) SHIPMENT_DIM_ID: INTEGER NULL (FK) CFR: DECIMAL(10,2) NOT NULL CIF: DECIMAL(10,2) NOT NULL NUM_CONTAINERS: INTEGER NULL PORT: VARCHAR(20) NOT NULL RISK_LEVEL: VARCHAR(5) NOT NULL IMPORTER_DIM_ID: INT NULL (FK) DATE_ID: INT NULL (FK) SGD_DIM_ID: INT NULL (FK)
CAL_DATE: DATE NULL CAL_DAY: INT NOT NULL CAL_MONTH: INT NOT NULL CAL_YEAR: INT NOT NULL CAL_WEEK: INTEGER NULL LEAP_YEAR: INTEGER NOT NULL MONTH_NAME: VARCHAR(20) NULL QUARTER: INT NULL QUARTER_NAME: VARCHAR(3) NULL
SCAN_DIM_ID: INTEGER IDENTITY SCAN_SEQ_NO: VARCHAR(17) NOT NULL SCAN_DATE: DATE NOT NULL CONTAINER_NO: VARCHAR(100) NULL CONTAINER_SIZE: VARCHAR(20) NULL CUSTOMS_OFFICE: VARCHAR(4) NOT NULL IMPORTER_NAME: VARCHAR(100) NOT NULL PAR_NUMBER: VARCHAR(17) NOT NULL SGD_REG_NO: VARCHAR(17) NOT NULL RISK_DETAILS: VARCHAR(200) NULL RISK_FOUND: VARCHAR(2) NULL SCAN_COMMENTS: TEXT NULL SEVERITY: CHAR(1) NULL SHIPMENT_TYPE: VARCHAR(20) NULL ANALYST_ID: VARCHAR(50) NOT NULL DATE_FROM: DATE NULL DATE_TO: DATE NULL VERSION: INTEGER NULL DIM_SHIPMENT SHIPMENT_DIM_ID: INTEGER IDENTITY INVOICE_NO: VARCHAR(40) NOT NULL PARTIAL_NO: VARCHAR(5) NOT NULL COUNTRY_OF_ORIGIN: VARCHAR(40) NOT NULL COUNTRY_OF_SUPPLY: VARCHAR(40) NOT NULL EXPORTER_NAME: VARCHAR(100) NOT NULL IMPORTER_NAME: VARCHAR(100) NOT NULL FOB_VALUE: DECIMAL(10,2) NOT NULL INSURANCE: DECIMAL(10,2) NOT NULL FREIGHT: DECIMAL(10,2) NOT NULL GOODS_DESCRIPTION: VARCHAR(2000) NULL GROSS_WEIGHT: VARCHAR(15) NULL INVOICE_DATE: DATE NOT NULL DATE_TO: DATE NULL DATE_FROM: DATE NULL VERSION: INTEGER NULL
DIM_SGD SGD_DIM_ID: INT IDENTITY SGD_REG_NO: VARCHAR(17) NOT NULL SGD_REG_DATE: DATE NOT NULL FORM_M_NO: VARCHAR(17) NULL FORM_M_YEAR: INT NULL PAR_NUMBER: VARCHAR(17) NULL PAR_ISSUE_DATE: DATE NULL SHIP_DOC_NO: VARCHAR(20) NOT NULL PARTIAL_NO: VARCHAR(3) NOT NULL IMPORTER_NAME: VARCHAR(100) NOT NULL DATE_TO: DATE NULL DATE_FROM: DATE NULL VERSION: INTEGER NOT NULL
Figure 10: Physical schema of the data warehouse
18
A MODEL FOR IMPLEMENTING RISK MANAGEMENT DATA WAREHOUSE FOR SCANNING COMPANIES IN NIGERIA
Figure 11 shows the physical schema of the staging database. The staging database contains data extracted from the flat files. We employed the component diagram in figure 12 to show the overall physical design of the data warehouse. The flat files stored in a designated network directory are fed to the ETL program which loads the files to the scanning staging database tables. The ETL also loads the data warehouse tables. The fact and dimensional tables are stored in different Tablespaces for optimal performance.
19
A MODEL FOR IMPLEMENTING RISK MANAGEMENT DATA WAREHOUSE FOR SCANNING COMPANIES IN NIGERIA
IMPORTER IMPORTER_ID: INT NOT NULL
SHIPPING_DOC
IMPORTER_NAME: VARCHAR(100) NOT NULL ADDRESS: VARCHAR(100) NOT NULL CITY: VARCHAR(50) NOT NULL TELEPHONE: VARCHAR(20) NULL POSTCODE: VARCHAR(10) NULL EMAIL: VARCHAR(50) NULL CONTACT_PERSON: VARCHAR(50) NULL STATE: VARCHAR(30) NULL LAST_UPDATED: DATE NULL
SHIP_DOC_NO: VARCHAR(30) NOT NULL FLIGHT_VOYAGE_NO: VARCHAR(30) NOT NULL IMPORTER_NAME: VARCHAR(100) NOT NULL (FK) SHIPPING_DATE: DATE NOT NULL CARRIER_NAME: VARCHAR(100) NOT NULL CUSTOMS_CODE: VARCHAR(4) NOT NULL DISCHARGE_PORT: VARCHAR(15) NOT NULL FORM_M_NO: VARCHAR(17) NOT NULL LOAD_PORT: VARCHAR(30) NOT NULL INVOICE_NO: VARCHAR(70) NOT NULL (FK) TRANSPORT_MODE: VARCHAR(7) NOT NULL GROSSWEIGHT: VARCHAR(15) NULL LAST_UPDATED: DATE NULL
SHIPMENT INVOICE_NO: VARCHAR(20) NOT NULL COUNTRY_OF_ORIGIN: VARCHAR(40) NOT NULL COUNTRY_OF_SUPPLY: VARCHAR(40) NOT NULL FOB_VALUE: DECIMAL(10,2) NOT NULL INSURANCE: DECIMAL(10,2) NOT NULL FREIGHT: DECIMAL(10,2) NOT NULL GOODS_DESCRIPTION: VARCHAR(2000) NULL INVOICE_DATE: DATE NOT NULL PARTIAL_NO: VARCHAR(5) NOT NULL IMPORTER_NAME: VARCHAR(100) NOT NULL EXPORTER_NAME: VARCHAR(100) NOT NULL (FK) LAST_UPDATED: DATE NULL
SCAN_RESULT SCAN_SEQ_NO: VARCHAR(17) NOT NULL SCAN_DATE: DATE NOT NULL CONTAINER_NO: VARCHAR(40) NULL CONTAINER_SIZE: VARCHAR(20) NULL CUSTOMS_OFFICE: VARCHAR(4) NOT NULL PAR_NUMBER: VARCHAR(17) NOT NULL SGD_REG_NO: VARCHAR(17) NOT NULL (FK) RISK_DETAILS: VARCHAR(200) NULL RISK_FOUND: CHAR(1) NOT NULL SCAN_COMMENTS: TEXT NULL SEVERITY: CHAR(1) NOT NULL SHIPMENT_TYPE: VARCHAR(20) NOT NULL IMPORTER_ID: INTEGER NOT NULL ANALYST_ID: INTEGER NULL (FK) LAST_UPDATED: DATE NULL
SGD SGD_REG_NO: VARCHAR(17) NOT NULL
EXPORTER EXPORTER_ID: INTEGER NOT NULL
SGD_REG_DATE: DATE NOT NULL FORM_M_NO: VARCHAR(17) NOT NULL FORM_M_YEAR: INTEGER NOT NULL PAR_NUMBER: VARCHAR(17) NOT NULL PAR_ISSUE_DATE: DATE NOT NULL SHIP_DOC_NO: VARCHAR(20) NOT NULL PARTIAL_NO: VARCHAR(5) NOT NULL IMPORTER_NAME: VARCHAR(100) NOT NULL LAST_UPDATE: DATE NULL
ANALYST ANALYST_ID: INTEGER NOT NULL PERSON_ID: INTEGER NULL FIRSTNAME: VARCHAR(30) NOT NULL LASTNAME: VARCHAR(30) NOT NULL OTHERNAMES: VARCHAR(30) NULL GENDER: CHAR(1) NOT NULL DATE_OF_BIRTH: DATE NULL MARITAL_STATUS: VARCHAR(20) NULL EMAIL: VARCHAR(50) NULL TELEPHONE: VARCHAR(20) NULL ADDRESS: VARCHAR(100) NULL CITY: VARCHAR(20) NULL STATE: VARCHAR(20) NULL ORGANIZATION: VARCHAR(100) NULL LAST_UPDATED: DATE NULL
EXPORTER_NAME: VARCHAR(100) NOT NULL ADDRESS: VARCHAR(100) NOT NULL CITY: VARCHAR(50) NULL TELEPHONE: VARCHAR(20) NULL POSTCODE: VARCHAR(10) NULL EMAIL: VARCHAR(50) NULL CONTACT_PERSON: VARCHAR(50) NULL STATE: VARCHAR(30) NULL COUNTRY: VARCHAR(40) NOT NULL LAST_UPDATED: DATE NULL
AGENCY AGENCY_ID: INTEGER NOT NULL AGENCY_NAME: VARCHAR(100) NOT NULL ADDRESS: VARCHAR(50) NOT NULL CITY: VARCHAR(50) NOT NULL CONTACT_PERSON: VARCHAR(50) NULL EMAIL: VARCHAR(50) NOT NULL TELEPHONE: VARCHAR(20) NOT NULL
REPORT REPORT_ID: INTEGER NOT NULL AGENCY_ID: INTEGER NOT NULL (FK) REPORT_NAME: VARCHAR(20) NOT NULL CODE: VARCHAR(20) NULL REQUEST_DATE: DATE NOT NULL ISSUED_DATE: DATE NOT NULL ANALYST_ID: INTEGER NULL (FK)
Figure 11: Physical schema of the staging database
20
A MODEL FOR IMPLEMENTING RISK MANAGEMENT DATA WAREHOUSE FOR SCANNING COMPANIES IN NIGERIA
Figure 12: Physical design of the data warehouse 4.1.5 Extract-Transform-Load (ETL) design The ETL process is responsible for extracting, transforming into a consistent form. The process is composed of six tasks: i. Selecting the sources for extraction; ii. Transforming the sources, in which the data is transformed into new data by filtering data, converting codes, performing table lookups, calculating derived values, transforming between different data formats, automatic generation of sequence numbers (surrogate keys),etc.; iii. Joining the data sources; iv. Selecting the target to load; v. Mapping source attributes to target attributes; vi. Loading the data from the sources.
21
A MODEL FOR IMPLEMENTING RISK MANAGEMENT DATA WAREHOUSE FOR SCANNING COMPANIES IN NIGERIA
The ETL process runs periodically (via a script) to automatically extract, transform and load data to the data warehouse database. The extract-transform-load process is shown in figure 13.
Data is extracted from the flat
files(Microsoft Excel or compatible files) into a staging database(a temporary data store) from
staging
where data is ultimately loaded into the data warehouse tables.
Loading routine
Relational data sources
Fact table Load metadata
Source table Entity/attribute / field mapping
Nonrelational data sources
Column mapping, aggregation, column splitting, relational operations,format conversion
Dimension table
Extraction function Adapters OLEDB ODBC JDBC BDE GDS23
Transformation function(system or user-defined)
Figure 13: Extract-Transform-Load process design
4.2 Data quality assessment Data quality assessment is a continuous process and starts during system documentation. The methods employed to ensure quality are: i.
Documentation of the field/column types, fields with null values, counts, ranges, averages, column relationships with tables, and foreign keys;
ii.
The ETL rules are based on the relationships amongst documented attributes;
iii.
ETL checks data type match and values during loading;
22
A MODEL FOR IMPLEMENTING RISK MANAGEMENT DATA WAREHOUSE FOR SCANNING COMPANIES IN NIGERIA
iv.
Consolidation of all data sources to a staging database to provide an intermediate data store where data consistency checks would be done prior to and during loading data to the staging tables;
v.
Integrity checks, column checks, such as data type matching, numeric and date value checks are performed during data loading into the dimensional tables from the staging tables.
23
A MODEL FOR IMPLEMENTING RISK MANAGEMENT DATA WAREHOUSE FOR SCANNING COMPANIES IN NIGERIA
5. Implementation 5.1 Database requirements It is assumed that adequate hardware is available to drive the implementation. MySQL is the preferred database server for implementing this risk management data warehouse. It is an opensource platform, hence would greatly reduce the cost of implementation. 5.2 Data warehouse implementation The implementation phase is divided into six phases: i. Configuration of a network directory in the local network where the source files (spreadsheet files are located); ii. Installation and configuration of MySQL database server 5.5(or higher), Pentaho Data integration (used to design the extract-transform-load program); iii. Creation of the ETL data transformation schema and associated jobs; iv. Creation of a physical staging database tables where data are extracted and loaded to by the ETL program; v. Creation of the data warehouse tables (dimensions/fact); vi. Loading of staging tables and data warehouse tables. The database creation scripts are enclosed in Appendix C. 5.2.1 Loading the Staging and the data warehouse tables The loading process is illustrated in figure 14. The staging tables are loaded from the transaction file sources (spreadsheets particularly the Microsoft Excel workbook comprising of many worksheets). This project used the community edition of Pentaho Data Integration (PDI) and MySQL database software for testing the model. Both tools are open source and do not require expensive hardware to run. The PDI software is used to develop an ETL model which maps each worksheet in the Excel workbook file (located a network file system) to a specific table in the staging database. Transformation and loading to the dimensional tables of the data warehouse are made from the staging tables using scripts generated by the ETL model.
24
A MODEL FOR IMPLEMENTING RISK MANAGEMENT DATA WAREHOUSE FOR SCANNING COMPANIES IN NIGERIA
Figure 14: Single-phase table ETL logic 5.3 Data access There are many tools available for accessing a data warehouse; while some are sophisticated (Business objects, Cognos, Crystal, etc), many simple but effective programs are also available. Simple client tools like MySQL workbench could be used to run queries against the data warehouse. Data warehouse and database administrators can write simple SQL scripts which can be used by data analysts. Sophisticated web-based interfaces could also be used to access the data warehouse. 5.4 Security Every Enterprise needs a security infrastructure that implements the three basic levels of security i.e. Authorization, Authentication and Accounting which ensures every person on the network will be able to access only the data that he/she is authorized to. The security of an enterprise data warehouse is a top organizational objective. It may be viewed in three levels: i.
Organizational;
ii.
Network;
iii.
Database.
At the organizational level, the management would need to review or create requisite information security policies that would protect its information assets. At the network level, the corporate LAN/WAN should be secured using firewalls, VPNs and layer 7 gateways/proxies.
25
A MODEL FOR IMPLEMENTING RISK MANAGEMENT DATA WAREHOUSE FOR SCANNING COMPANIES IN NIGERIA
At database level, user roles, system privileges and object privileges should be used to restrict user access to data as well as control SQL statements that users can execute. In addition, a virtual private database can be used to enforce security on tables, views, synonyms directly.
26
A MODEL FOR IMPLEMENTING RISK MANAGEMENT DATA WAREHOUSE FOR SCANNING COMPANIES IN NIGERIA
6. Conclusion 6.1 Conclusion This report presents a concise model for establishing a risk management data warehouse for scanning companies in Nigeria. The project commenced with site visits to the ports in Lagos where cargo scanning of imports are heavily done by scanning service providers and NCS. The existing technologies and infrastructure were reviewed to identify practical solutions to the areas that pose challenges. Challenges in the area of data consolidation, reporting and decision support were prevalent at the scanning sites (ports). At these ports, no databases were in use, the common tool for consolidating data was Microsoft Excel software. Considering the volume of data in the system, the difficulties experienced by analysts in generating accurate data for analytic/report purposes, and the various agencies that need it, we designed a model for implementing a data warehouse that would support reporting and analysis. We have also discussed the steps and technologies that could be used to harness the model into a functional data warehouse which would ultimately support efficient and effective data retrieval for analysis.
27
A MODEL FOR IMPLEMENTING RISK MANAGEMENT DATA WAREHOUSE FOR SCANNING COMPANIES IN NIGERIA
References Elmasiri, R., and Navanthe, S.B., 2003. Fundamentals of Database systems. Fourth edition. Boston, Addison Wesley. Eze, U.F., 2008. Data mining model for management of data warehouse in tertiary institutions. Ph.D Thesis, NnamdiAzikiwe University, Awka. Gillenson, M.L. ,2005. Fundamentals of database management systems. New York,John Wiley and Sons Inc. Henschen, D., 2012. Big Data Debate: End Near For Data Warehousing? [Online] Informationweek. Available at [Accessed 2 February 2014]. Hoffer, J.A., Prescott, M.B., and McFadden, F.R., 2005. Modern database management. Seventh edition. New Jersey, Prentice Hall. Petrenko, M, Rada, A., Fitzsimons, G., McCallig, E. and Zuzarte, C.,2012. Best Practices Physical database design for data warehouse environments. [Online] IBM. Available at https://www.ibm.com/developerworks/community/wikis/form/anonymous/api/wi ki/0fc2f498-7b3e-4285-8881-2b6c0490ceb9/page/2d6faf27-ee09-455f-b88f9ac9b4a9c212/attachment/23ca37f4-53db-4e2e-b940-bafa2f3476a2/media/> [Accessed 2 February 2014]. Rob, P., Morris, S. and Coronel, C., 2004. Database systems: Design, implementation, and management. Sixth edition. Boston, Thompson. Rob, P., and Coronel, C., 2012. Database systems: Design, implementation, and management. Tenth edition. Boston, Thompson. Senn, J.A., 2004. Information technology: Principles,practices, opportunities. Third edition. New Jersey, Prentice Hall. Tanler, R,1997. Intranet data warehouse: Tools and Techniques for Building and an Intranet-Enabled Data warehouse. New York, John Wiley and Sons Inc. Thierauf, J.R., 1997. Knowledge Management Systems for Business. Connecticut, Quorum books. zurMuehlen, M., Shapiro, R.,2009. Business Process Analytics. Handbook on Business Process Management. Berlin, SpringerVerlag.
28
A MODEL FOR IMPLEMENTING RISK MANAGEMENT DATA WAREHOUSE FOR SCANNING COMPANIES IN NIGERIA
Appendices A. RISK MANAGEMENT STRATEGY Table A.1 presents the risks identified during the project and how they were managed.
Table A.1 Project risk management strategy # 1
Category of risk design
Risk
Effect
2
design
3
external
Access restriction to historical/test data
4
design
Loose objectives
5
external
6
external
7
design
User uncertain about business processes User does not know or see need for a data warehouse ill-structured assumptions
Impact Mitigation strategy
Researcher fails to understand the business domain Inadequate requirements definition
Project fails to support Very full business high goals/objectives; project failure
Devote more time to study and document the processes using alternate means
Project fails to achieve complete goals as user requirements not captured may Jan realization of project implementation Design may not be certifiable for future implementation
high
Very high
Poor project coordination; project failure Missing requirements identification
Very high
Information given by users may not be reliable
high
Project delay and execution problems;
Very high
Perform a comprehensive survey and documentation using a bottom up approach. A mix of different requirements definition techniques may help Discuss with relevant personnel as to the importance of using historical data for testing purposes; Suggest a non-disclosure agreement if necessary Define objectives; Make defined objectives coherent Identify and interact with users who are familiar with the business processes; Review operational procedures manual Educate users on the need for a data warehouse and how it would make their productivity better and worthwhile Review assumptions; Do a detailed requirements analysis; Perform a comprehensive
high
29
A MODEL FOR IMPLEMENTING RISK MANAGEMENT DATA WAREHOUSE FOR SCANNING COMPANIES IN NIGERIA
8
design
Failure to reconcile project benefits with organizationa l/business goals
Project failure
Very high
9
Technical
More resources needed to reconcile/map various data structures to suit technical design
high
10
Technical
Data structure differences amongst different user application programs Poor data quality
Very high
10
external
11
external
12
external
Poor/unreliable decisions may arise from implemented solution Component coherent issues Delay in project execution Delay in project documentation; flaws in design
Tight project time Project funding User’s resistance to disclose roles in process flows
Very high Very high Very high
documentation of project activities; Review work breakdown structure; Review design framework Perform needs/requirements analysis; Identify major business process challenges as regards decision/strategic support; Identify importance of decision support to the organization; Document dominant application programs used and their file specifications; Identify areas of convergence; Convert data to match technical requirements Proper data cleansing
Review scope of project and exclude redundancies Define the scope to match available resources Educate the user on importance of user role in the project and benefits of the project
30
A MODEL FOR IMPLEMENTING RISK MANAGEMENT DATA WAREHOUSE FOR SCANNING COMPANIES IN NIGERIA
B. DATA DICTIONARY OF THE ENTITIES/CLASSES In this section, we present all the concepts/classes identified in this project, their attributes and specifications. Table B.1: Object classes Object class
Representation
Importer
This object represents an importer who makes a preliminary declaration through the primary import document-the ‘Form M’ as required by the central bank of Nigeria. The Form M is the key import document which an intending importer must establish through his/her bank. The form M contains all the details about an import and its opened by approved bank prior to the import process
ShippingDocument
The shipping document is also called the transport document. The transport document documents what is shipped and associated details
Agency
This object represents any government agency that has the capacity to request a report from the scanning and risk management operations at the ports
Analyst
Represents an analyst at the port. Analyst may refer to a scanner officer, image analyst or a documentation analyst
Person
Represents a person. A person object class is a superclass from which other object classes like the analyst class
SGD
Represents the single goods declaration document which is used to document imported goods during clearance at the port
scanningResult
This class represents the results of cargo scanning and the subsequent image analysis
Report
represents a report requested by an agency
Exporter
Represents an exporter of a product
Shipment
Represents the goods declared by a sales invoice or the combined certificate of value and origin
31
A MODEL FOR IMPLEMENTING RISK MANAGEMENT DATA WAREHOUSE FOR SCANNING COMPANIES IN NIGERIA
B.2 Table B.2 : Person specification Attributes
Description
Domain
Null
index Referential
type PERSON_ID
The ID of every person who is
integrity
Numeric
NO
YES
involved in scanning operations FIRSTNAME
First name of the person
String
NO
NO
LASTNAME
Last name of the person
String
NO
NO
OTHERNAMES
Names other than first name and
string
YES NO
surname of the person GENDER
Gender of the person
String
NO
NO
TELEPHONE
Telephone number of the person
String
NO
NO
EMAIL
E-mail of the person
String
NO
NO
DATE_OF_BIRTH
Indicates the date of birth of the
String
NO
NO
person ADDRESS
address of the person
String
NO
NO
MARITAL_STATUS
Indicates the marital status of the
String
NO
NO
String
NO
NO
String
YES NO
officer NATIONALITY
Indicates the person’s country of birth
ORGANIZATION
Represents the organization the person works with
STATE_OF_ORIGIN Represents the state of origin of the String person
Table B.3: Analyst specification
32
A MODEL FOR IMPLEMENTING RISK MANAGEMENT DATA WAREHOUSE FOR SCANNING COMPANIES IN NIGERIA
Attributes
Description
Domain
Null index Referential integrity
type ANALYST_ID The ID of the analyst
Numeric
NO
YES
PERSON_ID
String
NO
NO
The ID of every person who is involved in scanning
References PERSON.PERSON_ID
operations
Table B.4 : Agency specification Attributes
Description
Domain
Null
index Referential
type
integrity
AGENCY_ID
The ID of the agency
Numeric
NO
YES
AGENCY_NAME
Name of the agency
String
NO
NO
ADDRESS
Physical Address of the agency
String
NO
NO
CITY
City where agency is located
string
YES NO
String
NO
NO
CONTACT_PERSON Contact person/representative of the agency to be contacted TELEPHONE
Telephone number of the agency
String
NO
NO
EMAIL
E-mail of the agency
String
NO
NO
Table B.5: Report specification Attributes
Description
Domain
Null
index Referential integrity
type REPORT_ID
The ID of the report
Numeric
NO
YES
REPORT_NAME
Title of the report
String
NO
NO
CODE
Official code of the
String
YES NO
Date
NO
report REQUEST_DATE Date when report was
NO
requested by agency
33
A MODEL FOR IMPLEMENTING RISK MANAGEMENT DATA WAREHOUSE FOR SCANNING COMPANIES IN NIGERIA
ISSUED_DATE
Date of issuance of
String
NO
NO
numeric
NO
NO
the report AGENCY_ID
Agency ID of the requesting agency
ANALYST_ID
ID of analyst who
References AGENCY.AGENCY_ID
numeric
NO
NO
prepares the report
References ANALYST.ANALYST_ID
Table B.6 : Importer specification Attributes
Description
Domain
Null
index Referential
type IMPORTER_ID
The ID of the Importer usually the
integrity
Numeric
NO
YES
company registration number IMPORTER_NAME
Importer’s registered name
String
NO
NO
TELEPHONE
Telephone number of the importer
String
NO
NO
EMAIL
E-mail of the importer
String
YES NO
CITY
Indicates the city where the
String
NO
NO
String
NO
NO
String
YES NO
String
YES NO
string
NO
NO
Domain
Null
index Referential
importer is based ADDRESS
address of the importer
CONTACT_PERSON Represents the representative of the importer STATES
Represents the state of where the importer is based
POSTCODE
Represents the postal code of the importer
Table B.7 : Exporter specification Attributes
Description
34
A MODEL FOR IMPLEMENTING RISK MANAGEMENT DATA WAREHOUSE FOR SCANNING COMPANIES IN NIGERIA
type EXPORTER_ID
The ID of the exporter, may be the
integrity
Numeric
NO
YES
NO
company registration number EXPORTER_NAME
exporter’s registered name
String
NO
TELEPHONE
Telephone number of the exporter
String
YES NO
EMAIL
E-mail of the exporter
String
YES NO
CITY
Indicates the city where the
String
NO
NO
String
NO
NO
String
YES NO
String
YES
String
YES NO
exporter is based ADDRESS
address of the exporter
CONTACT_PERSON Represents the representative of the exporter STATES
Represents the state of where the importer is based
POSTCODE
Postal code of the exporter
COUNTRY
Exporter’s country
NO
NO
Table B.8 : Shipment specification Attributes
Description
Domain
Null index Referential integrity
type INVOICE_NO
The invoice
String
NO
YES
number of the combined certificate of value and origin of the shipment INVOICE_DATE
Invoice date
Date
NO
NO
COUNTRY_OF_ORIGIN
Country where
String
NO
NO
goods are
35
A MODEL FOR IMPLEMENTING RISK MANAGEMENT DATA WAREHOUSE FOR SCANNING COMPANIES IN NIGERIA
manufactured COUNTRY_OF_SUPPLY Country to which
String
NO
NO
numeric
NO
NO
goods are to be shipped IMPORTER_ID
ID of Importer
REFERENCES IMPORTER.IMPORTER_ID
EXPORTER_ID
ID of exporter
numeric
NO
NO
REFERENCES EXPORTER.EXPORTER_ID
PARTIAL_NO
Partial number of
String
NO
NO
String
NO
NO
Numeric
NO
NO
numeric
NO
NO
numeric
NO
NO
String
NO
NO
the shipment. Goods declared in a single import document such as Form_M could be shipped in many batches or partials GOODS_DESCRIPTION
Represents the
String
description of the goods GROSS_WEIGHT
Gross weight of the shipment
FOB_VALUE
Free on board value of the shipment expressed as a currency value
INSURANCE
Insured value of the shipment
FREIGHT
Freight costs of the shipement
FORM_M_NO
Application
36
A MODEL FOR IMPLEMENTING RISK MANAGEMENT DATA WAREHOUSE FOR SCANNING COMPANIES IN NIGERIA
number of the Form M SHIPMENT_TYPE
Type of packing
String
NO
NO
used for shipment; it may be containerized or non-containerized
Table B.9 : ShippingDocument Attributes
Description
Domain Null index Referential integrity type
SHIP_DOC_NO
The airway bill
String
NO
YES
String
NO
NO
number(for air shipment) or road way bill number(for inter-border shipments) or the bill of lading for sea freight INVOICE_NO
Invoice NUMBER
REFERENCES SHIPMENT.INVOICE_NO
ARRIVAL_DATE
Date of arrival of
String
NO
NO
String
NO
NO
String
NO
NO
carrier to destination country CARRIER_NAME
Name of shipping line/air line/transport company handling the shipment
CUSTOMS_CODE
The standard local customs office code
37
A MODEL FOR IMPLEMENTING RISK MANAGEMENT DATA WAREHOUSE FOR SCANNING COMPANIES IN NIGERIA
DISCHARGE_PORT
Discharge port at
String
NO
NO
String
NO
NO
String
NO
NO
String
NO
NO
Date
NO
NO
String
NO
NO
index
Referential integrity
destination LOADING_PORT
Port of loading at country of shipping
FLIGHT_VOYAGE_NO
Flight number(air
String
shipment) or vessel number (for sea freight) or truck number for across the border shipment SHIPPING_DATE
Take-off date for airline/vessel
FORM_M_NO
Application form number
SHIPPING_DOC_DATE Date of issuance of shipping document TRANSPORT_MODE
Mode of shipping(air/sea/road)
Table B.10 : Single goods declaration(SGD) specification Attributes
Description
Domain
Null
38
A MODEL FOR IMPLEMENTING RISK MANAGEMENT DATA WAREHOUSE FOR SCANNING COMPANIES IN NIGERIA
type SGD_REG_NO
SGD
String
NO
YES
numeric
NO
NO
REGISTRATION NUMBER IMPORTER_ID
ID of Importer
REFERENCES IMPORTER.IMPORTER_ID
SHIP_DOC_NO
Shipping
String
NO
NO
document number FORM_M_YEAR
Year in which the
REFERENCES SHIPPINGDOCUMENT.SHIP_DOC_NO
Numeric
Form m is issued SGD_REG_DATE
Registration date
String
NO
NO
string
NO
NO
Date
NO
NO
String
NO
NO
String
NO
NO
of the SGD PAR_NUMBER
Pre-arrival risk assessment report (PAR) number
PAR_ISSUE_DATE
Date of issuance of the PAR
FORM_M_NO
Application number of the Form M
SHIPMENT_TYPE
Type of packing used for shipment; it may be containerized or non-containerized
Table B.11 : ScanningResult specification
39
A MODEL FOR IMPLEMENTING RISK MANAGEMENT DATA WAREHOUSE FOR SCANNING COMPANIES IN NIGERIA
Attributes
Description
Domain
Null
index
String
NO
NO
Numeric
NO
NO
Referential integrity
type SGD_REG_NO
SGD REGISTRATION NUMBER
IMPORTER_ID
ID of Importer
REFERENCES IMPORTER.IMPORTER_ID
ANALYST_ID
Shipping
Numeric
NO
NO
document number SHIPMENT_TYPE
Type of
REFERENCES ANALYST.ANALYST_ID
String
NO
NO
String
YES
NO
string
NO
NO
shipment;may be containerized or not CONTAINER_NO
Registration date of the SGD
PAR_NUMBER
Pre-arrival risk assessment report (PAR) number
CONTAINER_SIZE
Container size
String
YES
NO
CUSTOMS_CODE
Local customs
String
NO
NO
String
NO
NO
Boolean
NO
NO
String
YES
NO
office code RISK_DETAILS
Details of the risk found
RISK_FOUND
Indicates whether risk is found or not
SCAN_COMMENTS
Post-scanning comments on the shipement
SCAN_DATE
Date of scan
Date
NO
NO
SCAN_SEQ_NO
Sequence number
Numeric
NO
YES
of the scan record
40
A MODEL FOR IMPLEMENTING RISK MANAGEMENT DATA WAREHOUSE FOR SCANNING COMPANIES IN NIGERIA
SEVERITY
Indicates severity
Boolean
NO
NO
of the risk found
C. DATABASE/DATA WAREHOUSE CREATION C.1: STAGING DATABASE CREATE DATABASE STAGING;
41
A MODEL FOR IMPLEMENTING RISK MANAGEMENT DATA WAREHOUSE FOR SCANNING COMPANIES IN NIGERIA
USE STAGING; CREATE TABLE AGENCY ( AGENCY_ID
INTEGER NOT NULL,
AGENCY_NAME
VARCHAR(100) NOT NULL,
ADDRESS CITY
VARCHAR(50) NOT NULL, VARCHAR(50) NOT NULL,
CONTACT_PERSON EMAIL
VARCHAR(50) NULL,
VARCHAR(50) NOT NULL,
TELEPHONE
VARCHAR(20) NOT NULL
); ALTER TABLE AGENCY ADD PRIMARY KEY (AGENCY_ID); CREATE TABLE ANALYST ( PERSON_ID
INTEGER NULL,
FIRSTNAME
VARCHAR(30) NOT NULL,
LASTNAME
VARCHAR(30) NOT NULL,
OTHERNAMES
VARCHAR(30) NULL,
GENDER
CHAR(1) NOT NULL,
DATE_OF_BIRTH
DATE NULL,
MARITAL_STATUS EMAIL
VARCHAR(50) NULL,
TELEPHONE
VARCHAR(20) NULL,
ADDRESS CITY
VARCHAR(20) NULL,
VARCHAR(100) NULL, VARCHAR(20) NULL,
STATE
VARCHAR(20) NULL,
ORGANIZATION ANALYST_ID
VARCHAR(100) NULL, INTEGER NOT NULL,
LAST_UPDATED
DATE NULL
); ALTER TABLE ANALYST ADD PRIMARY KEY (ANALYST_ID); CREATE TABLE EXPORTER ( EXPORTER_ID EXPORTER_NAME
INTEGER NOT NULL, VARCHAR(100) NOT NULL,
42
A MODEL FOR IMPLEMENTING RISK MANAGEMENT DATA WAREHOUSE FOR SCANNING COMPANIES IN NIGERIA
ADDRESS CITY
VARCHAR(100) NOT NULL, VARCHAR(50) NULL,
TELEPHONE
VARCHAR(20) NULL,
POSTCODE
VARCHAR(10) NULL,
EMAIL
VARCHAR(50) NULL,
CONTACT_PERSON STATE
VARCHAR(50) NULL,
VARCHAR(30) NULL,
LAST_UPDATED COUNTRY
DATE NULL, VARCHAR(40) NOT NULL
); ALTER TABLE EXPORTER ADD PRIMARY KEY (EXPORTER_ID);
CREATE TABLE IMPORTER ( IMPORTER_ID
INT NOT NULL,
IMPORTER_NAME ADDRESS CITY
VARCHAR(100) NOT NULL,
VARCHAR(100) NOT NULL, VARCHAR(50) NOT NULL,
TELEPHONE
VARCHAR(20) NULL,
POSTCODE EMAIL
VARCHAR(10) NULL, VARCHAR(50) NULL,
CONTACT_PERSON STATE
VARCHAR(50) NULL,
VARCHAR(30) NULL,
LAST_UPDATED
DATE NULL
); ALTER TABLE IMPORTER ADD PRIMARY KEY (IMPORTER_ID); CREATE TABLE REPORT ( REPORT_ID
INTEGER NOT NULL,
AGENCY_ID
INTEGER NOT NULL,
REPORT_NAME CODE
VARCHAR(20) NOT NULL,
VARCHAR(20) NULL,
REQUEST_DATE
DATE NOT NULL,
ISSUED_DATE
DATE NOT NULL,
ANALYST_ID
INTEGER NULL
43
A MODEL FOR IMPLEMENTING RISK MANAGEMENT DATA WAREHOUSE FOR SCANNING COMPANIES IN NIGERIA
); ALTER TABLE REPORT ADD PRIMARY KEY (REPORT_ID); CREATE TABLE SCAN_RESULT ( CONTAINER_NO
VARCHAR(40) NULL,
CONTAINER_SIZE
VARCHAR(20) NULL,
CUSTOMS_OFFICE
VARCHAR(4) NOT NULL,
PAR_NUMBER
VARCHAR(17) NOT NULL,
RISK_DETAILS
VARCHAR(200) NULL,
RISK_FOUND
CHAR(1) NOT NULL,
SCAN_COMMENTS SCAN_DATE
DATE NOT NULL,
SCAN_SEQ_NO SEVERITY
TEXT NULL,
VARCHAR(17) NOT NULL, CHAR(1) NOT NULL,
SGD_REG_NO SHIPMENT_TYPE
VARCHAR(17) NOT NULL, VARCHAR(20) NOT NULL,
IMPORTER_ID
INTEGER NOT NULL,
ANALYST_ID
INTEGER NULL,
LAST_UPDATED
DATE NULL
); ALTER TABLE SCAN_RESULT ADD PRIMARY KEY (SCAN_SEQ_NO); CREATE TABLE SGD ( SGD_REG_NO SGD_REG_DATE FORM_M_NO PAR_NUMBER
VARCHAR(17) NOT NULL, DATE NOT NULL, VARCHAR(17) NOT NULL, VARCHAR(17) NOT NULL,
FORM_M_YEAR
INTEGER NOT NULL,
PAR_ISSUE_DATE
DATE NOT NULL,
SHIP_DOC_NO PARTIAL_NO IMPORTER_NAME LAST_UPDATE
VARCHAR(20) NOT NULL, VARCHAR(5) NOT NULL, VARCHAR(100) NOT NULL, DATE NULL
); ALTER TABLE SGD
44
A MODEL FOR IMPLEMENTING RISK MANAGEMENT DATA WAREHOUSE FOR SCANNING COMPANIES IN NIGERIA
ADD PRIMARY KEY (SGD_REG_NO); CREATE TABLE SHIPMENT ( COUNTRY_OF_ORIGIN
VARCHAR(40) NOT NULL,
COUNTRY_OF_SUPPLY
VARCHAR(40) NOT NULL,
FOB_VALUE
DECIMAL(10,2) NOT NULL,
INSURANCE
DECIMAL(10,2) NOT NULL,
FREIGHT
DECIMAL(10,2) NOT NULL,
GOODS_DESCRIPTION INVOICE_NO
VARCHAR(20) NOT NULL,
INVOICE_DATE PARTIAL_NO
VARCHAR(2000) NULL,
DATE NOT NULL, VARCHAR(5) NOT NULL,
IMPORTER_NAME
VARCHAR(100) NOT NULL,
EXPORTER_NAME
VARCHAR(100) NOT NULL,
LAST_UPDATED
DATE NULL
); ALTER TABLE SHIPMENT ADD PRIMARY KEY (INVOICE_NO); CREATE TABLE SHIPPING_DOC ( SHIPPING_DATE
DATE NOT NULL,
CARRIER_NAME
VARCHAR(100) NOT NULL,
CUSTOMS_CODE
VARCHAR(4) NOT NULL,
DISCHARGE_PORT
VARCHAR(15) NOT NULL,
FLIGHT_VOYAGE_NO FORM_M_NO
VARCHAR(17) NOT NULL,
IMPORTER_NAME LOAD_PORT SHIP_DOC_NO INVOICE_NO
VARCHAR(30) NOT NULL,
VARCHAR(100) NOT NULL, VARCHAR(30) NOT NULL, VARCHAR(30) NOT NULL, VARCHAR(70) NOT NULL,
TRANSPORT_MODE
VARCHAR(7) NOT NULL,
GROSSWEIGHT
VARCHAR(15) NULL,
LAST_UPDATED
DATE NULL
);
ALTER TABLE SHIPPING_DOC ADD PRIMARY KEY (SHIP_DOC_NO);
45
A MODEL FOR IMPLEMENTING RISK MANAGEMENT DATA WAREHOUSE FOR SCANNING COMPANIES IN NIGERIA
ALTER TABLE REPORT ADD FOREIGN KEY agency_report (AGENCY_ID) REFERENCES AGENCY (AGENCY_ID); ALTER TABLE REPORT ADD FOREIGN KEY analyst_report (ANALYST_ID) REFERENCES ANALYST (ANALYST_ID); ALTER TABLE SCAN_RESULT ADD FOREIGN KEY sgd_scan (SGD_REG_NO) REFERENCES SGD (SGD_REG_NO); ALTER TABLE SCAN_RESULT ADD FOREIGN KEY analyst_scan (ANALYST_ID) REFERENCES ANALYST (ANALYST_ID); ALTER TABLE SHIPMENT ADD FOREIGN KEY shipment_exporter (EXPORTER_NAME) REFERENCES EXPORTER (EXPORTER_ID); ALTER TABLE SHIPPING_DOC ADD FOREIGN KEY imp_shipping (IMPORTER_NAME) REFERENCES IMPORTER (IMPORTER_ID); ALTER TABLE SHIPPING_DOC ADD FOREIGN KEY shipping_shipment (INVOICE_NO) REFERENCES SHIPMENT (INVOICE_NO);
C.2: DATA WAREHOUSE DATABASE CREATE DATABASE SCANNING_WAREHOUSE; USE SCANNING_WAREHOUSE; SET FOREIGN_KEY_CHECKS=0; DROP TABLE IF EXISTS `dim_date`; CREATE TABLE `dim_date` ( `DATE_ID` int(11) NOT NULL AUTO_INCREMENT, `CAL_DAY` int(11) DEFAULT NULL, `CAL_MONTH` int(11) DEFAULT NULL, `MONTH_NAME` varchar(20) DEFAULT NULL, `CAL_YEAR` int(4) DEFAULT NULL, `QUARTER` int(4) DEFAULT NULL, `QUARTER_NAME` varchar(2) DEFAULT NULL, `LEAP_YEAR` int(11) DEFAULT '0', `CAL_DATE` date DEFAULT NULL, `CAL_WEEK` int(11) DEFAULT NULL, PRIMARY KEY (`DATE_ID`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8;
DROP TABLE IF EXISTS `dim_importer`; CREATE TABLE `dim_importer` (
46
A MODEL FOR IMPLEMENTING RISK MANAGEMENT DATA WAREHOUSE FOR SCANNING COMPANIES IN NIGERIA
`IMPORTER_ID` int(11) NOT NULL, `IMPORTER_NAME` varchar(100) NOT NULL, `ADDRESS` varchar(100) NOT NULL, `CITY` varchar(50) NOT NULL, `TELEPHONE` varchar(20) DEFAULT NULL, `POSTCODE` varchar(10) DEFAULT NULL, `EMAIL` varchar(50) DEFAULT NULL, `IMPORTER_DIM_ID` int(11) NOT NULL AUTO_INCREMENT, `CONTACT_PERSON` varchar(50) DEFAULT NULL, `STATES` varchar(30) DEFAULT NULL, `version` bigint(11) NOT NULL DEFAULT '0', `DATE_FROM` date DEFAULT NULL, `DATE_TO` date DEFAULT NULL, PRIMARY KEY (`IMPORTER_DIM_ID`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8;
DROP TABLE IF EXISTS `dim_scan_result`; CREATE TABLE `dim_scan_result` ( `SCAN_REQUEST_DATE` date DEFAULT NULL, `SCAN_SEQ_NO` varchar(17) DEFAULT NULL, `SCAN_DATE` date DEFAULT NULL, `ANALYST_ID` varchar(50) NOT NULL DEFAULT '0', `SCAN_DIM_ID` int(11) NOT NULL AUTO_INCREMENT, `CONTAINER_NO` varchar(40) DEFAULT NULL, `CONTAINER_SIZE` varchar(20) DEFAULT NULL, `CUSTOMS_OFFICE` varchar(4) DEFAULT NULL, `IMPORTER_NAME` varchar(100) DEFAULT NULL, `PAR_NUMBER` varchar(17) DEFAULT NULL, `RISK_DETAILS` varchar(2000) DEFAULT NULL, `RISK_FOUND` varchar(5) DEFAULT NULL, `SCAN_COMMENTS` text, `SEVERITY` varchar(4) DEFAULT NULL, `SGD_REG_NO` varchar(17) DEFAULT NULL, `SHIPMENT_TYPE` varchar(20) DEFAULT NULL, `VERSION` int(11) DEFAULT '0', `DATE_FROM` datetime DEFAULT NULL, `DATE_TO` datetime DEFAULT NULL,
47
A MODEL FOR IMPLEMENTING RISK MANAGEMENT DATA WAREHOUSE FOR SCANNING COMPANIES IN NIGERIA
PRIMARY KEY (`SCAN_DIM_ID`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8;
DROP TABLE IF EXISTS `dim_sgd`; CREATE TABLE `dim_sgd` ( `SGD_DIM_ID` int(11) NOT NULL AUTO_INCREMENT, `SGD_REG_NO` varchar(17) DEFAULT NULL, `SGD_REG_DATE` date DEFAULT NULL, `FORM_M_NO` varchar(17) DEFAULT NULL, `PAR_NUMBER` varchar(17) DEFAULT NULL, `FORM_M_YEAR` int(11) DEFAULT NULL, `PAR_ISSUE_DATE` date DEFAULT NULL, `SHIP_DOC_NO` varchar(20) DEFAULT NULL, `PARTIAL_NO` varchar(15) DEFAULT NULL, `IMPORTER_NAME` varchar(100) DEFAULT NULL, `VERSION` int(11) NOT NULL DEFAULT '0', `DATE_TO` datetime DEFAULT NULL, `DATE_FROM` datetime DEFAULT NULL, PRIMARY KEY (`SGD_DIM_ID`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8;
DROP TABLE IF EXISTS `dim_shipment`; CREATE TABLE `dim_shipment` ( `SHIPMENT_DIM_ID` int(11) NOT NULL AUTO_INCREMENT, `COUNTRY_OF_ORIGIN` varchar(40) DEFAULT NULL, `COUNTRY_OF_SUPPLY` varchar(40) DEFAULT NULL, `EXPORTER_NAME` varchar(100) DEFAULT NULL, `FOB_VALUE` float(10,2) DEFAULT '0.00', `INSURANCE` float(10,2) DEFAULT '0.00', `FREIGHT` float(10,2) DEFAULT '0.00', `GOODS_DESCRIPTION` varchar(2000) DEFAULT NULL, `GROSS_WEIGHT` varchar(15) DEFAULT NULL, `INVOICE_NO` varchar(50) DEFAULT NULL, `INVOICE_DATE` date DEFAULT NULL, `IMPORTER_NAME` varchar(100) DEFAULT NULL, `PARTIAL_NO` varchar(15) DEFAULT NULL, `VERSION` int(11) NOT NULL DEFAULT '0',
48
A MODEL FOR IMPLEMENTING RISK MANAGEMENT DATA WAREHOUSE FOR SCANNING COMPANIES IN NIGERIA
`DATE_FROM` datetime DEFAULT NULL, `DATE_TO` datetime DEFAULT NULL, PRIMARY KEY (`SHIPMENT_DIM_ID`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8;
DROP TABLE IF EXISTS `dim_shipping_doc`; CREATE TABLE `dim_shipping_doc` ( `SHIP_DIM_ID` int(11) NOT NULL AUTO_INCREMENT, `ARRIVAL_DATE` date DEFAULT NULL, `SHIPPING_DATE` date DEFAULT NULL, `CARRIER_NAME` varchar(70) DEFAULT NULL, `CUSTOMS_CODE` varchar(4) DEFAULT NULL, `DISCHARGE_PORT` varchar(15) DEFAULT NULL, `FLIGHT_VOYAGE_NO` varchar(30) DEFAULT NULL, `SHIP_DOC_DATE` date DEFAULT NULL, `FORM_M_NO` varchar(17) DEFAULT NULL, `IMPORTER_NAME` varchar(100) DEFAULT NULL, `LOAD_PORT` varchar(30) DEFAULT NULL, `PARTIAL_NO` varchar(15) DEFAULT NULL, `SHIP_DOC_NO` varchar(20) DEFAULT NULL, `INVOICE_NO` varchar(30) DEFAULT NULL, `TRANSPORT_MODE` varchar(7) DEFAULT NULL, `GROSSWEIGHT` varchar(20) DEFAULT NULL, `VERSION` int(11) NOT NULL DEFAULT '0', `DATE_FROM` datetime DEFAULT NULL, `DATE_TO` datetime DEFAULT NULL, PRIMARY KEY (`SHIP_DIM_ID`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8;
CREATE TABLE `fact_scan` ( `FACT_SCAN_ID` int(11) NOT NULL AUTO_INCREMENT, `CFR` float(10,2) DEFAULT NULL, `CIF` float(10,2) DEFAULT NULL, `RISK_LEVEL` varchar(11) DEFAULT NULL, ‘NUM_CONTAINERS’ int(11) DEFAULT 0, `PORT` varchar(50) DEFAULT NULL,
49
A MODEL FOR IMPLEMENTING RISK MANAGEMENT DATA WAREHOUSE FOR SCANNING COMPANIES IN NIGERIA
`SCAN_DIM_ID` int(11) DEFAULT NULL, `DATE_ID` int(11) DEFAULT NULL, `SHIPMENT_DIM_ID` int(11) DEFAULT NULL, `SGD_DIM_ID` int(11) DEFAULT NULL, `IMPORTER_DIM_ID` int(11) DEFAULT NULL, `SHIP_DIM_ID` int(11) DEFAULT NULL, PRIMARY KEY (`FACT_SCAN_ID`), KEY `fact_scan` (`SCAN_DIM_ID`), KEY `fact_date` (`DATE_ID`), KEY `fact_shipment` (`SHIPMENT_DIM_ID`), KEY `fact_sgd` (`SGD_DIM_ID`), KEY `fact_importer` (`IMPORTER_DIM_ID`), KEY `fact_shipping` (`SHIP_DIM_ID`), CONSTRAINT `fact_date` FOREIGN KEY (`DATE_ID`) REFERENCES `dim_date` (`DATE_ID`) ON DELETE NO ACTION, CONSTRAINT `fact_importer` FOREIGN KEY (`IMPORTER_DIM_ID`) REFERENCES `dim_importer` (`IMPORTER_DIM_ID`) ON DELETE NO ACTION, CONSTRAINT
`fact_scan`
FOREIGN
KEY
(`SCAN_DIM_ID`)
REFERENCES
`dim_scan_result`
(`SCAN_DIM_ID`) ON DELETE NO ACTION, CONSTRAINT `fact_sgd` FOREIGN KEY (`SGD_DIM_ID`) REFERENCES `dim_sgd` (`SGD_DIM_ID`) ON DELETE NO ACTION, CONSTRAINT `fact_shipment` FOREIGN KEY (`SHIPMENT_DIM_ID`) REFERENCES `dim_shipment` (`SHIPMENT_DIM_ID`) ON DELETE NO ACTION, CONSTRAINT `fact_shipping` FOREIGN KEY (`SHIP_DIM_ID`) REFERENCES `dim_shipping_doc` (`SHIP_DIM_ID`) ON DELETE NO ACTION ) ENGINE=InnoDB DEFAULT CHARSET=utf8;
SET FOREIGN_KEY_CHECKS=1;
50