Building a minimalist DNA database management system

6 downloads 14555 Views 3MB Size Report
Building a minimalist DNA database management system using Pruem tools. ABSTRACT. When it comes of DNA data exchange base on 2008/615/JHA and ...
Building a minimalist DNA database management system using Pruem tools Florin STANCIU, PhD

ABBREVIATIONS

ABSTRACT

DNA-DB-MS - DNA database management system CC - Communication component ME - Match engine GUI - Graphical user interface

When it comes of DNA data exchange base on 2008/615/JHA and 2008/616/JHA Council Decisions, each EU country is free of choosing the implementation means. Romania has some experience in implementing three different types of IT solutions: a bought one – Dimensions 2.0, an in-house built solution – Pruem tools and a freeware solution – CODIS 7.0, each of them with their specific advantages and disadvantages. Starting with the Pruem DNA data exchange infrastructure and base on Pruem tools any DNA database laboratory, which has some Pruem DNA data exchange experience, can build from scratch a simple DNA database management system for laboratory internal DNA profiles comparison purposes, with a minimum of human and IT resources. This paper presents a review of the Romanian National DNA Database Laboratory efforts in building a minimalist DNA database management system using the most common and freely available Pruem tools: Communication component (Germany), Match engine (Austria) and Graphical user interface (Netherlands). Keywords: Pruem DNA data exchange, Pruem tools, DNA database management system, Romanian National DNA Database Laboratory.

INTRODUCTION A forensic DNA-DB-MS is a software or a group of software that interact closely, and its main purpose is to enable users of a DNA database laboratory, DNA profiles management (from human biological samples) by inserting, removing, comparing, grouping, filtering, updating editing, indexing and storing them in a forensic database. Usually these DNA profiles came from convicted persons, suspects, crime scene stains, unidentified human remains, etc. The main difference between Pruem DNA data exchange architecture and DNA-DB-MS consist in the databases location. If for Pruem there are a minimum of two databases located in two different UE countries, in DNA-DB-MS case there are a maximum of two databases with location on the same SQL server, each of them with a different role. Also, the general DNA-DBMS architecture is simpler in comparison with Pruem (Fig. 1 and 2). The two DNA-DB-MS databases (Fig. 2) have the following characteristics: a) the first database (DB1) is built as a standard Pruem database for CODIS Users (versions 1024 MB ? Processors: =>1 ? HDD: =>10Gb Other ? Apache Tomcat v7.0.12 ? Java SE Development Kit 6 Update 25 ? ArGoSoft Mail Server Freeware v1.8.8.9 ? Microsoft SQL Server 2008 Enterprise Edition ? Microsoft Office Professional Edition 2003, Sp3 ? SQL Manager 2008 for SQL Server v3.2.0.2 In addition it had been built: ? A SQL Manager 2008 for SQL Server import template file for importing the DNA profile into the DNA database using the below Microsoft Office Excel files; ? An DNA profile Import Excel template file for AmpFlSTR Indentifiler Kit loci arrangement; ? An DNA profile Import Excel template file for Interpol DNA Profile Search Request Form loci arrangement. PRUEM TOOLS CUSTOMIZATION PruemDNA

ResponseIn threads for DB2 had been deactivated by removing them from each PruemDNA config folder and also, from the main HTML CC web page which manage injection tool threads for both Ccs.

SQL Server DB/Tab

AT/DE/NL

SMTP

DNA Match (for PRUEM) One Match engine connected to DB2 (the main database), for which a single specific System DSN had been configured. The default Pruem parameters are also valid for DNA-DB-MS normal tasks.

SMTP-Server

Protocol

Client Integrated or CODIS-MS own GUI

Profile Index-DB Index-DB

POP3

Index-DB Copied CODIS

Communication-Tool

XML

Result (HIT/NO-HIT)

sTESTA

SQL Server CODIS-Pruem User Interface By default, this Microsoft Office Access application is intended to be used just with the Nederland's Testset, and for this reason in the Search other countries sub-panel appears just those DNA profiles that start with “TST” letters. Due to the fact that each laboratory use specific sample names, the CPU had to be modified, by removing the conditioning filter from the Form_frmMainform code, in order to allow viewing all other samples name formats. Also by default, the GUI has the option of switching between an Operational server and a Test server. For the DNA-DB-MS the Operational server had been considered the DB1 and the Test server the DB2, for which two specific User DNSs had been configured. Note: At this level one bug had been observed. The switch between the two DBs can be made by starting twice the GUI application in the following order: open GUI > choose YES/NO > close GUI > open GUI > choose YES/NO as previous. Where YES it opens the DB1 and NO it opens the DB2.

R E Q U E S T

XML

DB/Tab Match Engine Protocol

SMTP-Server

SMTP

POP3 Profile

Communication-Tool

Result (HIT/NO-HIT)

Index-DB

A N S W E R

Figure 1 – Pruem communication flow chart (adaptation upon Jimmy Wang's initial chart)

SQL Server DB/Tab

NL CODIS-Pruem GUI

Profile Index-DB

POP3

Protocol

Result (HIT/NO-HIT)

Communication-Tool - A

DE SMTP

R E Q U E S T

XML

SMTP-Server

For differentiating DNA profiles in more categories than Pruem tools provide (samples and persons), the AGENCY field from CODIS_EXP and other database tables had been used for further labeling. The differentiation of new categories (convicted offenders, suspects, etc.) had been made, using the DNA profile Import Excel template files (e.g. “RO-S”, where "RO" is for Romania and “S” is the label for Suspects.

AT

DATA WORKFLOW In order to insert and compare DNA profiles in DNA-DB-MS, a rigorous workflow was needed for assuring that always new profiles are searched against the DNA profiles within the main database, before inserting them into the main database for further searches: 1.The DNA profile Import Excel template file is formatted upon the following rules: a) Decimal separator should be “.”; b) All the wildcards should be “*”; c) No more than two alleles per locus, and where are more than two, the first allele and just one wildcard “*” will be used. 2.DNA profiles are imported in DB1 using SQL Manager 2008 for SQL Server. 3.An interrogation request is sent to DB2 by using CODIS-Pruem User Interface opened for DB1. 4.View the results using the CODIS-Pruem User Interface opened for DB2.

After one CC with default settings was built, it had been duplicated (one CC for each database – DB1 and DB2) and all the parameters from the PruemDna tools and config folders were updated with for the new name and the new location of each CC folder, so that in the end, both CCs were located in one main root folder.

5.Edit the initial DNA profile Import Excel template file, if is needed (by deleting unnecessary DNA profiles), and import it into DB2 by using SQL Manager 2008 for SQL Server.

Both HTML CC web pages which manage injection tool threads had been merged in a single one. Also, RequestIn and ResponseOut threads for DB1 and RequestOut and

A DNA-DB-MS is very valuable in a DNA database laboratory for the following reasons: a)it can be used as an exclusion tool by comparing all the new DNA profiles with the

CONCLUSIONS

SMTP

Match Engine

XML

Protocol POP3 Profile Result (HIT/NO-HIT)

Index-DB

Although, from a technical perspective, certificates are no longer necessary in a DNADB-MS, by using the Germany CC, a separate certificate for each database had to be created. For the country code, two random characters (ISO 3166 accepted), had been used, so that DNA profiles be even more differentiated in the GUI.

DB/Tab

Communication-Tool - B

DE

A N S W E R

Figure 2 – DNA database management system chart

DNA profiles of the laboratory staff or other DNA profiles categories; b)it can find the matches between DNA profiles from reference and criminal scene samples, before sending them into the Pruem workflow; c) it can find the matches between DNA profiles from reference samples that came from the same person (e.g. persons with multiple identities or persons that had been introduced in the DNA database more than once) The DNA-DB-MS presented here, is far to equal or even approach consecrated DNADB-MSs such as CODIS or others; it still has some bugs and can be even more customized and minimized with some additional Java & Visual Basic knowledge. With all that, this is an rapid and free solution that some DNA database laboratory may take in consideration, a solution which do not require special hardware (e.g. works well on any Windows XP version and can be made portable by virtualize the DNA-DB-MS operating system) and also do not require very high level specialized staff… After all, I am a biologist and I could think all of these… ACKNOWLEDGMENTS I would like to thank all my colleagues from the Romanian National DNA Database National Forensic Science Institute (Bratu Tan?a, Cotolea Adnana, Cu?ãr Veronica, Iancu Florentina, Pîrlea Sorina, Stoian Ionut Marius and Vladu Simona) for their valuable support. USEFUL CONTACTS If you would like to obtain further information about the above Pruem tools, contact your National Contact Point in charge with Pruem DNA data exchange or the Pruem tools developers: CC - Jimmy Wang ([email protected]); ME - Karl Obermayer ([email protected]); CPU - Margreet Kamp ([email protected]). REFERENCES [1] - Margreet Kamp, CODIS-Pruem User Manual, January 2010, Version 1.1, Netherlands Forensic Institute. [2] - Margreet Kamp, CODIS-Pruem Implementation Guide, 8 December 2010, Version 1.2, Netherlands Forensic Institute. [3] - Jimmy Wang, Installation instructions for the Prüm DNA Communication Center, 30 August 2006, BKA, Federal Criminal Police Office, Germany. [4] - Karl Obermeyer, PRÜM Implementation. Integration of the PRÜM-Process into the Austrian DNA Database, 19 April 2010, Criminal Intelligence Service, Austria.

National Forensic Science Institute Bucharest - ROMANIA ?tefan cel Mare, No 13-15, Sector 2, 020123

[email protected]