2nd Int'l Conf. on Electrical Engineering and Information & Communication Technology (ICEEICT) 2015 Iahangirnagar University, Dhaka-1342, Bangladesh, 21-23 May 2015
Towards Development of Health Data Warehouse: Bangladesh Perspective Shahidul Islam Khan; Abu Sayed Md. Latiful Hoque Department of Computer Science and Engineering (CSE) Bangladesh University of Engineering and Technology (BUET) Dhaka, Bangladesh
[email protected];
[email protected]
Abstract-Health informatics is one of the top most focuses of
cases with different definitions and structures in information
researchers now a days. Availability of timely and accurate data
system (structured data) [I]-[8]. For the above reason Clinical
is essential for informed medical decision making. Health care
Data Stores (CDS) with isolated information islands across
organizations face a common problem with large amount of data
various
they have in numerous systems. Such systems are unstructured
administrative processes,
and
unorganized,
integration.
requires
Researchers,
computational
medical
time
practitioners,
for
health
data care
providers and patients will not be able to utilize the knowledge stored in different repositories unless synthesize the information from disparate sources. This problem can be solved by Data warehousing. Data warehousing techniques share a common set of tasks, include requirements analysis, data design, architectural design,
implementation
and deployment.
Developing
Clinical
data warehouse is complex and time consuming but is essential to
hospitals,
departments,
laboratories
and
related
which are time consuming and
demanding reliable integration [7]. Data required to make informed medical decisions are trapped within fragmented, disparate,
and
heterogeneous
clinical
and
administrative
systems that are not properly integrated or fully utilized. Ultimately, health care suffer because medical practitioners and health care providers are unable to access and use this information to perform activities such as diagnostics, and treatment optimization to improve patient care.
deliver quality patient care. Data integration tasks of medical data store are much challenging when designing clinical data warehouse architecture. This research identifies prospects and complexities
of
Health
data
warehousing
and
Mining
in
Bangladesh perspective and proposes a data-warehousing model suitable for integrating data from different health care sources.
Keywords-Data Warehouse; Data Mining; Pathological Data; Health Data Preprocessing;
I.
INTRODUCTION
Informed decision making in health care is vital to provide timely, accurate, and relevant advice to the right person, to reduce the cost of health care and to improve the overall quality of health care services. Since medical decisions are very complex, making choices about medical decision-making processes, procedures and treatments can be overwhelming One of the major Information Technology challenge in clinical practice
is
how
to
integrate
several
disparate,
isolated
information repositories into a single logical repository. A massive amount of health records, related documents and medical images created by clinical diagnostic equipment are generated daily. Medical documents are owned by different hospitals, health professionals and patients. These valuable data are stored in various medical information systems such as HIS (Hospital Information System), PACS (Picture Archiving and Communications System), RIS (Radiology Information System) in various hospitals, departments and
diagnostic
laboratories. There are not only a tremendous volume of imaging
files (unstructured data), but also many medical
information such as medical records, diagnosis reports and This research is supported by the ICT Division, Ministry of Posts, Telecommunications and Information Technology, Government of the People's Republic of Bangladesh.
978-1-4673-6676-2/15/$31.00 ©2015 IEEE
Fig. l. Components of health data
Clinical or Health data refers to any information that is contained in a patient's medical record.
This information
may
from
be acquired
admission
from
notes
or a doctor's visit.
forms such as
text
or
derived
numbers
demographics, history,
laboratory
digital
EEG,
signals (ECG,
(histological, radiological, Furthermore, clinical
(patient data,
EMG,
ultrasound,
studies
a
hospital
This data comes in various
involve
identification,
etc),
analog
or
ENG
etc),
images
etc),
and
videos.
specimen
collection
from multiple patients. Further complicating the storage of this data
is
the
fact
that
because
patient
information cannot be publicly used.
identification
Health data are very
complex in nature. Fig. I represents the various components of health data. [3], [6], [9], [13].
The advantages and disadvantages of data warehousing are given below [8], [13]: Advantages: 1. Allow existing systems to continue in operation without
Successful healthcare data management is an important factor in developing support systems for the clinical decision making process. Traditional database systems do not satisfy the requirements for critical data analysis tasks of the clinical decision-making users. An operational database supports daily data transactional operations. Operational databases contain detailed data but they do not include important historical
any modification 2.
Consolidate unstructured data from various legacy
systems into one standard set 3. Improve quality of data and service 4. Allow users to retrieve necessary data by themselves
patient data, and since they are usually highly normalized,
Disadvantage:
they perform poorly for complex queries that need to join
I. Oevelopment cost and time
many relational tables or to aggregate large volumes of data in order to generate various clinical reports. This aggravates the data fragmentation. A clinical data warehouse is a data store that is different from the hospital's operational databases. It can be used for the analysis of consolidated historical data. Implementing a Oata Warehouse (OW) is a complex task containing two major phases. Firstly, in the configuration phase, a conceptual view of the warehouse is first specified
The main aim of this research is to identify the main obstacles for healthcare data integration and proposing a data warehousing model suitable for integrating fragmented data in respect to Bangladesh.
The result will contribute to the
advancement of knowledge in the field of medical informatics. The proposed data warehousing model will help in efficient medical decision making process.
according to user requirements (DW design). Secondly, the
The rest of this paper is organized as follows. In Section II
related data sources and the Extraction-Load- Transform
we have presented selected literature reviews on OW and
(ETL) process (data acquisition) are determined.
Finally,
Health DW. Section III describes briefly some design issues of
decisions about persistent storage of the warehouse using
National Health OW. In Section IV we have shown the
database technology and the various ways data will be
calculation of approximate size of our DW. Section V gives
accessed during analysis are made. After the initial load (the
readers ideas about how our OW will be used for knowledge
first load of the DW according to the configuration), during
discovery. Finally Section VI concludes the paper.
the
operation
phase,
warehouse
data
must
be
regularly
refreshed, i. e. , modifications of operational data since the last
II.
LITERATURE REVIEW
OW refreshment must be propagated into the warehouse such that data stored in the data warehouse reflect the state of the underlying operational systems. All the above steps must be carefully applied to fragmented healthcare data to guarantee the efficient and reliable retrieval at the point of care[5], [13]. According to Inmon[8] A data warehouse is a subject
Health informatics or healthcare informatics is an intersection of computer science and health care services. It deals with resources and methods needed to optimize the acquisition, storage, retrieval and use of information in medical research and applied to the areas of health care management, diagnosis,
oriented, integrated, non-volatile, and time-variant collection
clinical care, pharmacy, nursing and public health. Knowledge
of data in support of management's decisions.
discovery is the non-trivial process of identifying valid, novel,
The data
potentially useful and ultimately understandable patterns in
warehouse contains granular corporate data. •
Subject-oriented:
Classical operations systems are
organized around the applications of the company. Each company has its own unique set of subjects. •
Integrated:
is converted, reformatted, re-sequenced, summarized, and so forth. Non-volatile:
one of
the
steps in the process
of
learning algorithms that produce a particular enumeration of
A
Oata
Warehouse
(OW)
unifies
the
data
scattered
throughout an organization into a single centralized data structure
with
a
common
format.
It
is
a
repository
of
integrated information, available for querying and analysis [5]. Data warehouse data is loaded and
Thus, data warehousing may be considered a "proactive"
accessed, but it is not updated. Instead, when data in
approach to information integration, as compared to the more
the data warehouse is loaded, it is loaded in a
traditional
snapshot, static format. When subsequent changes
integration starts when a query arrives. [6].
occur, a new snapshot record is written. •
Data mining,
knowledge discovery, consists of applying data analysis and patterns over the data [1-4].
Oata is fed from multiple disparate
sources into the data warehouse. As the data is fed it
•
data.
Time-variant:
Every
unit
of
data
"passive"
approaches
where
processing
and
A medical data warehouse is a repository where healthcare in
the
data
warehouse is accurate as of some one moment in time. In some cases, a record is time stamped. In other cases, a record has a date of transaction.
providers can gain access to medical data gathered in the patient care process. Extracting medical domain information to a data warehouse can facilitate efficient storage, enhances timely analysis and increases the quality of real time decision making processes. Currently medical data warehouses need to
address the issues of data location, technical platforms, and
providing low-cost screening using disease models that require
data formats; organizational behaviors on processing the data
easily-obtained attributes from previous cases. It can perform
and culture across the data management population. Today's
automated analysis of pathological signals (ECG/ EMG), and
healthcare organizations require not only the quality and
medical images (CT scan, Ultrasonography etc.) [3], [15].
effectiveness of their treatment, but also reduction of waste and unnecessary costs. By effectively leveraging enterprise wide
data
on
labor
expenditures,
supply
utilization,
procedures, medications prescribed, and other costs associated with patient care, healthcare professionals can identify and correct wasteful practices and unnecessary expenditures[7].
Medical domain has certain unique data requirements such as high volumes of unstructured data (e.g. digital image files, voice
clips,
radiology
confidentiality.
information,
Data
warehousing
etc.)
and
models
data should
accommodate these unique needs. There are huge constraints and issues that limit the way the data mining is performed for
A Clinical Data Warehouse (COW) can facilitate efficient
medical datasets. Some of these issues are the way the data is
storage, enhances timely analysis and increases the quality of
collected; accuracy of the data, ethical, privacy and social
real time decision making processes.
Such methodologies
issues that comes with patient's records [15-16]. Agrawal et
share a common set of tasks, including business requirements
al.
analysis, data design, architecture design, implementation and
problems which were further divided into four classes in [18].
deployment[8].
They are classification, clustering, association and Sequences.
The
COW
is
a
place
where
healthcare
in [17] proposed three basic classes of data mining
providers can gain access to clinical data gathered in the patient care process. It is also anticipated that such data warehouse may provide information to users in areas ranging from research to management. The idea of using a
OW
for
accessing clinical data is not new, but there is no easy solution.
Clinical
information
systems,
like
any
other
enterprise information system, are diverse and individual. Even if based on a standard solution there has so to be done
Research is also done to find out impact of missing values and explore the impact of noise and how this can influence the output.
Zhu et al.
classified noises into class noise and
attributes noise. The research mainly concentrated on attribute noise it is more difficult to handle. Attribute noise include incorrect attribute values, missing or don't know attribute values and incomplete attributes or don't care values [19].
much customizing to fit the existing clinical processes and
Several researches have focused on the techniques that
documentation, so that there is no standardization yet to be
have built in mechanism to handle noise and missing values
usable for different hospitals.
In order to construct an
and
which
are
more
appropriate
to
use
for
medical
operational and effective OW it is essential to combine
applications. Few techniques that have been applied and are
process work, domain expertise and high quality database
more suited to medical data sets are studied in [20-21]. For
design [9]. This seems to correspond to concurring literature,
example decision tree, logic programs, K-nearest neighbour,
where we found either a holistic approach with theoretical
and Bayesian classifiers. Lee et al recommended that Bayesian
conceptualization but lacking practical implementation [10] or
networks and decision trees are the primary techniques applied
practical implementation
focused on one single domain, i. e.
lacking generalizability of such concepts [II].
in medical information systems [22]. Obenshain claimed that that neural networks performed better then logistic regression,
Electronic Health Records (EHRs) which describe the diseases and treatments of patients are normally stored in the
but the decision tree did better in identify active compounds most likely to have biological activity [23].
hospitals or clinics, where they are created. However, patients
Wang and Wang claimed that most process models focus
may be treated in different hospitals, clinics and, therefore,
on the results but not in gaining new knowledge. Medical data
there is a need for integrating health records from different
mining applications is expected to discover new knowledge
hospitals to enable any hospital to obtain a total overview of a
and should follow a five stage data mining development cycle:
patient's health history[12].
planning tasks, developing data mining hypotheses, preparing
The trend of adopting data warehouses for health systems in presented in [14], where the design experience in the University of Virginia Health System is reported. Here the
data, selecting data mining tools, and evaluating data mining resu Its [24]. Handling Missing Data in Pathology Databases using
data warehouse is used to provide clinicians and researchers
Multiple
with
Optimizing data collection for public health using feature
direct,
rapid
access
to
retrospective
clinical
and
Imputation
technique
is
discussed
in
[25].
administrative patients' data. In addition, they use the data
selection is studied in [26]. Cubillas et. al. proposed a model
warehouse also for educational and research aims.
for improvement in appointment scheduling in health care
Discovering patterns in medical datasets is difficult and challenging task but also rewarding [3]. Recently data mining techniques have improved significantly but it does not benefit the health professionals until the mined data is turned into
centers [27]. Hoque et. al. discussed present structure of pathological data, requirements to formulate efficient models and
the
necessity
to
reform
the
present
structure
for
predicative data mining in [28].
useful information. By using effective data mining tools and
Kumari and Singh used Neural Network{or the diagnosis
algorithms and a step by step data mining process it is possible
of diabetes [29]. Yilmaz et. at. proposed a modified K-means
to produce useful and new information from the dataset [4].
Algorithm based data preparation method for diagnosis of
With data mining the screening, diagnosis and detection of diseases may get more efficient by reducing both time and costs for the corresponding procedures. It can be effective in
heart and diabetes diseases [30]. Herland et. at. present recent research using Big Data tools and approaches for the analysis of Health Informatics [31].
III. SYSTEM DESIGN The architecture of National DW model is illustrated in
hospitals,
diagnostic
centers
and
other
sources,
different
Fig. 2. Health data from different govt. and private sources
preprocessing steps will be executed on data such as cleaning,
such as hospitals, clinics, diagnostic centers, research centers
noise
will be collected. Then data preprocessing, one of the major
Extraction-Load-Transform (ETL) process, the preprocessed
task for developing a DW from heterogeneous sources, will be
data will be integrated into a temporary repository. After that
performed.
data will be loaded into DW. Online Analytical Processing
[t
includes
data
cleaning,
missing
values
reduction
queries
and
and
normalization
imputation, normalization, transformation etc. As for National
(OLAP)
Health DW, data are coming from different public and private
performed over the pathological DW.
mining
techniques.
operations
can
be
Using
easily
Fig. 2. Brief Architecture of Data Warehouse 4D Pathological data cube that will be used for national DW development is shown in Fig. 3. Here O-D apex cube will provide highest level of summarization of national health data and 4-D base cube contains lowest level of summarization. Partial materialization is used rather than full materialization of cuboids to reduce huge space requirements.
involve multiple join operations. Most suitable Logical Data Warehousing Model is the Star Schema[13], [14], [24]. We have used
Star
requires little more space than Snowflake but Star Schema is much faster to response complex join queries.
Using the building blocks of the fact table and the various dimension tables, one has thousands of ways to aggregate the data.
For
clinical
analysis
purposes,
frequently
needed
aggregated datasets should be created in advance for the users. Having data readily and easily available is a major tenet of data warehousing. For our DW, some aggregated datasets could be: •
Patient count by Diagnosis, Gender, Age, Date
•
Count of Procedures by Provider and Date
•
Billing and discount information.
•
Count of retesting
Logical design of DW involves the definition of structures that enable an efficient access to information. There are many logical models like Flat schema, Star schema, Star Cluster schema, Snowflake schema, Fact Constellation schema etc. Among
them,
constellation
star
schema,
schema
are
snowflake mostly
schema
used
Schema in our design which is
illustrated in Fig. 4. Because of de-normalization, Star Schema
and
fact
commercially.
Efficiency is the most important factor in DW modeling because many queries access large amounts of data that may
Fig. 3. Pathological Data Cube
shown
in
Fig.
4.
Our
METADATA FOR TYP E ID GENERATION
TABLE!.
The major fact and dimension tables used in the OW development process are
sample
Colour
Sample_Count
Type_ID
the fact table and up to 1,88,000 records in each of the
Straw
9439
0
different dimension tables. Fact table records will grow very
Yellow
58
1
pathological data repository currently has 12, I 0,000 records in
fast comparing dimension tables' records and fact table records are dominant to calculate the size of the OW with time.
Reddish
32
2
L. Yellow
29
3
Milkly white
2
4
D. Yellow
2
5
Yellowish white
I
6
Reddish black
I
7
L.Reddish
I
8
Hazy
1
9
v. USE OF NATIONAL HEALTH OW
National Health OW can be used in many ways to improve national health standard, to provide better and prompt services to the patients and to facilitate health related research among the doctors, clinical researchers etc. Following is an example of the use of National Health OW.
Fraud Testing Awareness
Fig. 4. Fact Table and Dimension Tables
From OW we can easily find out support and confidence values of different attributes and generate association rules.
IV. DW SIZE-CASE STUDY According
to Directorate General
of
For example Health
Services
(DGHS) under the Ministry of Health and Family Welfare (MoHFW):
Age(X, Negative (X, T3)
[Support =67%, Confidence=96%]
Bangladesh Private Clinic and Diagnostic Owners Association
From support and confidence values it can be clearly
(BPCDOA), 8,000 diagnostic centers of the country have
identified that this test T3 has almost no impact of disease
DGHS approvals till December 2014.[32], [33], [34], [35]. So
diagnosis [36]. National awareness can be developed not to
minimum number of places where pathological tests are
perform the test at initial level for Young Males.
performed is 8717. If for simplicity of calculation we consider that average 500 patients go for diagnosis in each place every day and each patient go through 3 tests and each test has 5 attributes then:
In this way many other interesting patterns can be derived from National Health DW by using various data mining algorithms like association or clustering.
R= 8717 X 500 X 3 X 5 = 65377500 records per day. So more than 65 million records will be added in the fact tables of National Health OW. Considering 1 record takes O. IKE memory space than the DW will consume 6.23GB
memory each day. It is certainly falls under big data category [31] and Bangladesh Government should go for Cloud storage and services for this. A case of metadata generation for the test result of Urine
Colour Diagnosis is shown in Table I. This metadata is used in the normalization process of the data preprocessing step of National Health OW development.
VI. CONCLUSION This paper presented the developmental stages
of a DW
platform for the management, processing and analysis of large-scale Health data.
it summarizes the experience in
designing a national health data warehouse modeled for e health system. In this paper, widely accepted conceptual and logical
design
Considering
approaches
the
in
quality
OW
factors
design and
are
the
discussed. information
requirements, Star schema was chosen as the most suitable logical model for the purpose. Establishing a data warehouse gathering huge data from existing Health databases should give easier and better access to interesting data for both researchers,
health
service
providers and govt. authorities. In order to get maximum benefit
from
the
model
presented
in
this
research
in
be
Platt, S. Cohen, and W. a. Knaus, "Data mining and clinical data repositories: Insights from a 667,000 patient data set," Comput. BioI. Med.,vol. 36,pp. 1351-1377,2006.
There should be a document which clearly defines the
[15] K. Cios, "Uniqueness of medical data mining," Artificial intelligence in medicine, vol. 26,no. 1-2,pp. 1-24,2002.
Bangladesh,
the
conditions
mentioned
below
should
satisfied. I.
structure of the data tables currently used by the concerned
[16] A loham,"Discover Patterns in Adverse Drug Reaction," University of South Australia,B.Sc. Engineering Thesis,2010.
Pathological centers. 2. The stakeholders should clearly know what are the data retrieval operations going to be executed using the data warehouse. 3. There should be a cooperative mind among different health service providers to help the governmental bodies such as
Ministry of Health and Family Welfare and Ministry of
Posts, Telecommunications and Information Technology to enrich the health data warehouse so that the Govt. can achieve the
goal
of
Health,
Population
and
Nutrition
Sector
Development Program (HPNSDP 2011-2016). REFERENCES
[17] R. Agrawal, T. Imielinski, and A Swami, "Database Mining: A Performance Perspective," IEEE Transactions on Knowledge and Data Engineering 5(6): pp.914-925,1993. [18] A L. Houston, "Medical Data Mining on the Internet: Research on a Cancer Information System," Artificial Intelligence Review 13: pp.437466,1999. [19] X. Zhu, T. Khoshgoftaar, 1. Davidson, and S. Zhang, "Editorial: Special issue on mining low-quality data," Knowledge and Information Systems, vol. I I ,no. 2,pp. 131-136,2007. [20] M.L. Brown and J.F. Kros, "Data mining and the impact of missing data," Industrial Management and Data Systems, vol. 103, pp. 611-621, 2003. [21] N. Lavrai:, "Selected techniques for data mining in medicine, Artificial intelligence in medicine," vol. 16,no. I, pp. 3-23,1999.
[1]
U. M. Fayyad, G. P. Shapiro, and P. Smyth, "From Data Mining to Knowledge Discovery: An Overview," Advances in Knowledge Discovery and Data Mining,1-36. AAAI Press/MIT Press,1996.
[22] I.N. Lee,S.C. Liao,and M. Embrechts, "Data mining techniques applied to medical information," Medical Informatics and the Internet in Medicine,vol. 25,no. 2,pp. 81-102,2000.
[2]
R. Khosla and T. Dillon, "Knowledge Discovery, Data Mining and Hybrid Systems," In Engineering Intelligent Hybrid Multi-Agent Systems,Kluwer Academic Publishers pp.143-177,1997.
[23] M.K. Obenshain, "Application of Data Mining Techniques to Healthcare Data," Infection Control and Hospital Epidemiology, vo1.25, no 8, pp. 690-695,2004
[3]
J.F.Roddick, P. Fule, and W.J. Graco, Exploratory medical knowledge discovery: experiences and issues, SIGKDD Explor. Newsl., vol. 5, no. I, pp. 94-99,2003.
[24] H. Wang, and S. Wang, "Medical knowledge acquisition through data mining," ITME 2008. IEEE International Symposium on,Xiamen,2008.
[4]
AM. Wilson, L. Thabane and A Holbrook, Application of data mining techniques in pharmacovigilance, British Journal of Clinical Pharmacology,vol. 57,no. 2,pp. 127-134,2003.
[5]
[6]
[7]
W.H. Inmon,EIS and the data warehouse: a simple approach to building an effective foundation for EIS. Database Programming and Design, 5(11): 70-73,1992. N .Stolba, M. Banek, and AM. Tjoa, "The Security Issue of Federated Data Warehouses in the Area of Evidence-Based Medicine," Proc. of the First International Conference on Availability, Reliability and Security (ARES'06,IEEE),20-22 April,2006. T. R. Sahama, and P. R. Croll, "A Data Warehouse Architecture for Clinical Data Warehousing," In Roddick, J. F. and Warren, J. R., Eds. Proceedings Australasian Workshop on Health Knowledge Management and Discovery (HKMD 2007) CRPIT, 68, pages pp. 227-232, Ballarat, Victoria.
[25] S. Faisal, "Missing Data in Pathology Databases," MSc Thesis, Australian National University, 2011. [26] S. N. Partington, V. Papakroni, and T. Menzies, "Optimizing data collection for public health decisions: a data mining approach.," BMC Public Health, vol. 14,p. 593,2014. [27] J. J. Cubillas,M. 1. Ramos, F. R. Feito, and T. Urena, "An improvement in the appointment scheduling in primary health care centers using data mining.," J. Med. Syst.,Springer,vol. 38,p. 89,2014 [28] AS.M. L Hoque,S. Galib and M. Tasnim,"Mining Pathological Data to Support Medical Diagnostics," Proc.s of Workshop on Advances on Data Management: Applications and Algorithms, Department of Computer Science and Engineering,BUET,Dhaka,pages 71-74,2013. [29] S. Kumari and A Singh, "A data mining approach for the diagnosis of diabetes mellitus," IEEE 7th International Conference on Intelligent Systems and Control (ISCO),2013.
[8]
W. Inmon, Building the Data Warehouse, 3rd edition, Wiley-New York, 2002.
[30] N. Yilmaz,O. Inan, and M. S. Uzer, "A New Data Preparation Method Based on Clustering Algorithms for Diagnosis Systems of Heart and Diabetes Diseases," J. Med. Syst.,Springer,vol. 38,2014
[9]
J. A Lyman, K. Scully, and J.H. Harrison Jr, "The development of health care data warehouses to support data mining," Clin Lab Med. 2008 Mar;28(l):55-71,vi.
[31] M. Herland, T. M. Khoshgoftaar, and R. Wald, "A review of data mining using big data in health informatics," J. Big Data,Springer, vol. I, p. 2, 2014.
[10] R. Kush, L. Alschuler, R. Ruggeri, S. Cassells, N. Gupta, L. Bain, K. Claise, M. Shah, and M. Nahm, "Implementing Single Source: the STARBRITE proof-of-concept study," J. Am. Med. Inform. Assoc. 14: 662-673,2007.
[32] HEALTH BULLETIN 2014,2nd Edition, Directorate General of Health Services, Ministry of Health and Family Welfare, Government of the People's Republic of Bangladesh,December 2014.
[11] c.J. Brooks, J.W. Stephen, D.E. Price, D.V. Ford, R.A. Lyons, S.L. Prior, and S.c. Bain, "Use of a patient linked data warehouse to facilitate diabetes trial recruitment from primary care," Prim Care Diabetes. 2009 Nov;3(4):245-8. Epub 2009 lui 14. [12]
G. Fette, M. Ertl, A Worner, P. Kluegl, S. Stork, and F. Puppe, "Information Extraction from Unstructured Electronic Health Records and Integration into a Data Warehouse," unpublished
[13] S. Nugawela, "Data Warehousing Model For Integrating Fragmented Electronic Health Records From Disparate And Heterogeneous Clinical Data Stores," M.Sc. Thesis, Queensland University of Technology, 2013. [14] I. M. Mullins, M. S. Siadaty, J. Lyman, K. Scully, C. T. Garrett, W. Greg Miller, R. Muller, B. Robson, C. Apte, S. Weiss, 1. Rigoutsos, D.
[33] http://www.dghs.gov.bdlindex.php/enlhealth-program-progress/hpnsdp20 11-16/84-english-root/ehealth-eservice/497-hpnsdp-20 11-16-brief Accessed 20 Feb 2015 [34] http://www.bpcdoa.com/clinics_and_diagnostics.htmI Accessed 20 Feb 2015 [35] http://www.thefinancialexpress-bd.com/2014!l2!l5/71077/print Accessed 20 Feb 2015. [36] H. Jiawei, K. Micheline, and P../ian, Data Mining Concepts and Techniques 3rd Edition,2012,Elsevier.