HEALTH SCIENCE JOURNAL® Volume 6, Issue 4 (October – December 2012)
_ORIGINAL ARTICLE_
Application of Multiple Linear Regression Model through GIS and Remote Sensing for Malaria Mapping in Varanasi District, INDIA Praveen Kumar Ra1, Mahendra Singh Nathawat2, Mohhamad Onagh1 1.Department of Geography, Banaras Hindu University, Varanasi, INDIA 2.School of Science, Indira Gandhi National Open University, New Delhi, INDIA 1.
ABSTRACT Background: The production of malaria maps relies on modeling to predict the risk for most of the area, with actual observations of malaria prevalence usually only known at a limited number of specific locations. However, the estimation is complicated by the fact that there is often local variation of risk that cannot easily be accounted for by the known variables. An attempt has to be made for Varanasi district to evaluate status of Malaria disease and to develop a model, by which malaria prone zones were predicted by five classes of relative malaria susceptibility i.e. Very Low, Low, Moderate, High, and Very High categories. Methodology: Multiple Linear regression models were built for malaria cases reported in study area, as the dependent variable and various time based groupings of average temperature, rainfall and NDVI data as the independent variables. GIS is be used to investigate associations between such variables and the distribution of the different species responsible for malaria transmission. Accurate prediction of risk is dependent on knowledge of a number of variables i.e Land Use, NDVI, climatic factors, distance to location of existing government health centers, population, distance to ponds, streams and roads etc. that are related to malaria transmission. Climatic factors, particularly rainfall, temperature and relative humidity are known to have a strong influence on the biology of mosquitoes. To produce malaria susceptibility map in this method, the amounts of quantitative and qualitative variables based on sampling of 50×50 networks in form of a 38622×9 matrix have been transferred from GIS software (ILWIS 3.4 and ARC GIS-9.3) into statistical software (SPSS). Results: Percentage of malaria area is very much related to distance to health facilities. It is found that, 4.77% of malaria area is belonging to 0-1000 m buffer distance to health facilities and 24.10% of malaria area comes in 6000-10000 m buffer distance. As the distance to health facilities increases, malaria area is also increasing. Key words: Multivariate linear regression, malaria susceptibility model, GIS, Remote Sensing, Varanasi. CORRESPONDING AUTHOR
Praveen Kumar Rai, E-mail:
[email protected]
INTRODUCTION Page | 731 E-ISSN: 1791-809X
Health Science Journal © All rights reserved
www.hsj.gr
Quarterly scientific, online publication by Department of Nursing A’, Technological Educational Institute of Athens
he representation and analysis of
T
importance of these factors has yet to be
maps of disease-incidence data is a
determined.2, 3
basic tool in the analysis of regional
Malaria (marsh fever, periodic fever) is a
variation in public health. Tobler’s first
parasitic disease that involves infection
law of geography, which states that
of the red blood cells (RBCs). Malaria in
“things that are closer are more related,”
humans is caused by the transmission of
is central to core spatial analytical
one or more of four parasitic, class
techniques
Sporozoa.
as
well
as
analytical
The
severity
and
clinical
conceptions of geographic space. In the
presentation of symptoms depend largely
case of disease spread, individuals near
on
the
species
of
Plasmodium
4
or exposed to a contagious person or a
contracted. The four species responsible
tainted
are
for infection are Plasmodium falciparum,
deemed more susceptible to certain types
Plasmodium vivax, Plasmodium ovale
of illnesses.1
and Plasmodium malaria. These species
The rapid urbanization in many parts of
vary widely with respect to geographic
the world is changing the context for
distribution, physical appearance and
human population and their interaction
immunogenic
with
transmission depends on the diverse
environmental
the
natural
setting
ecosystem.
To
factors
malaria mosquito human relationship, it
parasites,
is required to identify the type of human
interactions among them. These factors
migration,
may
economic
status,
environmental This
behavior
aspects
requirement
growth,
socio
and
around
the
them.
underscores
the
human
and
condition
The
may
rainfall,
mosquito
vector
and
the
others,
environmental most
apparent
determinants are the meteorological and environmental
the
vectors,
among
meteorological etc.
the
hosts,
include,
importance of human intervention that affect
influence
Malaria
understand the complex nature of the
population
that
potential.
parameters,
temperature,
such
humidity
as and
population and the intensity of parasitic
vegetation.5,6 That there are so few
transmission in endemic areas, whether
examples of the use of epidemiological
in rural or urban settings. The key
maps
determinants of the outcome of malaria
explained
should be related to the human host,
spatially
parasite,
understanding of how epidemiological
parameters.
vector
or
However,
environmental the
relative
in
malaria
control
by
lack
the
defined
data
of and
may
be
suitable, of
an
variables relate to disease outcome.
Page | 732 Application of Multiple Linear Regression Model through GIS and Remote Sensing for Malaria Mapping in Varanasi District, INDIA
HEALTH SCIENCE JOURNAL® Volume 6, Issue 4 (October – December 2012)
However, recent evidence suggests that
healthcare planning, GIS has shown its
the clinical outcomes of infection are
capability to answer a diverse range of
determined by the intensity of parasite
questions relating to the key goals of
exposure,
efficiency, effectiveness, and equity of
and
developments
in
geographical information systems (GIS) provide
new
ways
epidemiological
data
to
represent
spatially.
GIS
the
provision
services. play
12,13
a
significant
reorganization
climatic
disease
of
the
collection
localities with the presence or absence of the various species.
7,8
This computer
public
of
planning
century,
health
Unquestionably, GIS will
software is being used to correlate the attributes
of
part
public in
especially
the in
in
the
health
and
twenty-first response
to
sweeping changes taking place in the
based technology has been available for a
handling of health information.14
number of years but it is only recently
Vegetation
that it has been widely appreciated as a
vector breeding, feeding, and resting
powerful new tool to augment existing
locations.
monitoring and evaluation methods for
indices have been used in remote sensing
disease mapping.5, 9
and Earth science disciplines. The most
Mathematical and statistical modules
widely used index is the Normalized
embedded in GIS enable the testing of
Difference Vegetation Index (NDVI). It is
hypotheses
estimation,
simply defined as the difference between
explanation, and prediction of spatial
the red and the near infrared bands
and
and
the trends.10
temporal
is A
often
associated
number
of
with
vegetation
Statistical
normalized by twice the mean of these
techniques model the relation between
two bands. For green vegetation, the
parasitaemia
factors
reflectance in the red band is low
(environmental, possible interventions,
because of chlorophyll absorption, and
socio-economic
a
the reflectance in the near infra-red band
model,
is high because of the spongy mesophyll
multivariate
risk
and
risk
factors)
linear
via
regression
which is further used for prediction.11
leave structure.1
GIS plays a variety of roles in the
Because malaria is vector-borne, there
planning
the
are many remotely sensed abiotic and
dynamic and complex healthcare system
biotic environmental variables that are
and disease mapping. Although still at an
relevant
early stage of integration into public
transmission and habitat niches of the
and
management
of
to
the
study
of
malarial
Page | 733 E-ISSN: 1791-809X
Health Science Journal © All rights reserved
www.hsj.gr
Quarterly scientific, online publication by Department of Nursing A’, Technological Educational Institute of Athens
vector. For example, the Normalized
indicators (of discrete and continuous
Difference Vegetation Index (NDVI) is a
nature) form remote sensing data and
characterization of vegetative density
incorporate these into the core of the
based on the amount and wavelength of
analysis.
the radiation reflected by the leaves of a
desegregation at which the analysis has
plant.
been undertaken in the present study is
When
vegetation
is
photosynthetically active, it has a high
not
reflectance in the near-infrared region of
methods.
Indeed,
possible
the
through
level
of
conventional
the spectrum and a low reflectance in the red portion of the spectrum. In an
Objectives of the Study
environment where vegetation is healthy
The main aim of this study is to develop
and green, the leaves of the resident
a malaria distribution map and malaria
vegetation
significant
susceptibility models using multivariate
percentage of the visible light produced
linear regression analysis though Remote
by the sun.15 The more vigorous and
Sensing data and GIS techniques.
will
absorb
a
denser the vegetation is, therefore, the higher the NDVI becomes. NDVI has also
Study Area
been used as a surrogate for rainfall
The study is Varanasi district, U.P.,
estimate. It is an effective measure for
extending between the latitude of 25°10’
arid or semi-arid region. For tropical
N to 25°37’ N latitude and longitudes of
regions where ample rainfall is normally
82°39’ E to 83°10’ E, lies in eastern Uttar
received, vegetation index is a less
Pradesh. Its major area is stretched
sensitive measure for estimating rainfall.
towards the west and north of the
The mean vegetation index over a region
Varanasi city spread over an area of
reflects the degree of urbanization or
1454.11 sq. km (Fig.1). Administrative
lack of vegetation. In this sense, NDVI in
the
a grid cell is used as an indicator for the
namely,
mean level of vegetation present in the
which are further sub-divided into eight
cell.
development blocks namely Baragaon,
The present study illustrates how GIS
Pindra, Cholapur, Chiraigaon, Harhua,
allows integration of different data sets
Sevapuri, Araziline and
to
Vidapeeth altogether consisting of 1336
arrive
at
holistic
or
aggregative
solutions. It shows how the technique can
help
in
generating
district
comprises
Pindra
and
two
Varanasi
tahsils Sadar
Kashi
villages.
additional
Page | 734 Application of Multiple Linear Regression Model through GIS and Remote Sensing for Malaria Mapping in Varanasi District, INDIA
HEALTH SCIENCE JOURNAL® Volume 6, Issue 4 (October – December 2012)
Data Collection and Assessment of Used
above data sources have been used to
Parameters
generate various thematic data layers.
Successful
prediction
of
malaria
Climate data not only for Varanasi
occurrence and the production of a map
district
of the malaria prone areas call for the
district/places like Patna, Gaya were
collection of the relevant spatial data. A
used for comparative variation and for
number of thematic maps (referred to as
interpolation in trend of rainfall and
data
temperature from Varanasi district.
layers
in
GIS)
on
specific
but
for
various
neighboring
parameters or parameters which are related to the occurrence of malaria,
Methodology
distance to water bodies, distance to
A number of thematic maps (referred to
river,
as data
distance
to
hospital,
rainfall,
layers in
GIS) on
specific
temperature, land use/landcover, NDVI
parameters or parameters which are
etc. have been generated (Fig. 2).
related to the occurrence of malaria, viz.
A malaria susceptibility map (i.e. malaria
land
susceptibility
zone
ponds/tanks, distance to river, distance
susceptibility
index)
and has
malaria also
use,
NDVI,
distance
to
water
been
to road, distance to hospital, rainfall,
prepared. The basic data sources that
temperature and expected population
have been used to generate these layers
density of year 2009 have been generated
are including IRS-1C LISS III data of year
(Fig 2). In this study, Ilwis Version-3.4
2008, SOI topographic maps (1:50,000
and ArcGIS Version-9.3 GIS and ERDAS
scale). Census data of year 2001 was also
Imagine Version-9.1 software were used
used and using this population data of
to produce the layer maps that assist in
year 2001, projected population of year
the
2009 was calculated which was used to
susceptibility maps. Topography map of
calculate population density of year
1:50,000 scale of study area were used to
2009. Besides, field surveys have been
digitize district and development block
carried out for verification and condition
boundary. The coordinates of important
of ponds/water tanks, health facilities in
point for geo reference point like road
PHC’s/CHC’s and government hospitals.
conjunction points and malaria prone
Malaria data of year 2009 was used for
area, existing health care facilities units
this study. These data are taken from
were measured during the field surveys
District Malaria Office, Varanasi. The
using Global Position Systems (GPS)
production
of
the
malaria
Page | 735 E-ISSN: 1791-809X
Health Science Journal © All rights reserved
www.hsj.gr
Quarterly scientific, online publication by Department of Nursing A’, Technological Educational Institute of Athens
technology. In the measurement phase,
Furthermore, it will help to make an
one receiver served as a base station,
equation and linear function (model) for
while the other was used to collect GPS
malaria susceptibility in intended study
data at the selected ground control
area. All these used parameters were
points. To establish the relationship
analysed in SPSS statistical software
between object space and image space,
using multiple linear regression model
the ground control points were selected
and crossed to each other and then
in
finally Malaria Susceptibility Index (MSI)
the
model
area
measurements
in
to
conduct
the
all
National
and
Malaria
Susceptibility
Zonation
Coordinate System. The vector maps
(MSZ) were produced.
were produced from the IRS LISS-III
In this study equation of the theoretical
remote
model will be described as follows.
sensing
data
and
SOI
topographical map. Therefore, land use
L B0 b1 X 1 b2 X 2 b3 X 3 ... bm X m
map, NDVI and vector layers of water
Where, L is the occurrence of Malaria in
bodies and other important parameters
each unit, X’s are the input independent
used in this study were delineated in
variables
ERDA Imagine-9.1 and ARC GIS-9.3
observed for each mapping unit, the B’s
software.
geo-
are coefficients estimated from the data
referenced digital map of development
through statistical techniques, and ε
blocks/districts were used. In order to
represents the model error.16 To produce
For
GIS
platform
carry out multivariate analysis of data and to determine the all parameters responsible for malaria in the study area, a multiple linear regression has been used. Multiple Linear regression models were built for malaria cases reported in study area, as the dependent variable
malaria
(or
instability
susceptibility
parameters)
map
in
this
method, the amounts of quantitative and qualitative variables based on sampling of 50×50 networks in form of a 38622×9 matrix have been transferred from GIS software (ILWIS 3.4 and ARC GIS-9.3) into statistical software (SPSS).
and various time based groupings of temperature, rainfall and NDVI data as the independent variables. The multiple linear regression method reveals that how the susceptibility of malaria as the standard variables
deviation and
of
independent
predictors
change.
Discussion Malaria exists in every tropical and subtropical landscape across the globe; sometimes making seasonal excursions into temperate areas as well. 15 The protozoan parasites that cause it have
Page | 736 Application of Multiple Linear Regression Model through GIS and Remote Sensing for Malaria Mapping in Varanasi District, INDIA
HEALTH SCIENCE JOURNAL® Volume 6, Issue 4 (October – December 2012)
more complex genomes, metabolisms
disease
and life cycles than almost any other
effective prevention decisions may be
vector-borne
complexity
made by the government and public
target
health
makes
threat.
them
a
This
difficult
for
may
be
uncovered.
institutions
through
More
better
interventions such as drugs and vaccines
allocation of medical resources by using
because
the network analysis models of a GIS.19
the
parasite’s
shape-shifting
ways allow it to evade chemical and immunological defenses. They pose a
Malaria Influencing Data Layers
moving
intentionally
The regression coefficients of this model
changing their outer coating during each
(multiple linear regressions) are given in
phase of their life cycle, and creating a
Table 1.
diverse
Rainfall
target
as
well,
antigenic
and
metabolic
wardrobe through sexual recombination,
Rainfall is considered to be the most
an
creation
important malaria triggering parameter
unavailable to simpler microbes such as
causing soil saturation and a rise in pore-
viruses and bacteria.17
water pressure. However, there are not
The origin and subsequent spread of
many examples of the use of this
malaria diseases have a close relation
parameter in stability zonation, probably
with time and geographic locations. If
due to the difficulty in collecting rainfall
disease
in
data for long periods over large areas.
space/location and time and they contain
After interpolation between amounts of
essential disease attributes, the spatial
annual rainfall in the study area stations,
distribution and temporal characteristics
the isohyets map created. Finally this
of the disease spread may be monitored
map has been grouped into five classes
and visualized for probable intervention.
to prepare the rainfall data layer (Fig.
With the availability of disease spread
2a). It was verified that approximately
models, the contagious process may be
maximum
dynamically simulated and visualized in
occurred in >984 mm rainfall class. In
two or three dimensional spatial scales.18
>984 mm rainfall class, 30.81% of
Consequently;
population
malaria area comes in very low and low
groups may be identified and visually
zones whereas 6.29% of malaria area
located while the spatial distributional
calculated for very high zones but in
10000m buffer distance of stream
categories calculated, which is 0.03%
and in this zone only very low and
(Table 1).
moderate categories are available, which
distance to road increases malaria area
is mainly because of influence of some of
percentage shown decreasing trend.
only
very
low
and
Here also seen that,
low as
the other indicators/variables and at this distance malaria indicators or breeding
Distance to Health Facilities
sources are not very much influence on
Health facilities of the Varanasi district
people. One important thing it was also
are based on mainly modern allopathic
found in this study that as the distance
of treatment. To know the distributional
to rivers/stream increases, percentage of
pattern of health care facilities, data has
malaria effected area in high and very
been collected from CMO office and
high categories are decreasing, which is
government hospital located in rural
8.95% within the 0-1000m and 2.97% in
areas of Varanasi district. The existing
6000-9000m
health facilities both in rural and urban
buffer
zone
respectively
(Table 1).
area were surveyed with the help of Leica
DGPS.
There
are
different
Distance to Road
categories of health centre providing
Similar to the effect of the distance to
infrastructure
streams, distance to road is also one of
district. The PHC’s are dotted in the
important parameters to estimate the
district located at an interval of 10-20
distance of road from existing health
kms and the tahsil hospitals are located
care facilities in the study area. Five
about 50 km apart.
different buffer areas were created on
with the distribution of medical centre’s
the path of the road to determine the
of the district bears a close relationship
effect of the road on the malaria disease
with the hierarchy a central places and
(Fig. 2e). The malaria area percentage in
population
each buffer zone is given in Table 1 and
Besides, the transport network has also
shows that 63.83% of the malaria area
influenced the growth of health care
and
size
treatment
of
in
the
The hierarchical
the
settlement.
Page | 739 E-ISSN: 1791-809X
Health Science Journal © All rights reserved
www.hsj.gr
Quarterly scientific, online publication by Department of Nursing A’, Technological Educational Institute of Athens
facilities. Percentage of malaria area is
the area is mainly sourced from heaps of
very much related to distance to health
garbage. The solid and liquid wastes
facilities (Fig. 2f). In Table 1, it is found
generated out of the household and
that, 4.77% of malaria area is belonging
industrial activities are dumped and
to 0-1000 m buffer distance to health
released in uncontrolled sites. These
facilities and 24.10% of malaria area
wastes are disposed of in the low lying
comes in 6000-10000 m buffer distance.
areas where the tanks and ponds are
Table 1 shows that as the distance to
located and due to this malaria vectors
health facilities increases, malaria area
very easily developed and many cases of
are also increasing, except in >10000 m
malaria
buffer zones (7.71% of malaria area
polluted pond water. In this it were
only). Here, in these area may be malaria
found that 44.7% of malaria are occurred
breeding sources are not developed as
within