Predicting pandemics and forecasting epidemics - Semantic Scholar

6 downloads 0 Views 1MB Size Report
Hans WACKERNAGEL¹,Christian LAJAUNIE¹. Magali LEMAITRE2, Huey Chyi LEE1,3,4, ... MINES ParisTech. (with input from: Fabrice CARRAT, Mark WILSON) ...
Predicting pandemics and forecasting epidemics based on influenza mortality and morbidity data

Hans WACKERNAGEL¹,Christian L AJAUNIE¹ Magali L EMAITRE2 , Huey Chyi L EE1,3,4 , Thomas R OMARY1 Fabrice C ARRAT2 , Laurent B ERTINO3 , Alex C OOK4 ¹Equipe de Géostatistique — Centre de Géosciences — MINES ParisTech 2 INSERM UMR-S 707 3 Mohn-Sverdrup Center / NERSC 4 National University of Singapore

eVITA Meeting - Bergen, 25 january 2010

The eVITA EnKF project

Geostatistics group • MINES ParisTech

Pre-diction or pre-vision ?

Prédire to predict - to foretell. Prévoir to forecast - to foresee.

(German:

Vorhersage - Voraussicht)

Influenza epidemics

1) Long term prediction (mortality): extreme value analysis of epidemics

2) Short term forecast (morbidity): data assimilation by particle filtering

Extreme value analysis of US influenza excess mortality data Magali Lemaitre1 , Geffroy Brandicourt1,2 Hans Wackernagel2 1 Unité

2 Equipe

707 — INSERM de Géostatistique — MINES ParisTech

(with input from: Fabrice CARRAT, Mark WILSON)

Extreme value theory Statistics of extremes concerns modelling risks from rare events with potentially large impacts: environmental hazards—rain, snow, storms, hurricanes, earthquakes, typhoons, high tides, . . . structural failures—bridges, dams, oilrigs, . . . reliability—mechanical failure due to corrosion etc. finance—market crashes. Statistical modelling of extremes relies on the limit distributions of maxima. These distributions belong all to one family: the Generalized Extreme Value distribution, with 3 parameters (location, scale, shape).

Extreme value theory Statistics of extremes concerns modelling risks from rare events with potentially large impacts: environmental hazards—rain, snow, storms, hurricanes, earthquakes, typhoons, high tides, . . . structural failures—bridges, dams, oilrigs, . . . reliability—mechanical failure due to corrosion etc. finance—market crashes. Statistical modelling of extremes relies on the limit distributions of maxima. These distributions belong all to one family: the Generalized Extreme Value distribution, with 3 parameters (location, scale, shape).

Decreasing mortality (all causes)

Mortality rates (all causes) standardized by age taking the 2000 US population as reference. Effect of aging population has thus been removed. Diminishing mortality due improving health care, hygiene, etc.

Mortality in California (all causes, not standardized)

Mortality in New York (all causes, not standardized)

Excess US influenza mortality Epidemiology is intimately linked to demography. The yearly excess mortality due to influenza has been computed following Simonsen et al. (1997, 2005).

The maximum monthly excess influenza mortality was extracted for the epidemic season in each year (block maxima approach).

Return levels for 20 years Excess influenza mortality in US (left) and France (right)

20 years return level ●

−399

−278

90% 95%



90% 95%

● ●





99%

99%

● ● ●



−282

● ●

● ● ●



40000

60000

80000

100000

return level (excess flu mortality)

120000

−284



−403

NLLH

● ● ●



−401

● ●

−280



NLLH

−397



20 years return level



10000

15000

20000

25000

return level (excess flu mortality)

30000

Influenza epidemics

1) Long term prediction (mortality): extreme value analysis of epidemics

2) Short term forecast (morbidity): data assimilation by particle filtering

Flu epidemics and meteorological conditions

by Etienne G OUDAL Christian L AJAUNIE , Hans WACKERNAGEL , Laurent B ERTINO

Links between the onset of influenza epidemics and meteorological parameters were analyzed using: exploratory statistics generalized linear models (GLM) logistic regression support vector machines (SVM)

Geostatistics group • MINES ParisTech

Data from FP7 ENSEMBLES project 6-hourly data from 1/1/1984 to 31/8/2002 — ERA40 — ECMWF

ILI data from Rhônes-Alpes region

ILI cases against temperature and humidity

ILI cases against temperature and humidity Deviations from seasonal means

Logistic regression

The probability be in epidemic state is modelled as:   P(Z) logit{P(Z)} = log = Zw 1 − P(Z) where P(Z) = P(Y = 1 | Z) is the probability that a given day is epidemic. Binary data is used with the first 14 years as training period, the subsequent 3 years for validation.

Geostatistics group • MINES ParisTech

Logistic regression using daily data

Logistic regression from temperature (left) and seasonal mean temperature (right)

Logistic regression with all meteorological parameters

Relatively good fit at the onset of epidemics

Support vector machines (SVM)

Machine learning: the machine “learns” from part of the data, then validates on the rest of it. Classification: a given day is classified as epidemic or non-epidemic. Possibility to use a large number of predictors and a great many data—with reasonable computing times.

Geostatistics group • MINES ParisTech

SVM: temperature only

Validation Non-epidemic Epidemic

Observations Non-epidemic Epidemic 1896 301 126 161

SVM: all parameters Validation Non-epidemic Epidemic

Observations Non-epidemic Epidemic 1714 189 308 273

More epidemics, but also more false epidemics!

Conclusions

A relation between the meteorological conditions and the onset of epidemics could so far not be established. SVM results still to be improved by using a different kernel? A more explicit treatment of seasonality of meteorological conditions needed.

H1N1 2009 epidemic in Singapore Alex C OOK’s web site (National University of Singapore)

http://www.stat.nus.edu.sg/staff/alexcook/flu/flu.html

Bayesian version of the particle filter programmed by Huey Chyi L EE (Honor’s thesis) in collaboration with MINES ParisTech. Alex C OOK and Mark C HEN (NUS) developped the epidemic model. Geostatistics group • MINES ParisTech

Estimated ILIs per GP

ILI cases per day per doctor 7 ●

6 5



4 3

● ● ● ● ● ●●

●● ●

2 1

● ●● ● ●●● ● ●●

● ●

●●

●● ●

● ● ● ●● ●● ● ●●● ●

Jun

5



15

Jul

25





● ●● ● ● ● ●●● ● ●● ● ● ● ● ● ●● ● ●●● ● ● ● ● ● ●●●● ● ● ● ● ●● ●● ●●●●●●● ● ● ●● ● ● ● ● ● ● ● ● ● ●



0 25





5

15

Aug

25

5

15

25

Sep

5

Oct

Daily reported ILIs per GP

Estimated ILI cases per family doctor

8 ●

6 4 2 0

● ● ●● ● ● ● ●● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ●● ● ●●● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ●●●● ●● ● ● ● ● ●● ●● ● ● ●● ●● ● ●● ● ●● ●●● ● ● ●●●● ● ●● ● ●● ● ● ● ●● ● ● ● ● ●

25

Jun

5

15

Jul

30

10

Aug

25

5

15

Sep

30

15

Oct

30

Nov

Reported (red) & forecast (black/grey) ILI cases

Daily predicted ILIs (000s)

Total ILI patients seeking treatment (per day) 8 6 4 2 0 25

Jun

5

15

Jul

30

10

Aug

25

5

15

Sep

30

15

Oct

30

Nov

Estimated & forecast total number of patients (in thousands) with ILI seeking treatment per day, aggregated across all private and poly-clinics. The weekend effect has been removed for easier interpretation.

Predicted population infected

Proportion of population infected or recovered 30% 25% 20% 15% 10% 5% 0% 25

Jun

5

15

30

Jul

Estimated and forecast total number of people who: 1 2 3

are currently symptomatic, have recovered, had pre-existing immunity (a very small proportion).

10

Aug

25

5

15

Sep

30

15

Oct

30

Nov

Conclusion and Perspectives

Geostatistics group • MINES ParisTech

Conclusion

The particle filter can be an efficient tool for the early detection of a change in epidemic state. Improvements were examined (by Huey Chyi L EE in Bergen/Fontainebleau 2009): in the particle filter algorithm (resampling algorithms), in the underlying SIR model and its parameterization.

An important side-product of the system is that it provides an estimate of the total number of infected people at a given time for a given region. As a scenario simulator it can also provide an error estimate in the assessment of the severeness of an epidemy.

Conclusion

The particle filter can be an efficient tool for the early detection of a change in epidemic state. Improvements were examined (by Huey Chyi L EE in Bergen/Fontainebleau 2009): in the particle filter algorithm (resampling algorithms), in the underlying SIR model and its parameterization.

An important side-product of the system is that it provides an estimate of the total number of infected people at a given time for a given region. As a scenario simulator it can also provide an error estimate in the assessment of the severeness of an epidemy.

Perspectives I Inclusion of climate parameters. Alternative filters

Seasonal influenza: characterize the meteorological configurations leading to an outbreak. Possible links between reanalysed meteorogical data and observed incidences have been explored (Etienne G OUDAL). Relevant climatic parameters to be included into the epidemics’ early detection system? As the dimensionality of the system grows, the particle filter is in danger of becoming impracticable. Alternative: the Ensemble Kalman Filter.

Perspectives I Inclusion of climate parameters. Alternative filters

Seasonal influenza: characterize the meteorological configurations leading to an outbreak. Possible links between reanalysed meteorogical data and observed incidences have been explored (Etienne G OUDAL). Relevant climatic parameters to be included into the epidemics’ early detection system? As the dimensionality of the system grows, the particle filter is in danger of becoming impracticable. Alternative: the Ensemble Kalman Filter.

Perspectives II Other diseases

The system can easily be adapted to handle other diseases than influenza: gastro-enteritis, chickenpox (also monitored by Sentinelles) bacterial meningitis (sub-sahelian zone: meningitis belt; China) dengue (Singapore) ...

References B ERTINO, L., E VENSEN , G., AND WACKERNAGEL , H. Sequential data assimilation techniques in oceanography. International Statistical Review 71 (2003), 223–241. C APPÉ , O., G ODSILL , S., AND M OULINES , E. Overview of existing methods and recent advances in sequential Monte Carlo. Proceedings of the IEEE 95, 5 (2007), 899–908. C APPÉ , O., M OULINES , E., AND RYDEN , T. Inference in Hidden Markov Models. Springer, 2005. C ARRAT, F., L UONG , J., L AO, H., S ALLE , A. V., L AJAUNIE , C., AND WACKERNAGEL , H. A small-world-like model for comparing interventions aimed at preventing and containing pandemics. BMC Medicine 4 (2006), 26. C OLES , S. An Introduction to Statistical Modeling of Extreme Values. Springer, London, 2001. E VENSEN , G. Data Assimilation: the Ensemble Kalman Filter, 2nd ed. Springer, Berlin, 2009. J ÉGAT, C., C ARRAT, F., L AJAUNIE , C., AND WACKERNAGEL , H. Early detection and assessment of epidemics by particle filtering. In GeoENV VI – Geostatistics for Environmental Applications (2008), A. Soares, M. J. Pereira, and R. Dimitrakopoulos, Eds., Springer, pp. 23–35.