Direct Identification of Bacteria in Blood Culture Samples using an ...

IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING

1

Direct Identification of Bacteria in Blood Culture Samples using an Electronic Nose Marco Trincavelli, Member, IEEE, Silvia Coradeschi, Amy Loutfi, Bo Söderquist, and Per Thunberg

Abstract—In this work we introduce a method for identification of bacteria in human blood culture samples using an electronic nose. The method uses features which capture the static (steadystate) and dynamic (transient) properties of the signal from the gas sensor array and proposes a means to ensemble results from consecutive samples. The underlying mechanism for ensembling is based on an estimation of posterior probability which is extracted from a support vector machine classifier. A large data set representing 10 different bacteria cultures has been used to validate the presented methods. The results detail the performance of the proposed algorithm and show that through ensembling decisions on consecutive samples significant reliability in classification accuracy can be achieved. Index Terms—Electronic nose, bacteria identification, sepsis.

I. I NTRODUCTION EPSIS also known as blood poisoning of septicaemia is caused by the presence of micro-organisms, predominantly bacteria in ciruculating blood. Rapid administration of efficient antibiotic treatment is crucial as sepsis can result in septic shock, multiple organ dysfunction and even death. The current standard procedure for diagnosis involves routine microbiological blood cultures. Such procedures can take at least 36 hours to several days before diagnosis can be made. Typically, automated blood culture monitoring systems are first used to incubate a blood culture and monitor the production or reduction of gases (this is normally done via non intrusive methods for detection of CO2 ). Once a sample indicates a change in gas tension, a lab technician will culture the sample on plates for further identification. This secondary sub-culturing may require up to 36 hours before a final identification can be made. Thus finding diagnostic methods that provide fast identification of the presence and the type of bacteria thereby allowing proper antibiotic treatment can provide an increased benefit to the patient as well as help to reduce cost to the health care system. In the past decade, compact gas sensors have been integrated into an instrument called an electronic nose which can provide fast (order of minutes) and non-intrusive measurement of gaseous agents. Within the medical field, electronic noses have been applied to the identification of bacterial agents in different contexts ranging from upper respiratory diseases [1], urine samples [2], and ENT bacteria [3]. The challenge of developing an electronic noses for bacteria identification

S

M. Trincavelli, S. Coradeschi and A. Loutfi are with the Center for Applied ¨ Autonomous Sensor Systems, School of Science and Technology, Orebro ¨ University, Orebro, SE - 70182, Sweden e-mail: [email protected]. B. ¨ Söderquist is with Orebro University Hospital and P. Thunberg is with the ¨ ¨ department of Medical Physics at Orebro University Hospital, Orebro, Sweden email:[email protected]

in blood cultures depends on a number of factors. Firstly, the way in which headspace sampling system introduces the samples to the sensor array have been shown in comparative studies to have an effect on identification rates, where samplers that control temperature and humidity provide better performance results [4]. Secondly, the blood in which the bacteria is sampled may differ from person to person and thus identification should be independent from the medium itself. Thirdly, the post-processing of the signals should be suited for the characteristics of the signals and application. In applications with a larger number of sensors, it is important to extract the relevant features from the signals that can be used for further processing. Although, trials have been made with a single sensor type for identification of bacteria in blood cultures [5], today’s off-the-shelf electronic nose devices contain anywhere from 4 to 32 sensors in an array. This allows for a wider range of odours to be detected. In this work, a new approach for classifying bacteria with an electronic nose is presented. The method evaluates the suitability of a given sample for classification by representing the output from a support vector machine (SVM) with a posterior probability estimation. This estimation is ensembled across ten consecutive reponses of the same sample in order to make the classification more reliable. An electronic nose containing 22 gas sensors with partial and overlapping selectivities and an automatic headspace sampler is used to regulate the samples. The data processing methods presented consist of extracting features that reside on the static (steady-state) and dynamic (transient) properties of the signal. These features are fed into a support vector machine and to the ensembling algorithm. The mechanism of ensembling is based on treating the posterior probabilities as a random sample and estimating the 95% confidence interval for the mean of the posterior of each class. A mean with significant superior confidence interval for a class is disjoint and above all the others then classification is performed (assigning the sequence of samples to that class); otherwise a rejection is declared. Identifications are made among 10 typical bacteria cultures leading to sepsis by directly sampling after the first stage incubation indicates a positive culture growth. II. E XPERIMENTAL M ETHOD A. Sample Preparation Bacteria isolates obtained from clinical samples were used. The bacteria were subcultured on blood agar plates and a bacterial suspension solution was adjusted at turbidity of standard MacFarland 1.0 (3x108 cfu/mL) in NaCl Solution.


2

TABLE I BACTERIA S PECIES C ODE AND TAXONOMY Species Code

Taxonomy

ECOLI

E.coli

PSAER

Pseudomonas aeruginosa

STA

Staphylococcus aureus

KLOXY

Klebsiella oxytoca

PRMIR

Proteus mirabilis

SRFCL

Enterococcus faecalis

STLUG

Staphylococcus lugdunensis

PASMU

Pasteurella multocida

HSA

Steptococcus pyogenes

HINFL

Hemophilus influenzae

After which, 2 ml of the bacteria suspension were added to an aerobic blood culture bottle (BD BACTECT M Plus + Aerobic/F) together with 5 ml of pooled human blood. Blood cultured bottles were incubated at 35◦ C for 24 hours. The bacterial isolates used along with the respective species codes are given in Table I. It is important to note that the viable count when the bottle is positive has not been established and therefore is not known at the time of sampling. In this work only one bacterial strain and one bacteria species is considered at the time as polyclonal or polymicrobial bacteremia is rare and multi-bacteria strains arising from e.g. wounds is beyond the scope of this study. B. Sampling Procedure The purpose of the sampling system is to transfer the headspace from the sample to the sensors without altering its composition and properties. The sampling system of the NST 3220 Emission Analyzer, Applied Sensors, Linköping, Sweden uses two adjacent needles, one ’in’ needle and one ’out’ needle. The sample gas drawn through the ’in’ needle is passively replaced with air through the ’out’ needle by the negative pressure created in the sample bottle as the headspace is removed. This ensure a minimal dilution factor, and therefore a minimal alteration of the properties, in particular when compared to systems that use a carrier gas. The sensor array of the NST Emission analyzer is composed of 10 MOS and 12 MOSFET sensors, for a total of 22 sensors. These two technologies complement each other enabling the array to discriminate between a much wider range of gases than by using just one sensor technology [6]. The sampling cycle, as in most e-nose based systems, is composed by three phases: baseline acquisition, odour sampling and recovery to initial state. In the baseline acquisition phase the sensor array is exposed to a reference gas (air in this case) for 10 seconds and the value of the sensors is recorded. During the odour sampling phases the headspace in the analysis bottle is injected into the sensor chamber for 30 seconds. After this, the sensors are exposed again to the reference gas for 260 seconds in order for the sensors to recover the value they had during the baseline acquisition phase. The total length of the sampling cycle is five minutes. The sampling cycle is repeated ten times in a row and

Fig. 1. Graphical interpretation of the two feature extraction methods used in this work.

we refer to a series of ten consecutive sampling cycles as a measurement. A measurement sequence is composed by one measurement for every type of bacteria. The whole data set is composed by 12 measurement sequences, 6 done with a first batch of bacteria cultures and six done with a second batch one week later. Blood samples within a batch came from the same source and different sources were used between batches. III. PATTERN R ECOGNITION A LGORITHM The pattern recognition algorithm is articulated into five phases, namely feature extraction, dimensionality reduction, classification, posterior probability estimation and ensembling decisions. The next subsections describe each of the phases of the algorithm. A. Feature Extraction In several works dealing with feature extraction from gas sensor response [7], [8], it has been demonstrated that features based on the dynamic of the signal can add information for the discrimination task. Therefore two different feature extraction methods are considered in this work: • Static Response: The static response of a sensor is defined as the difference between the value that the sensor has at the end of the sampling phase minus the baseline value for that sensor. • Response Derivative: The response derivative is calculated as the average of the derivative of the sensor during the first 3 seconds of exposition of the array to the headspace. Figure 1 gives a graphical interpretation of these two features. The extraction of these features generates a 22 dimensions feature vector in case only the static response is considered or 44 dimensions vector if both the static response and the response derivative are considered. B. Dimensionality Reduction When the dimensionality of the considered feature space is high and a finite number of samples is available, a good


estimation of the discriminant function is difficult to obtain. However, in many practical applications multivariate data in