Modeling Testing and Discharge Decisions for

0 downloads 0 Views 226KB Size Report
19 Mar 2008 - control-limit policies that are dependent on the patient's length of stay in the hospital. ... Department of Industrial Engineering, University of Pittsburgh,. Pittsburgh, PA .... such as myopic policies and active learning policies.
Modeling Testing and Discharge Decisions for Pneumonia and Sepsis Patients Jennifer E. Kreke Andrew J. Schaefer





Matthew D. Bailey

Derek C. Angus

§



Mark S. Roberts



March 19, 2008

Abstract Sepsis, the leading cause of death among critically ill patients, is the twelfth-leading cause of death in the U. S. Current medical research, focused on understanding the disease from a molecular level, is exploring the correlation between various inflammatory markers and patient survival. In this paper, we present a partially observed Markov decision process (POMDP) model that utilizes these results to explore the decision of when to test for additional cytokine information in order to gain a better understanding of the patient’s true health state. These testing decisions then influence the decision of when to discharge the patient from the hospital, expanding our previous work in this area. We parameterize this model using patient-based disease progression data from a large-scale clinical study in order to characterize optimal testing and discharge policies for various problem instances. In particular, we demonstrate multi-region control-limit policies that are dependent on the patient’s length of stay in the hospital.

KEYWORDS: partially observable Markov decision process, medical decision making, sepsis ∗ Department of Industrial Engineering, University of Pittsburgh, Pittsburgh, PA 15261, [email protected] † Management Department, Bucknell University, Lewisburg, PA 17837, [email protected] ‡ Corresponding Author. Department of Industrial Engineering, University of Pittsburgh, Pittsburgh, PA 15261, [email protected] Phone: 412-624-5045 Fax:412-624-9831 § CRISMA Laboratory, Department of Critical Care Medicine, University of Pittsburgh, Pittsburgh, PA 15261, [email protected] ¶ Department of Medicine, University of Pittsburgh, Pittsburgh, PA 15213, [email protected]

1

1

Introduction

Sepsis, an overwhelming inflammatory response to infection (Bone et al. 1992), develops in approximately 750,000 people annually in the United States. More than one third of these cases result in death, mainly due to an improperly regulated inflammatory response, making sepsis one of the top causes of death overall (Anderson 2002) and the leading cause of death in critically ill patients (Angus et al. 2001). Sepsis is an acute disease with patient death or recovery occurring within a short period of time from the onset of the disease. The effective management of sepsis during this time results from appropriate prevention, testing, and treatment strategies. This paper considers patients that are admitted to the hospital, diagnosed with community-acquired pneumonia (CAP), and at some point during their treatment, develop sepsis. This paper assumes that diagnosis and treatment decisions, other than those explicitly considered in the model, follow standard care. In the treatment of sepsis, antibiotics may be used initially to treat the underlying infection. Unfortunately, factors such as polymicrobial infections and antimicrobial-resistant organisms make prompt diagnosis and treatment of infection difficult (Eggimann and Pittet 2001). Organ support therapies such as fluid replacement, mechanical ventilation, and blood transfusions, may also be needed throughout the treatment process to prevent organ failure. Unfortunately, research efforts into new treatment options have met with limited success. Out of the hundreds of experimental medications tested to control the body’s inflammatory response, only one was a qualified success (Bone 1996, Burchardi and Schneider 2004, Marshall 2000, Zeni et al. 1997). These failures are mainly due to the lack of understanding of sepsis at the molecular level (Wheeler and Bernard 1999). To this end, a multi-center study led by the University of Pittsburgh School of Medicine is currently investigating genetic and inflammatory markers of sepsis in a study called GenIMS (2007). The GenIMS investigation focuses on the leading cause of sepsis, community-acquired pneumonia. A key aim of the GenIMS investigation is to demonstrate that the levels of certain cytokines (inflammatory markers) are strongly correlated with patient survival. It has been discovered that in some cases, a patient may appear to be well; however, the patient’s cytokine levels indicate that the patient has a higher probability of death should the patient be discharged from the hospital as compared to if the patient were to remain in the hospital receiving standard treatment (Kellum et al. 2007). As it is widely believed that cytokines are correlated with patient survival, it is clear that knowing the levels of these cytokines will change how the clinician makes decisions, for example, when to discharge the patient from the hospital. A difficulty arises in that the clinician typically does not know the patient’s true cytokine levels with complete certainty for at least two reasons. First, if the clinician decides not to order tests in a given period, then the clinician will not have any information on the patient’s partially observable health state, causing uncertainty. Second, even if the clinician orders a test and observes its result, there may be error associated with the result. Test error can be interpreted in 2

two different ways. First, the result itself may be inaccurate from a measurement standpoint. Accuracy in this case refers to the sensitivity and specificity of the test. Alternatively, the interpretation of the test results may be inaccurate. In other words, even if the test result is numerically correct, the interpretation of the numerical result may not be completely accurate. In previous work, we explored the hospital discharge decision using a Markov Decision Process (MDP) approach (Kreke et al. 2007), where we were able to demonstrate that hospital discharge policies for patients with sepsis typically follow a non-stationary control limit-type strategy, based on the patient’s Sepsisrelated Organ Failure Assessment (SOFA) score (Vincent et al. 1996). The SOFA score was developed by the Working Group on Sepsis-related Problems of the European Society of Intensive Care Medicine to describe quantitatively the degree of organ dysfunction/failure over time (Vincent et al. 1996). A control-limit strategy for hospital discharge can be interpreted as follows: continue treating a patient in the hospital using standard methods of care until the patient’s SOFA score improves to the time-varying control limit score, then release the patient from the hospital. These types of policies are useful in that they provide structure to an otherwise ad-hoc process that stems from physician intuition and clinical uncertainty (Burns and Wholey 1991, McCormick et al. 1999). The lack of standardized discharge policies has resulted in substantial differences in length of stay between and within institutions (Cleary et al. 1991, Fine et al. 1993), mainly due to concerns about the morbidity and mortality associated with premature hospital discharge. One limitation of our MDP model was that it considered a completely observable patient health state, the SOFA score. In many medical applications, particularly those involving a testing process, the patient’s true health state cannot actually be observed directly by the clinician. Instead the clinician relies on the results of various tests to predict the value of the patient’s underlying true health state. Extending our previous MDP model, we consider a Partially Observable Markov Decision Process (POMDP) model (Monahan 1980, 1982, Smallwood and Sondik 1973) to explore how testing decisions influence the hospital discharge decision. The POMDP model extends the MDP model by allowing the health state to be partially observable. Instead of using the completely observable SOFA score, the POMDP model presented in this paper models the patient health state as the value of a single cytokine level, interleukin-6 (IL-6), and considers the hospital discharge decision in addition to the decision of when to test for the patient’s true IL-6 level. A model discussing the combination of multiple state variables, such as SOFA and IL-6, is discussed by Kreke (2007). Unfortunately, the data to calibrate such a model are not available at this time, so exploration of this model is left to future research. Because a test result may not be received in every stage and the test results that are received may either be inaccurate themselves or interpreted inaccurately, the model uses a belief vector to describe a probability distribution over the possible true patient health states. The values of the vector correspond to the clinician’s belief that the patient’s true health is in each of the possible states. The model utilizes an observation probability matrix to relate the ob3

served values to the underlying health state. Then, using an initial estimate of the belief variable, the current test results, and knowledge of the last action taken, Bayesian updating is used to form a new estimate of the belief vector. The clinician’s decision is made based on the value of this belief vector at each decision point. Hauskrecht and Fraser (2000) present a POMDP model for the management of patients with ischemic heart disease. The authors construct a hierarchical Bayesian belief network based on data from the literature and clinical expertise to represent the disease dynamics. Their work demonstrates the ability of the modeling framework to provide clinical insight, but they cite increasing computational complexity as the model size increases. Peek (1999) presents an influence diagram to describe the underlying relationships among state variables in patients with ventricular septum defect, a disorder with characteristics for congenital heart disease. He utilizes this diagram to construct a POMDP model that considers the various treatment decisions such as when to perform a chest X-ray and when to perform surgery. While the model captures many aspects of the disease and its treatment, the author does not present structural policy results or solutions obtained from patient-based data. Hu et al. (1996) present a POMDP model to determine efficient dosage policies from medical drug therapy. Specifically, they consider the effects of various information gathering policies, such as myopic policies and active learning policies. Computational results are presented to compare policy types and a passive information gathering strategy is suggested for use in clinical practice. We extend these previous modeling efforts by considering a model that is calibrated with clinical data from a large-scale trial. We consider the trade-offs between model complexity and data availability to both formulate and solve realistic models given the current data constraints. These models provide realistic policies that can begin to provide structured guidelines for clinical practice. While previous medical applications of POMDPs have presented decisionmaking frameworks, the models have not led to implementable results, largely due to insufficient data availability. As additional data become available, many of the models and techniques in the literature will prove to be increasingly valuable to clinical practice. The models presented in this paper attempt to incorporate sufficient complexity to capture the dynamic nature of disease progression, while still allowing for solutions to be calibrated with available data. The remainder of this paper is organized as follows: Section 2 formally describes the problem and presents the model formulation; structural results describing model results with respect to changes in test cost and accuracy are provided in Section 3; data sources, problem instances, and results for various computational experiments are presented and discussed in Section 4; finally, Section 5 concludes the paper with directions for future research.

4

2

Problem Description

This model considers the clinician’s decisions of when to test for cytokine information and when to discharge the patient from the hospital. Due to the short time horizon of the disease, this model considers a finite treatment horizon of N days as measured from hospital admission. It is assumed that patients that have not been discharged by day N , must be discharged from the hospital. A finite observation horizon of T days, as measured from hospital admission, is used to accurately model expected patient survival. We assume that due to the acute nature of the disease, the only cause of death possible within this observation horizon is sepsis (Angus 2000, Clermont et al. 2003). Based on conventions in the literature (Clermont et al. 2003, 2002, 2004, Halm et al. 1998, Kasal et al. 2004, Kellum et al. 2007, Kreke et al. 2007, Saka et al. 2007), we use the values N = 30 days and T = 90 days. The objective of this model is to maximize the patient’s expected survival over the observation horizon. Due to the finite observation and treatment horizons, the model is formulated as a finite-horizon POMDP, where the patient’s true health state can only be observed through an inaccurate testing procedure that measures the value of a single cytokine level. At each decision point, the clinician can decide to continue treating the patient in the hospital using standard care without testing (C), continue treating the patient in the hospital using standard care and also order a cytokine test (O), or discharge the patient from the hospital without testing (D). Other than these decisions, we assume that the patient is being treated according to standard methods of care. In this model, the patient’s true health state, the IL-6 level, can take on one of two values: low (L) or high (H), following the threshold utilized by Kellum et al. (2007). A high cytokine value is correlated with a high probability of patient death within 90 days of admission (Kellum et al. 2007). The observed cytokine level can also take on one of these same two values, L or H. Test sensitivity and specificity are modeled through an observation probability distribution that relates the observed value to the patient’s true underlying cytokine level. In the model we assume that the patient’s cytokine level cannot be known without testing, testing is done only when action O is chosen, and test results are received at the beginning of the next time period. Since a test result may not be received in every stage and the test results that are received may either be inaccurate themselves or interpreted inaccurately, a belief variable is used to describe the probability that the patient’s true cytokine level is low. The corresponding probability that the patient’s true health state is high can then be calculated from the value of this belief variable. Using an initial estimate of the belief variable, the current test result, and knowledge of the last action taken, Bayesian updating is used to form a new estimate of the probability that the cytokine level is low. The clinician’s decision is made based on the value of this belief variable at each decision point.

5

2.1

Model formulation

The following notation is used: N = {1, 2, . . . , N }: discrete stages at which a decision must be made by the clinician, where N is the treatment horizon. If a patient has not died and has not been discharged by stage N , it is assumed that the patient is discharged at stage N . We define a stage as one day; however, the model is flexible enough to consider smaller time intervals (hours, for example) as the data for solving such a model become available. T : the observation time horizon used to measure patient survival from hospital admission. yt : a scalar describing the true value of the patient’s cytokine level at stage t. It is assumed that yt can take on one of three values: low (L), high (H), or dead. ot : the observed value of the patient’s cytokine level at stage t. If the patient has died, then ot = dead; otherwise, if a test is ordered, it is assumed that ot can take on one of two values: low (L) or high (H). When no test result was ordered at time t − 1 or at time t = 0, ot = ∅. at : the action taken at stage t. Possible actions are to continue treating the patient in the hospital without ordering a cytokine test (C), to continue treating the patient in the hospital and order a cytokine test (O), or to discharge the patient from the hospital (D). It is assumed that when a cytokine test result is ordered, the test result is received at the beginning of the next period before the next decision is made. Therefore, the patient cannot be discharged before the next period. If the decision is made to discharge the patient from the hospital, the patient transitions out of the model. pt (yt+1 |yt , at ): the probability that the patient’s true health state is yt+1 at stage t + 1 given that at stage t, the patient’s true health state was yt , and action at was chosen. Note that pt (dead|dead, ·) = 1. z(ot+1 |yt+1 , at ): the test accuracy, i.e., the probability of observing cytokine level ot+1 at stage t + 1 when the patient’s true cytokine level is yt+1 at stage t + 1 and action at was chosen at stage t. Note that z(dead|dead, ·) = 1 and that z(L|dead, ·) = z(H|dead, ·) = z(∅|dead, ·) = 0. πt : the probability that the patient’s true cytokine level is low (yt = L) at stage t given that the patient is still alive at stage t. γt (ot+1 |πt , at ): the probability of observing cytokine level ot+1 at stage t + 1 given that the belief variable was πt and action at was taken at stage t. U (πt+1 |ot+1 , πt , at ): the updating function used to update the belief variable πt to πt+1 based on ot+1 , the observation at stage t + 1, πt , the belief variable at stage t, and at , the action taken at stage t. c: a scalar representing the cost of ordering a cytokine test (converted to patient life days using methods from cost-effectiveness analysis (Gold and Moellering 1996)). ft (yt , D): the expected (T − t)-day survival (in patient life days) of a patient that is discharged from the hospital at stage t with true health state yt . Note that ft (dead, D) = 0.

6

rt (πt , D): the expected (T − t)-day survival (in patient life days) of a patient that is discharged from the hospital at stage t with belief variable πt , where rt (πt , D) = ft (L, D)πt + ft (H, D)(1 − πt ). ft (yt , O): the expected reward (in patient life days) received for keeping a patient with true health state yt at stage t in the hospital for one more stage and ordering a cytokine test. Note that ft (dead, O) = 0. rt (πt , O): the expected reward (in patient life days) received for keeping a patient with belief variable πt at stage t in the hospital for one more stage and ordering a cytokine test, where rt (πt , O) = ft (L, O)πt + ft (H, O)(1 − πt ). ft (yt , C): the expected reward (in patient life days) received for keeping a patient with true health state yt at stage t in the hospital for one more stage. Note that ft (dead, C) = 0. rt (πt , C): the expected reward (in patient life days) received for keeping a patient with belief variable πt at stage t in the hospital for one more stage, where rt (πt , C) = ft (L, C)πt + ft (H, C)(1 − πt ). fN (yt ): the expected (T − N )-day survival of a patient that is discharged from the hospital at stage N with true health state yt . Note that fN (dead) = 0. rN (πN ): the expected (T − N )-day survival of a patient that is discharged from the hospital at stage N with belief variable πN , where rN (πN ) = fN (L)πN + fN (H)(1 − πN ). Vt (πt ): the value function used to calculate the total expected reward (in patient life days) at stage t when the patient is alive and in health state πt , where Vt∗ (πt ) denotes the optimal value function value. A∗t (πt ): the set of optimal actions at stage t when the system is in state πt , where a∗t (πt ) ∈ A∗t (πt ) is an action that maximizes the value function Vt (πt ). After action at is taken at stage t with current belief variable πt , an immediate expected reward rt (πt , at ) is received. If at = D, the patient is discharged and receives an expected reward, rt (πt , D). If at = O, the clinician orders a cytokine test, the patient receives an expected reward rt (πt , C) (in patient life days), and the patient’s cytokine level transitions to a new value, which includes the possibility of patient death. Finally, if at = C, no test is ordered, the patient receives an expected reward rt (πt , C) (in patient life days), and the patient’s core health state transitions to a new value, which includes the possibility of patient death. If a patient dies (i.e., yt+1 = ot+1 = dead), it is assumed that death occurs at the beginning of the next stage and that the patient exits the model and does not accumulate any future rewards. If the patient does not die, the belief variable, πt , is updated from stage t to stage t + 1 using the updating function, U (πt+1 |ot+1 , πt , at ), which performs the update using the observation ot+1 from stage t + 1 as follows: U (πt+1 |ot+1 , πt , at ) ( =

z(ot+1 |L)[pt (L|L,at )πt +pt (L|H,at )(1−πt )] , γt (ot+1 |πt ,at )

pt (L|L, at )πt + pt (L|H, at )(1 − πt ),

where, γt (ot+1 |πt , at ) 7

if ot+1 ∈ {H, L}; if ot+1 = ∅.

(1)

=

X

z(ot+1 |yt+1 , at )[pt (yt+1 |L, at )πt + pt (yt+1 |H, at )(1 − πt )].

yt+1 ∈{L,H,dead}

Let the optimal value function, Vt∗ (πt ), be the total expected reward for a living patient with belief variable πt for stage t onward. Vt∗ (πt ) can then be defined recursively as follows: VN∗ (πN ) = rN (πN ), for all πN ∈ Π,

(2)

  rt (πt , D), P ∗ rt (πt , O) − c + ot+1 ∈{L,H} γt (ot+1 |πt , O)Vt+1 (U (πt+1 |ot+1 , πt , O)), Vt∗ (πt ) = max  ∗ rt (πt , C) + γt (∅|πt , C)Vt+1 (U (πt+1 |∅, πt , C)). (3) for all πt ∈ [0, 1] and t = 1, . . . , N − 1. The structural results presented in Section 3 and the computational experiments described in Section 4 provide insight into the value of test accuracy and cost as they relate to prolonging the expected life of a patient.

3

Structural Results

This section presents structural results for the POMDP model pertaining to changes in test cost and accuracy. The purpose of this analysis is to provide insight into the robustness of the model results for a range of model parameters, as can be expected in practice. Relevant proofs are provided in the Appendix. Assumption 1 ft (L, D) ≥ ft (H, D) for all t ∈ N . Assumption 1 states that the (T − t)-day expected survival for a patient discharged in health state L is at least as great as the (T − t)-day expected survival for a patient discharged in health state H. Assumption 2 ft (yt , C) = ft (yt , O) = 1 for all yt ∈ Y and for all t ∈ N . Assumption 2 states that the immediate reward received for keeping the patient in the hospital, with or without ordering a cytokine test, is one more day of patient life. Future research could consider other reward functions, such as incorporating the cost of care or using quality-adjusted survival. Definition 1 (Barlow and Proschan 1965) An N ×N transition probability matrix P (t), with entries [P (t)]yj , is said to be IFR (Increasing Failure Rate) if the PN rows of P (t) are in nondecreasing stochastic order, that is, z(y) = j=k [P (t)]yj is monotonically nondecreasing in y for k = 1, . . . , N . Assumption 3 The cytokine transition probability matrix P (t), with entries [P (t)]yj = pt (j|yt , C) = pt (j|yt , O), is IFR for all t ∈ N .

8

From Definition 1, Assumption 3 implies that for two patients with cytokine levels L and H, it is more likely that the patient with cytokine level H will either stay in the H state or die than it is that the patient in the L state will transition to the H state or die. In other words, sicker patients are more likely to progress to being even sicker than are healthier patients. Assumption 4 The test has symmetric accuracy ρ, i.e., z(L|L) = z(H|H) = ρ. Assumption 5 Tests are more accuracte than inaccurate, i.e., ρ ∈ ( 21 , 1]. Under similar assumptions, it has been shown that the POMDP value function, Vt∗ (πt ), is continuous, nondecreasing, and convex in πt , πt ∈ [0, 1] (Monahan 1980, Smallwood and Sondik 1973). To incorporate the added impact of testing costs we extend the results of Monahan. As a result, we present the following new result with respect to the ordering of the value function. In addition, we characterize the changes in the testing regions under varying costs and testing accuracies in subsequent corollaries. Let Vt∗ (πt , c) denote the optimal value function value at stage t when the testing cost is c and the belief variable is πt . Theorem 1 Vt∗ (πt , c0 ) ≤ Vt∗ (πt , c00 ) for c0 ≥ c00 and for all t ∈ N . The ordering shown in Theorem 1 leads to Corollary 1, which shows that as the testing cost decreases, the optimal testing region does not decrease. Corollary 1 If it is optimal to test in state πt at stage t when the testing cost is c0 , then it is also optimal to test when the testing cost is c00 ≤ c0 . Corollary 1 provides the intuitive result that when it is optimal to test for a given test cost c0 , then it is also be optimal to test for a lower test cost c00 given that all other model parameters remain the same. Exploring the effects of test accuracy on the optimal value function value, let Vt∗ (πt , ρ) denote the optimal value function value at stage t when the test accuracy is ρ and the belief variable is πt . Monahan (1980) demonstrates an ordering on the value function with respect to test accuracy. Extending this result, Corollary 2 shows that as testing accuracy increases, the optimal testing region does not decrease. Corollary 2 If it is optimal to test in state πt at stage t when the test accuracy is ρ0 , then it is also optimal to test when the test accuracy is ρ00 ≥ ρ0 . In Corollary 2 we see that if it is optimal to test for a given test accuracy ρ0 , then it is also optimal to test under a test with improved accuracy. The specific results described in Corollaries 1 and 2 are demonstrated through the computational experiments described in Section 4.

9

4

Computational Results

The problem instances described in this section were solved by discretizing the state space and using backward induction (Monahan 1982).

4.1

POMDP data sources

The GenIMS trial data contains static and dynamic variables for 2320 patients that were identified as potentially having community-acquired pneumonia (CAP). Of these patients, 2032 were admitted to the hospital and went on to develop varying degrees of sepsis. The computational experiments presented in this section utilize a sample of 1096 patients, with 936 patients being excluded from the GenIMS in-patient cohort because they did not have at least two consecutive days of cytokine test results. Static variables such as age and race are provided for each patient.

4.2

Interleukin-6

The following computational experiments utilize the cytokine interleukin-6 (IL6) to represent the observed value of the patient’s health state. IL-6 has been shown to be a predictor of severe sepsis and death through studies conducted as part of the GenIMS investigation (Kellum et al. 2007). In particular, it has been shown that elevated concentrations of this cytokine were higher for those patients that died following severe sepsis compared to those who survived (Kellum et al. 2007). Based on the results of this study, the problem instances tested in this dissertation considered an IL-6 level of 5.9 pg/mL or less to be low and all levels greater than this value to be high (Kellum et al. 2007).

4.3

Problem instances

As age and race have been determined to be significant predictors of patient mortality (Johnston 2005, Kasal et al. 2004), these static variables are used to define problem instances following the conventions used in (Clermont et al. 2004, Kreke et al. 2007, Saka et al. 2007). Due to the sparsity of cytokine data, problem instances were constructed for the cohort consisting of Caucasian patients age 65 and older. As additional data become available, the model can be used for other age and race cohorts. Due to issues with data sparsity when attempting to develop non-stationary probability matrices, it is assumed that the transition and observation probabilities are stationary; however, the patient’s (T − t)-day expected survival is calculated as a function of the patient’s length of stay in the hospital. As additional controlled trials are conducted to study the role of inflammatory markers in predicting sepsis progression and survival, the availability of additional data will allow for the consideration of increasingly complex models, such as those with multiple non-stationary components.

10

Table 1: Problem Instances Tested Using the POMDP Model

A B C D E F G H I

z(L|L) = z(H|H) 1 1 1 0.95 0.95 0.95 0.90 0.90 0.90

c 0 0.5 1 0 0.5 1 0 0.5 1

Table 1 defines nine problem instances by their observation probabilities and testing costs. These instances were tested using the general model. Following Assumption 4, z(L|L) = z(H|H). Further investigation into changes in test specificity versus sensitivity is left to future research.

4.4

Results for various testing costs and accuracy levels

Figure 1 displays the optimal discharge and testing policy regions for the nine problem instances described in Table 1. Recall that these problem instances were all solved for the cohort consisting of Caucasian patients age 65 and older. In Figure 1, the problem instances are presented such that test accuracy increases from bottom to top and testing cost increases from left to right. Three values of test accuracy are considered: 1 (completely accurate), 0.95, and 0.9. Three values of test cost (in life days) are considered: 0 (no cost), 0.5, and 1. Note that the optimal policy on day 29 is the same for each problem instance. This policy can be interpreted as: If the belief variable is less than 0.37, keep the patient in the hospital for one more day and then discharge on day 30; otherwise, discharge the patient from the hospital on day 29. There is no testing region on day 29 since the model assumes that a patient that is still in the hospital on day 30 must be discharged from the hospital. A test result received on day 30 (for a test ordered on day 29) would not change this decision and is therefore not necessary. Since testing is not involved at the end of the horizon, the nine problem instances presented are the same on day 29 looking forward and therefore have the same optimal policy. For the previous stages, however, the varying test costs and accuracies impact the resulting policies, as can be seen in the differing optimal policies for each problem instance for days 1 through 28. The convergence of each policy to 0.37 on day 29 can be attributed to these end-of-horizon effects. These regions can be further explained as follows: For very low values of the belief variable, the decision maker believes with a high probability that the patient’s cytokine level is high and it is therefore not necessary to order a test for 11

Figure 1: Optimal Policy Regions for Each POMDP Problem Instance

12

more information. This additional information would not change the decision to keep the patient in the hospital, so a test is not ordered. Similarly, for very low values of the belief variable, the decision maker is fairly certain that the patient is healthy enough to be discharged from the hospital, and it is therefore not necessary to order a test for more information. This additional information would most likely not change the decision to discharge the patient from the hospital, so a test is not ordered. Notice that these observations are true even for the case when there is no testing cost and no test error. The Continue and Discharge regions increase in size as the associated testing cost increases, confirming the results of Corollary 1 in Section 3. These regions decrease in size as the associated testing accuracy increases, confirming the results of Corollary 2 in Section 3. The center testing region is highly dependent on both the cost of ordering a test and the accuracy of the test results. The testing region is largest for Problem Instance A in which there is no testing cost and the test results are completely accurate. For Problem Instance I, on the other hand, in which the testing cost is highest and the test accuracy is lowest, the testing region is very narrow. The testing region appears to be less affected when only one parameter, cost or accuracy, is changed, but decreases dramatically as both become more unfavorable.

5

Conclusions

The results presented in this paper demonstrate the potential value of inexpensive, accurate testing procedures as well as accurate interpretation of test results. More importantly, however, is the suggestion that even in light of completely accurate and cost-free tests, it is not optimal to test all the time. This at first seems to be counterintuitive until one considers that additional information is only needed if it will change the current decision. Through these results, it is clear that unless a patient’s health state is on the borderline of deciding to discharge the patient, then it is not necessary to gather more information by ordering a cytokine test. While this model considers an individual patient’s perspective, one must also consider the amount of time the clinician and other health care providers spend in administering, processing, and analyzing test results. Avoiding unnecessary tests will help to minimize health care costs. As additional data become available through large-scale clinical trials like GenIMS, models such as the POMDP presented in this paper may be able to inform clinical practice by providing policy results in addition to suggesting general strategies. From this study, however, it is clear that the POMDP framework can be used to solve medical decision making questions in which aspects of the patient’s health state can only be observed through a costly or inaccurate testing procedure. Future research efforts could include the addition of multiple variables in the state description, including both completely observable and partially observable elements.

13

6

Acknowledgments

J. Kreke was supported by a Graduate Fellowship through the AT&T Labs Fellowship Program. A. Schaefer, M. Roberts and D. Angus were supported by National Institute of General Medical Services grant R01 GM61992. A. Schaefer was also supported by grant CMMI-0546960 from the National Science Foundation. The authors would like to thank the GenIMS investigator team for their cooperation and assistance and for providing the data for this research effort. The authors gratefully acknowledge the helpful comments of Lisa Maillart, Murat Kurt and Burhan Sandikci from the University of Pittsburgh.

References Anderson, R.N. 2002. Deaths: Leading causes for 2000. National Vital Statistics Reports 50(16) 1–86. Angus, D.C. 2000. Study design issues in sepsis trials. Sepsis 4(1) 7–13. Angus, D.C., W.T. Linde-Zwirble, J. Lidicker, G. Clermont, J. Carcillo, M.P. Pinsky. 2001. Epidemiology of severe sepsis in the United States: Analysis of incidence, outcome, and associated costs of care. Critical Care Medicine 29(7) 1303–1310. Barlow, R.E., F. Proschan. 1965. Mathematical Theory of Reliability. John Wiley and Sons, New York. Bone, R.C. 1996. Why sepsis trials fail. Journal of the American Medical Association 276(7) 565–566. Bone, R.C., R.A. Balk, F.B. Cerra, R.P. Dellinger, A.M. Fein, W.A. Knaus, R.M. Schein, W.J. Sibbald. 1992. Definitions for sepsis and organ failure and guidelines for the use of innovative therapies in sepsis. the ACCP/SCCM Consensus Conference Committee. American College of Chest Physicians/Society of Critical Care Medicine. Chest 101(6) 1644–1655. Burchardi, H., H. Schneider. 2004. Economic aspects of severe sepsis: A review of intensive care unit costs, cost of illness and cost effectiveness of therapy. Pharmacoeconomics 22(12) 793–813. Burns, L., D. Wholey. 1991. The effects of patient, hospital, and physician characteristics on length of stay and mortality. Medical Care 29(3) 225–271. Cleary, P.D., S. Greenfield, A.G. Mulley. 1991. Variation in length of stay and outcomes for six medical and surgical conditions. Journal of the American Medical Association 266(1) 73–89. Clermont, G., D.C. Angus, K.G. Kalassian, W.T. Linde-Zwirble, N. Ramakrishnan, P.K. Linden, M.R. Pinsky. 2003. Reassessing the value of short-term mortality in sepsis: Comparing conventional approaches to modeling. Critical Care Medicine 31(11) 2627–2633. Clermont, G., D.C. Angus, W.T. Linde-Zwirble, M.F. Griffin, M.J. Fine, M.R. Pinsky. 2002. Does acute organ dysfunction predict patient-centered outcomes? Chest 121(6) 1963–1971. Clermont, G., V. Kaplan, R. Moreno, J.L. Vincent, W.T. Linde-Zwirble, B. Van Hout, D.C. Angus. 2004. Dynamic microsimulation to model multiple outcomes in cohorts of critically ill patients. Intensive Care Medicine 30(12) 2237–2255.

14

Eggimann, P., D. Pittet. 2001. Infection control in the ICU. Chest 120(6) 2059–2093. Fine, M.J., D.E. Singer, A.L. Phelps, B.H. Hanusa, W.N. Kapoor. 1993. Differences in length of hospital stay in patients with community-acquired pneumonia: A prospective four hospital study. Medical Care 31(4) 371–380. GenIMS. 2007. http://genims.org/. Gold, H.S., R.C. Moellering. 1996. Antimicrobial-drug resistance. New England Journal of Medicine 335 1445–1453. Halm, E.A., M.J. Fine, T.J. Marrie, C.M. Coley, W.N. Kapoor, D.S. Obrosky, D.E. Singer. 1998. Time to clinical stability in patients hospitalized with communityacquired pneumonia. Journal of the American Medical Association 279(18) 1452–1457. Hauskrecht, M., H. Fraser. 2000. Planning treatment of ischemic heart disease with partially observable Markov decision processes. Artificial Intelligence in Medicine 18(3) 221–244. Hu, C., W.S. Lovejoy, S.L. Shafer. 1996. Comparison of some suboptimal control policies in medical drug therapy. Operations Research 44(5) 696–709. Johnston, J.A. 2005. Determinants of mortality in patients with severe sepsis. Medical Decision Making 25(4) 374–386. Kasal, J., Z. Jovanovic, G. Clermont, L.A. Weissfeld, V. Kaplan, R.S. Watson, D.C. Angus. 2004. Comparison of Cox and Gray’s survival models in severe sepsis. Critical Care Medicine 32(3) 700–707. Kellum, J.A., L. Kong, M.P. Fink, L.A. Weissfeld, D.M. Yealy, M.P. Pinsky, J. Fine, A. Krichevsky, R.L. Delude, D.C. Angus. 2007. Understanding the inflammatory cytokine response in pneumonia and sepsis: Results of the GenIMS study. Archives of Internal Medicine 167(15) 1655–1663. Kreke, J. 2007. Modeling disease management decisions for patients with pneumoniarelated sepsis. Ph.D. thesis, University of Pittsburgh. Kreke, J.E., M.D. Bailey, A.J. Schaefer, D.C. Angus, M.S. Roberts. 2007. Modeling hospital discharge policies for patients with pneumonia-related sepsis. Tech. rep., University of Pittsburgh. To appear in IIE Transactions. Marshall, J.C. 2000. Clinical trials of mediator-directed therapy in sepsis: What have we learned? Intensive Care Medicine 26 S75–83. McCormick, D., M.J. Fine, C.M. Coley, T.J. Marrie, J.R. Lave, D.S. Obrosky, W.N. Kapoor, D.E. Singer. 1999. Variation in length of hospital stay in patients with community acquired pneumonia: Are shorter stays associated with worse medical outcomes? American Journal of Medicine 107(1) 5–12. Monahan, G.E. 1980. Optimal stopping in a partially observable Markov process with costly information. Operations Research 28(6) 1319–1334. Monahan, G.E. 1982. A survey of partially observable Markov decision processes: Theory, models, and algorithms. Management Science 28(1) 1–16. Peek, N.B. 1999. Explicit temporal models for decision-theoretic planning of clinical management. Artificial Intelligence and Medicine 15(2) 135–154. Saka, G., J.E. Kreke, A.J. Schaefer, D.C. Angus, M.S. Roberts. 2007. Use of dynamic microsimulation to predict disease progression in patients with pneumoniarelated sepsis. Critical Care 11(3) R65.

15

Smallwood, R.D., E.J. Sondik. 1973. The optimal control of partially observable Markov decision processes over a finite horizon. Operations Research 21(5) 1071–1088. Vincent, J.-L., R. Moreno, J. Takala, S. Willatts, A. De Medonca, H. Bruining, C.K. Reinhart, R.M. Suter, L.G. Thijs. 1996. The SOFA (Sepsis-related Organ Failure Assessment) score to describe organ dysfunction/failure. On behalf of the Working Group on Sepsis-Related Problems of the European Society of Intensive Care Medicine. Intensive Care Medicine 22(7) 707–710. Wheeler, A.P., G.R. Bernard. 1999. Current concepts: Treating patients with severe sepsis. The New England Journal of Medicine 340(3) 207–214. Zeni, F., B.F. Freeman, C. Natanson. 1997. Anti-inflammatory therapies to treat sepsis and septic shock: A reassessment. Critical Care Medicine 25(7) 1095–1100.

7

Appendix

Proofs of Statements Proof of Theorem 1: (By induction) From (2) it is known that VN∗ (πN , c0 ) = VN∗ (πN , c00 ) = rN (πN ). Assuming that Vk∗ (πk , c0 ) ≤ Vk∗ (πk , c00 ) for k = t + 1, . . . , N − 1, it suffices to show that Vt∗ (πt , c0 ) ≤ Vt∗ (πt , c00 ) . Following from (3), it follows that   rt (πt , D), P ∗ rt (πt , O) − c0 + ot+1 ∈{L,H} γt (ot+1 |πt , O)Vt+1 (U (πt+1 |ot+1 , πt , O), c0 ), Vt∗ (πt , c0 ) = max  ∗ 0 rt (πt , C) + γt (∅|πt , C)Vt+1 (U (πt+1 |∅, πt , C), c ). and   rt (πt , D), P ∗ ∗ 00 rt (πt , O) − c00 + ot+1 ∈{L,H} γt (ot+1 |πt , O)Vt+1 (U (πt+1 |ot+1 , πt , O), c00 ), Vt (πt , c ) = max  ∗ 00 rt (πt , C) + γt (∅|πt , C)Vt+1 (U (πt+1 |∅, πt , C), c ). The induction assumption implies that ∗ ∗ Vt+1 (U (πt+1 |ot+1 , πt , O), c00 ) ≥ Vt+1 (U (πt+1 |ot+1 , πt , O), c0 )

(4)

∗ ∗ Vt+1 (U (πt+1 |∅, πt , C), c00 ) ≥ Vt+1 (U (πt+1 |∅, πt , C), c0 ).

(5)

and Vt∗ (πt , c0 )

Suppose that Vt∗ (πt , c0 ). Next, suppose that

= rt (πt , D). By definition,

X

Vt∗ (πt , c0 ) = rt (πt , O)−c0 +

Vt∗ (πt , c00 )

≥ rt (πt , D) =

∗ γt (ot+1 |πt , O)Vt+1 (U (πt+1 |ot+1 , πt , O), c0 ).

ot+1 ∈{L,H}

By definition, Vt∗ (πt , c00 ) ≥ rt (πt , O)−c00 +

X

∗ γt (ot+1 |πt , O)Vt+1 (U (πt+1 |ot+1 , πt , O), c00 )

ot+1 ∈{L,H}

16

X

≥ rt (πt , O) − c0 +

∗ γt (ot+1 |πt , O)Vt+1 (U (πt+1 |ot+1 , πt , O), c00 )

(6)

∗ γt (ot+1 |πt , O)Vt+1 (U (πt+1 |ot+1 , πt , O), c0 )

(7)

ot+1 ∈{L,H}

X

≥ rt (πt , O) − c0 +

ot+1 ∈{L,H}

= Vt∗ (πt , c0 ), where (6) follows from the fact that c0 ≥ c00 and (7) follows from the induction assumption (4). ∗ Finally, suppose that Vt∗ (πt , c0 ) = rt (πt , C)+ γt (∅|πt , C)Vt+1 (U (πt+1 |∅, πt , C), c0 ). By definition, ∗ Vt∗ (πt , c00 ) ≥ rt (πt , C) + γt (∅|πt , C)Vt+1 (U (πt+1 |∅, πt , C), c00 ) ∗ (U (πt+1 |∅, πt , C), c0 ) = Vt∗ (πt , c0 ), ≥ rt (πt , C) + γt (∅|πt , C)Vt+1

(8)

where (8) follows from the induction assumption (5) and the desired result follows. 2 Proof of Corollary 1: Given πt and testing cost c0 , a∗t = O and (3) imply that X ∗ rt (πt , O) − c0 + γt (ot+1 , O|πt )Vt+1 (U (πt+1 |ot+1 , πt , O)) ≥ rt (πt , D) ot+1 ∈{L,H}

and that X

rt (πt , O) − c0 +

∗ γt (ot+1 |πt , O)Vt+1 (U (πt+1 |ot+1 , πt , O))

ot+1 ∈{L,H} ∗ ≥ rt (πt , C) + γt (∅|πt , C)Vt+1 (U (πt+1 |∅, πt , C)).

Note that c00 ≤ c0 and Theorem 1 imply that X ∗ rt (πt , O) − c00 + γt (ot+1 |πt , O)Vt+1 (U (πt+1 |ot+1 , πt , O)) ot+1 ∈{L,H}

X

≥ rt (πt , O) − c0 +

∗ γt (ot+1 |πt , O)Vt+1 (U (πt+1 |ot+1 , πt , O)).

ot+1 ∈{L,H}

Therefore, rt (πt , O) − c00 +

X

∗ γt (ot+1 |πt , O)Vt+1 (U (πt+1 |ot+1 , πt , O)) ≥ rt (πt , D)

ot+1 ∈{L,H}

and rt (πt , O) − c00 +

X

∗ γt (ot+1 |πt , O)Vt+1 (U (πt+1 |ot+1 , πt , O))

ot+1 ∈{L,H} ∗ ≥ rt (πt , C) + γt (∅|πt , C)Vt+1 (U (πt+1 |∅, πt , C)),

and the result follows.

2

17

Theorem 2 (Monahan 1980) Vt∗ (πt , ρ0 ) ≤ Vt∗ (πt , ρ00 ) for ρ0 ≤ ρ00 and for all t ∈ N. Proof of Corollary 2: Given πt and test accuracy ρ0 , a∗t = O and (3) imply that X ∗ rt (πt , O) − c + γt (ot+1 |πt , O)Vt+1 (U (πt+1 |ot+1 , πt , O), ρ0 ) ≥ rt (πt , D) ot+1 ∈{L,H}

and that X

rt (πt , O) − c +

∗ γt (ot+1 |πt , O)Vt+1 (U (πt+1 |ot+1 , πt , O), ρ0 )

ot+1 ∈{L,H} ∗ ≥ rt (πt , C) + γt (∅|πt , C)Vt+1 (U (πt+1 |∅, πt , C)).

Note that following from Theorem 2, ρ00 ≥ ρ0 implies that X ∗ rt (πt , O) − c + γt (ot+1 |πt , O)Vt+1 (U (πt+1 |ot+1 , πt , O), ρ00 ) ot+1 ∈{L,H}

X

≥ rt (πt , O) − c +

∗ γt (ot+1 |πt , O)Vt+1 (U (πt+1 |ot+1 , πt , O), ρ0 ).

ot+1 ∈{L,H}

Therefore, rt (πt , O) − c +

X

∗ γt (ot+1 |πt , O)Vt+1 (U (πt+1 |ot+1 , πt , O), ρ00 ) ≥ rt (πt , D)

ot+1 ∈{L,H}

and X

rt (πt , O) − c +

∗ γt (ot+1 |πt , O)Vt+1 (U (πt+1 |ot+1 , πt , O), ρ00 )

ot+1 ∈{L,H} ∗ ≥ rt (πt , C) + γt (∅|πt , C)Vt+1 (U (πt+1 |∅, πt )),

and the result follows.

2

18

Suggest Documents