National Institutes of Health Consensus Development Project ... - Core

3 downloads 0 Views 167KB Size Report
Mary E. D. Flowers,1 Alexandra H. Filipovich,11 Rima Saliba,9 Georgia B. Vogelsang,12. Steven Z. Pavletic,13 ..... formulate a reasonable expectation of risk and a pre- liminary estimate of ... diagnosis and management of chronic GVHD. If it is.
Biology of Blood and Marrow Transplantation 12:491-505 (2006) 䊚 2006 American Society for Blood and Marrow Transplantation 1083-8791/06/1205-0001$32.00/0 doi:10.1016/j.bbmt.2006.03.004

National Institutes of Health Consensus Development Project on Criteria for Clinical Trials in Chronic Graft-versus-Host Disease: VI. Design of Clinical Trials Working Group Report Paul J. Martin,1 Daniel Weisdorf,2 Donna Przepiorka,3 Steven Hirschfeld,4 Ann Farrell,4 J. Douglas Rizzo,5 Ronan Foley,6 Gerard Socie,7 Shelly Carter,8 Daniel Couriel,9 Kirk R. Schultz,10 Mary E. D. Flowers,1 Alexandra H. Filipovich,11 Rima Saliba,9 Georgia B. Vogelsang,12 Steven Z. Pavletic,13 Stephanie J. Lee1 1

Fred Hutchinson Cancer Research Center, University of Washington School of Medicine, Seattle, Washington; University of Minnesota, Minneapolis, Minnesota; 3University of Tennessee, Memphis, Tennessee; 4US Food and Drug Administration, Rockville, Maryland; 5Medical College of Wisconsin, Milwaukee, Wisconsin; 6McMaster University, Hamilton, Ontario, Canada; 7Hopital St. Louis, Paris, France; 8EMMES Corporation, Rockville, Maryland; 9University of Texas M.D. Anderson Cancer Center, Houston, Texas; 10British Columbia Children’s Hospital, University of British Columbia, Vancouver, British Columbia, Canada; 11Cincinnati Children’s Hospital Medical Center, University of Cincinnati, Cincinnati, Ohio; 12Johns Hopkins University School of Medicine, Baltimore, Maryland; 13National Cancer Institute, National Institutes of Health, Bethesda, Maryland 2

Correspondence and reprint requests: Paul J. Martin, MD, Fred Hutchinson Cancer Research Center, PO Box 19024, 1100 Fairview Avenue N, D2-100, Seattle, WA 98109-1024 (e-mail: [email protected]). Received March 8, 2006; accepted March 9, 2006

ABSTRACT The complexity of chronic graft-versus-host disease (GVHD) and the lack of established research methods have made it difficult to design, conduct, and analyze clinical trials involving subjects with this disease, even when promising treatment options are available. This consensus document was developed to offer an approach for overcoming these obstacles. Clinical trials in chronic GVHD should adhere to principles of good trial design and practice. Inclusion and exclusion criteria should allow as many subjects to participate as possible without compromising the interpretation of results. Pre-enrollment assessment of chronic GVHD characteristics should be standardized. The protocol should provide clear guidance about administration of study medication and other interventions. Methods of assessing response should be defined and validated in advance. Efficacy endpoints should be selected to reflect clinical benefit. Expert biostatistical support is needed to ensure the validity and reliability of trial results. The use of consistent standards in clinical trial designs to evaluate agents that have activity in pathogenic pathways could facilitate advances in the treatment of chronic GVHD. © 2006 American Society for Blood and Marrow Transplantation

KEY WORDS Chronic graft-versus-host disease Clinical trials ● Design



Allogeneic hematopoietic cell transplantation



Consensus



The opinions expressed here are those of the authors and do not represent the official position of the National Institutes of Health, US Food and Drug Administration, or the US Government.

INTRODUCTION Previous studies have provided a good understanding of risk factors for development of chronic graftversus-host disease (GVHD) and risk factors for mortal-

BB&MT

ity among patients with newly diagnosed chronic GVHD. Advances in supportive care have decreased morbidity, but survival for patients with chronic GVHD has not changed since the mid-1980s. Five-year survival rates for patients with newly diagnosed “standard-risk” 491

P. J. Martin et al

chronic GVHD (platelet count ⬎100 000/␮L and onset without previous acute GVHD or after resolution of previous acute GVHD) have remained at approximately 70%, and 5-year survival rates for those with “high-risk” chronic GVHD (platelet count ⬍100 000/␮L or progressive onset from acute GVHD) have remained at 40-50% [1,2]. In aggregate, only 50% of patients with chronic GVHD are able to discontinue immunosuppressive treatment within 5 years after the diagnosis, and 10% require continued treatment beyond 5 years. The remaining 40% die or develop recurrent malignancy before chronic GVHD resolves [2].

PURPOSE OF THIS DOCUMENT A variety of inherent challenges must be considered in designing a clinical trial of treatment for chronic GVHD. The numbers of patients with chronic GVHD in any single institution are generally not sufficient for studies intended to establish safety and efficacy for investigational products in a reasonable period. Studies that enroll patients at several institutions are more likely to meet accrual targets and provide the necessary data to analyze outcomes in a robust manner. A major challenge is to coordinate and align the practices for the purpose of an efficient trial design and unbiased analysis at institutions that have evolved divergent standards based on their own resources and experience. The goal of this working group is to recommend approaches and definitions to ensure consistent and interpretable results of clinical studies designed to assess interventions for treatment of chronic GVHD. It is not a goal of this working group to recommend standards of care for clinical practice. Considerations for trials intended as formal demonstration of safety and efficacy are noted where applicable. Prevention trials and observational studies will not be extensively addressed in this document. In designing any clinical trial, 3 critically important questions must be considered: (1) Who will be enrolled in the trial? (2) How will subjects be treated? (3) How will results be evaluated? We concentrate our recommendations on areas where consensus among members could be achieved, and where we believe that a well-accepted standard would significantly enhance the conduct and consistency of clinical trials to evaluate prevention and treatment of chronic GVHD. These recommendations will be modified with further input from the transplantation community and will be updated as new information about chronic GVHD becomes available. An appendix, Consensus Definitions and Glossary—2006, defines the boldface terms and represents opinions of the working group or are terms that might be unfamiliar to readers. 492

SUMMARY OF RECOMMENDATIONS 1. Inclusion and exclusion criteria should encompass as many people with chronic GVHD as possible without compromising the ability to interpret results of the study. 2. Baseline evaluations should document the need for therapy, identify prognostic characteristics, and specifically characterize the condition of subjects at the time of enrollment, so that results of therapy can be interpreted. 3. Within reason, the study protocol should specify or provide guidance regarding the dosing and dose adjustment of all immunosuppressive medications, including the study drug. Reasons for deviations should be noted in case-report forms. 4. Case-report forms should be calendar driven by the protocol to provide assessment of chronic GVHD and adverse events at regular intervals. Patient-reported measures should be incorporated whenever feasible. Standardized and clinically validated measurements should be used. 5. There is urgent need for validation of consensus response measurements. To this end, enrollment in observational studies and clinical trials that test prospective markers and criteria should be encouraged. 6. Primary and secondary endpoints should be selected for their ability to demonstrate clinical benefit, which can be a prolongation of survival or an improvement in the way a patient feels or functions. Endpoints should account for competing outcomes such as death or recurrent malignancy. Composite endpoints may be required in some specific protocols. 7. Biostatistical analysis should incorporate considerations of competing events and concomitant therapy, appropriate power calculations, interim analyses, sensitivity analysis, and missing measurements. STUDY POPULATIONS General Considerations

The study design should have eligibility criteria, and pre-study evaluations should document the diagnosis of chronic GVHD, account for the probable risk-benefit ratio to individual participants, and ensure interpretability of the ultimate results. The diagnosis of chronic GVHD should be established according to the definitions provided in the diagnosis and staging document [3]. Whenever feasible, the diagnosis of chronic GVHD should be documented by photography or confirmed by tissue biopsy before patients are enrolled in treatment trials [4]. Information from well-documented baseline evaluations can be used not only to demonstrate eligibility for enrollment but might also be used for future assessment of previously unidentified risk factors. In treatment trials, inclusion criteria depend primarily on whether the population of interest has newly diagnosed chronic GVHD requiring “pri-

Design of Clinical Trials in Chronic GVHD

mary” therapy or more advanced chronic GVHD that has already not improved or has recurred after primary therapy, thus requiring “secondary” treatment. Although chronic GVHD occurs in 30-60% of survivors after hematopoietic cell transplantation (HCT), the heterogeneity of the syndrome makes it difficult to accrue a homogeneous cohort in any study. Inclusion and exclusion criteria must balance the need for a representative selection of subjects (arguing that a large proportion of people with chronic GVHD should be eligible) versus the need for clear interpretation (arguing that the study population should be as homogeneous as possible). In general, phase III trials should be more inclusive, whereas phase II trials may need to be more restrictive to facilitate the use of early and intermediate endpoints. Inclusion of children is encouraged in phase II and III studies and may require assessments of additional endpoints, such as growth and development, which are relevant to this population. Studies limited to enrollment of children may also be feasible through the participation of pediatricoriented cooperative groups. Inclusion Criteria in Clinical Trials Testing Topical or Systemic Agents as Primary Therapy

Current criteria define the diagnosis and scoring of chronic GVHD more precisely than previously and include considerations of symptom severity, functional impairment, and prognosis [3]. In general, systemic immunosuppressive treatment is not needed for patients who have chronic GVHD with only mild abnormalities involving only 1 or perhaps 2 sites with no symptoms or only mild symptoms and no functional impairment and no characteristics that portend a poor prognosis. These patients may be suitable for enrollment in clinical trials to evaluate topical agents [3]. Systemic treatment is needed for patients who have chronic GVHD that causes more severe abnormalities or complications such as fasciitis, contractures, or pulmonary involvement and for patients with clinical characteristics indicating a poor prognosis. These patients may be suitable for enrollment in clinical trials to evaluate systemic agents. Patients with more severe abnormalities or complications may also be suitable for enrollment in clinical trials to evaluate topical agents as long as participation in the trial permits systemic treatment and accounts for its effects. Inclusion Criteria in Clinical Trials of Secondary Therapy

Criteria for secondary treatment should be clearly delineated in the protocol. Secondary treatment is indicated when primary therapy has failed as indicated by progression of chronic GVHD signs and

BB&MT

symptoms during therapy. In some circumstances, secondary therapy may also be instituted for persistent steroid-dependent chronic GVHD. Steroid-intolerant patients with steroid-responsive chronic GVHD can also be enrolled in clinical trials of secondary therapy, provided that steroid intolerance is strictly defined. Inclusion of steroid-intolerant patients would not pose a problem when the primary endpoint is safety. Stratification would be needed in a randomized trial in which the primary endpoint is efficacy, because the response rate among steroidintolerant patients could be higher than that among those with steroid-refractory or steroid-dependent chronic GVHD. A key issue for clinical trials testing secondary therapy is whether and how to limit enrollment of patients with far advanced chronic GVHD that has produced irreversible damage or has proved resistant to more than a single previous treatment. Responsiveness of the disease will likely depend on the extent of disease activity as opposed to irreversible damage. Even if the study intervention inhibits disease activity, clinical benefit would not be evident if most manifestations represent irreversible damage that has already accumulated during the previous course of the disease. Enrollment of patients with far advanced chronic GVHD might be appropriate when the primary endpoint is related to safety and tolerance or to efficacy as measured by arrested progression of the disease, but a lack of improvement during treatment would be difficult to interpret in patients with irreversible contractures, bronchiolitis obliterans, or sicca syndrome, if these are the only manifestations of chronic GVHD. Patients with irreversible damage as the only manifestation of chronic GVHD should be excluded from trials designed to measure improvement in chronic GVHD, although it may be appropriate to include these patients in studies in which the primary endpoint is survival, prevention of chronic GVHD progression, palliation of disease manifestations that are ordinarily considered irreversible (eg, through the use of anti-fibrotic medications), or reduction in the dose of steroid treatment. Patients who have reversible and irreversible manifestations of chronic GVHD can be enrolled in trials with endpoints related to clinical improvement, but the trial design should specify how previously established irreversible damage will be assessed. Organs or sites affected by irreversible damage could be excluded for purposes of evaluating improvement, but they should be included for purposes of evaluating possible progression of the disease. Clinical trials for secondary treatment of chronic GVHD could benefit enormously from the availability of markers indicating disease activity as distinguished from damage. 493

P. J. Martin et al

Exclusion Criteria

In clinical trials of treatment for chronic GVHD, exclusion criteria generally address the presence of (1) uncontrolled infection at the time of enrollment, (2) contraindications to administration of the study intervention or known inability of the patient to tolerate the study intervention, and (3) complications such as recurrent or progressive malignancy that might affect the study endpoints or temper the effort to control chronic GVHD. Other typical exclusion criteria include pregnancy or breast-feeding, an unwillingness to comply with any critical components of study treatment or response evaluation, or short duration of anticipated survival due to other comorbidities. Complications such as fasciitis, contractures, and pulmonary involvement may evolve into fixed, irreversible deficits that do not improve during immunosuppressive treatment. Patients may be excluded from enrollment in certain clinical trials if these deficits are truly irreversible and if their presence would interfere with the assessment of efficacy, but they could be enrolled in a trial with survival as the endpoint if stabilization of the irreversible deficit would be expected to prolong survival. Mixed Acute and Chronic GVHD

A recognition of overlap syndromes in which chronic GVHD occurs with manifestations typical of acute GVHD has increased recently. Inclusion of patients with such overlap syndromes may complicate the interpretation of phase II trials because the response of individual organs may depend on whether acute or chronic GVHD is predominant. Unambiguous response criteria are needed for each organ if patients with overlap syndromes are included in phase II trials in which shorter-term clinical response is the primary endpoint. Patients with overlap syndromes can be included in phase III trials in which resolution of GVHD, discontinuation of therapy, or survival is a component of the primary endpoint. Stratification may be needed to account for different outcomes according to the presence or absence of acute GVHD. Considerations for Drug Development

Conventionally, new agents have been tested for safety and for initial assessment of efficacy among patients with steroid-refractory GVHD, under the premise that only patients at very high risk of poor outcomes should be subjected to the unknown risks of a new agent. This approach, however, poses several practical problems. Patients with steroid-refractory GVHD have highly heterogeneous disease manifestations and risk factors, making it difficult to evaluate efficacy except through evaluations of change over time for individual patients. The number of patients with steroid-refractory GVHD is smaller than the 494

number with newly diagnosed GVHD. In addition, steroid-refractory GVHD is less likely than newly diagnosed GVHD to respond to therapy. Because patients may be reluctant to enroll in placebo-controlled trials for steroid-refractory GVHD, protocols would benefit from designs that offer all patients some chance of benefit from participation in the study. Comparative trials are difficult because no agents have been approved for treatment of steroid-refractory GVHD. A comparative trial of an unapproved agent against best supportive care could be considered, although there is currently no standard for defining best supportive care. A comparative trial of an agent already approved for other indications may be hampered by the availability of the agent for off-label use outside the context of a study. The use of a crossover design might improve enrollment in comparative studies for treatment of steroidrefractory chronic GVHD. If a crossover design is used, the criteria for changing therapy should be explicit and unambiguous, and the primary outcome variable should be assessed before the crossover. If the phase II experience in patients with steroidrefractory GVHD is encouraging, a pivotal phase III controlled study may be necessary as a formal demonstration of efficacy. Because the number of patients with newly diagnosed chronic GVHD is larger than the number with steroid-refractory GVHD, and because response may be easier to assess in patients with newly diagnosed chronic GVHD, a phase III study could enroll patients with newly diagnosed GVHD, where a new agent could be tested as an “add-on” to “standard” first-line therapy. If a phase III study in patients with newly diagnosed chronic GVHD is necessary under circumstances in which phase II results are not available in the same population, a decision must be made whether to forge ahead with the phase III study or whether to conduct a separate phase II study in patients with newly diagnosed chronic GVHD before undertaking a phase III study. This decision will depend on whether the safety and efficacy experiences from previous studies are adequate to formulate a reasonable expectation of risk and a preliminary estimate of the effect size to assess the potential for benefit in subjects to be enrolled in the phase III study. Alternatively, a “window” approach could be used in phase II studies in which a new agent is given for a limited time before beginning standard therapy for newly diagnosed chronic GVHD. A window study can be used most appropriately with interventions that can produce rapid improvement. Positive results with a window approach could lead to subsequent phase II studies to assess the response rate and collect other pharmacodynamic or pharmacokinetic measures during longer-term administration of the agent. Alternatively, phase III studies could be carried out to com-

Design of Clinical Trials in Chronic GVHD

pare the investigational agent designed against standard therapy in a newly diagnosed population. TREATMENT METHODS Guidance Regarding the Management of Chronic GVHD

The level of guidance regarding the management of chronic GVHD in protocols should be appropriate to the lowest level of expertise among the providers who are likely to participate in the study. In the United States, most allogeneic HCT is performed in tertiary care centers. However, the median onset of chronic GVHD occurs at 4-6 months from HCT, after many patients have returned to the care of their referring physicians. These specialists typically have expertise in oncology or hematology but not in the diagnosis and management of chronic GVHD. If it is anticipated that referring hematologists and oncologists will be involved in study conduct or follow-up, management guidelines should be targeted to their level of expertise. Specific training toward study goals and proper procedures for all personnel should be considered as an important tool to ensure quality and consistency. Administration of a Study Medication or Intervention

Protocols for treatment of chronic GVHD must define appropriate guidelines for administration of the study medication or intervention, including the initial dosage, monitoring of blood levels, any decreases in dose or frequency of administration indicated because of toxicity, and any tapering of doses or frequency of administration at the end of treatment. Clinical trials involving treatment for chronic GVHD have the additional complexity that the study medication or intervention is frequently used in conjunction with other immunosuppressive treatments or interventions that increase the potential for drug interactions. The need for or use of additional agents could be considered as secondary endpoints during a planned taper of immunosuppressive treatment, but blinding would be needed for optimal interpretation of these endpoints. Administration of Glucocorticoids

“Standard” steroid therapy currently represents a well-established mainstay of initial treatment for chronic GVHD. Although the evaluation of a new treatment for chronic GVHD might ideally be carried out under stable conditions in which all other components of treatment are held constant, this ideal is very difficult to achieve in practice, if only because longterm high-dose glucocorticoid treatment causes unacceptable morbidity. This circumstance creates a medical and ethical imperative to prescribe the lowest dose

BB&MT

of glucocorticoid treatment that effectively controls GVHD manifestations. This means that, if GVHD improves during investigational treatment, glucocorticoid doses should be decreased accordingly, even though the decrease in steroid doses may prolong the persistence of GVHD manifestations compared with outcomes that might have occurred with continued high-dose glucocorticoid treatment. However, tapering of steroid doses prompted by clinical improvement could mask the full benefit of an effective investigational agent, and continued administration of highdose steroids could mask the absence of benefit from an ineffective investigational agent. Protocols should provide guidelines for tapering of glucocorticoid doses and should specify daily or alternate-day administration. Guidelines should indicate the appropriate starting dose and any adjustments in glucocorticoids doses, depending on toxicity and changes in GVHD severity. A fixed starting dose and taper schedule may be appropriate for clinical trials in which administration of glucocorticoids is an integral component of the intervention being tested and for trials to evaluate short-term efficacy of a new regimen. In longerterm trials, however, adherence to a fixed schedule is likely to be impossible because of variation among subjects in persistence of chronic GVHD and tolerability of glucocorticoid-related side effects. Protocols should highlight the possibility of adrenal insufficiency at the end of the glucocorticoid taper and should provide guidelines for reinstitution of glucocorticoid treatment if GVHD recurs after glucocorticoid administration has been discontinued. Administration of Other Treatments

Protocols should provide guidelines for the administration of immunosuppressive medications and any other treatments that could affect manifestations of GVHD and potentially confound the interpretation of the study. Information about the administration of any such treatments should be recorded for analysis at the end of the study (see content of case-report forms below). Study requirements must find a balance between the physician’s potential need for flexibility in deviating from a prescribed course and the study’s need for consistent medical management. This balance becomes especially difficult in open-label studies and in trials with longer-term endpoints that involve physician judgment regarding decisions to add or withdraw medications from the immunosuppressive regimen or to adjust the pace of taper. Use of any medication that could potentially affect the course of chronic GVHD or severity of organ manifestations should be recorded in the case-report forms. 495

P. J. Martin et al

Ancillary Therapy, Supportive Care, and Prevention of Steroid-Related Complications

Protocols should provide guidelines for prevention of opportunistic infection in patients with chronic GVHD [5]. Protocols should also acknowledge that organ or site-specific therapies are an important component of symptom management in patients with chronic GVHD [5]. Allowable site-specific therapies should be specified in the protocol, and the therapies used for patients should be recorded in the case-report forms. Protocols should remind providers that close attention must be paid to complications of glucocorticoid treatment and should suggest management strategies.

DATA COLLECTION Timing of Data Collection

Calendar-driven data collection (eg, every 3 or 6 months) is strongly recommended for clinical trials of treatment for chronic GVHD. Insufficiently frequent assessment could overlook differences in time to response between study arms, whereas too frequent assessments might make it difficult to detect changes from 1 evaluation to the next and would unnecessarily add to the resource burden of the study. Calendardriven data collection should be supplemented by event-driven collection of GVHD-related data at the time of treatment success or treatment failure. Collection of adverse event data should also be event driven. Longitudinal or time-to-event analysis can be accommodated by calendar-driven data collection, although careful planning would be required if treatment involves cycles of different lengths between study arms. For GVHD therapy trials, we strongly favor real-time collection of prespecified data from physicians and patients during clinic visits. This approach ensures the capture of detailed and directed information that may be necessary to document response assessments [6]. Case-Report Forms

Standardized case-report forms should be used, with the level of detail determined by the study goals. Data collected retrospectively from medical charts intended for clinical care lack details needed to assess the results of clinical trials. Instead, a checklist form should be used routinely for assessment and documentation of chronic GVHD manifestations during clinic visits. Consistent template-driven documentation at an appropriate level of detail through the real-time use of standard assessment forms enables comparisons with previous assessments or with the patient’s baseline condition. If time points for assessment are stan496

dardized, then data for all patients in each study arm can be easily aggregated for analysis. Three levels of data regarding chronic GVHD are outlined in the appendix. These levels describe the amount of detail to be collected, depending on whether chronic GVHD is a primary, secondary, or tertiary endpoint in the study. Investigators are often tempted to collect any data element that might possibly be used in an analysis. Over-collection of data adds to the burden of participation experienced not only by physicians but also by subjects. Excessive data collection can compromise enthusiasm for the trial and increase the probability that critical data elements will be missed or that inappropriate analysis will occur. Conversely, elimination of unnecessary data collection, without compromising collection of safety data, facilitates the conduct of clinical trials. Content of Case-Report Forms

Case-report forms for the baseline assessment of chronic GVHD should collect information regarding eligibility for enrollment in the trial, manifestations that indicate prognosis, and severity of manifestations in each organ or site potentially affected by the disease [6]. Previously recognized prognostic indicators at the onset of chronic GVHD include performance status, lichen planus-like versus sclerotic skin lesions, percent body surface affected by rash, diarrhea, weight loss ⬎10%, oral involvement, platelet count, total bilirubin, progressive onset from previous acute GVHD versus onset without previous acute GVHD or after resolution of previous acute GVHD, and type and dose of previously administered immunosuppressive medications [2,7]. At each subsequent calendar-driven assessment point, the dose and schedule of the investigational agent should be recorded together with the reasons for any dose interruption or reduction, and the assessment of GVHD severity should be repeated together with any other necessary endpoint information specified by the study design. Drug diaries can be used to identify the maximum, minimum, or average dose of the investigational agent during intervals between assessment points. The amount of concomitant immunosuppressive medications, ancillary therapies, and supportive care (eg, physical therapy, massage, punctal plugs) should also be recorded. In trials in which the blood levels of an investigational medication are not monitored and adjusted as part of the study, it may also be useful to record the administration of specific concomitant medications that might affect the concentration of investigational agents in the blood. All concomitant medications must be recorded in studies to be submitted for regulatory review. The information from case-report forms of this design can effectively describe not only the initial onset and final

Design of Clinical Trials in Chronic GVHD

resolution of chronic GVHD but also its severity across time. Patient-Reported Outcomes

Information in the case-report form can include quality of life or functional assessments as reported by patients. Collection of patient-reported data should use validated instruments that have been confirmed in the specific patient population. Established methods of instrument administration and scoring should be used, whenever such instruments and methods are available. Each instrument should define clinically meaningful differences that qualify as improvement or worsening between 2 assessments based on statistical considerations or clinical perceptions. Results should report population changes and percentage of patients with clinically meaningful difference in selfreported outcomes before and after treatment. We encourage the development of instruments to capture parent or self-reported data from pediatric patients. Reporting of Adverse Events and Serious Adverse Events

The protocol document should specify a safetymonitoring plan that defines adverse events (AEs) and serious AEs (SAEs), taking into consideration the study design, known and potential risks of the investigational product, the patient population under study, and any regulatory or sponsor requirements. SAEs generally include any adverse event that causes death or threat to life, hospitalization, disability, or congenital abnormalities, but even these definitions are somewhat open to interpretation, and pertinent details should be included in the protocol. The protocol should also specify the requirements for reporting AEs. In general, investigators are expected to provide expedited reports of SAEs to sponsors within 24 hours [8], and sponsors are expected to provide expedited reports to the US Food and Drug Administration (FDA) if there are any unexpected SAEs. Investigators and sponsors are required to report to the institutional review board any unanticipated problems involving risk to human subjects or others. Considerations for Multi-institutional Studies

Multi-institutional studies provide faster enrollment and allow wider generalization of results. Protocols for multi-institutional studies should address standardization of routine management strategies, monitoring, and endpoint assessment, because differences in management can confound the assessment of patient-reported outcomes and quality of life. Audits and centralized training of individuals who assess patients, follow protocol-mandated procedures, examine histopathology, perform chronic GVHD grading, and assess clinical trial endpoints may improve com-

BB&MT

pliance and reproducibility. Likewise, centralized review of histopathology, clinical grading, and response assessments may improve reproducibility and uniformity across institutions. These quality enhancements add greatly to the costs of conducting clinical trials.

PRIMARY AND SECONDARY ENDPOINTS Selection of Endpoints

The choice of specific primary and secondary endpoints will be influenced by the nature and scope of claims to be made at the end of the study, the type of intervention, study phase, indication (prevention, primary treatment or secondary treatment), conduct in an academic versus community setting, and by the presence or absence of blinding. Potential primary and secondary endpoints that could be used in chronic GVHD studies are listed in Table 1. These include physician-assessed response (complete response, partial response, stable disease or no response, progression), patient-reported outcomes, time to GVHD progression, transplant-related mortality, disease-free survival, overall survival, survival to resolution of chronic GVHD, and survival to permanent discontinuation of immunosuppressive treatment. The endpoints in Table 1 are listed in ascending order according to the scope of potential conclusions that could be made for the indication(s) of the product under study. GVHD response and patient-reported outcomes could be applied globally or to specific organs, depending on the nature of the intervention and the organs or sites affected by the disease. Valid primary and secondary endpoints of chronic GVHD treatment studies should focus on the clinical benefits of central importance to patients: survival free of the underlying disease, freedom from chronic GVHD manifestations, treatment and complications, and improvement in symptoms, function, and quality of life. Intermediate endpoints that reflect the degree of chronic GVHD activity, such as physician assessment, biochemical tests, and biomarkers, are important only to the extent that they may predict or confirm 1 of the patient-experienced outcomes [9,10]. Because chronic GVHD typically has a prolonged clinical course, extended follow-up is strongly encouraged to ensure that responses are durable. The minimum duration of response required to designate treatment as “successful” will vary according to trial design and endpoint. Requirements for durability of response should be prespecified in the definition of endpoints. Where applicable, criteria for failure of primary therapy and failure of secondary treatment should be defined. Complete response, strictly defined partial response, and validated early surrogate markers of success or failure are highly appropriate endpoints 497

P. J. Martin et al

Table 1. Potential Primary and Secondary Endpoints for Chronic GVHD Treatment Trials

Endpoint*

Time to Endpoint

Interpretation

GVHD response

Response to treatment according to organ-specific or summary measurements

Short

Patient-reported outcomes

Self-assessed morbidity caused by chronic GVHD or treatment

Short

Transplant-related mortality

Death before recurrence or progression of malignancy

Long

Relapse-free survival

Survival without recurrent malignancy

Long

Overall survival

Survival

Long

Relapse-free survival to resolution of GVHD†

Complete response

Long

Relapse-free survival to permanent discontinuation of immunosuppressive treatment†

Cure

Long

Biostatistical Notes Scales for measurement of response have not yet been validated. Results may be affected by changes in ancillary treatment and supportive care. Endpoint is subject to bias and has greater validity in blinded trials. Most appropriate as a primary endpoint in phase II studies and possibly in selected phase III studies. Missing data and informative censoring are major problems. Few sensitive instruments are available. Results may be affected by changes in ancillary treatment and supportive care. Endpoint is subject to bias and has greater validity in blinded trials. Best used as a secondary endpoint. Endpoint measures failure rather than success. Also known as chronic GVHD-specific mortality in the literature. Best used as a secondary endpoint. Objective endpoint, but not specific to chronic GVHD. Best used as a secondary endpoint. Objective endpoint, ultimate gold standard, but may not be specific to chronic GVHD. Best used as a secondary endpoint. Endpoint must account for continued immunosuppressive treatment. Best used as a primary endpoint in phase III studies. Best used as a primary endpoint in phase III studies.

*Endpoint probabilities can be estimated as cumulative incidence or as proportions of success or failure at a specified time point with complete follow-up. Cumulative incidence estimates can be influenced by the frequency of competing risks. Endpoint rates (ie, hazards) can be estimated and compared by Cox regression. †Secondary treatment is considered to be an indication of failure.

for phase II studies. Results for these endpoints can be assessed within the first several months after enrollment and are not greatly affected by subsequent management or other competing late events such as recurrent malignancy. Outcomes with wider scope such as permanent resolution of chronic GVHD or survival are highly appropriate endpoints for phase III studies, but statistical methods must account for concomitant treatment and competing events such as death or recurrent malignancy during immunosuppressive treatment. Chronic GVHD may decrease the risk of recurrent malignancy and increase the risk of death from causes other than recurrent malignancy. Hence, assessment of overall benefit must account for tradeoffs between the 2 by using composite endpoints in which death or recurrent malignancy is considered as 498

treatment failure. Stratification can be used to ensure that competing risks are balanced between arms. Response Criteria

The dimensions of response measurement should be defined, including the organ systems and sites involved by chronic GVHD, types of manifestations, and extent of involvement. The development and application of response criteria for chronic GVHD pose very difficult challenges, especially in phase II studies in which the proportion of subjects with complete responses may be low [6]. If possible, active, reversible disease manifestations should be distinguished from fixed deficits and irreversible damage. Overall response assessments should include defined and widely

Design of Clinical Trials in Chronic GVHD

accepted measurements of disease activity in each organ or site. Concomitant therapy must be controlled when response is the primary endpoint of phase II studies designed to assess the efficacy of an investigational agent. The timing of response assessment should be defined (eg, 6 weeks or 3-6 months from enrollment), and results should indicate whether responses were durable after completion of treatment. When resources are available, clinical impressions can be validated through assessment by several individuals and by subspecialty experts, especially in open-label studies. Care should be taken to define and place a higher priority on clinically meaningful benefits in the definition of endpoints. Supporting Information from Physician and Patient Participants

Physician behavior and patient self-reports can provide additional information about responses. For physicians, changes in the dose of immunosuppressive medications and addition of new medications likely reflect the clinical impression of response. Patients can be queried about quality-of-life or symptom scales, and they can report changes in performance and function with the use of standardized questionnaires. Physician behavior and “patient-reported outcomes” may be more sensitive to subtle differences in disease activity, but each involves subjective assessments with uncertain reliability and probable susceptibility to bias, especially in open-label studies. In blinded trials, however, these assessments may be used to support other more objective response criteria.

BIOSTATISTICAL CONSIDERATIONS Prevention Trials

In some situations, a clinical trial may be designed to prevent severe chronic GVHD, eg, through alterations in the nature or source of the graft. Because of competing risks (eg, death and recurrent malignancy before the onset of chronic GVHD), cumulative incidence estimates, rather than Kaplan-Meier methods, should be used to evaluate the proportion of patients who develop chronic GVHD, and the trial arms should be balanced for competing risks. Patients can develop chronic GVHD only if they are engrafted with donor cells and survive for some minimal time, usually 80-100 days, after HCT. For chronic GVHD prevention trials, the incidence of chronic GVHD should be reported as a cumulative incidence curve from the inception of the intervention to prevent the disease, which may vary according to the design of the study. Death without previous chronic GVHD should be treated as a competing risk. Decreasing donor chimerism and recurrence or progression of an underlying malignancy may influence

BB&MT

the speed with which immunosuppressive medications are withdrawn and thus indirectly affect onset of chronic GVHD. Protocols for prevention trials should specify whether these events will be treated as competing risks, and justification for this decision should be provided. The power calculation for prevention trials should specify the minimum clinically significant risk reduction that the study is designed to detect. The power calculation should also account for dropouts. Treatment Trials

Primary and secondary endpoints in chronic GVHD treatment trials require careful statistical consideration, as summarized in Table 1. In all cases, endpoints, timing of assessments, and statistical methods should be prespecified. Kaplan-Meier estimates can be used to evaluate survival, and cumulative incidence estimates can be considered for evaluation of other simple and composite endpoints, depending on type of treatment and length of follow-up. All trials should be adequately powered to evaluate a prespecified statistical hypothesis regarding the primary endpoint, and the statistical hypothesis should address a clinically meaningful benefit. Late-phase studies should have data and safety monitoring boards, particularly if they involve multiple institutions. Scheduled interim analyses may be necessary to evaluate not only efficacy but also toxicity and futility, as illustrated by 3 trials that tested the use of thalidomide to prevent or treat chronic GVHD [11-13]. Data and safety monitoring boards may be appropriate for phase II studies testing agents with weakly characterized safety profiles that might jeopardize the safety of subjects and for studies that involve particularly vulnerable populations such as children. Additional Considerations in Phase II Treatment Trials

For a variety of reasons, results of phase II studies remain difficult to interpret. Disease characteristics at the time of enrollment can influence the chance of partial or complete response, and the management of glucocorticoid doses and other concomitant treatment can have marked effects on short-term outcomes. In open-label studies, evaluation of many endpoints is highly susceptible to bias. In the absence of welldefined response criteria, judgments regarding attainment of complete response are likely to be much more robust than are those regarding partial response. However, the short duration of phase II studies and presence of fixed, irreversible deficits may limit the number of patients who have a complete response. Nonetheless, phase II studies are useful for screening treatment options and planning future phase III studies. 499

P. J. Martin et al

Changes in treatment must be taken into account even when prespecified objective criteria are used to measure changes in manifestations of chronic GVHD. The conduct and interpretation of clinical trials for treatment of chronic GVHD would be greatly enhanced by development of biostatistical methods that could account for baseline prognostic characteristics and for changes in the severity of disease manifestations and the overall intensity of immunosuppressive treatment. Crossover designs may be biostatistically robust if failure is required before patients cross over (in reality, these are extended access trials) or if treatment remains blinded before and after crossover. Conversely, crossover designs can confound the assessment of longer-term outcomes such as survival and cannot be used with add-on study designs. Additional Considerations in Phase III Treatment Trials

Although the prespecified primary endpoint is of paramount importance in assessing the results of a phase III clinical trial, other considerations must be taken into account before reaching an overall conclusion. For example, a secondary analysis of the primary endpoint should demonstrate that a favorable difference in outcome does not reflect imbalance in the distribution of baseline risk factors among arms of the study. Large imbalances for risk factors that might affect outcome can be prevented by appropriate stratification for balance. In trials that enroll a large number of subjects, stratification is generally not necessary because the likelihood of a significant imbalance in risk factors between arms is low. The risk of an imbalance is higher in trials that enroll smaller numbers of subjects, but with limited enrollment, the trial design cannot accommodate strata for all of the many risk factors that are known to affect outcome among patients with chronic GVHD. Techniques for biased adaptive randomization can be used to prevent an imbalance in the distribution of multiple risk factors between arms. Because imbalances cannot be evaluated after the fact, and because predictive outcomes must be based on prespecified model assumptions, biased adaptive randomization is not recommended in clinical trials designed for regulatory approval. In chronic GVHD trials, differences in doses of systemic immunosuppressive medications or administration of topical therapy among study arms could affect the primary endpoint. An effective treatment for chronic GVHD would be expected to decrease the overall level of systemic immunosuppression and the amount of topical therapy. If this is not the case, then a secondary analysis should be carried out to correct for possible confounding effects of ancillary treatment and supportive care on the primary endpoint. Favorable results for the primary endpoint should be supported by favorable or neutral results for secondary 500

efficacy endpoints and by the absence of side effects that outweigh the benefits demonstrated by the efficacy endpoints. Handling of Missing Data

Missing data hamper the evaluation of patientreported outcomes and other endpoints. Imputation has been used as a technique to deal with missing data for patient-reported outcomes, although the use of this technique should be discussed with the FDA before being submitted for regulatory review for potential licensing. Missing data can also affect the assessment of response when the study arms have different dropout rates, resulting in informative censoring. Protocols should specify how missing data are be treated, and sample size estimates should make allowances for dropout and informative censoring, when necessary. Sensitivity testing under worst-case assumptions for missing data can be used to ensure robust results.

REGULATORY REVIEW Additional considerations of clinical trial design, endpoint choice, and study conduct are required of trials intended for licensing review by the FDA. These considerations include, but are not limited to, requirements for the use of validated measurements as primary endpoints, study designs that incorporate appropriate controls, and, whenever possible, blinding of study intervention and endpoint assessment to ensure the absence of bias. Sponsors and investigators should consider requesting a Special Protocol Assessment from the FDA before beginning a clinical trial intended to support a licensing application [14]. FDA can approve a new drug application or a new indication only after determining that the drug meets the statutory standards for safety and effectiveness, among other requirements. The FDA may refuse to approve an application if there is insufficient information about a drug to determine whether it is safe for use under the conditions prescribed or if there is a lack of substantial evidence that the drug will have its purported effect under the conditions of use prescribed, among other reasons. Assessment of Safety

Clinical trials intended for regulatory review must meet rigorous standards of design and conduct. In all clinical trials, assessment of safety through timely evaluation and reporting of all serious adverse events requires close attention. Regulatory requirements for reporting of adverse events should be followed. Clinical trial protocols should have guidelines that require detailed evaluation of adverse events that are not serious and expedited reporting of serious adverse events to the principal investigator and sponsor. Ex-

Design of Clinical Trials in Chronic GVHD

pedited reporting to FDA is required for unexpected serious adverse events. Assessment of Efficacy

Regulatory approval requires persuasive evidence of efficacy. Assessment of efficacy involves the measurement of an outcome variable that reflects clinical benefit and an evaluation of the statistical robustness and reproducibility of the trial results. Endpoints should be selected to provide evidence of meaningful clinical benefit for patients. Tools and instruments used to capture endpoint information must be well characterized and validated. Clear definitions of failure and success must be provided. Endpoints of limited scope and subjective endpoints that are not supported by objective measurements may be highly appropriate for the assessment of some interventions. For example, stabilization of pulmonary function tests might be appropriate in the evaluation of an inhalational agent for treatment of bronchiolitis obliterans in patients who previously showed consistent deterioration before enrollment in the study. Validated surrogate endpoints that have been demonstrated to indicate clinical benefit could serve as endpoints for phase II studies, although no such surrogate endpoints have yet been identified or validated for patients with chronic GVHD. Secondary efficacy endpoints in phase II studies can include the survival-related endpoints of wider scope that could be used as primary efficacy endpoints in phase III studies. If the study populations are similar, information from the secondary endpoints of earlier phase studies can be used to inform the estimates of effect size needed to design phase III studies that have primary endpoints of wider scope related to longer-term survival. Design of Phase III Clinical Trials

The FDA has summarized the characteristics of adequate and well-controlled “pivotal” studies needed for regulatory approval in the United States [15]. These characteristics include (1) a clear statement of the objectives and methods of analysis, with an implicit emphasis on prospective design, (2) a valid comparison with a control to provide a quantitative assessment of effect, (3) a method of selection of subjects that provides adequate assurance that they have the disease being studied, (4) a method to avoid bias in the assignment of subjects to treatment and controls groups, (5) adequate measures to minimize bias on the part of subjects, observers, and analysts of the data, (6) well-defined and reliable methods to assess subjects’ responses, and (7) an analysis that is adequate to assess the effects of the treatment.

BB&MT

CHALLENGES FOR THE FUTURE Improvements in clinical trial design can facilitate advances in the treatment of chronic GVHD only if drugs or devices previously approved for other indications or new therapeutic agents currently under development have activity against physiologic pathways leading to development or progression of the disease. Current understanding of the pathophysiology leading to chronic GVHD is limited, but deeper insight could come from nonclinical models and studies of other diseases that result in similar clinical manifestations, such as scleroderma and Sjögren syndrome. A more sophisticated understanding of the pathogenesis of chronic GVHD, combined with the availability of agents directed at specific targets in pathways leading to development of the disease, could lead to improvements in outcomes for patients after allogeneic HCT.

ACKNOWLEDGMENTS This project was supported by the National Institutes of Health’s National Cancer Institute, Office of the Director, Cancer Therapy Evaluation Program, Intramural Research Program and Center for Cancer Research; National Heart Lung and Blood Institute, Division of Blood Diseases and Resources; Office of Rare Diseases, National Institutes of Health, Office of the Director; National Institute of Allergy and Infectious Disease, Transplantation Immunology Branch; and the Health Resources and Services Administration, Division of Transplantation and the Naval Medical Research Center, C. W. Bill Young/Department of Defense Marrow Donor Recruitment and Research Program. The authors also acknowledge the following individuals and organizations that by their participation made this project possible: American Society for Blood and Marrow Transplantation, Center for International Bone and Marrow Transplant Research, Blood and Marrow Transplant Clinical Trials Network, Canadian Blood and Marrow Transplant Group, European Group for Blood and Marrow Transplantation, Pediatric Blood and Marrow Transplant Consortium, and representatives of the South American transplantation centers (Luis F. Bouzas, MD, and Vaneuza Funke, MD). This project was conducted in coordination with the American Society for Clinical Oncology and American Society of Hematology (liaisons, Michael Bishop, MD, and Jeff Coughlin). The organizers are also indebted to patients and patient and research advocacy groups who made this process much more meaningful by their engagement. Special thanks also to Paula Kim who coordinated these efforts. 501

P. J. Martin et al

APPENDIX Consensus Definitions and Glossary—2006

Appropriate controls: A comparison group (usually historical or concurrent) that provides information about the likely course of chronic GVHD without the intervention under study. Biased adaptive randomization: A biostatistical method of randomization that allows preferential assignment of subjects in a manner that balances the distribution of specified risk factors between study arms. Centralized training: A method to ensure valid and reliable (intra- and inter-rater) assessment of histopathology, chronic GVHD grading, and clinical trial endpoints; may include initial and ongoing training, assessment and feedback, and repeated assessments to check for intra-rater reliability. Clinically meaningful differences: A measurement of whether effect size (differences between populations or between baseline and subsequent evaluations) has clinical and statistical significance. Two methods are available to establish clinically meaningful differences: methods derived from an estimate of minimum important differences based on the distribution of measurements in a population and methods derived from measured changes anchored on patient perception, clinician perception, or behavioral elements. Competing events: Events that preclude or modify the risk of the outcome of interest. For example, death is generally considered a competing event for most endpoints, including chronic GVHD, because death precludes subsequent assessment of other outcomes. Similarly, progression or relapse of the underlying disease may be considered as a competing risk for GVHD because therapeutic interventions for treatment of the disease may modify the risk of GVHD. Criteria for secondary treatment: Failure of primary therapy as indicated by progression of chronic GVHD manifestations despite optimal first line therapy, eg, with prednisone at doses ⱖ1.0 mg · kg⫺1 · d⫺1 for 2 weeks; alternatively, administration of secondary therapy when primary therapy results in stable persistence of chronic GVHD manifestations and lack of improvement despite 4-8 weeks of sustained therapy, eg, with prednisone at doses ⱖ.5 mg · kg⫺1 · d⫺1 or an inability to taper prednisone doses to ⬍.5 mg · kg⫺1 · d⫺1 without recurrence of manifestations. A specific minimum duration of primary therapy should be defined for enrollment in clinical trials of secondary therapy. Early initiation of secondary therapy would be appropriate in patients with more severe chronic GVHD, whereas a longer trial of initial therapy would be appropriate in patients with sclerotic skin changes or other slowly reversible manifestations of chronic GVHD. A longer duration of primary therapy would also be appropriate when the agent to be used or 502

evaluated for secondary therapy has a high risk of toxicity. Depending on the study design, inability to tolerate primary therapy or lack of a satisfactory response to initial therapy may be considered an indication for secondary treatment or for enrollment in a clinical trial to evaluate secondary treatment. Criteria for treatment of newly diagnosed chronic GVHD: Oral, ocular, and cutaneous manifestations represent the most frequent and sensitive indicators of chronic GVHD. Systemic treatment is always indicated when multiple sites are involved. Likewise, systemic immunosuppressive therapy is indicated for patients with a quiescent or progressive onset who are diagnosed with chronic GVHD during treatment with prednisone, because GVHD manifestations in these patients would almost certainly have greater severity in the absence of steroid administration. In this case, treatment of chronic GVHD would require administration of steroids at a higher dose or the addition of another immunosuppressive medication [3]. Systemic immunosuppressive therapy should be considered for patients with chronic GVHD and thrombocytopenia (platelet count ⬍100 000/␮L) or other findings or biomarkers associated with an increased risk of nonrelapse mortality [16]. Cumulative incidence: A measurement of the actual rate of an outcome of interest in the study population, which accounts for any competing events. Subjects with uninformative outcomes (eg, death before onset of chronic GVHD) are included in cumulative incidence calculations but are censored (removed from the denominator) in Kaplan-Meier calculations. The Kaplan-Meier method assumes that patients who are censored because of competing events are at the same risk of an outcome of interest as are patients who have not been censored. This assumption may not hold true. When competing events occur in some patients before outcomes of interest in others, estimates generated according to the Kaplan-Meier method are higher than those generated using the cumulative incidence methods, leading to an overestimation of the outcome of interest. Both methods censor follow-up for surviving patients at the date of last contact. Disease activity: Evidence of active chronic GVHD that may respond to appropriate treatment aimed at controlling pathologic processes causing chronic GVHD, as distinguished from complications of therapy, comorbid conditions, or fixed irreversible defects that will not resolve even if pathologic processes that cause chronic GVHD are controlled. Data and safety monitoring board: An independent group of experts who review the conduct and selected outcomes such as safety, efficacy, or futility (including rate of accrual) during clinical trials. Early surrogate markers of success or failure: Measurements (including laboratory, physiologic testing, clinical measurements, or biomarkers) validated to

Design of Clinical Trials in Chronic GVHD

predict later clinical outcomes of importance. The surrogate marker must be adequately sensitive and specific for later clinical events so that they can substitute for longer-term outcomes. Failure: As applied to evaluation of efficacy, failure of therapy to control chronic GVHD, eg, progression of chronic GVHD or addition of new systemic immunosuppressive therapy, because of judgments that manifestations of chronic GVHD are not adequately controlled. Failure of primary therapy: This will usually include any of the following events during primary therapy: recurrent malignancy or death, inability to tolerate study treatment, progression of chronic GVHD, or addition of new systemic immunosuppressive therapy prompted by inadequately controlled chronic GVHD manifestations. An exacerbation of chronic GVHD manifestations during withdrawal of treatment is generally not considered primary treatment failure unless manifestations exceed those at the beginning of the trial or do not improve after reinstatement of previous treatment. Stable disease without new systemic immunosuppressive therapy is not considered primary treatment failure. Failure of secondary treatment: This is indicated by progression of chronic GVHD manifestations despite second-line therapy; alternatively, administration of tertiary therapy when secondary therapy results in stable persistence of active chronic GVHD manifestations and lack of improvement despite 4 weeks of therapy. The protocol should specify how recurrent malignancy or death, inability to tolerate secondary therapy, and flare of chronic GVHD manifestations during taper of secondary therapy will be interpreted for endpoint assessment. Fixed, irreversible deficits (“damage”): Organ manifestations that will not resolve even with adequate treatment of chronic GVHD. Some such deficits may improve with time or with other, non-chronic GVHD directed therapy, but the period for any possible improvement is longer than the period of follow-up for a clinical trial. Functional assessments: Self-reported or actual measurements of ability to perform tasks. Guidelines for tapering of glucocorticoid doses: Tapering of steroids should begin soon after the first evidence of improvement in manifestations of chronic GVHD, even if manifestations have not entirely resolved. Tapering should be carried out with the goals of lowering the dose and converting daily glucocorticoid administration to an every-other-day regimen [17,18] that can be tolerated for long periods and prevent any exacerbation in manifestations of chronic GVHD. Administration of glucocorticoids should then continue at stable doses until all reversible manifestations of chronic GVHD resolve. The taper schedule should be resumed after resolution of all

BB&MT

reversible manifestations of GVHD. At the end of the taper schedule, physicians and patients must be vigilant for recurrent GVHD and symptomatic adrenal insufficiency, which may require adjustment of glucocorticoid doses. Accelerated tapering of glucocorticoid doses may be necessary in patients who have glucocorticoid-related toxicity, but the benefit of decreased toxicity must be weighed against the risk of increased severity of GVHD. Imputation: A statistical technique using prespecified rules to fill in missing values when data are incomplete. Inability of the patient to tolerate the study intervention: Cessation of the study intervention for any medical reason (toxicity or side effects) or patient preference not related to whether or not the study medication is effective. Level I data: Basic data collected for all patients whether or not they are enrolled in chronic GVHDtargeted trials (eg, demographic characteristics of the study population, survival, recurrent malignancy, date of last contact). Level II data: More detailed data collected for patients who are involved in studies in which the occurrence of chronic GVHD is a primary or secondary endpoint (eg, baseline treatment variables in studies to measure the incidence of chronic GVHD or overall severity of chronic GVHD) or in studies of outcome among patients with chronic GVHD (eg, predictors of prognosis at diagnosis, risk factors, organ-specific severity, dose of immunosuppressive medications). Level III data: Very detailed data collected for patients who are involved in studies to evaluate treatment of chronic GVHD (eg, clinical information to support assessment of adverse events and response, details of medical management). Minimum clinically significant risk reduction: Minimum decrease in risk that would warrant use of a preventive strategy. Newly diagnosed chronic GVHD: Initial onset of manifestations that are diagnostic of chronic GVHD [3]. Decisions regarding treatment require additional considerations such as the presence or absence of symptoms and risk factors for mortality or progression of the disease. Organ or site-specific therapies: Treatments designed primarily to affect a specific organ manifestation or site of chronic GVHD. The term ancillary treatment refers to interventions directed toward controlling immune responses or inflammation at a specific site. The term supportive care refers to other interventions that are not directed toward immune responses or inflammation. Overlap syndromes: An increasingly recognized syndrome of chronic GVHD in which manifestations typical of acute GVHD are present [3]. 503

P. J. Martin et al

“Primary” therapy: First systemic intervention intended specifically to treat chronic GVHD. Quality of life: A multidimensional (physical, functional, emotional, social, spiritual) self-reported assessment that in sum indicates satisfaction with life and overall sense of well-being. Recurrent or progressive malignancy: Recurrent (if previously in remission) or progressive (if previously persistent after HCT) malignancy. Requirements for durability of response: This will depend on the phase of the trial and will likely be longer for phase III studies than for phase II studies. Durability of response is defined as the minimum period that improvement must last, from the time it is first observed to the time that it is accepted as a response for purposes of defining outcome in a clinical trial. Secondary treatment: Any systemic treatment administered because of efficacy failure. “Standard” steroid therapy: Prednisone at 1 mg · kg⫺1 · d⫺1 or equipotent doses of other corticosteroids, initially administered once daily and generally followed by conversion to an every-other-day regimen. Steroid-dependent: Controlled or absent manifestations of chronic GVHD during administration of steroid therapy, with exacerbation or recurrence whenever steroid doses are tapered. Steroid-intolerant: Intolerance of steroid doses necessary to control chronic GVHD. Examples of intolerance should be clearly defined in each trial. For example, intolerance may be defined as development of insulin-dependent diabetes or symptomatic avascular necrosis. Stratification for balance: A method of randomization according to specified subgroups of subjects so that important covariates are balanced in the overall study populations to be compared in a clinical trial. Stratification for balance can be done with or without stratification in the analysis, depending on the number of subjects in each stratum. Systemic immunosuppressive treatment: Any immunosuppressive treatment designed to control manifestations of active chronic GVHD throughout the entire body. Topical therapy: Any treatment delivered locally with the goal of treating the tissues to which it is applied, without significant effects at other sites. Uncontrolled infection: Bacterial, viral, or fungal infection that is not adequately controlled by current anti-infective or surgical therapy. Validated instruments: Methods of measuring outcomes that are reasonable (face validity) and previously demonstrated to (1) include all important aspects of disease (content validity), (2) perform as expected (construct validity), and (3) measure clinically meaningful change (sensitivity) across the full range of illness. 504

NIH CONSENSUS DEVELOPMENT PROJECT ON CRITERIA FOR CLINICAL TRIALS IN CHRONIC GVHD STEERING COMMITTEE Steven Pavletic and Georgia Vogelsang (project chairs), LeeAnn Jensen (planning committee chair), Lisa Filipovich (diagnosis and staging), Howard Shulman (histopathology), Kirk Schultz (biomarkers), Dan Couriel (ancillary and supportive care), Stephanie Lee (design of clinical trials), James Ferrara, Mary Flowers, Jean Henslee-Downey, Paul Martin, Barbara Mittleman, Shiv Prasad, Donna Przepiorka, Douglas Rizzo, Daniel Weisdorf, and Roy Wu (members). The project group also recognizes contributions of numerous colleagues in the field of blood and marrow transplantation, medical specialists and consultants, the pharmaceutical industry, and professional staff at the National Institutes of Health and FDA for intellectual input, dedication, and enthusiasm on the road toward completion of these documents.

REFERENCES 1. Sullivan KM, Witherspoon RP, Storb R, et al. Alternating-day cyclosporine and prednisone for treatment of high-risk chronic graft-v-host disease. Blood. 1988;72:555-561. 2. Stewart BL, Storer B, Storek J, et al. Duration of immunosuppressive treatment for chronic graft-versus-host disease. Blood. 2004;104:3501-3506. 3. Filipovich AH, Weisdorf D, Pavletic S, et al. National Institutes of Health consensus development project on criteria for clinical trials in chronic graft-versus-host disease: I. Diagnosis and Staging Working Group report. Biol Blood Marrow Transplant. 2005;11:945-955. 4. Shulman HM, Kleiner D, Lee SJ, et al. Histopathologic diagnosis of chronic graft-versus-host disease: National Institutes of Health consensus development project on criteria for clinical trials in chronic graft-versus-host disease: II. Pathology Working Group report. Biol Blood Marrow Transplant. 2006;12:31-47. 5. Couriel D, Carpenter P, Cutler C, et al. Ancillary therapy and supportive care of chronic graft-versus-host disease: National Institutes of Health consensus development project on criteria for clinical trials in chronic graft-versus-host disease: V. Ancillary Therapy and Supportive Care Working Group report. Biol Blood Marrow Transplant. 2006;12:375-396. 6. Pavletic SZ, Martin P, Lee SJ, et al. Measuring therapeutic response in chronic graft-versus-host disease: National Institutes of Health consensus development project on criteria for clinical trials in chronic graft-versus-host disease: IV. Response Criteria Working Group report. Biol Blood Marrow Transplant. 2006;12:252-266. 7. Lee SJ, Vogelsang GB, Flowers MED. Chronic graft-versushost disease. Biol Blood Marrow Transplant. 2003;9:215-233. 8. International Conference on Harmonization. Guidelines for good clinical practice; Geneva, Switzerland; May 1, 1996. 9. Wittes RE. Antineoplastic agents and FDA regulations: square pegs for round holes? Cancer Treat Rep. 1987;71:795806.

Design of Clinical Trials in Chronic GVHD

10. American Society of Clinical Oncology. Outcomes of cancer treatment for technology assessment and cancer treatment guidelines. J Clin Oncol. 1996;14:671-679. 11. Chao NJ, Parker PM, Niland JC, et al. Paradoxical effect of thalidomide prophylaxis on chronic graft-vs-host disease. Biol Blood Marrow Transplant. 1996;2:86-92. 12. Koc S, Leisenring W, Flowers ME, et al. Thalidomide for treatment of patients with chronic graft-versus-host disease. Blood. 2000;96:3995-3996. 13. Arora M, Wagner JE, Davies SM, et al. Randomized clinical trial of thalidomide, cyclosporine, and prednisone versus cyclosporine and prednisone as initial therapy for chronic graftversus-host disease. Biol Blood Marrow Transplant. 2001;7:265273. 14. Food and Drug Administration. Guidance for Industry: Special

BB&MT

15. 16.

17.

18.

Protocol Assessment. Rockville, MD, Food and Drug Administration 2002. 21 Code of Federal Regulations 324.126, 2005. Schultz, KR, Miklos DB, Fowler D, et al. Toward biomarkers for chronic graft-versus-host disease: National Institutes of Health consensus development project on criteria for clinical trials in chronic graft-versus-host disease: III. Biomarker Working Group report. Biol Blood Marrow Transplant. 2006;12: 126-137. Harter JG, Reddy WJ, Thorn GW. Studies on an intermittent corticosteroid dosage regimen. N Engl J Med. 1963;269:591596. Fauci AS, Dale DC. Alternate-day prednisone therapy and human lymphocyte subpopulations. J Clin Invest. 1975;55:2232.

505