The methodological quality assessment tools for ...

Journal of Evidence-Based Medicine ISSN 1756-5391

REVIEW ARTICLE

The methodological quality assessment tools for preclinical and clinical studies, systematic review and meta-analysis, and clinical practice guideline: a systematic review Xiantao Zeng1,2 , Yonggang Zhang3 , Joey S.W. Kwong3 , Chao Zhang2 , Sheng Li1 , Feng Sun4 , Yuming Niu2 and Liang Du3 1

Center for Evidence-Based and Translational Medicine, Zhongnan Hospital, Wuhan University, Wuhan, China Center for Evidence-based Medicine and Clinical Research, Taihe Hospital, Hubei University of Medicine, Shiyan, China 3 Chinese Cochrane Centre, West China Hospital, Sichuan University, Chengdu, China 4 Department of Epidemiology and Biostatistics, School of Public Health, Peking University Health Science Centre, Beijing, China 2

Keywords Clinical practice guideline; meta-analysis; methodological quality; primary study; risk of bias; systematic review. Correspondence Liang Du, Chinese Cochrane Centre, West China Hospital, Sichuan University, Guoxuexiang 37#, Chengdu 610041, China. Tel: 86-28-85426295; Fax: 86-28-85426295; Email: [email protected] Conflict of interest The authors declare no conflict of interest. Received 1 December 2014; accepted for publication 10 December 2014. doi: 10.1111/jebm.12141

2

Abstract Objective: To systematically review the methodological assessment tools for preclinical and clinical studies, systematic review and meta-analysis, and clinical practice guideline. Methods: We searched PubMed, the Cochrane Handbook for Systematic Reviews of Interventions, Joanna Briggs Institute (JBI) Reviewers Manual, Centre for Reviews and Dissemination, Critical Appraisal Skills Programme (CASP), Scottish Intercollegiate Guidelines Network (SIGN), and the National Institute for Clinical Excellence (NICE) up to May 20th, 2014. Two authors selected studies and extracted data; quantitative analysis was performed to summarize the characteristics of included tools. Results: We included a total of 21 assessment tools for analysis. A number of tools were developed by academic organizations, and some were developed by only a small group of researchers. The JBI developed the highest number of methodological assessment tools, with CASP coming second. Tools for assessing the methodological quality of randomized controlled studies were most abundant. The Cochrane Collaboration’s tool for assessing risk of bias is the best available tool for assessing RCTs. For cohort and case-control studies, we recommend the use of the Newcastle-Ottawa Scale. The Methodological Index for Non-Randomized Studies (MINORS) is an excellent tool for assessing non-randomized interventional studies, and the Agency for Healthcare Research and Quality (ARHQ) methodology checklist is applicable for cross-sectional studies. For diagnostic accuracy test studies, the Quality Assessment of Diagnostic Accuracy Studies-2 (QUADAS-2) tool is recommended; the SYstematic Review Centre for Laboratory animal Experimentation (SYRCLE) risk of bias tool is available for assessing animal studies; Assessment of Multiple Systematic Reviews (AMSTAR) is a measurement tool for systematic reviews/meta-analyses; an 18-item tool has been developed for appraising case series studies, and the Appraisal of Guidelines, Research and Evaluation (AGREE)-II instrument is widely used to evaluate clinical practice guidelines. Conclusions: We have successfully identified a variety of methodological assessment tools for different types of study design. However, further efforts in the development of critical appraisal tools are warranted since there is currently a lack of such tools for other fields, e.g. genetic studies, and some existing tools (nested casecontrol studies and case reports, for example) are in need of updating to be in line with current research practice and rigor. In addition, it is very important that all critical appraisal tools remain subjective and performance bias is effectively avoided.

C 2015 Chinese Cochrane Center, West China Hospital of Sichuan University and Wiley Publishing Asia Pty Ltd JEBM 8 (2015) 2–10

X. Zeng et al.

Methodological quality assessment tools

Figure 1 An overview of the different study types in medical research.

Introduction Professor Archibald (Archie) Cochrane (1909 to 1988) is considered to be the pioneer and father of the idea of evidence-based medicine (EBM) in our era (1). The concept of “EBM” is first formally put forward in 1992 by the Evidence-Based Medicine Working Group, which is led by Professor Gordon H. Guyatt (2). One year later, Professor Ian Chalmers, the student of Archie Cochrane and a group of 70 other international colleagues found The Cochrane Collaboration (3). These pioneering works lead us to the EBM era. In this era, how to make better use of preclinical and clinical studies, systematic review (SR) and meta-analysis, and clinical practice guideline (CPG) attract extensive attention worldwide. Before using, assessing the methodological quality of study is very important. The quality includes internal and external validity; methodological quality usually refers to internal validity (4, 5). Generally, the medical research can be divided into two major types: primary and secondary. Figure 1 shows an overview of the different study types in medical research (6, 7). Different research fields or/and different design types have different methodological quality assessment tools. The internal validity can be influenced by selection bias, performance bias, detection bias, attrition bias, reporting bias,

and other biases during the research process (5). Therefore, all methodological quality assessment tools are focused on these aspects, which is recommended as “risk of bias” by the Cochrane Handbook for Systematic Reviews of Interventions (the Cochrane Handbook) (5). There are three types of tools: scales, checklists, and items (Table 1) (8). Correctly choosing the applicability of the tools is very important; therefore, the purpose of our study was to systematically review the features and usability of the currently available and commonly used methodological quality assessment tools for preclinical and clinical studies, SR and meta-analysis, and CPG.

Methods Literature search We used “methodological quality,” “risk of bias,” and “critical appraisal” to search the Cochrane Handbook (http:// handbook.cochrane.org/), the Joanna Briggs Institute (JBI) Reviewers Manual (http://www.joannabriggs.org/sumari .html), and the Centre for Reviews and Dissemination’s guidance for undertaking SRs in health care (http://www .york.ac.uk/inst/crd/index_guidance.htm). The “systematic review,” meta-analysis, overview, and “clinical practice guideline” were also used to search PubMed, we chose the


3


X. Zeng et al.

Table 1 The type and characteristic of methodological quality tools Type

Characteristic

Item

Consisting of individual components, relevant to clinical research methodology and the results might be biased, such as allocation concealment and blinding Consisting of many items for assessment study quality and risk of bias, without scoring of each item Consisting of many items for assessment study quality and risk of bias, every item is scored and combined to give a summary score

Checklist Scale

articles that were published in the last five years and extracted the methodological quality tools from them. The “methodological quality,” “risk of bias,” “critical appraisal,” validity, tool, item, checklist, and scale were used to search the Google search engine (http://www.google.com.hk/), we choose the first 300 links. Reference lists of published articles were examined to identify additional sources not identified in the database searches. We also searched the website of the Critical Appraisal Skills Programme (CASP; http://www.casp-uk.net/), the Scottish Intercollegiate Guidelines Network (SIGN; http://www.sign.ac.uk/), and the National Institute for Clinical Excellence (NICE; http://www.nice.org.uk/). We finished this process on May 20, 2014.

Eligibility criteria The eligibility criteria were as follows: (1) the tool for assessing methodological quality of primary study, SR and metaanalysis, or CPG; (2) with the instructions or user’s guide, and describes the basic characteristics and objective of the tool; the tool is also used or commonly used now; (4) the guideline for reporting primary study, SR and meta-analysis, or CPG is excluded; (5) article only provided general narrative guidance or without an explicit scale, item, or checklist were excluded.

Characteristics of tools Finally, we included 21 tools in this SR, an overall summary of the main tool characteristics and more detailed information is presented in Appendix Table S1 to S8.

Tools for randomized controlled trial The first randomized controlled trial (RCT) is designed by Bradford A. Hill (1897 to 1991) and becomes the “gold standard” for experimental study design (9, 10). Archie Cochrane was a staunch supporter of RCT and spent much of his career promoting their use in research (11) and the Cochrane Systematic Reviews (CSR) have always focused on RCT from where it emerged. There are many tools developed for assessing RCTs. Table S1 presents major items of the Cochrane Collaboration’s tool (5, 12), the Physiotherapy Evidence Database (PEDro) scale (13–15), the Modified Jadad Scale (16–18), the Delphi List (19), CASP checklist for RCT (20–22), and the NICE methodology checklist for RCT. Other tools, such as Chalmers Scale (23), are not commonly used nowadays, so we did not introduce them. West et al (24) summarized these tools in 2002.

Tools for nonrandomized interventional study

Data extraction Two authors independently extracted the following information: the type of study addressed by the tool, the type of the tool (scale, checklist, or item), number of items and major contents of the tool, and the development unit of the tool. If the tool contains internal and external validity (applicability) or other additional criteria at the same time, we only extracted the section of internal validity.

Data analysis Meta-analysis was inapplicable; the quantitative analysis method was used to summarize the information and results, including objective, version, contents, designer, and website of tool.

4

Results

In clinical research, especially in surgery, RCT is not always feasible (25). Therefore, nonrandomized designs remain considerable. Nonrandomized clinical intervention study indicates that the investigators control over the allocation of participants to groups, but do not attempt randomization (eg, patient or physician preference) (26). According to whether with comparison, nonrandomized clinical intervention study can be divide into comparative and noncomparative. In order to provide a tool which could be used by readers, manuscript reviewers or journal editors to assess the quality of such studies, many scholars developed many methodological tools for nonrandomized intervention studies, most of which can be found in the article by Deeks et al (26). Table S2 presents the Methodological Index for Nonrandomized Studies (MINORS) tool (27) and Reisch’s


X. Zeng et al.

tool (28). The latter tool that was developed by Reisch et al, and was major for drug therapeutic studies, and is now used by the Cochrane Inflammatory Bowel Disease Group (Inflammatory Bowel Disease and Functional Bowel Disorders Review Group) (26). MINORS contain 12 methodological points, the first eight apply to both noncomparative and comparative studies, while the remaining four relate only to studies with two or more groups.

Tools for analytical study Observational study can be divided into analytical study and descriptive study, the analytical study includes three types: cohort study, case-control study, and cross-sectional study (6). Cohort study includes prospective cohort study, retrospective cohort study, and ambidirectional cohort study (29). There are three command tools for assessing the cohort study: The CASP checklist, the SIGN methodology tools, and the Newcastle-Ottawa Scale (NOS) (30, 31). These three tools (30, 31) also provide criteria for case-control study. Nowadays, there are no more accepted tools for crosssectional study (32). The Agency for Healthcare Research and Quality (AHRQ) recommends 11 items and the Crombie’s items contains 7 items for assessing the quality of cross-sectional/ prevalence study. Table S3 presents these above tools. Other relevant tools are summarized by Sanderson et al, 2007 (32).

Tools for case series A case series is an observational study describing a series of individuals, usually all receiving the same intervention with no control group (5). Several tools have been developed for assessing the methodological quality of case series study, the latest tool was developed by Moga et al (33) in 2012 using a modified Delphi technique, including 18 items. Table S4 shows this tool.


Cochrane Collaboration, NICE, SIGN, University of York, and AHRQ (38–42). The Cochrane Collaboration takes QUADAS as it is a standard tool for making CSR and set QUADAS in the Review Manager (RevMan) 5.0 software (http://srdta.cochrane.org/handbook-dta-reviews). In 2011, a revised “QUADAS-2” is launched due to experience of QUADAS Group, reports from users, and feedback from the Cochrane Collaboration (41, 43). QUADAS-2 tool is set into RevMan 5.2 in 2012 (44). Another tool is developed by CASP. Table S5 lists the QUADAS, QUADAS-2, and CASP checklist for assessing the DTA. Other relevant tools are reviewed by Whiting et al in 2004 (37).

Tools for animal study (preclinical study) Before clinical trials are carried out, the safety and effectiveness of new drugs are usually tested in animal models (45), therefore, animal studies are considered as preclinical research and have important significance (46, 47). However, unlike clinical studies, the attention on animal studies is far from enough. In 2002, the Lancet published a commentary that first outlined the scientific rationale for SRs of animal studies (48) and then many SRs and meta-analyses of animal studies were published (47, 49). Like clinical studies, the methodological quality of animal study also needs to be assessed (47). Table S6 shows the updated Stroke Therapy Academic Industry Roundtable (STAIR) tool (50)—Recommendations for Ensuring Good Scientific Inquiry (51), the revised STAIR tool—The Collaborative Approach to Meta-Analysis and Review of Animal Data from Experimental Studies (CAMARADES) (52), and the Systematic Review Centre for Laboratory Animal Experimentation’s (SYRCLE) risk of bias tool (53) for animal studies. These are recommended tools up to now.

Other tools for primary study Tools for diagnostic accuracy study Diagnostic accuracy study belongs to screening tests; it is also called “diagnostic test accuracy (DTA).” Screening is defined as tests done among apparently well people to identify those at an increased risk of a disease or disorder. Those identifications are sometimes then offered a subsequent diagnostic test or procedure, or in some instances, a treatment or preventive medication (34, 35). DTA have several unique features in terms of design which differ from standard intervention and observational evaluations. There are two recommended tools for quality assessment of DTA. The first is called “Quality Assessment of Diagnostic Accuracy Studies” (QUADAS) (36, 37); this tool is widely used and quickly accepted and recommended by the

Qualitative study (qualitative research) design is widely used in Nursing. The JBI stipulates that SR must conduct critical methodological quality using corresponding tools; these tools are embed into their premier software— module of JBI System for the Unified Management, Assessment and Review of Information (SUMARI; http://joannabriggs.org/sumari.html) (54, 55). For qualitative study, JBI gives 10 items and each item is judged “Yes,” “No,” or “Unclear.” CASP and NICE also provide a checklist to assess the methodological quality of qualitative study. The JBI, CASP, and NICE also provide tool for assessing the methodological quality of economic evaluations. Besides, the CASP provides a tool for appraisal clinical prediction


5


rule, the NICE provide methodology checklist for prognostic studies. The JBI also gives tool for RCT, quasi-RCT, cohort study, case-control study, cross-sectional study, case report, and expert opinion, respectively (54, 55). These entire appraisals have relevant module of JBI-SUMARI.

Tools for SR and meta-analysis SR and meta-analysis are the popular methods to keep up with current medical literature (56–58). The ultimate purposes and values of SR and meta-analysis are to use them for promoting the health care (58–60). Therefore, critically appraising the SR and meta-analysis before using them, such as conducting overview and evidence-based clinical practice, is also necessary and important. SR and meta-analysis assessment include methodological and reporting quality. The reporting quality is appraised using the relevant guidelines, such as Preferred Reporting Items for Systematic reviews and Meta-Analyses (61). The Enhancing the Quality and Transparency of health Research Network (http://www.equator-network.org/) (62–64) and the Research Reporting Guidelines and Initiatives (http://www .nlm.nih.gov/services/research_report_guide.html) are good websites to search reporting guidelines for both primary and secondary research. In 1988, Sacks et al developed a tool for assessing the quality of meta-analysis of RCTs and called “Sack’s Quality Assessment Check list” (65). And Oxman and Guyatt developed another methodological quality assessment tool—Overview Quality Assessment Questionnaire (OQAQ) in 1991 (66, 67). To overcome the shortcomings of above two tools, a measurement tool based on them, Assessment of Multiple Systematic reviews (AMSTAR) to assess the methodological quality of SRs is developed in 2007 (68). The AMSTAR tool is an 11-item questionnaire that asks reviewers to answer “Yes,” “No,” “Can’t answer,” or “Not applicable” (Table S7). CASP and NICE also develop a checklist for SR and metaanalysis, Table S7 shows their major items. Besides, the JBI also sets relevant module of JBI-SUMARI for SR and the contents are very similar to CASP checklists.

Clinical practice guideline CPG is being well integrated into the thinking of practicing clinicians and professional clinical organizations (69–71); it also incorporates scientific evidence into clinical practice (72). However, not all guidelines are the evidence-based CPGs (73, 74) and the quality of guidelines is of varying quality (75–77). Hence, before implementation to clinical, the quality should be appraised critically. In order to improve the quality of guideline and promote to use, more than 20 appraisal tools are developed (78). Of them,

6

X. Zeng et al.

the Appraisal of Guidelines for Research and Evaluation (AGREE) instrument has the most potential to serve as a basis for the development of an appraisal tool for clinical pathways (78). The AGREE instrument is first released in 2003 (79) and updates to AGREE II instrument come in 2009 (80). AGREE II comprises 23 items to develop, report, and evaluate practice guidelines and is available online at http://www .agreetrust.org/resource-centre/. Table S8 presents the major items of AGREE and AGREE II instrument.

Discussion In the EBM era and with the surging number of publications, the major attention is “Going from evidence to recommendations” (81, 82). Critical appraisal of evidence is the key point in this process (83, 84). According to characteristics of different study types, relevant evaluation tools are developed. Of them, some are recommended and used, some are used without recommendation, and some are eliminated (8, 26), (32), (37–39, 46, 47, 78). These tools give significant impetus to practice of EBM (85, 86). For primary research, the Collaboration’s recommended tool for assessing risk of bias of RCT is neither a scale nor a checklist; it is a domain-based evaluation in which critical assessments are made separately for different domains (5, 12). The “Cochrane Collaboration’s tool” is very widely accepted and recommended. Additionally, the Modified Jadad Scale and PEDro Scale are also suitable tools for RCT. Besides interventional SR of RCTs, the Cochrane Collaboration is now also concerned about SR of DTAs, methodology, and overview (5). Nonrandomized studies include quasi-experiments, natural experiments, and observational studies, which may be prospective or retrospective cohort studies or case-control studies (6, 87). Therefore, the Cochrane Collaboration recommends NOS for assessing nonrandomized studies and QUADAS and QUADAS-2 for DTA (5). However, nonrandomized study can be divided into nonrandomized interventional study and observational study. Hence, we believe the NOS is applicable to observational study (including case-control and cohort studies) (30, 31). In this situation, MINORS is defined for nonrandomized interventional study (27). The aim of Reisch et al (28) was to evaluate the design and performance of drug therapeutic studies, however, that tool lists a total of 57 items grouped into 12 categories including study design, sample size, randomization, and comparison groups (Table S2). Besides, some of the criteria are rather too specific to pharmaceutical studies and would require modification for more general use (26). Therefore, the complicated items and relevant items could be modified limited it use. The RCT design is also widely used in animal research and it is absorbing more and more attention (47, 88). Unlike


X. Zeng et al.

clinical RCT, the methodological assessment tools for clinical RCT cannot be used in animal study. In 1999, the initial “STAIR” group recommendations their criteria for assessing the quality of stroke animal studies and hence this tool is also called “STAIR.” In 2009, the STAIR Group updates their criteria and developed “Recommendations for Ensuring Good Scientific Inquiry.” Macleod et al proposed a 10-points tool— CAMARADES based on STAIR to assess methodological quality of animal study in 2004 (52). The SYRCLE’s RoB tool is developed based on the Cochrane risk of bias tool in 2014 (53). The first two appraisal tools are both major for stroke animal studies and they are accepted and used in stroke animal experiments (89–92). Although the CAMARADES tool gives relevant modified items for other animals besides stroke, the public acceptance is not very high. JBI releases the most types of tools, including RCT, quasi-RCT, cohort study, case-control study, nonexperimental study (descriptive study), case report, expert opinion, qualitative study, and SR (54, 55). However, their use is not too much and in the nursing field. The reason might be as it requires that the user must have permission and major concerns on nursing field. We consider that the JBI tools are well and can be used in relevant studies. CASP has many checklists and every one consists of three parts: “Are the results of the trial valid” (Section A), “What are the results” (Section B), and “Will the results help locally” (Section C). For evaluating the methodological quality, we use section A the most. The sections B and C are for evidence-based practice. CASP checklist for “Clinical Prediction Rule” is the only tool in this field. NICE provides seven methodology checklists: SR and meta-analysis, RCT, cohort study, case-control study, economic evaluation, qualitative study, and prognostic study; and recommends QUADAS-2 tool for DTA. Overview is designed to compile evidence from multiple SRs of a topic into one accessible and usable document (5). For SR and meta-analysis, besides above mentioned JBI tool, CASP and NICE checklists, AMSTAR tool (68) is the accepted and recommended tool. The QQAQ is also widely used nowadays. For CPG, the AGREE II instrument (80) is a well-accepted and recommended tool. However, great efforts remain necessary in the appraisal tools. First, many research types lack assessment tools, such as before-after study (time series) and nested case-control study (29). Second, special field lack corresponding tool, such as genetic study (known as single-nucleotide polymorphism) and in vivo experiment (cell study). Especially of single-nucleotide polymorphism research, the meta-analysis of this aspect is too many but few of them perform methodological quality assessment. Third, the existing tools are not well accepted, such as the tools for cross-sectional study and animal study. In further study, how to develop


well-accepted tools remains a significant and important work. In addition, it was important that all assessment tools with one common deficit is subjective. Therefore, the user must receive formal training, with relevant epidemiological knowledge, and have rigorous academic attitude. At least two reviewers independently evaluating and cross-checking is a good method to avoid performance bias (93). Second, the reporting quality is not equal to methodological quality, hence using reporting guidelines/statements to assess methodological quality is not appropriate. In conclusion, making accurate judgment of study type is the first priority. Then, choosing the corresponding tool is also important. In one word that mastering the relevant knowledge comprehensively and with more practice is the basic requirement for assessing the methodological quality correctly.

Funding This research was supported (in part) by the National Scientific Fund and project (81302508, 81403276), China Medical Board Collaboration Program in Health Policy Evidence (12-095), Foundation of Education and Science Planning Project of Hubei Province (2012A050), the special funds for “985 project” construction project of Wuhan University, and the Intramural Research Program of the Hubei University of Medicine (2011CZX01), without commercial or not-forprofit sectors. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. No additional external funding was received for this study. References 1. Stavrou A, Challoumas D, Dimitrakakis G. Archibald Cochrane (1909–1988): the father of evidence-based medicine. Interactive Cardiovascular and Thoracic Surgery 2013; 18(1): 121–24. 2. Group E-BMW. Evidence-based medicine. A new approach to teaching the practice of medicine. JAMA 1992; 268(17): 2420–25. 3. Levin A. The Cochrane Collaboration. Annals of Internal Medicine 2001; 135(4): 309–12. 4. Campbell DT. Factors relevant to the validity of experiments in social settings. Psychological Bulletin 1957; 54(4): 297–312. 5. Higgins J, Green S. Cochrane Handbook for Systematic Reviews of Interventions Version 5.1.0 [updated March 2011]: The Cochrane Collaboration, 2011. 6. Grimes DA, Schulz KF. An overview of clinical research: the lay of the land. Lancet 2002; 359(9300): 57–61. 7. Rohrig B, du Prel JB, Wachtlin D, Blettner M. Types of study in medical research: part 3 of a series on evaluation of scientific publications. Dtsch Arztebl Int 2009; 106(15): 262–68.


7


8. Juni P, Altman DG, Egger M. Systematic reviews in health care: assessing the quality of controlled clinical trials. BMJ 2001; 323(7303): 42–46. 9. The British Tuberculosis Association Research Committee. Treatment of pulmonary tuberculosis with streptomycin and para-aminosalicylic acid: a Medical Research Council investigation. British Medical Journal 1950; 2(4688): 1073–85. 10. Armitage P. Fisher, Bradford Hill, and randomization. International Journal of Epidemiology 2003; 32(6): 925–28; discussion 45–48. 11. Shah HM, Chung KC. Archie Cochrane and his vision for evidence-based medicine. Plastic and Reconstructive Surgery 2009; 124(3): 982–88. 12. Higgins JP, Altman DG, Gotzsche PC, Juni P, Moher D, Oxman AD, et al. The Cochrane Collaboration’s tool for assessing risk of bias in randomised trials. BMJ 2011; 343: d5928. 13. Campos TF, Beckenkamp PR, Moseley AM. Usage evaluation of a resource to support evidence-based physiotherapy: the physiotherapy evidence database (PEDro). Physiotherapy 2013; 99(3): 252–57. 14. Maher CG, Sherrington C, Herbert RD, Moseley AM, Elkins M. Reliability of the PEDro scale for rating quality of randomized controlled trials. Physical Therapy 2003; 83(8): 713–21. 15. Shiwa SR, Costa LO, Costa Lda C, Moseley A, Hespanhol, LC Jr, Venancio R, et al. Reproducibility of the Portuguese version of the PEDro scale. Cad Saude Publica 2011; 27(10): 2063–68. 16. Jadad AR, Moore RA, Carroll D, Jenkinson C, Reynolds DJ, Gavaghan DJ, et al. Assessing the quality of reports of randomized clinical trials: is blinding necessary? Controlled Clinical Trials 1996; 17(1): 1–12. 17. Schulz KF, Chalmers I, Hayes RJ, Altman DG. Empirical evidence of bias. Dimensions of methodological quality associated with estimates of treatment effects in controlled trials. JAMA 1995; 273(5): 408–12. 18. Hartling L, Ospina M, Liang Y, Dryden DM, Hooton N, Krebs Seida J, et al. Risk of bias versus quality assessment of randomised controlled trials: cross sectional study. BMJ 2009; 339: b4012. 19. Verhagen AP, de Vet HC, de Bie RA, Kessels AG, Boers M, Bouter LM, et al. The Delphi list: a criteria list for quality assessment of randomized clinical trials for conducting systematic reviews developed by Delphi consensus. Journal of Clinical Epidemiology 1998; 51(12): 1235–41. 20. Ibbotson T, Grimshaw J, Grant A. Evaluation of a programme of workshops for promoting the teaching of critical appraisal skills. Medical Education 1998; 32(5): 486–91. 21. Singh J. Critical appraisal skills programme. Journal of Pharmacology and Pharmacotherapeutics 2013; 4(1): 76. 22. Taylor R, Reeves B, Ewings P, Binns S, Keast J, Mears R. A systematic review of the effectiveness of critical appraisal skills training for clinicians. Medical Education 2000; 34(2): 120–25. 23. Chalmers TC, Smith H Jr., Blackburn B, Silverman B, Schroeder B, Reitman D, et al. A method for assessing the quality of a randomized control trial. Controlled Clinical Trials 1981; 2(1): 31–49.

8

X. Zeng et al.

24. West S, King V, Carey TS, Lohr KN, McKoy N, Sutton SF, et al. Systems to rate the strength of scientific evidence. Evidence Report-Technology Assessment (Summary) 2002; 47: 1–11. 25. McCulloch P, Taylor I, Sasako M, Lovett B, Griffin D. Randomised trials in surgery: problems and possible solutions. BMJ 2002; 324(7351): 1448–51. 26. Deeks JJ, Dinnes J, D’Amico R, Sowden AJ, Sakarovitch C, Song F, et al. Evaluating non-randomised intervention studies. Health Technology Assessment 2003; 7(27): 1–173. 27. Slim K, Nini E, Forestier D, Kwiatkowski F, Panis Y, Chipponi J. Methodological index for non-randomized studies (minors): development and validation of a new instrument. ANZ Journal of Surgery 2003; 73(9): 712–16. 28. Reisch JS, Tyson JE, Mize SG. Aid to the evaluation of therapeutic studies. Pediatrics 1989; 84(5): 815–27. 29. Grimes DA, Schulz KF. Cohort studies: marching towards outcomes. Lancet 2002; 359(9303): 341–45. 30. Wells G, Shea B, O’Connell D, Peterson J, Welch V, Losos M, et al. The Newcastle-Ottawa Scale (NOS) for Assessing the Quality of Nonrandomised Studies in Meta-Analyses. http://www.ohri.ca/programs/clinical_epidemiology/ oxford.asp Accessed in June, 2014. 31. Stang A. Critical evaluation of the Newcastle-Ottawa scale for the assessment of the quality of nonrandomized studies in meta-analyses. European Journal of Epidemiology 2010; 25(9): 603–05. 32. Sanderson S, Tatt ID, Higgins JP. Tools for assessing quality and susceptibility to bias in observational studies in epidemiology: a systematic review and annotated bibliography. International Journal of Epidemiology 2007; 36(3): 666–76. 33. Moga C, Guo B, Schopflocher D, Harstall C. Development of a Quality Appraisal Tool for Case Series Studies using a Modified Delphi Technique. 2012. http://www.ihe.ca/documents/Case%20 series%20studies%20using%20a%20modified%20Delphi%20 technique.pdf Accessed in June, 2014. 34. Grimes DA, Schulz KF. Uses and abuses of screening tests. Lancet 2002; 359(9309): 881–84. 35. Grimes DA, Schulz KF. Refining clinical diagnosis with likelihood ratios. Lancet 2005; 365(9469): 1500–05. 36. Whiting P, Rutjes AW, Reitsma JB, Bossuyt PM, Kleijnen J. The development of QUADAS: a tool for the quality assessment of studies of diagnostic accuracy included in systematic reviews. BMC Medical Research Methodology 2003; 3: 25. 37. Whiting P, Rutjes AW, Dinnes J, Reitsma J, Bossuyt PM, Kleijnen J. Development and validation of methods for assessing the quality of diagnostic accuracy studies. Health Technology Assessment 2004; 8(25): 1–234. 38. Willis BH, Quigley M. Uptake of newer methodological developments and the deployment of meta-analysis in diagnostic test research: a systematic review. BMC Medical Research Methodology 2011; 11: 27. 39. Whiting PF, Rutjes AW, Westwood ME, Mallett S, Group Q-S. A systematic review classifies sources of bias and variation in diagnostic test accuracy studies. Journal of Clinical Epidemiology 2013; 66(10): 1093–1104.


X. Zeng et al.

40. Whiting PF, Weswood ME, Rutjes AW, Reitsma JB, Bossuyt PN, Kleijnen J. Evaluation of QUADAS, a tool for the quality assessment of diagnostic accuracy studies. BMC Medical Research Methodology 2006; 6: 9. 41. Whiting PF, Rutjes AW, Westwood ME, Mallett S, Deeks JJ, Reitsma JB, et al. QUADAS-2: a revised tool for the quality assessment of diagnostic accuracy studies. Annals of Internal Medicine 2011; 155(8): 529–36. 42. Deeks J, Bossuyt P, Gatsonis C. Cochrane Handbook for Systematic Reviews of Diagnostic Test Accuracy Version 1.0.0. 2009. http://srdta.cochrane.org/ Accessed in June, 2014. 43. Schueler S, Schuetz GM, Dewey M. The revised QUADAS-2 tool. Annals of Internal Medicine 2012; 156(4): 323; author reply 4. 44. Review Manager (RevMan). Version 5.2 [program]. Copenhagen: The Nordic Cochrane Centre, The Cochrane Collaboration, 2012. 45. Sibbald WJ. An alternative pathway for preclinical research in fluid management. Critical Care 2000; 4 (Suppl2): S8–S15. 46. Perel P, Roberts I, Sena E, Wheble P, Briscoe C, Sandercock P, et al. Comparison of treatment effects between animal experiments and clinical trials: systematic review. BMJ 2007; 334(7586): 197. 47. Hooijmans CR, Ritskes-Hoitinga M. Progress in using systematic reviews of animal studies to improve translational research. PLoS Medicine 2013; 10(7): e1001482. 48. Sandercock P, Roberts I. Systematic reviews of animal experiments. Lancet 2002; 360(9333): 586. 49. van Luijk J, Leenaars M, Hooijmans C, Wever K, de Vries R, Ritskes-Hoitinga M. Towards evidence-based translational research: the pros and cons of conducting systematic reviews of animal studies. ALTEX 2013; 30(2): 256–57. 50. Stroke Therapy Academic Industry R. Recommendations for standards regarding preclinical neuroprotective and restorative drug development. Stroke 1999; 30(12): 2752–58. 51. Fisher M, Feuerstein G, Howells DW, Hurn PD, Kent TA, Savitz SI, et al. Update of the stroke therapy academic industry roundtable preclinical recommendations. Stroke 2009; 40(6): 2244–50. 52. Macleod MR, O’Collins T, Howells DW, Donnan GA. Pooling of animal experimental data reveals influence of study design and publication bias. Stroke 2004; 35(5): 1203–08. 53. Hooijmans CR, Rovers MM, de Vries RB, Leenaars M, Ritskes-Hoitinga M, Langendam MW. SYRCLE’s risk of bias tool for animal studies. BMC Medical Research Methodology 2014; 14: 43. 54. Vardell E, Malloy M. Joanna Briggs Institute: an evidence-based practice database. Medical Reference Services Quarterly 2013; 32(4): 434–42. 55. Hannes K, Lockwood C. Pragmatism as the philosophical foundation for the Joanna Briggs meta-aggregative approach to qualitative evidence synthesis. Journal of Advanced Nursing 2011; 67(7): 1632–42.


56. Lau J, Ioannidis JP, Schmid CH. Summing up evidence: one answer is not always enough. Lancet 1998; 351(9096): 123–27. 57. Clarke M, Chalmers I. Meta-analyses, multivariate analyses, and coping with the play of chance. Lancet 1998; 351(9108): 1062–63. 58. Oxman AD, Schunemann HJ, Fretheim A. Improving the use of research evidence in guideline development: 8. Synthesis and presentation of evidence. Health Research Policy and Systems 2006; 4: 20. 59. Swennen MH, van der Heijden GJ, Boeije HR, van Rheenen N, Verheul FJ, van der Graaf Y, et al. Doctors’ perceptions and use of evidence-based medicine: a systematic review and thematic synthesis of qualitative studies. Academic Medicine 2013; 88(9): 1384–96. 60. Gallagher EJ. Systematic reviews: a logical methodological extension of evidence-based medicine. Academic Emergency Medicine 1999; 6(12): 1255–60. 61. Liberati A, Altman DG, Tetzlaff J, Mulrow C, Gotzsche PC, Ioannidis JP, et al. The PRISMA statement for reporting systematic reviews and meta-analyses of studies that evaluate healthcare interventions: explanation and elaboration. BMJ 2009; 339. 62. Harms M. The EQUATOR network and the PRISMA statement for the reporting of systematic reviews and meta-analyses. Physiotherapy 2009; 95(4): 237–40. 63. Pandis N, Fedorowicz Z. The international EQUATOR network: enhancing the quality and transparency of health care research. Journal of Applied Oral Science 2011; 19(5): doi: pii:S1678-77572011000500001. 64. Kac G, Hirst A. Enhanced quality and transparency of health research reporting can lead to improvements in public health policy decision-making: help from the EQUATOR Network. Cad Saude Publica 2011; 27(10): 1872. 65. Sacks HS, Berrier J, Reitman D, Ancona-Berk VA, Chalmers TC. Meta-analyses of randomized controlled trials. New England Journal of Medicine 1987; 316(8): 450–55. 66. Oxman AD. Checklists for review articles. BMJ 1994; 309(6955): 648–51. 67. Oxman AD, Guyatt GH. Validation of an index of the quality of review articles. Journal of Clinical Epidemiology 1991; 44(11): 1271–78. 68. Shea BJ, Grimshaw JM, Wells GA, Boers M, Andersson N, Hamel C, et al. Development of AMSTAR: a measurement tool to assess the methodological quality of systematic reviews. BMC Medical Research Methodology 2007; 7: 10. 69. Davis DA, Taylor-Vaisey A. Translating guidelines into practice. A systematic review of theoretic concepts, practical experience and research evidence in the adoption of clinical practice guidelines. Canadian Medical Association Journal 1997; 157(4): 408–16. 70. Neely JG, Graboyes E, Paniello RC, Sequeira SM, Grindler DJ. Practical guide to understanding the need for clinical practice guidelines. Otolaryngology–Head and Neck Surgery 2013; 149(1): 1–7.


9


71. Browman GP, Levine MN, Mohide EA, Hayward RS, Pritchard KI, Gafni A, et al. The practice guidelines development cycle: a conceptual tool for practice guidelines development and implementation. Journal of Clinical Oncology 1995; 13(2): 502–12. 72. Tracy SL. From bench-top to chair-side: how scientific evidence is incorporated into clinical practice. Dental Materials 2013; 30(1): 1–15. 73. Chapa D, Hartung MK, Mayberry LJ, Pintz C. Using preappraised evidence sources to guide practice decisions. Journal of the American Association of Nurse Practitioners 2013; 25(5): 234–43. 74. Eibling D, Fried M, Blitzer A, Postma G. Commentary on the role of expert opinion in developing evidence-based guidelines. Laryngoscope 2013; 124(2): 355–57. 75. Chen YL, Yao L, Xiao XJ, Wang Q, Wang ZH, Liang FX, et al. Quality assessment of clinical guidelines in China: 1993 - 2010. Chinese Medical Journal (English) 2012; 125(20): 3660–64. 76. Hu J, Chen R, Wu S, Tang J, Leng G, Kunnamo I, et al. The quality of clinical practice guidelines in China: a systematic assessment. Journal of Evaluation in Clinical Practice 2013; 19(5): 961–67. 77. Henig O, Yahav D, Leibovici L, Paul M. Guidelines for the treatment of pneumonia and urinary tract infections: evaluation of methodological quality using the Appraisal of Guidelines, Research and Evaluation II instrument. Clinical Microbiology and Infection 2013; 19(12): 1106–14. 78. Vlayen J, Aertgeerts B, Hannes K, Sermeus W, Ramaekers D. A systematic review of appraisal tools for clinical practice guidelines: multiple similarities and one common deficit. International Journal for Quality in Health Care 2005; 17(3): 235–42. 79. Collaboration A. Development and validation of an international appraisal instrument for assessing the quality of clinical practice guidelines: the AGREE project. Quality & Safety in Health Care 2003; 12(1): 18–23. 80. Brouwers MC, Kho ME, Browman GP, Burgers JS, Cluzeau F, Feder G, et al. AGREE II: advancing guideline development, reporting and evaluation in health care. Canadian Medical Association Journal 2010; 182(18): E839–E842. 81. Guyatt GH, Oxman AD, Kunz R, Falck-Ytter Y, Vist GE, Liberati A, et al. Going from evidence to recommendations. BMJ 2008; 336(7652): 1049–51.

10

X. Zeng et al.

82. Andrews J, Guyatt G, Oxman AD, Alderson P, Dahm P, Falck-Ytter Y, et al. GRADE guidelines: 14. Going from evidence to recommendations: the significance and presentation of recommendations. Journal of Clinical Epidemiology 2013; 66(7): 719–25. 83. Tunguy-Desmarais GP. Evidence-based medicine should be based on science. South African Medical Journal 2013; 103(10): 700. 84. Muckart DJ. Evidence-based medicine - are we boiling the frog? South African Medical Journal 2013; 103(7): 447–48. 85. Swanson JA, Schmitz D, Chung KC. How to practice evidence-based medicine. Plastic and Reconstructive Surgery 2010; 126(1): 286–94. 86. Manchikanti L. Evidence-based medicine, systematic reviews, and guidelines in interventional pain management, part I: introduction and general considerations. Pain Physician 2008; 11(2): 161–86. 87. Britton A, McKee M, Black N, McPherson K, Sanderson C, Bain C. Choosing between randomised and non-randomised studies: a systematic review. Health Technology Assessment 1998; 2(13): 1–124. 88. Vesterinen HM, Sena ES, Egan KJ, Hirst TC, Churolov L, Currie GL, et al. Meta-analysis of data from animal studies: a practical guide. Journal of Neuroscience Methods 2013; 221C: 92–102. 89. O’Connor AM, Sargeant JM. Meta-analyses including data from observational studies. Preventive Veterinary Medicine 2013; 113(3): 313–22. 90. Schmidt A, Wellmann J, Schilling M, Strecker JK, Sommer C, Schabitz WR, et al. Meta-analysis of the efficacy of different training strategies in animal models of ischemic stroke. Stroke 2013; 45(1): 239–47. 91. Wei RL, Teng HJ, Yin B, Xu Y, Du Y, He FP, et al. A systematic review and meta-analysis of Buyang Huanwu decoction in animal model of focal cerebral ischemia. Evidence-Based Complementary and Alternative Medicine 2013, 2013: 138484. 92. Wu S, Sena E, Egan K, Macleod M, Mead G. Edaravone improves functional and structural outcomes in animal models of focal cerebral ischemia: a systematic review. International Journal of Stroke 2013; 9(1): 101–06. 93. Gold C, Erkkila J, Crawford MJ. Shifting effects in randomised controlled trials of complex interventions: a new kind of performance bias? Acta Psychiatrica Scandinavica 2012; 126(5): 307–14.