Gastrointestinal Endoscopy Competency Assessment ...

4 downloads 0 Views 266KB Size Report
Items 1 - 315 - London, Toronto and Kingston, Ontario, Canada; St. John's, Newfoundland, ... Walsh is a doctoral fellow of the CIHR Canadian Child Health.
ORIGINAL ARTICLE

Gastrointestinal Endoscopy Competency Assessment Tool: reliability and validity evidence Catharine M. Walsh, MD, MEd, FRCPC,1,8,10 Simon C. Ling, MB, ChB, MRCP,1,8 Nitin Khanna, MD, FRCPC,2 Samir C. Grover, MD, MEd, FRCPC,3,9 Jeffrey J. Yu, BSc,10 Mary Anne Cooper, MD, MSc, MEd, FRCPC,4,9 Elaine Yong, MD, FRCPC,4,9 Geoffrey C. Nguyen, PhD, FRCPC,5,9 Gary May, MD, FRCPC, FASGE,3,9 Thomas D. Walters, MD, MBBChir, MRCP,1,8 Richard Reznick, MD, MEd, FRCSC, FACS,7 Linda Rabeneck, MD, MPH, FRCPC,5,6,9 Heather Carnahan, PhD11 London, Toronto and Kingston, Ontario, Canada; St. John’s, Newfoundland, Canada

Background: Rigorously developed and validated direct observational assessment tools are required to support competency-based colonoscopy training to facilitate skill acquisition, optimize learning, and ensure readiness for unsupervised practice. Objective: To examine reliability and validity evidence of the Gastrointestinal Endoscopy Competency Assessment Tool (GiECAT) for colonoscopy for use within the clinical setting. Design: Prospective, observational, multicenter validation study. Sixty-one endoscopists performing 116 colonoscopies were assessed using the GiECAT, which consists of a 7-item global rating scale (GRS) and 19-item checklist (CL). A second rater assessed procedures to determine interrater reliability by using intraclass correlation coefficients (ICCs). Endoscopists’ first and second procedure scores were compared to determine test-retest reliability by using ICCs. Discriminative validity was examined by comparing novice, intermediate, and experienced endoscopists’ scores. Concurrent validity was measured by correlating scores with colonoscopy experience, cecal and terminal ileal intubation rates, and physician global assessment. Setting: A total of 116 colonoscopies performed by 33 novice (!50 previous procedures), 18 intermediate (50-500 previous procedures), and 10 experienced (O1000 previous procedures) endoscopists from 6 Canadian hospitals. Main Outcome Measurements: Interrater and test-retest reliability, discriminative, and concurrent validity. Results: Interrater reliability was high (total: ICC Z 0.85; GRS: ICC Z 0.85; CL: ICC Z 0.81). Test-retest reliability was excellent (total: ICC Z 0.91; GRS: ICC Z 0.93; CL: ICC Z 0.80). Significant differences in GiECAT scores among novice, intermediate, and experienced endoscopists were noted (P! .001). There was a significant positive correlation (P ! .001) between scores and number of previous colonoscopies (total: r Z 0.78, GRS: r Z 0.80, CL: Spearman’s r Z 0.71); cecal intubation rate (total: r Z 0.81, GRS: Spearman’s r Z 0.82, CL: Spearman’s r Z 0.75); ileal intubation rate (total: Spearman’s r Z 0.82, GRS: Spearman’s r Z 0.82, CL: Spearman’s r Z 0.77); and physician global assessment (total: Spearman’s r Z 0.90, GRS: Spearman’s r Z 0.94, CL: Spearman’s r Z 0.77). Limitations: Nonblinded assessments. Conclusion: This study provides evidence supporting the reliability and validity of the GiECAT for use in assessing the performance of live colonoscopies in the clinical setting. (Gastrointest Endosc 2015;-:1-10.)

Abbreviations: CL, checklist; GiECAT, Gastrointestinal Endoscopy Competency Assessment Tool; GRS, global rating scale; ICC, intraclass correlation coefficient; PGA, physician global assessment.

Copyright ª 2015 by the American Society for Gastrointestinal Endoscopy 0016-5107/$36.00 http://dx.doi.org/10.1016/j.gie.2014.11.030

DISCLOSURE: All authors disclosed no financial relationships relevant to this article. This project was supported by an American Society for Gastrointestinal Endoscopy Quality in Endoscopic Research Award. Dr Walsh is a doctoral fellow of the CIHR Canadian Child Health Clinician Scientist Training Program and the recipient of a Department of Paediatrics Research Fellowship (Hospital for Sick Children) and a Postgraduate Medical Education Award, University of Toronto.

Received August 20, 2014. Accepted November 12, 2014.

www.giejournal.org

Current affiliations: Division of Gastroenterology, Hepatology and Nutrition, Hospital for Sick Children, Toronto, Ontario, Canada (1), Division of Gastroenterology, St. Joseph’s Health Centre, University of Western Ontario, London, Ontario, Canada (2), Division of (footnotes continued on last page of article)

Volume

-,

No.

-

: 2015 GASTROINTESTINAL ENDOSCOPY 1

Gastrointestinal Endoscopy Competency Assessment Tool

Walsh et al

Assessment is a cornerstone of high-quality endoscopic education, influencing both teaching and learning.1 There has been a shift in medical education over the past decade toward a competency-based model that is centered on the achievement of core training milestones and competency benchmarks. This necessitates the need for formative assessment tools to document trainees’ progress toward predefined outcomes and provide a means of accumulating evidence of competence.2 It is increasingly recognized that workplace-based assessment is essential because performance in the authentic clinical environment is core to medical competence.3,4 Direct observational colonoscopy assessment tools, if rigorously developed and validated, provide a means to assess endoscopists’ performance in vivo, in the workplace, in a standardized and reproducible manner. Additionally, they allow for the integrated assessment of competencies, which is postulated to enhance learning.5 The Gastrointestinal Endoscopy Competency Assessment Tool (GiECAT) is a direct observational assessment tool designed to assess competence in performing colonoscopy at the “does” level of Miller’s pyramid.4,6 It was developed systematically by a panel of 55 international endoscopy experts by using Delphi methodology and thus is reflective of endoscopic practice across institutions.6 The GiECAT was specifically constructed to assess the full breadth of competencies required to perform colonoscopy procedures in an integrated manner: (1) technical (psychomotor); (2) cognitive (knowledge and application of endoscopically derived information to clinical practice); and (3) integrative (higher level competencies such as clinical judgment and communication that complement an individual’s technical skills and knowledge to facilitate effective delivery of safe and appropriate care in varied contexts) competencies. Additionally, it addresses performance of all components of a colonoscopy procedure, including pre-, intra-, and postprocedural aspects of care. The GiECAT was designed for use as both a formative and summative assessment tool to monitor endoscopists’ progress throughout the learning continuum from novice to competent endoscopist. The current study aims to prospectively examine evidence of the reliability and validity of the GiECAT in the context of formative assessment of competence in performing colonoscopy in the clinical setting. Formative assessment aims to promote reflection, guide learning, and enable competence through the provision of feedback and benchmarks to orient the learner and facilitate continuous performance improvement.7-9 Although numerous frameworks have been proposed to evaluate educational assessment tools,5,10-13 the Accreditation Council for Graduate Medical Education Advisory Committee on Educational Outcome Assessment’s framework for evaluating the quality of an assessment measure13 was used as a basis for this study. This framework outlines 2 GASTROINTESTINAL ENDOSCOPY Volume

-,

No.

-

: 2015

standards in 6 areas including reliability, validity, ease of use, resources required, ease of interpretation, and educational impact.13 Validity evidence of the use of the GiECAT as a formative assessment tool in the clinical setting is discussed using the unified, evidence-based approach to validation.12,14 This approach is based on the accumulation of 5 categories of evidence of construct validity to provide support for an intended use of an assessment tool, including validity evidence of content, response process, internal structure, associations with other variables, and consequences.12,14-16

METHODS This was a prospective, multicenter, observational study assessing evidence of the reliability and validity of the GiECAT for use in the clinical setting. Ethical approval was obtained from the Research Ethics Boards at all involved institutions including the University Health Network, Mt. Sinai Hospital, St. Michael’s Hospital, Sunnybrook Health Sciences Centre, University of Toronto, and the University of Western Ontario. Written informed consent was obtained from all endoscopist participants and patients where required.

Participants Participants were adult gastroenterology and general surgical residents, fellows, and attending physicians from 6 Canadian academic hospitals. Based on predefined case-volume criteria, novice (performed !50 previous colonoscopies), intermediate (performed 50-500), and experienced endoscopists (performed O1000) were recruited to participate.

The Gastrointestinal Endoscopy Competency Assessment Tool The GiECAT was developed using Delphi methodology, whereby 55 international endoscopy experts from 44 centers rated potential checklist (CL) and global rating scale (GRS) items during 5 iterative rounds of surveys for their importance as indicators of the competence of trainees learning to perform colonoscopy.6 The GiECAT comprises a task-specific 7-item GRS and a 19-item CL. The GRS assesses holistic aspects of colonoscopy performance by using a criterion-referenced 5-point ordinal scale with descriptive anchors reflective of the level of independence demonstrated by the endoscopist (Appendix 1, available online at www.giejournal.org). Ratings on the 7 items (technical skill, strategies for endoscope advancement, visualization of mucosa, independent procedure completion (need for assistance), knowledge of procedure, interpretation and management of findings, and patient safety) are summed to generate a score from 7 to 35, with higher scores reflecting superior performance. The CL items, which detail key procedural steps, are scored on a dichotomous www.giejournal.org

Walsh et al

scale (1 Z done correctly or 0 Z not done/done incorrectly) with total CL scale scores ranging from 0 to 19.

Data collection Endoscopist participants were assessed in real-time performing 2 clinical colonoscopies less than 10 days apart. Procedures on patients with a history of colon or rectal resection were excluded. Sedation, monitoring, and other operating procedures were carried out according to standard practice at the respective endoscopy unit. Novice and intermediate endoscopists were gastroenterology and general surgery trainees who were supervised by an attending endoscopist who provided verbal and/or practical assistance according to usual practice. Performance during each colonoscopy was assessed by an experienced attending endoscopist by using the GiECAT. Assessors varied across endoscopists. Assessors were instructed to read the item descriptions and use the full range of responses, but no formal rater training was provided. Using 5-point ordinal scales, assessors were also asked to provide an overall physician global assessment (PGA) of endoscopic competence and an overall global assessment of the endoscopist’s technical, cognitive, and integrative skills. A subset of procedures was rated independently by a second trained observer to allow for determination of interrater reliability. Demographic data were collected from all endoscopist participants including level of training (if applicable), hand dominance, sex, colonoscopy experience, and estimated cecal and terminal ileal intubation rates (based on their last 20 colonoscopies).

Reliability analysis Reliability refers to the overall consistency of a measure. Interrater reliability refers to the level of agreement achieved by independent assessors rating the same performance.14 Interrater reliability for the GiECAT was established by comparing the total, GRS, and CL scores assigned independently by the attending endoscopists and trained observers by using intraclass correlation coefficient model 1 (1-way random effects model for both single measures [individual rater] and average measures [the average of 2 raters’ scores]).17 Given that participants were assessed by different sets of randomly selected evaluators, this model ensures generalizability of the results to similar raters.17 Test-retest reliability reflects the degree to which an assessment measure produces consistent results when completed by the same rater (single measures ICC) or group of raters (average measures ICC) on 2 different occasions under similar conditions.17 Test-retest reliability was determined by comparing the GiECAT total score and GRS and CL scores assigned for an endoscopist’s first and second colonoscopies by using ICC model 2 (2-way random-effects model (participant by procedure, single measure), absolute agreement definition).17,18 www.giejournal.org

Gastrointestinal Endoscopy Competency Assessment Tool

Validity analysis To avoid bias, validity analysis was based on data from each endoscopist’s first procedure because not all participants performed 2 colonoscopies. Internal structure validity evidence of the GiECAT was assessed by using item-total correlations, interitem correlations and internal consistency (by using Cronbach’s a and KuderRichardson formula 20 for the GRS and CL, respectively).19 Additionally, endoscopists’ total combined score for the technical (GRS items 1-4,7; CL items 5-12), cognitive (GRS item 5, CL items 1,3,13,14,16), and integrative (GRS items 6,7; CL items 1-3,15-19) GiECAT items were compared with their respective overall physician global ratings of technical, cognitive, and integrative skills by using Pearson’s correlation coefficient. Finally, the relationship between GRS and CL scores was examined by using Pearson’s correlation coefficient. To evaluate discriminative validity, which reflects whether an assessment measure adequately detects differences between groups hypothesized to score differently, the GiECAT total, GRS, and CL scores of novice, intermediate, and experienced endoscopists were compared by using separate Kruskal-Wallis tests with post hoc Mann-Whitney U pairwise comparisons. Additionally, receiver-operating characteristic curves were used to examine the ability of the GiECAT to differentiate between endoscopists who were assigned a PGA score reflective of competence (4 or 5) compared with those who were considered not yet competent (PGA score of 1, 2, or 3). Concurrent validity refers to the degree to which an assessment relates to other measures of the same construct.10 Evidence of concurrent validity of the GiECAT was examined by using correlation analysis with Spearman’s correlation coefficient to examine the relationship between total, GRS, and CL scores with colonoscopy experience, cecal intubation rate, terminal ileal intubation rate, and PGA of skill.

Educational usefulness Ease of use was evaluated by asking the assessors to rate how easy it was for them to rate the endoscopist’s performance by using the GiECAT on an ordinal scale of 1 (extremely easy) to 5 (extremely difficult).

RESULTS Endoscopist characteristics Sixty-one endoscopists from 6 Canadian hospitals participated, including 33 novice, 18 intermediate, and 10 experienced endoscopists. Data were collected for a total of 116 colonoscopy procedures. The characteristics of the endoscopists are shown in Table 1. Volume

-,

No.

-

: 2015 GASTROINTESTINAL ENDOSCOPY 3

Gastrointestinal Endoscopy Competency Assessment Tool

Walsh et al

TABLE 1. Endoscopist participant characteristics Demographic characteristics Discipline

% (no.) of cohort Gastroenterology, 52.5 (32); general surgery, 47.5 (29)

Sex

Male 63.9 (39)

Hand dominance

Right 93.4 (57)

No. of years performing colonoscopy

!1 y, 73.8 (45); 1-5 y, 14.7 (9); O5 y 11.5 (7)

Reliability analysis Interrater reliability for the 2 raters (39 procedures assessed) was good and above the cutoff of 0.75 generally considered indicative of good reliability in educational measurement (Table 2).20 The mean time between an endoscopists’ first and second colonoscopy procedure was 0.44  1.64 days. The test-retest reliability of the GiECAT was above the acceptable threshold of 0.8 (Table 2).18,21

noncompetence (PGA scores of 1, 2, or 3) revealed areas under the receiver-operating characteristic curve for GiECAT total, GRS, and CL scores of 0.98 (95% confidence interval, 0.95–1.00), 0.98 (95% confidence interval, 0.95– 1.00), and 0.91 (95% confidence interval, 0.83–0.98), respectively. A large area under the receiver-operating characteristic curve indicates strong discriminatory performance of the GiECAT. Concurrent validity analysis revealed a significant positive correlation (P ! .001) among GiECAT total, GRS, and CL scores and the number of previous colonoscopies; the cecal intubation rate; the ileal intubation rate, and the PGA of skill (Table 5).

Educational usefulness Assessors’ mean rating of ease of use was 1.70  0.48, with a score of 1 reflecting extremely easy and a score of 2 reflecting fairly easy.

DISCUSSION Validity analysis Internal consistency of the GiECAT GRS was 0.98. a Values remained consistent when tested against deleting each GRS item (a values of 0.97 to 0.98). Interitem and item-total correlations of the GRS are outlined in Table 3 as well as the correlation between each item and the overall PGA rating assigned. The internal consistency of the GiECAT CL was 0.91 and a values ranged from 0.89 to 0.91 when each CL item was deleted. The total combined technical, cognitive, and integrative item scores correlated strongly with their corresponding overall physician global ratings of technical, cognitive, and integrative skills, with correlation values of 0.95, 0.82, and 0.82, respectively (P ! .001). Additionally, there was a significant positive correlation between GRS and CL scores (r Z 0.82, P ! .001). Analysis showed a significant main effect of group (novice, intermediate, experienced) for total GiECAT scores (Kruskal–Wallis Z 43.392, P ! .001, h2 Z 0.71 [Table 4]). Post hoc analysis indicated that the experienced endoscopists scored significantly higher than intermediates (P ! .001) who scored higher than novices (P ! .001). There was also a significant increase in GRS scores with level of expertise (Kruskal–Wallis Z 41.652, P ! .001, h2 Z 0.69 [Table 4]). Post hoc analysis showed that GiECAT GRS scores differed significantly between each of the 3 groups (P ! .001). Additionally, there was a significant main effect of group for GiECAT CL scale scores (Kruskal–Wallis Z 37.602, P ! .001, h2 Z 0.63 [Table 4]). Once again, post hoc planned comparisons revealed that scores were significantly higher among experienced endoscopists compared with intermediates (P Z .029), who scored higher than novices (P ! .001). Comparison of endoscopists who were assigned PGA scores reflecting competence (PGA scores of 4 or 5) versus 4 GASTROINTESTINAL ENDOSCOPY Volume

-,

No.

-

: 2015

This study establishes evidence supporting the quality of the GiECAT as an assessment instrument for use in the authentic clinical context in a formative manner throughout training.13 The study findings are highlighted in the following as they relate to the concepts of reliability, validity, and educational usefulness as outlined by the Accreditation Council for Graduate Medical Education Advisory Committee on Educational Outcome Assessment’s framework for evaluating the quality of an assessment measure.13

Reliability Evidence of both the test-retest and interrater reliability of the GiECAT was examined. The test-retest reliability of the GiECAT was above the acceptable threshold of 0.8, despite potential variation in procedural difficulty across patients.18,21 The GiECAT total, CL, and GRS scores all demonstrated interrater reliability above 0.75, the level that is considered to indicate good reliability for formative educational assessment measures.20 Interrater reliability for a single assessor, however, did not exceed 0.9, which is traditionally required for high-stakes assessments such as credentialing examinations.22 The current study was designed to assess the utility of the GiECAT for use during training; therefore, the assessors were intentionally not trained or calibrated to reflect the typical training context, and strict patient and procedure inclusion criteria were not used. These factors may have potentially acted to decrease reliability. The use of varied assessors across endoscopists also likely reduced interrater reliability compared with a study design using the same 2 raters for all endoscopists. Before use in high-stakes assessment, strategies that have been shown to be effective in improving reliability, such as rater www.giejournal.org

Walsh et al

Gastrointestinal Endoscopy Competency Assessment Tool

TABLE 2. GiECAT interrater and test-retest reliability coefficients Interrater reliability ICC1,1 single measure

Test-retest reliability

ICC1,1 average measure (2 raters)

ICC2,1 single measure

ICC2,1 average measure (2 raters)

Component of GiECAT

ICC

95% CI

ICC

95% CI

ICC

95% CI

ICC

95% CI

Total GiECAT score*

0.85

0.73-0.92

0.92

0.84-0.96

0.91

0.85-0.95

0.95

0.92-0.97

Global rating scale score*

0.85

0.73-0.92

0.92

0.85-0.96

0.93

0.88-0.96

0.96

0.94-0.98

Checklist score*

0.81

0.67-0.90

0.90

0.80-0.95

0.80

0.68-0.88

0.88

0.81-0.94

GiECAT, Gastrointestinal Endoscopy Competency Assessment Tool; ICC1,1, intercorrelation coefficient model 1; ICC2,1, intercorrelation coefficient model 2; CI, confidence interval. *Correlations significant (P ! .001).

TABLE 3. GiECAT global rating scale interitem and item-total correlations

Global rating scale dimension

TS

SA

VM

PC

K

IMF

Technical skill Strategies for endoscope advancement

0.95

Visualization of mucosa

0.93

0.93

Independent procedure completion

0.90

0.88

0.91

Knowledge of procedure

0.92

0.94

0.91

0.89

Interpretation and management of findings

0.83

0.86

0.86

0.86

0.91

Patient safety

0.79

0.79

0.81

0.78

0.79

0.79

Item-total correlation

Correlation with physician global assessment of skill

0.95

0.91

0.95

0.91

0.95

0.92

0.92

0.93

0.95

0.85

0.90

0.90

0.83

0.77

GiECAT, Gastrointestinal Endoscopy Competency Assessment Tool; IMF, interpretation and management of findings; K, knowledge of procedure; PC, independent procedure completion; PS, patient safety; TS, technical skills; VM, visualization of the mucosa.

TABLE 4. Comparison of scores between novice, intermediate, and experienced endoscopists for GiECAT total, global rating scale, and checklist scores Score GiECAT scale component

Novice

Intermediate

Experienced

P value*

Maximum possible score

Total GiECAT scorey

25.00 (19.0-32.0)

43.00 (39.75-46.25)

53.5 (53.00-54.00)

!.001

54

Global rating scale scorey

14.00 (11.00-20.00)

26.00 (21.75-28.00)

35.00 (34.00-35.00)

!.001

35

Checklist scorey

10.00 (7.00-13.00)

17.5 (14.75-19.00)

19.00 (18.75-19.00)

!.001

19

GiECAT, Gastrointestinal Endoscopy Competency Assessment Tool. Scores reported as median (interquartile range). *Significant differences between groups (P ! .001). Comparisons were carried out by using the Kruskal-Wallis test of ranks. y Post hoc Mann-Whitney U pairwise comparisons showed significant differences between novice, intermediate, and experienced endoscopists (P ! .03).

training, calibration, and practice, could be implemented.23 Development of a user guide and practice assessment videos could also be used to increase rater consistency. Furthermore, patient variability could be reduced through implementation of strict patient inclusion criteria (eg, exclusion of patients with a previous difficult endoscopy, standardized procedural indications) and improved standardization of testing conditions (eg, outpatient endoscopy procedures only, with exclusion of inpatient or emergency colonoscopies).

Validity Content validity evidence in this context entails establishing that the GiECAT adequately captures the target www.giejournal.org

construct of “competence in performing colonoscopy” that the tool is intended to assess.10,24 Content-related validity evidence of the GiECAT is provided by the comprehensive and methodical approach to instrument development that was used, including use of an a priori conceptual framework of endoscopic competence, a systematic literature review, incorporation of existing assessment tools, and the use of Delphi methodology to systematically harness expert professional judgment.6 Response process validity evidence of performancebased assessments includes measures taken to help ensure the accuracy and integrity of recorded data.15 For the GiECAT, the Delphi panel provided feedback on the wording of CL and global rating items to optimize Volume

-,

No.

-

: 2015 GASTROINTESTINAL ENDOSCOPY 5

Gastrointestinal Endoscopy Competency Assessment Tool

Walsh et al

TABLE 5. Concurrent validity of GiECAT total, global rating scale, and checklist components Correlation coefficient (Spearman’s r) No. of previous colonoscopies

Cecal intubation rate

Terminal ileal intubation rate

Physician global assessment of skill

Total GiECAT score*

0.78

0.81

0.82

0.90

Global rating scale score*

0.80

0.82

0.82

0.94

Checklist score*

0.71

0.75

0.77

0.77

GiECAT scale component

GIECAT, Gastrointestinal Endoscopy Competency Assessment Tool. *Correlations significant (P ! .001).

interpretability and ease of use. Additionally, the scoring framework and anchors for the CL and GRS were developed based on a thorough literature review and Delphi panel feedback to help ensure clarity and enhance scoring reproducibility. Criterion referencing, which compares trainees’ performance with predetermined absolute standards, was used because it is known to facilitate consistent judgments and is therefore the preferred method of referencing for workplace-based assessments.1 Evidence of internal structure validity includes data evaluating the relationship among individual assessment items and how the items relate to the underlying assessment construct.19 Strong reliability, as discussed previously, is reflective of internal structure validity evidence. Validity evidence of internal structure is also supported by the high internal consistency and interitem and item-total correlations of the GiECAT CL and GRS. Additionally, internal consistency of the CL and GRS were found to remain relatively constant when tested against deleting each item, indicating that all items contribute meaningfully to the overall assessment. Furthermore, a strong correlation was found between the total combined technical, cognitive, and integrative item scores and their corresponding overall scores of technical, cognitive, and integrative skill, indicating that these items accurately reflect their respective competency domains. Validity evidence of associations with other variables centers on the association between assessment scores and other external variables that have a hypothesized theoretical relationship with the assessment construct that the tool is intended to measure.14,19 Supportive of this, the GiECAT was able to discriminate between novice, intermediate, and experienced endoscopists (discriminative validity). Further validity evidence was provided by the strong ability of the GiECAT to discriminate between endoscopists who were assigned a PGA rating reflective of “competence” opposed to those who were rated as being “not yet competent.” Additionally, there were high correlations between GiECAT scores and other measures hypothesized to reflect “endoscopic competence” including self-reported colonoscopy experience, cecal and terminal ileal intubation rates, and PGA of skill (concurrent validity). 6 GASTROINTESTINAL ENDOSCOPY Volume

-,

No.

-

: 2015

Finally, consequences validity evidence, which involves evaluating the intended or unintended impact of the assessment and the resultant decisions and actions, was not explicitly examined due to the requirement for longitudinal data.19

Educational usefulness Ease of use and interpretation are essential for workplace-based assessments to facilitate adoption and ensure feasibility for use within the clinical training environment.25 The GiECAT takes 2 to 5 minutes to complete, is inexpensive to administer, and can be completed by an individual assessor without additional resources, thus providing evidence of the tool’s ease of use.13 Additionally, it has a user-friendly and logical structure and achieves an acceptably high level of interrater reliability for formative low-stakes assessment with minimal rater training. Furthermore, assessors using the tool in the context of this study rated its mean ease of use as 1.7  0.48 out of 5, reflecting excellent usability. Ease of interpretation is evidenced by the GiECAT’s simple scoring system and the interpretability of total scores.13 Longitudinal studies are planned to assess the educational impact of the GiECAT on an individual learner’s performance and endoscopy curricula.

Limitations This study has several important limitations. First, it was not feasible to blind the assessors because of the clinical nature of the assessments. Even though the CL and GRS use well-defined scoring frameworks and criterionreferenced anchors, the assessors may have been influenced by their knowledge of the endoscopists’ level of experience, potentially leading to assessor bias. To address this limitation, future work is planned to examine the psychometric properties of the GiECAT in the context of blinded video-based ratings and to determine how realtime and video-based assessments compare. Second, endoscopists’ awareness of being assessed may have generated performance-altering anxiety that may have resulted in appraisals that did not reflect endoscopists’ true performance abilities.26 Third, the inherently subjective nature of the GiECAT assessment tool may be viewed by some as a limitation. There has been a long debate in www.giejournal.org

Walsh et al

the field of health professions education regarding the utility of subjectivity in assessment.27 Rigorously developed and validated direct observational assessment tools, such as the GiECAT, are currently considered one of the most accepted assessment methods to support competency-based curricula and outcomes-based assessment and are now considered to be the criterion standard for assessment of many surgical procedures.28,29 Last, as with all studies that examine psychometric properties of assessment tools, the results of our study are context specific, relating to use of the GiECAT in the context of low-stakes formative assessment within the clinical environment. Before use in high-stakes summative assessment, research is required to investigate the suitability of the GiECAT in this context and the effects of more extensive rater training to determine whether an acceptably high reliability greater than 0.9 can be achieved with a single rater.20,22 Additionally, further studies are required to determine the utility of the GiECAT for use in the context of simulation-based endoscopy training.

CONCLUSIONS AND FUTURE DIRECTIONS Workplace-based assessment, which entails assessment of colonoscopy skills in the clinical environment, is ideal because it enables the assessment of endoscopists’ performing procedures in routine clinical practice and allows the assessment of all competencies required to perform endoscopy procedures in an integrated manner, 2 factors that are known to facilitate learning.1 Rigorous workplace-based assessment requires use of educationally useful, valid, and reliable direct observational tools, such as the GiECAT. Although there are other potential markers of skill in performing colonoscopy, such as independent cecal intubation rate, most do not provide trainees with targeted feedback or afford a detailed authentic representation of the way in which a trainee functions in the clinical setting where there are many potential influences on performance.30 Integration of the GiECAT into endoscopy training programs would facilitate the provision of timely, meaningful, and task-specific performance feedback to guide and support trainees’ learning. It would also provide endoscopy educators with a framework for teaching. Additionally, longitudinal use would allow for the production of assessment data that could be used to gauge trainees’ progress toward defined outcomes within a competencybased curriculum, guide improvement through the identification of specific strengths and deficits, and help identify learners in difficulty. This study provides reliability and validity evidence of the GiECAT for use as a formative assessment tool in the clinical setting. To date, there have been a number of direct observational assessment tools developed for endoscopy; however, evidence supporting the reliability and validity of these tools for use during training is limited aside www.giejournal.org

Gastrointestinal Endoscopy Competency Assessment Tool

from the Mayo Colonoscopy Skills Assessment Tool.31,32 The GiECAT is unique in that it was specifically designed to reflect practice across institutions; addresses performance of pre-, intra-, and postprocedural elements of colonoscopy; allows for the integrated assessment of technical, cognitive, and integrative competencies required for competent endoscopic performance; and is capable of formative longitudinal assessment throughout the training continuum. We are currently working to integrate the GiECAT into an online point-of-care endoscopy-tracking platform33 that would allow trainers to complete GiECAT assessments in real-time and permit trainees to capture endoscopic quality indicators such as polyp detection rate, cecal and terminal ileal intubation rates, and withdrawal time.34 Data collected from large numbers of endoscopy trainees could then be used to generate an average learning curve of GiECAT scores based on aggregated data. This will allow for the determination of specific milestones for endoscopists at different levels of training and the establishment of minimal performancebased benchmark criteria for competence based on GiECAT scores. Additionally, establishment of a learning curve based on aggregated data would allow program directors to track trainees’ progress, pinpoint trainees’ skill deficits to aid in the development of personalized learning plans, and facilitate identification of trainees requiring remedial attention. Endoscopy training programs aim to equip learners with the technical, cognitive, and integrative competencies required to perform high-quality colonoscopy procedures independently, safely, and effectively. Integration of the GiECAT into training programs would enable assessments to be carried out in a structured and rigorous manner in an authentic clinical context, with sampling across cases and evaluators. Furthermore, it would provide a framework for teaching and feedback provision and ultimately reinforce trainees’ learning of desired competencies because assessment is known to drive learning.35

ACKNOWLEDGMENTS The authors thank Drs Brian Hodges and Dorcas Beaton for their insightful comments.

REFERENCES 1. Beard JD, Marriott J, Purdie H, et al. Assessing the surgical skills of trainees in the operating theatre: a prospective observational study of the methodology. Health Technol Assess 2011;15:i-xxi, 1-162. 2. Donato AA. Direct observation of residents: a model for an assessment system. Am J Med 2014;127:455-60. 3. Beard J. Workplace-based assessment: the need for continued evaluation and refinement. Surgeon 2011;9(Suppl 1):S12-3. 4. Miller GE. The assessment of clinical skills/competence/performance. Acad Med 1990;65(9 Suppl):S63-7. 5. Van der Vleuten CPM, Schuwirth LWT. Assessing professional competence: from methods to programmes. Med Educ 2005;39:309-17.

Volume

-,

No.

-

: 2015 GASTROINTESTINAL ENDOSCOPY 7

Gastrointestinal Endoscopy Competency Assessment Tool

Walsh et al

6. Walsh CM, Ling SC, Khanna N, et al. Gastrointestinal Endoscopy Competency Assessment Tool: development of a procedure-specific assessment tool for colonoscopy. Gastrointest Endosc 2014;79: 798-807.e5. 7. Shute VJ. Focus on formative feedback. Rev Educ Res 2008;78:153-89. 8. Epstein RM. Assessment in medical education. N Engl J Med 2007;356: 387-96. 9. Ben-David MF. The role of assessment in expanding professional horizons. Med Teach 2000;22:472-7. 10. American Educational Research Association, American Psychological Association, National Council on Measurement in Education. Standards for educational and psychological testing. Washington (DC): American Educational Research Association; 1999. 11. Kane MT. An argument-based approach to validity. Psychol Bull 1992;112:527-35. 12. Messick S. Validity. In: Educational measurement. New York (NY): American Council on Education and Macmillan; 1989. p. 13-104. 13. Swing SR, Clyman SG, Holmboe ES, et al. Advancing resident assessment in graduate medical education. J Grad Med Educ 2009;1:278-86. 14. Cook DA, Beckman TJ. Current concepts in validity and reliability for psychometric instruments: theory and application. Am J Med 2006;119:166.e7-16. 15. Downing SM. Validity: on meaningful interpretation of assessment data. Med Educ 2003;37:830-7. 16. Kane MT. Current concerns in validity theory. J Educ Meas 2001;38: 319-42. 17. Shrout PE, Fleiss JL. Intraclass correlations: uses in assessing rater reliability. Psychol Bull 1979;86:420-8. 18. Weir JP. Quantifying test-retest reliability using the intraclass correlation coefficient and the SEM. J Strength Cond Res 2005;19:231-40. 19. Cook DA, Zendejas B, Hamstra SJ, et al. What counts as validity evidence? Examples and prevalence in a systematic review of simulation-based assessment. Adv Health Sci Educ Theory Pract 2014;19:233-50. 20. Watkins M, Portney L. Foundations of clinical research: applications to practice. Upper Saddle River (NJ): Pearson Education, Inc; 2009. 21. Gallagher AG, Ritter EM, Satava RM. Fundamental principles of validation, and reliability: rigorous science for the assessment of surgical education and training. Surg Endosc 2003;17:1525-9. 22. Downing SM. Reliability: on the reproducibility of assessment data. Med Educ 2004;38:1006-12. 23. Wright JG, Feinstein AR. Improving the reliability of orthopaedic measurements. J Bone Joint Surg Br 1992;74:287-91. 24. Haynes SN, Richard DC, Kubany ES. Content validity in psychological assessment: a functional approach to concepts and methods. Psychol Assess 1995;7:238-47. 25. Bould MD, Crabtree NA, Naik VN. Assessment of procedural skills in anaesthesia. Br J Anaesth 2009;103:472-83.

8 GASTROINTESTINAL ENDOSCOPY Volume

-,

No.

-

: 2015

26. Williams RG, Klamen DA, McGaghie WC. Cognitive, social and environmental sources of bias in clinical performance ratings. Teach Learn Med 2003;15:270-92. 27. Eva KW, Hodges BD. Scylla or Charybdis? Can we navigate between objectification and judgment in assessment? Med Educ 2012;46: 914-9. 28. Jelovsek JE, Kow N, Diwadkar GB. Tools for the direct observation and assessment of psychomotor skills in medical trainees: a systematic review. Med Educ 2013;47:650-73. 29. Van Hove PD, Tuijthof GJM, Verdaasdonk EGG, et al. Objective assessment of technical surgical skills. Br J Surg 2010;97:972-87. 30. Davies H, Howells R. How to assess your specialist registrar. Arch Dis Child 2004;89:1089-93. 31. Sedlack RE. The Mayo Colonoscopy Skills Assessment Tool: validation of a unique instrument to assess colonoscopy skills in trainees. Gastrointest Endosc 2010;72:1125-33, 1133.e1-3. 32. Sedlack RE. Training to competency in colonoscopy: assessing and defining competency standards. Gastrointest Endosc 2011;74: 355-66.e1-2. 33. Xenodemetropoulos T, Armstrong D, Tse F, et al. Resident practice audit in gastroenterology (RPAGE): an innovative approach to gastroenterology trainee evaluation and professional development [abstract]. Can J Gastroenterol 2013;27(Suppl A):A069. 34. Armstrong D, Barkun A, Bridges R, et al. Canadian Association of Gastroenterology consensus guidelines on safety and quality indicators in endoscopy. Can J Gastroenterol 2012;26:17-31. 35. Kromann CB, Jensen ML, Ringsted C. The effect of testing on skills learning. Med Educ 2009;43:21-7. Gastroenterology, St. Michael’s Hospital, Toronto, Ontario, Canada (3), Division of Gastroenterology, Sunnybrook Health Sciences Centre, Toronto, Ontario, Canada (4), Division of Gastroenterology, Mount Sinai Hospital, Toronto, Ontario, Canada (5), Cancer Care Ontario, Toronto, Ontario, Canada (6), Faculty of Health Sciences, Queen’s University Kingston, Ontario, Canada (7), Departments of Paediatrics (8) and Medicine (9) and the Wilson Centre (10), University of Toronto, Toronto, Ontario, Canada, and School of Human Kinetics and Recreation, Memorial University of Newfoundland, St. John’s, Newfoundland, Canada (11). The abstract of an earlier version of this article was presented as an oral presentation at the 2014 Canadian Digestive Diseases Week Conference. Reprint requests: Catharine M. Walsh, MD, MEd, PhD, FRCPC, Hospital for Sick Children, Division of Gastroenterology, Hepatology, and Nutrition, 555 University Ave., Room 8417, Black Wing, Toronto, Ontario M5G 1X8, Canada. If you would like to chat with an author of this article, you may contact Dr Walsh at [email protected].

www.giejournal.org

Walsh et al

APPENDIX: Gastrointestinal Endoscopy Competency Assessment Tool (GiECAT) 1. Using the scale provided, please rate the candidate’s performance on the following global rating items: Please note: A score of 4 should be assigned to those individuals who are competent to perform the tasks independently, without the need for supervision (ie, do not take the candidate’s level of training into account when assigning a score)

Gastrointestinal Endoscopy Competency Assessment Tool

SCALE 1 Unable to achieve tasks despite significant verbal and/or hands-on guidance 2 Achieves some of the tasks but requires significant verbal and/or hands-on guidance 3 Achieves most of the tasks independently, with minimal verbal and/or manual guidance 4 Competent for independent performance of all tasks 5 Highly skilled performance of all tasks

Global rating item

Score

TECHNICAL SKILL Demonstrates an ability to manipulate the endoscope by using torque steering, angulation control knobs, and advancement/ withdrawal for effective navigation of the gastrointestinal tract. STRATEGIES FOR ENDOSCOPE ADVANCEMENT Demonstrates an ability to use loop reduction, insufflation, pullback, suction, external pressure, and patient position change to advance the endoscope independently, expediently and safely. VISUALIZATION OF MUCOSA Demonstrates an ability to achieve a clear luminal view required for safe endoscope navigation and complete mucosal evaluation, including good visualization around corners and folds and appropriate use of mucosal cleaning techniques (eg, lavage, suction). INDEPENDENT PROCEDURE COMPLETION (NEED FOR ASSISTANCE) Demonstrates an ability to complete the endoscopic procedure expediently and safely without verbal and/or manual guidance. KNOWLEDGE OF PROCEDURE Demonstrates general procedural knowledge including indications and contraindications, potential complications, endoscopy techniques, equipment maintenance, and troubleshooting. INTERPRETATION AND MANAGEMENT OF FINDINGS Demonstrates an ability to accurately identify and interpret pathology and/or procedural complications and form an appropriate management plan. PATIENT SAFETY Demonstrates an ability to perform the procedure in a manner that minimizes patient risk and assures optimal patient safety (eg, atraumatic technique, minimal force, minimal red-out, recognition of personal and procedural limitations, safe sedation practices, and appropriate communication).

www.giejournal.org

Volume

-,

No.

-

: 2015 GASTROINTESTINAL ENDOSCOPY 8.e1

Gastrointestinal Endoscopy Competency Assessment Tool

Walsh et al

2. Please rate the candidate’s performance based on the following checklist items (check the appropriate box): Not done or done Done Not incorrectly correctly observed

Checklist item PRE-PROCEDURE 1. Reviews relevant patient information (health records, relevant investigations) and obtains history as appropriate (indications, contraindications, medical history, medications, allergies).

,

,

,

2. Takes action in response to patient history and investigations where appropriate (eg, prophylactic antibiotics, anesthetic risk factors, etc.).

,

,

,

3. Demonstrates a sound knowledge of the indications and contraindications to colonoscopy, its benefits and risks, potential alternative investigations and/or therapies, and an awareness of the sequelae of endoscopic or nonendoscopic management.

,

,

,

4. Explains to the patient and/or caregivers the perioperative process and procedure (likely outcome, time to recovery, benefits, potential risks/complications, and rates), checks for understanding, and addresses concerns and questions.

,

,

,

5. Recognizes loop formation and avoids or reduces appropriately during the procedure (using pullback, torque, external pressure, patient position change).

,

,

6. Uses rotation and/or torque appropriately.

,

,

7. Uses withdrawal (as an advancement strategy) appropriately.

,

,

8. Uses abdominal pressure and changes in patient position appropriately to aid endoscope advancement.

,

,

9. Advances to the cecum (in an appropriate time).

,

,

10. Withdraws from the cecum/terminal ileum to the rectum in an appropriate time (O6 minutes).

,

,

11. Withdraws the endoscope in a controlled manner.

,

,

12. Performs therapeutic maneuvers (biopsy and/or polypectomy) independently, appropriately, and safely.

,

,

13. Demonstrates recognition of anatomic landmarks (eg, rectum, flexures, ileocecal valve, appendiceal orifice) and/or incomplete examination.

,

,

14. Demonstrates recognition of pathological and anatomic abnormalities.

,

,

15. Describes findings accurately, interprets abnormalities in the context of the patient, and selects the appropriate strategy/technique to deal with them.

,

,

16. Administers sedation appropriately (type, dose), monitors the patient’s vital signs and comfort level throughout the procedure, and responds appropriately AND/OR demonstrates appropriate interaction with the anesthetist to ensure appropriate sedation and monitoring throughout the procedure.

,

,

17. Demonstrates appropriate interaction and communication with the procedure nurses and/or assistants throughout the procedure.

,

,

18. Educates the patient and/or caregiver about the colonoscopic findings (explanation, significance) and follow-up plan and provides advice regarding potential postprocedure complications, recommended course of action, etc.

,

,

,

19. Appropriate and timely documentation of procedure (written/dictated/EMR).

,

,

,

PROCEDURE: TECHNICAL

PROCEDURE: COGNITIVE

PROCEDURE: NONTECHNICAL

POST-PROCEDURE

3. Please indicate whether a lapse in professionalism occurred: , No , Minor lapse (inadvertent and/or did not cause any substantial harm) , Major lapse (evidence of full knowledge that this action was not right and/or the lapse does cause harm) 4. ASSESSOR GLOBAL ASSESSMENT Please provide your expert global assessment of the endoscopist’s skill level independent of the above checklist and global ratings. , Requires significant guidance with all aspects of performing colonoscopy , Able to perform a colonoscopy under supervision, but requires guidance with many aspects of the procedure , Able to perform a colonoscopy independently, under supervision, with minimal guidance , Competent to perform a colonoscopy independently, safely, and expediently without the need for supervision , Highly skilled advanced ability to perform a colonoscopy independently with optimal efficiency and safety 8.e2 GASTROINTESTINAL ENDOSCOPY Volume

-,

No.

-

: 2015

www.giejournal.org