RESEARCH ARTICLE
The Relationship Between Formative and Summative Examination and PANCE Results: A Multi-Program Study ScottL.Massey,PhD,PA-C;LouiseLee,MHA,PA-C; SoniaYoung,PT,DPT;DavidHolmerud,BA
purpose: ThiscorrelationalresearchstudywasconductedtovalidateexternallyapreviouspilotstudythatexaminedtherelationshipbetweenformativeandsummativeexaminationsandPANCEscoresatonephysicianassistant(PA)program.ThestudywasexpandedtofivePAprogramstotestthe hypothesisthatPAprogramsthroughoutthecountrycouldusecomprehensiveexaminationsthatmirroredthePhysicianAssistantNationalCertifying Exam(PANCE)alongwiththePhysicianAssistantClinicalKnowledgeRatingandAssessmentTool(PACKRAT)scorestopredictfuturePANCEperformancesuccessfully.methods: ThesixparticipatingPAprogramsadministeredtheSuMMI,whichisa360-questionPANCEsimulationexambasedupon theNationalCommissiononCertificationofPhysicianAssistants(NCCPA)blueprint,theSuMMII,a700-questionexaminationbasedupontheNCCPA blueprint,andthePACKRATexaminationtoallsecond-yearstudents.Participatingprogramswereprovidedcutscoresforeachofthetestinstruments thatidentifiedstudentsatcriticalriskforfailingthePANCE.Results: The scoresofthestudyparticipantsfromtheoriginatinginstitutionweregathered between2007and2010andfromfiveadditionalprogramsin2011–2012todeterminecorrelationalstrengthbetweentheinstrumentsandthe PANCE.TheSuMMI,SuMMII,andPACKRATscores(N=421)werecombinedinaregressionmodeltogenerateapredictedPANCEscore.Thestrength ofcorrelationbetweenthepredictedPANCEscoreandactualPANCEscoredemonstratedan R valueof0.79andanR 2 of0.62.discussion: ThiscorrelationalstudysuggeststhatsummativeexaminationinstrumentscanbedevelopedandtriangulatedwithPACKRATscorestopredictfutureperformanceonthePANCEanddeterminethresholdsofriskforfailingthePANCEbaseduponstratifiedlevelsofpastperformance. JPhysicianAssistEduc2013;24(1):24-34
Scott Massey, PhD, PA-C, is the founding program director and professor of physician assistant studies, School of Physician Assistant Studies, Misericordia University, Dallas, Pennsylvania. Louise Lee, MHA, PA-C, is an assistant director and assistant professor, Physician Assistant Program, Tufts University School of Medicine, Boston, Massachusetts. Sonia Young, PT, DPT, is the assistant dean and associate director, School of Physician Assistant Studies, University of the Cumberlands, Williamsburg, Kentucky. David Holmerud, BA, is the content developer, planning and information management, National Fire Academy, Emmitsburg, Maryland. Correspondence should be addressed to: Scott Massey, PhD, PA-C School of Physician Assistant Studies Misericordia University 301 Lake Street Dallas, Pennsylvania 18612-1090 Telephone: (570) 674-3047 Email:
[email protected]
24
INTRoduCTIoN The identification of students at risk of failing the Physician Assistant National Certifying Exam (PANCE) is one of the largest challenges facing physician assistant (PA) educators. With the national landscape demonstrating a rapid growth in PA programs, there will continue to be an increasing number of PAs entering education positions from clinical practice. The tasks of learning pedagogical methods, along with the necessity of assessing student performance, challenges experienced educators and overwhelms young, developing faculty. This reinforces the need for a standardized practice to analyze student performance and identify students at risk who still have adequate time for effective intervention. This research study has attempted to define a process that can be replicated at PA programs throughout the country. The researchers chose a simple correlational approach that may help to demystify the complexity of statistically analyzing student per-
formance using formative and summative examinations and generating useful data to identify at-risk students. This correlational study builds upon a previously published report from 2011 in which Massey et al1 demonstrated that the use of formative and summative examinations could generate accurate predicted scores that correlated strongly with PANCE results. The need for external validation is necessary to show that this process can be replicated at any PA program by demonstrating that the design and administration of valid formative and summative examinations will generate results that can be statistically analyzed to produce useful data.
Background to the Problem In the field of PA education, with the exception of the Physician Assistant Clinical Knowledge Rating and Assessment Tool (PACKRAT), there is a lack of validated formative and summative instruments that can accurately predict future PANCE performance. 2013Vol24No1| TheJournalofPhysicianAssistantEducation
The Relationship between formative and Summative Examination and pANCE Results: A multi-program Study
There is also a need for summative assessment instruments that can both meet the Accreditation Review Commission on Education for the Physician Assistant (ARC-PA) Standards and provide accurate information to predict student performance on the PANCE.
Purpose of the Study The purpose of the study was to determine if program-specific formative and summative assessment tools and the PACKRAT could be administered to PA students in multiple programs and achieve equal strength of correlation previously documented in a pilot study. This research study thereby could demonstrate that a system of programdesigned examinations could achieve external validity. This could provide a model for PA programs to develop multiple tools that can be triangulated to better predict future PANCE performance and used in combination with the system of early remediation and mentoring during the clinical year.
Definition of Terms
• SUMM I is a 360-question formative examination patterned after the National Commission on Certification of Physician Assistants (NCCPA) blueprint. The instrument contains the exact organ-system distribution and weighting of questions as the 2011 blueprint. • PACKRAT is a 225-question examination developed annually by the Physician Assistant Education Association’s PACKRAT Committee. • SUMM II is a 700-question, comprehensive examination patterned after the NCCPA blueprint. The instrument contains the exact organ-system distribution and weighting of questions as the 2011 blueprint. • Predicted PANCE score is a numerical figure generated through logistic regression derived from retrospective data. Students in the study received predicted scores generated from all
three instruments separately and one triangulated, predicted score from all three instruments together.
Literature Review The ARC-PA Standards 4th edition, Standard C3.04, states: “The program must conduct and document a summative evaluation of each student within the final 4 months of the program to verify that each student is prepared to enter clinical practice.”2 Programs are therefore left with the task of developing summative instruments that meet this standard and correlate with PANCE scores. Limited data are published on research within PA education regarding the use of formative and summative examinations to generate scores that could be used to predict PANCE scores. A few have studied examination score correlations using PACKRAT scores, while others have looked at different summative examination scores. Yet, there are also studies that looked at learning systems and methodologies as well as demographics and their effects on PANCE performance.3,4 Massey et al5 studied an interactive learning system and its effects on PANCE performance over a 2-year period of time. This entailed the preparation of concise, highimpact review notes on each blueprint item throughout the learner’s didactic and clinical years. Their research demonstrated significant improvements in PANCE scores, pass rates, and mean scores. The PACKRAT exam has likely been studied most frequently due to the nature and purpose of its use and has been shown to be predictive in various studies that were conducted prior to extensive research that examined other types of summative assessments. Cody et al,6 using logistic regression analyses, showed a significant relationship between PACKRAT scores and PANCE performance (P < .001, with correlation coefficient of 0.67). Further, the study demonstrated a PACKRAT
TheJournalofPhysicianAssistantEducation| 2013Vol24No1
score of < 55% (raw score = 124) as having a sensitivity of 77.2% and specificity of 83.3%, concluding that PACKRAT has a strong predictive value of PANCE pass/fail performance. Other educational training programs within the medical sciences have been researched and may prove helpful to PA educators, since data are limited within this population set. Medical students are also evaluated at the end of their didactic training with the US Medical Licensing Examination (USMLE). Coumarbatch et al7 evaluated the correlation between the Medical College Admission Test (MCAT), cumulative grades from years one and two of medical school, and performance on the USMLE. They concluded that year-2 standard scores and the MCAT biological sciences scores were significant predictors of USMLE performance. Swanson et al8 also developed a method to predict success on the USMLE. They investigated the link between success on Step 2 of the USMLE and Step 1 scores, subject test scores, and clerkship characteristics. They again confirmed that success on early testing in Step 1 is predictive of success in Step 2. Holmboe et al9 investigated a correlation between the In-Training Examination in Internal Medicine (ITE-IM) taken at any time during residency and the American Board of Internal Medicine Certifying Examination (ABIM-CE). The ITEIM is a formative assessment tool used to evaluate postgraduate year-2 knowledge during residency training. They discovered that cutoff scores and percentile rank compared with the postgraduate year national peer group demonstrated a sensitivity and specificity for predicting passing versus failure on the ABIM-CE. Nursing education has also performed in-depth studies on board score predictability and found similar results. Research revealed a statistically significant relationship between first 25
The Relationship between formative and Summative Examination and pANCE Results: A multi-program Study
Exit Exam scores (versus final Exit Exam scores) and National Council Licensure Examination for Registered Nurses (NCLEX-RN) outcomes.10 More recent research over a 3-year period performed by Ennulat et al11 found that, although PACKRAT alone showed relational predictability to the PANCE, the most accurate statistical predictor was the combination of a summative exam given at the end of the clinical year, along with the average multiple-choice-exam results from the didactic year in students who performed in the bottom quartile of the class. In addition to the above variables, they also measured prerequisite, preadmission grade-point averages (GPAs), and overall preadmission GPAs. Their study suggested that PA programs might consider developing their own internal analyses of exam assessments in order to help identify students “at high risk” of PANCE failure. As a pilot study completed by Massey et al1 demonstrated, using two summative examinations along with the PACKRAT has statistically significant predictive value in determining students’ PANCE scores. This study incorporated PACKRAT, two examinations patterned after the PANCE with one exam comprised of 360 questions and the other comprised of 700 questions. The study also incorporated three endof-rotation clinical exams (surgery, women’s health, and pediatrics) into this regression model. The study found that the predicted PANCE score based upon this model had a final correlation of 0.79 with the actual PANCE score (R value of 0.87 and R2 of 0.75). Once again, the combination of several different program assessment tools appears to be more helpful in identifying students at risk of failing the PANCE as opposed to using one alone. Although each variable alone showed a linear correlation to the PANCE scores, the
26
combination approach showed the strongest correlation value. Moreover, what this approach also accomplishes is to equip PA educators with tools for early intervention and enhancement of study skills for those deemed at risk of failing the PANCE.
mETHodS
Description of the Population and Sample The population sample was defined by the number of former students who completed the formative and summative examinations during the clinical year and the PANCE after graduation. The total number of subjects in the study, which began in 2007, is 421. Between 2007 and 2010, there were 204 subjects at Massachusetts College of Pharmacy and Health Sciences. In 2011 and 2012, 217 subjects were enrolled from six different PA programs. The programs were thereafter identified as programs A–F. All of the PA programs taking part in the study awarded a master’s degree. It is noteworthy that all participating programs were located in private institutions and situated as follows: one program was housed in a small college in the Ohio Valley; three universities were distributed in the Midwest, the Middle Atlantic States, and the Southeast; and two programs were housed in a Carnegie health-science-focused college in the Northeast. The breakdown of subjects from the six participating programs in 2011–2012 is as follows: Program A (N = 79), Program B (N = 15), Program C (N = 26), Program D (N = 35), Program E (N = 20), and Program F (N = 42). The cumulative sample size for each of the formative and summative examinations and the derived Predicted PANCE (PP) included SUMM I (N = 401), SUMM II (N = 217), PACKRAT (N = 421), and PP (N = 421).
Identification of the Independent and Dependent Variables This nonexperimental correlational study did not use independent and dependent variables. Although the students were provided feedback and mentoring if they were determined to be at risk for future failure of the PANCE, this did not constitute manipulation of the aforementioned quantitative variables. The study did not determine any causation nor were there any control groups.
Statement of the Research Questions and hypotheses Statistical Hypotheses Research Question 1: Is there a statistically significant correlation between formative examination SUMM I and PANCE scores? Research Question 2: Is there a statistically significant correlation between formative examination PACKRAT and PANCE scores? Research Question 3: Is there a statistically significant correlation between comprehensive exam SUMM II and PANCE scores? Research Question 4: Is there a statistically significant correlation between predicted PANCE scores generated by triangulating the SUMM I/ PACKRAT/ SUMM II scores into a regression formula and actual PANCE scores? Research Question 5: Is there a statistically significant correlation between SUMM I/ PACKRAT/ SUMM II and predicted PANCE scores when expanding the study from a single program to a multiple-program design? Hypothesis 1: There will be a statistically significant correlation between formative examination SUMM I and PANCE scores.
2013Vol24No1| TheJournalofPhysicianAssistantEducation
The Relationship between formative and Summative Examination and pANCE Results: A multi-program Study
Hypothesis 2: There will be a statistically significant correlation between formative examination PACKRAT and PANCE scores. Hypothesis 3: There will be a statistically significant correlation between comprehensive SUMM II examination and PANCE scores. Hypothesis 4: There will be a statistically significant correlation between predicted PANCE scores generated by triangulating the SUMM I/ PACKRAT/ SUMM II scores into a regression formula and actual PANCE scores. Hypothesis 5: There will be a statistically significant correlation between SUMM I/ PACKRAT/ SUMM II and predicted PANCE scores when expanding the study from a single program to a multiple program design.
Instrumentation The study used SUMM I, which is a 360-question, formative examination that is patterned after the NCCPA blueprint. The instrument contains the exact organ-system distribution and
weighting of questions as the 2011 blueprint. Also used was the PACKRAT, which is a 225-question examination developed annually by the Physician Assistant Education Association (PAEA) PACKRAT Committee, and the SUMM II, which is a 700-question, comprehensive examination patterned after the NCCPA blueprint. The data were stored in a secure location in an Excel spreadsheet at each of the participating PA programs. After the graduates took the PANCE, the data were de-identified, with all names eliminated to maintain confidentiality. Each of the participating programs consulted their institutional review boards and, due to the fact the study was analyzing the identified data with no name or numerical identification, it was therefore determined to be exempt. The data were statistically analyzed using the Microsoft Excel program. The correlational analyses included the Pearson, the R squared, and logistic linear regression.
ResearchDesign This study was a nonexperimental correlational research design that analyzed
the relationship between the raw scores of the quantitative variables: SUMM I, SUMM II, PACKRAT, Predicted PANCE (PP), and the actual metric PANCE score. Therefore, there was no manipulation of variables or independent or dependent variables. The researchers were seeking to determine if the aforementioned quantitative variables demonstrated strength of correlation and the ability to predict future performance on the PANCE.
RESuLTS
Analysis of hypothesis 1: There will be a statistically significant correlation between formative examination SUMM I and PANCE scores The null hypothesis is rejected; therefore, the hypothesis is supported by the data analysis. When the cumulative database of SUMM I scores gathered between 2007–2012 (N = 401) was combined in a regression model to determine strength of relationship between the raw score in the actual PANCE Score, an R value of 0.70 and R2 of 0.49 were obtained (see Table 1). The descriptive statistics displayed in
figure 1. Scatter plot Comparing Summ I to pANCE CombinedSuMMItoPANCE
PANCEScoreR 2 =0.49
TheJournalofPhysicianAssistantEducation| 2013Vol24No1
27
The Relationship between formative and Summative Examination and pANCE Results: A multi-program Study Table 1. Regression Statistics for All Instruments Regression Statistics
All Testing Instruments
Summ I
MultipleR
0.80
0.70
0.64
0.68
0.72
0.76
0.79
0.79
R2
0.64
0.49
0.40
0.47
0.53
0.57
0.63
0.62
AdjustedR 2
0.63
0.49
0.40
0.46
0.52
0.57
0.62
0.62
93.98
94.81
121.60
98.46
107.33
87.04
96.31
83.40
197
401
217
421
197
401
217
421
Standarderror observations
Summ II
pR
Summ I/II
Summ I/pR
Summ II/pR
pp only
Table 2. descriptive Statistics Cumulative database 2007–2012 Total database
Summ I
Summ II
pACKRAT
pANCE
Mean
228.88
527.26
147.10
532.20
522.04
Standarderror
1.21
2.95
0.85
6.55
5.56
Median
228
527
146
517
513
Mode
pp
216
539
158
800
482
24.20
43.42
17.52
134.48
114.04
585.62
1885.15
306.82
18083.58
13005.45
Kurtosis
0.10
-0.00
0.11
-0.17
-0.33
Standarddeviation Samplevariance Skewness
0.30
-0.15
0.14
-0.00
0.03
Range
151
231
121
646
612
Minimum
164
404
89
200
179
Maximum
315
635
210
846
791
91780
114415
61930
224058
219780
Count
Sum
401
217
421
421
421
Confidencelevel(95.0%)
2.38
5.81
1.68
12.88
10.93
Table 2 provide the mean scores for the total number of subjects (N = 228). The performance range revealed scores of 164–315 out of a total of 360 points possible. The scatter plot (see Figure 1) demonstrates the SUMM I to PANCE correlation.
Analysis of hypothesis 2: There will be a statistically significant correlation between formative examination PACKRAT and PANCE scores The null hypothesis is rejected; there28
fore, the hypothesis is supported by the data analysis. When the cumulative database of PACKRAT scores gathered between 2007–2012 (N = 421) was combined in a regression model to determine strength of relationship between the raw score in the actual PANCE score, an R value of 0.68 and R2 of 0.47 were obtained (see Table 1). The descriptive statistics displayed in Table 2 provide the mean scores for the total number of subjects (N = 147). The performance range revealed scores
of 89–210 out of a total of 225 points possible. The scatter plot (see Figure 2) demonstrates the PACKRAT-toPANCE correlation.
Analysis of hypothesis 3: There will be a statistically significant correlation between comprehensive SUMM II examination and PANCE scores The null hypothesis is rejected; therefore, the hypothesis is supported by the data analysis. When the cumulative Continuedonpage30
2013Vol24No1| TheJournalofPhysicianAssistantEducation
The Relationship between formative and Summative Examination and pANCE Results: A multi-program Study figure 2. Scatter plot Comparing pACKRAT to pANCE
PRScore
CombinedPACKRATtoPANCE
PANCEScoreR 2 =0.47
figure 3. Scatter plot Comparing Summ II to pANCE
SuMMIIScore
CombinedSuMMIItoPANCE
PANCEScoreR 2 =0.40
TheJournalofPhysicianAssistantEducation| 2013Vol24No1
29
The Relationship between formative and Summative Examination and pANCE Results: A multi-program Study Continuedfrompage28
database of SUMM II scores gathered between 2009–2012 (N = 217) were combined in a regression model to determine strength of relationship between the raw score in the actual PANCE score, an R value of 0.64 and R2 of 0.40 were obtained (see Table 1). The descriptive statistics displayed in Table 2 provide the mean scores for the total number of subjects (N = 147). The performance range revealed scores of 404–635 out of a total of 700 points
possible. The scatter plot (see Figure 3) demonstrates the SUMM II to PANCE correlation.
Analysis of hypothesis 4: There will be a statistically significant correlation between predicted PANCE scores generated by triangulating the SUMM I/ PACKRAT/ SUMM II scores into a regression formula and actual PANCE scores The null hypothesis is rejected; therefore, the hypothesis is supported by the
data analysis. When the SUMM I, PACKRAT, and SUMM II raw scores were combined in a regression model, a predicted score, based upon retrospective data gathered between 2007–2012, was demonstrated (N = 412). These data are displayed in a scatterplot (see Figure 4). The strength of correlation between the predicted score and actual PANCE score was generated using logistical linear regression. An R value of 0.79 and R2 of 0.62 were obtained. In Table 1, there are regression statistics from all the
figure 4. Scatter plot Comparing predicted pANCE to pANCE
PredictedPANCE
CombinedPredictedPANCEtoPANCE
PANCEScoreR 2 =0.61
figure 5. pearson Correlation to pANCE Scores CombinedTestScoreCombinations:SuMMI,SuMMII,PACKRAT
TestScoreCombinations
30
2013Vol24No1| TheJournalofPhysicianAssistantEducation
The Relationship between formative and Summative Examination and pANCE Results: A multi-program Study
figure 6. Relationship between pANCE and Total Raw Score
TotalRawScores
PANCEandTotalRawScore(SuMMI+SuMMII+PACKRAT)
PANCEScoreR 2 =0.98
separate testing instruments as well as several combinations of testing instruments and the combination of all three. When the simple raw scores from the SUMM I, PACKRAT, and SUMM II together were correlated with the actual numerical PANCE score, an R value of 0.80 and R2 of 0.64 (N = 197) were obtained. When a combination of SUMM I and SUMM II were combined into a regression model, an R value of 0.72 and R2 of 0.52 (N = 197) were obtained. When the simple raw scores from SUMM II and PACKRAT together were combined into a regression model, an R value of 0.79 and R2 of 0.62 (N = 217) were obtained. A graphic representation of these data is provided (see Figure 5). When the combination of the total raw scores achieved on the three testing instruments (SUMM I (360 questions) + SUMM II (700 questions) + PACKRAT (225 questions) = 1,285 total possible) was correlated with stratification of actual PANCE performance using increments of 100, this revealed a surprisingly high correlation. Figure 6 provides a graphic depiction of
these data with the total raw score on the y-axis and the stratified levels of PANCE performance depicted on the x-axis. This provides the correlation between these two variables (R2 = 0.98).
Analysis of hypothesis 5: There will be a statistically significant correlation between SUMM I/ PACKRAT/ SUMM II and predicted PANCE scores when expanding the study from a single program to a multiple-program design The null hypothesis is rejected; therefore, the hypothesis is supported by the data analysis. In 2011 and 2012, the instruments were administered to a total of 217 students in six PA programs. Table 4 provides the descriptive statistics for the six PA programs. Table 3 provides a breakdown of the number of students and the regression statistics for each of the respective programs. The data from each program were analyzed separately. Pearson and R2 were generated to determine the strength of correlation between SUMM I, SUMM II, PACKRAT, and Predicted PANCE
score and the PANCE score of the program graduates. There were no data available for SUMM I for program E, as this exam was not administered to this cohort. The limitations of these data were the small subject numbers in some of the programs (Program B = 15; Program E = 20). There was some variation between programs in terms of strength of correlation. The reason for these differences is unknown. Figure 7 depicts the PANCE scores for each of the programs. Figure 8 depicts the Predicted PANCE scores for each of the programs. The predicted PANCE score demonstrated the strongest correlation for all the respective programs (Table 3).
dISCuSSIoN
Summary of Findings Addressing the hypotheses The relationship between the formative and summative examination instruments and the PANCE scores was consistent with the findings in the pilot study by Massey et al.1 The expansion to a multiprogram study validated that the use of Continuedonpage33
TheJournalofPhysicianAssistantEducation| 2013Vol24No1
31
The Relationship between formative and Summative Examination and pANCE Results: A multi-program Study Table 3. program-Specific data data denoting Strength of Correlation between Examination Instrument and Actual pANCE Score School
N=
Summ I
Summ II
pACKRAT
A
79
R ²=0.42 Pearson=0.65
R ²=0.57 Pearson=0.75
R ²=0.70 Pearson=0.84
-
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
B
15
-
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
C
26
-
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
D
35
-
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
E
20
-
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
F
42
R ²=0.61 Pearson=0.78 R ²=0.27 Pearson=0.52 R ²=0.39 Pearson=0.62
R ²=0.43 Pearson=0.65 R ²=0.43 Pearson=0.66 R ²=0.53 Pearson=0.73 R ²=0.76 Pearson=0.87
Nodata
R ²=0.57 Pearson=0.75
R ²=0.40 Pearson=0.64
R ²=0.52 Pearson=0.72 R ²=0.39 Pearson=0.62 R ²=0.64 Pearson=0.80 R ²=0.65 Pearson=0.81 R ²=0.59 Pearson=0.77
pp
R ²=0.72 Pearson=0.85 R ²=0.60 Pearson=0.77 R ²=0.44 Pearson=0.66 R ²=0.66 Pearson=0.81 R ²=0.72 Pearson=0.85 R ²=0.63 Pearson=0.79
Table 4. Combined program data (programs A–f) Summ I
Summ II
pACKRAT
pANCE
pp
224.91
527.26
147.60
514.10
506.88
Standarderror
1.70
2.95
1.23
10.67
8.50
Median
225
527
147
494
504.37
Mode
225
539
148
800
N/A
23.85
43.42
18.17
157.16
125.22
568.67
1885.15
329.97
24700.09
15680.94
Kurtosis
-0.29
-0.00
-0.12
-0.70
-0.53
Skewness
0.12
-0.15
-0.17
0.16
0.04
Range
128
231
96
600
598.76
Minimum
164
404
89
200
182.46
Maximum
292
635
185
800
781.22
44308
114415
32030
111559
109992.65
Count
197
217
217
217
217
Confidencelevel(95.0%)
3.35
5.81
2.43
21.03
16.76
Mean
Standarddeviation Samplevariance
Sum
32
2013Vol24No1| TheJournalofPhysicianAssistantEducation
The Relationship between formative and Summative Examination and pANCE Results: A multi-program Study
figure 7. Combined pANCE Scores 900
■ Mean
pANCE Score
800
■ Median
700
■ Minimum
600
■ Maximum
500 400 300 200 100 0
d bine Com
A ool Sch
b ool Sch
C ool Sch
d ool Sch
E ool Sch
f ool Sch
figure 8. Combined predicted pANCE Scores
predicted pANCE Score
900
■ Mean
800
■ Median
700
■ Minimum
600
■ Maximum
500 400 300 200 100 0
d bine Com
A ool Sch
Continuedfrompage31
testing instruments that followed the NCCPA blueprint and used questions that are consistent with the PANCE has external validity to predict future PANCE performance with statistical validity. The null hypotheses were rejected in hypotheses 1–5. Anecdotally, the author has provided tutoring to a large number of students over the years who initially failed the PANCE. The qualitative analysis of these subjects demonstrated an inability to prioritize the most salient information and disor-
b ool Sch
C ool Sch
d ool Sch
ganization during the study process. There were also a large number of these subjects who reported significant test anxiety and difficulty with multiplechoice examinations. Further investigation into the study process could provide valuable information to answer this question.
Implications for Further Research This research needs to be replicated on a larger scale to determine if there are widespread implications for PA programs. The test instruments (SUMM I,
TheJournalofPhysicianAssistantEducation| 2013Vol24No1
E ool Sch
f ool Sch
SUMM II) developed by the principal investigator can be replicated by PA programs using a standardized approach. Constructing test instruments using the NCCPA blueprint and incorporating questions at higher levels of Bloom’s taxonomy are essential to attempts to replicate a similar instrument. This research determined that a larger number of test questions administered during the clinical year (N = 1,285), including all three test instruments, had a strong correlation with PANCE results.
33
The Relationship between formative and Summative Examination and pANCE Results: A multi-program Study
CoNCLuSIoN The data derived from the study are encouraging. The study suggests that PA programs could develop a system to better predict future performance by using program-specific, comprehensive examinations. The data analysis suggests that there was external validity when these test instruments were administered to students from several PA programs as similar correlational data were demonstrated. The guidelines developed to mentor students who were identified as at-risk have promise and could be replicated and implemented by PA programs throughout the nation. These efforts are worthwhile even if only a few students, who might have failed the PANCE without intervention, are guided toward success. ACKNowLEdGEmENTS The authors wish to thank the following individuals who contributed to this article. Participating Investigators: Susan White, MD; Helen Martin, DHSc, PA-C; Susan Wulff, MS, PA-C; Rosalind Becker, MS, PA-C; Maha Lund, MS, MPAS, PA-C Writing Contributor: Andrew Linder, BS
34
REfERENCES 1.
2.
3.
4.
5.
Massey S, Stallman J, Lee L, Klingaman K, Holmerud D. The relationship between formative and summative examinations and PANCE scores. J Physician Assist Educ. 2011;22(1):41-45. Accreditation Review Commission on Education for the Physician Assistant. Accreditation Standards for Physician Assistant Education. http://www.arcpa.org/documents/Standards4theditionwithclarifyingchanges12.2012fnl.p df. Accessed January 14, 2013. Duma RD, Wilson DE. PACKRAT: A predictor of success on the PANCE. In Proceedings: 2nd Annual Symposium on Graduate Research and Scholarly Projects. Wichita, KS: Wichita State University; 2006:122123. Asprey D, Dehn R, Kreiter C. The impact of age and gender on the Physician Assistant National Certifying Examination scores and pass rates. Perspect Physician Assist Educ. 2004;15(1):38-41. Massey S, Sedrak M, Lee L. The effect of an interactive learning system on PANCE performance. J Physician Assist Educ. 2008;19(3):30-34.
Cody JT, Adamson KA, Parker RL, Brakhage CH. Evaluation of the relationship between student performance on the PACKRAT and the PANCE examinations. Perspect Physician Assist Educ. 2004;15(1):42-46. 7. Coumarbatch J, Robinson L, Thomas R, Bridge P. Strategies for identifying students at risk for USMLE Step 1 Failure. Medical Student Education. 2010;42(2):105-110. 8. Swanson DB, Ripkey DR, Case SM. Identifying students at risk for poor performance on the USMLE Step 2. Acad Med. 1999;74(10):S45-S48. 9. Holmboe ES, Babbott SF, Beasley BW, Hinchey KT, Blotzer JW. The predictive validity of the Internal Medicine In-Training Examination. The American Journal of Medicine. 2007;120(8):735-740. 10. Nibert AT. Predicting NCLEX success with the HESI Exit Exam: fourth annual validity study. Nurse Educator. 2008;Sept/Oct sup:28S-34S. 11. Ennulat C, Garrubba C, DeLong D. Evaluation of multiple variables predicting the likelihood of passage and failure of PANCE. J Physician Assist Educ. 2011;22(1):7-18. 6.
2013Vol24No1| TheJournalofPhysicianAssistantEducation