A Test Development Life Cycle Framework for Testing ... - Comira

0 downloads 226 Views 452KB Size Report
Feb 29, 2016 - development life cycle framework for initial planning, and also as a structure ... legal defensibility, a
A  Test  Development  Life  Cycle  Framework  for  Testing  Program  Planning   Pamela  Ing  Stemmer,  Ph.D.   February  29,  2016   How   can   an   organization   (test   sponsor)   successfully   execute   a   testing   program?     Successfully   operating   a   testing   program   with   a   valid,   legally   defensible   examination   is   no   small   task.   Reviewing   necessary   information   and   making   numerous   decisions   can   be   overwhelming.   Effective   planning   and   successful   execution   of   a   testing   program   may   be   achieved   by   using   a   test  development  life  cycle  framework.     In   this   paper,   we   present   a   framework   for   planning   and   decision   making,   to   guide   successful   execution   of   new   and   existing   testing   programs.   We   recommend   the   use   of   this   test   development   life   cycle   framework   for   initial   planning,   and   also   as   a   structure   for   evaluating   current   testing   program   functioning.   In   this   document,   we   introduce   each   test   development   stage.  Further  details,  including  issues  to  address  at  each  stage  of  the  test  development  cycle,   questions  to  consider  as  a  test  sponsor,  and  questions  to  ask  potential  or  existing  vendors  will   be  presented  throughout  the  rest  of  this  white  paper  series.     This   white   paper   series   is   focused   on   testing   programs   in   the   domain   of   certification   and   licensure.   The   purpose   of   a   credential   is   to   provide   evidence   to   the   public   that   an   individual   meets  the  minimum  level  of  competence  required  for  safe  practice  in  a  particular  profession.   Many  certification  and  licensure  programs  develop  one  or  more  valid  tests  to  assure  that  the   knowledge   and   performance   capabilities   of   their   credential   holders   meet   those   standards.   To   achieve   validity,   legal   defensibility,   and   public   confidence   in   the   credential,   effort   is   required   throughout  the  planning,  execution,  and  evaluation  of  a  testing  program.  Important  questions   need  to  be  considered  by  test  sponsors  and  information  needs  to  be  gathered  from  potential   and/or  existing  vendors.     Planning   ahead,   using   a   test   development   life   cycle   framework,   such   as   the   one   presented   in   Figure  1,  can  ensure  that  important  questions  and  decisions  are  not  overlooked  throughout  the   planning   and   execution   of   a   testing   program.   Benefits   of   planning   ahead,   using   a   test   development   life   cycle   framework   include,   but   are   not   limited   to,   ensuring   security,   quality   assurance,  validity  evidence,  legal  defensibility,  and  preparation  for  accreditation  application.     Measurement  professionals   employ  the  2014  Joint   Standards  for  Educational  and  Psychological   Testing  (AERA/APA/NCME,  2014)  and  the  National  Commission  for  Certifying  Agencies  (NCCA)   2016   Standards   for   the   Accreditation   of   Certification   Programs   (Institute   for   Credentialing   Excellence,   2014)   as   psychometric   criteria   for   validating   examinations,   such   as   licensing   and   certification  examinations.  

1  

The   test   development   life   cycle   includes   the   following   stages:   job   analysis,   test   specifications,   item   development,   test   assembly,   standard   setting,   equating,   test   administration,   scoring   and   reporting,  and  technical  report.    

1. Job Analysis   9.  Technical   Report  

2. Test SpecificaWons  

8.  Scoring   and   ReporWng  

3.  Item   Development  

7.  Test   AdministraWon  

4.  Test   Assembly    

6.  EquaWng  

5.  Standard   Se[ng  

Figure   1.   The   test   development   life   cycle   provides   a   useful   framework   for   addressing   important   questions   and   decisions   throughout   the   planning,   execution,   and   evaluation   of   a   testing   program.  

2  

1. Job  Analysis A   job   analysis   (or   practice   analysis)   is   an   empirical   method   used   to   define   the   content   that   is   to   be   measured   by   an   examination   (Downing,   2006;   Figure   1).   Use   of   a   job   analysis   study   as   a   method  of  defining  examination  content  is  clearly  endorsed  by  both  the  2016  NCCA  Standards   (Standards   13   and   14)   and   the   2014   Joint   Standards   (Standards   4.12,   11.03,   and   11.13).   For   credentialing   programs,   NCCA   Standard   14   states   that   a   job   analysis   must   be   conducted   and   documented,   in   order   to   delineate   and   evaluate   job   responsibilities   and   content   domains   associated   with   the   credential’s   purpose.   Numerous   approaches   may   be   taken   when   conducting   a   job   analysis.   Common   methods   and   issues   to   consider   when   planning   a   job   analysis  will  be  discussed  in  the  next  installment  of  this  white  paper  series.     2. Test  Specifications Test   specifications   provide   a   detailed,   comprehensive   description   of   an   examination’s   components   and   features  (Figure   1).   Test   specifications   include   a   test   blueprint,  which   outlines   the   number   or   proportion   of   items   assigned   to   measure   each   content   domain   (and   subdomain,   if   applicable).   Other   elements   of   test   specifications   include   the   following:   test   length;   item   format(s);   maximum   testing   time;   candidate   directions;   test   administration   procedures;   any   permissible   materials;   and   procedures   for   scoring   and   reporting   (AERA/APA/NCME,   2014).   According  to  The  2014  Joint  Standards  (Standards  4.01  and  4.02)  and  the  2016  NCCA  Standards   (Standard  15),  comprehensive  test  specifications  must  be  established  and  documented.  Issues   to  consider  regarding  creating  and  updating  test  specifications,  such  as  time  and  processes,  will   be  presented  in  an  upcoming  installment  of  this  white  paper  series.     3. Item  Development Item   development   comprises   item   writing   and   item   review   (Figure   1).   Important   item   development   issues   to   consider   include,   but   are   not   limited   to,   the   following:   subject   matter   expert   (SME)   recruitment   and   training;   guidelines   for   item   content   and   structure;   security   of   item   content;   item   authoring   methods;   item   banking.   The   content   domains   outlined   in   the   test   blueprint  guide  the  item  development  process.  The  2014  Joint  Standards  (Standards  4.07  and   4.08)   and   the   2016   NCCA   Standards   (Standards   13   and   16)   emphasize   the   importance   of   developing   and   documenting   systematic   item   development   procedures.   Questions   to   address   regarding  item  development  will  be  discussed  in  an  upcoming  installment  of  this  white  paper   series.     4. Test  Assembly Item   selection   for   an   examination   form   should   be   guided   by   the   test   blueprint   (Figure   1).   In   addition  to  alignment  with  test  blueprint,  there  are  many  issues  for  test  sponsors  to  consider   regarding  test  assembly,  including:  key  balance;  placement  of  anchor  items  and  accounting  for   an  equating  model;  inclusion  and  coverage  of  pretest  items;  test  security;  and  quality  control.   There   are   many   approaches   to   test   form   assembly,   which   may   be   appropriate   for   paper   and   pencil   administration,   computer   based   testing   (CBT),   or   both.   Test   form   assembly   should   be   guided  by  psychometric  principles,  such  as  classical  test  theory  (CTT)  or  item  response  theory   3  

(IRT).   Tests   may   be   assembled   so   as   to   be   presented   in   a   fixed   format   or   with   individualized   elements,   such   as   with   computer   adaptive   testing   (CAT)   or   linear   on   the   fly   testing   (LOFT).   Expectations   for   test   assembly   are   addressed   in   the   2016   NCCA   Standards   (Standard   16)   and   the   2014   Joint   Standards   (Standard   4).   Further   discussion   regarding   test   assembly   issues   and   questions  will  be  presented  in  an  upcoming  installment  of  this  white  paper  series.   5. Standard  Setting Standard   setting   refers   to   the   process   by   which   a   passing   score   for   an   examination   is   established  (Cizek,  2006;  Figure  1).  In  certification  and  licensure,  the  goal  of  standard  setting  is   to   determine   the   level   of   performance   (i.e.,   knowledge,   skill   and/or   ability)   required   by   an   individual   to   demonstrate   minimal   competence   in   a   particular   profession.   Issues   for   test   sponsors   to   consider   include:   method   selection;   SME   panel   composition,   recruitment,   and   training;   and   determining   when   standard   setting   is   appropriate,   versus   equating   a   new   form   with  an  existing  examination  form.  Standard  setting  is  endorsed  by  the  2016  NCCA  Standards   (Standards   13   and   17)   and   the   2014   Joint   Standards   (Standards   5.22   and   11.16).   Methods   of   standard   setting   and   issues   to   consider   when   planning   a   standard   setting   will   be   presented   in   a   future  installment  of  this  white  paper  series.       6. Equating According   to   the   2016   NCCA   Standards   (Standard   21),   testing   programs   must   establish   that   candidates   will   not   receive   any   advantages   or   disadvantages   resulting   from   content   structure   and/or   difficulty   across   different   forms   of   an   examination   (Figure   1).   Issues   for   test   sponsors   to   consider   regarding   equating   include:   selection   of   appropriate   statistical   procedures   (i.e.,   equating  models);  establishing  equivalence  across  examination  forms  that  have  been  translated   into   different   languages;   and   ensuring   that   examination   forms   are   compliant   with   the   requirements  of  the  test  blueprint  and  selected  equating  model.   Questions  to  ask  and  issues  to   consider   regarding   equating   will   be   discussed   in   an   upcoming   installment   of   this   white   paper   series.       7. Test  Administration Test   administration   is   the   most   visible   component   of   the   test   development   life   cycle   and   delivers  vital  evidence  of  test  score  validity,  when  it  is  executed  in  an  organized,  consistent,  and   effective  manner  (Downing,  2006;  Figure  1).  There  are  many  elements  of  test  administration  for   test   sponsors   to   consider,   such   as:   proper   candidate   identification;   standardization   of   testing   environments;   examination   security;   proctor   training   and   monitoring;   quality   control;   and   candidate   accommodation.   The   2014   Joint   Standards   (Standards   6.01   –   6.07)   and   the   2016   NCCA   Standards   (Standard   18)   assert   the   need   for   testing   programs   to   establish   and   comply   with  policies  and  procedures  designed  to  protect  examination  content  and  ensure  standardized   candidate  experiences.  Further  discussion  regarding  test  administration  issues  to  consider  will   be  presented  in  a  future  installment  of  this  white  paper  series.       4  

8. Scoring  and  Reporting There   are   many   different   ways   to   apply   a   scoring   key   to   candidate   responses   from   an examination,   ranging   from   simple   (e.g.,   number   of   correct   out   of   total   number   of   items)   to complex   (e.g.,   partial   credit,   penalties   for   incorrect   responses;   Figure   1).   The   range   of   valid scores  may  be  changed  to  suit  the  program's  needs  or  the  candidates'  understanding,  through the  use  of  raw,  percentage,  or  scaled  scores.  Other  important  scoring  issues  for  test  sponsors  to address  include  key  validation,  item  analysis,  and  any  additional  statistical  analysis  information required  for  the  testing  program  (e.g.,  quarterly  reports,  annual  reports). With  respect  to  score  reporting,  decisions  need  to  be  made  regarding:  what  type  of  information   will  be  provided  to  candidates;  what  information  will  be  reported  to  the  test  sponsor  (e.g.,  total   scores,  subscores,  average  examination  performance  statistics);  in  what  format  will  information   be   provided   to   candidates   and   the   sponsor   organization   (e.g.,   paper,   electronic,   preliminary,   official);   whether   contextual   cues   such   as   explanatory   text,   visual   representations,   or   uncertainty  information  will  be  included;  and  on  what  type  of  timeline  will  the  information  be   provided  (e.g.,  instant,  delayed).  Issues  related  to  scoring  and  reporting  are  addressed  by  both   the  2016  NCCA  Standards  (Standard  19)  and  the  2014  Joint  Standards  (Standards  6.10  –  6.16).   Further   detail   regarding   scoring   and   reporting   issues   to   consider   will   be   discussed   in   an   upcoming  installment  of  this  white  paper  series.     9. Technical  Report Thorough   documentation   of   the   entire   test   development   process   provides   validity   evidence   for the  testing  program  (Figure  1).  With  respect  to  technical  reports,  it  is  important  to  consider  the level   of   documentation   that   test   sponsors   should   receive   from   contracted   vendors,   which   is guided  by  the  2014  Joint  Standards  (Standard  7.04)  and  the  2016  NCCA  Standards  (Standards 13   –   21).   Additional   considerations   regarding   technical   documentation   will   be   provided   in   a future  edition  of  this  white  paper  series. Operating  a  successful  testing  program  with  a  valid,  legally  defensible  examination,  is  no  small   feat.   Information   gathering,   review,   and   decision-­‐making   can   be   overwhelming.   Using   a   test   development  life  cycle  framework  may  facilitate  effective  planning  and  successful  execution  of   a  testing  program.  For  more  detailed  descriptions,  issues,  and  questions  to  ask  regarding  each   stage  of  the  test  development  life  cycle,  check  the  Comira  website  at  the  end  of  each  month  for   the  next  white  paper  publication.  If  you  would  like  a  free  one-­‐hour  consultation  on  the  quality   of   your   testing   program/planning,   please   use   the   following   link   to   contact   Comira’s   Psychometric  Team.  

5  

References       American  Educational  Research  Association,  American  Psychological  Association,  National   Council  on  Measurement  in  Education.  (2014).  Standards  for  educational  and   psychological  testing.  Washington,  DC:  AERA.     Cizek,  G.  J.  (2006).  Standard  setting.  In  S.  M.  Downing  &  T.  M.  Haladyna  (Eds.),  Handbook  of  test   development  (pp.  225  -­‐  258).  Mahwah,  NJ:  Lawrence  Erlbaum  Associates,  Inc.     Downing,  S.  M.  (2006).  Twelve  steps  for  effective  test  development.  In  S.  M.  Downing  &  T.  M.   Haladyna  (Eds.),  Handbook  of  test  development  (pp.  3  -­‐  25).  Mahwah,  NJ:  Lawrence   Erlbaum  Associates,  Inc.     Institute  for  Credentialing  Excellence.  (2014).  National  Commission  for  Certifying  Agencies   Standards  for  the  Accreditation  of  Certification  Programs.  Washington,  DC:  ICE.  

6  

Suggest Documents