How can invariant measurement based on Rasch models inform ...

1 downloads 126 Views 1MB Size Report
4 Nov 2017 - 2. Three keystones. ▫ Power of educational assessment (purposes and decisions). ▫ Network of stable mea
How can invariant measurement based on Rasch models inform educational assessments? George Engelhard, Jr. The University of Georgia Presentation at the International Conference on Educational Measurement, Evaluation and Assessment in November 2017 (Abu Dhabi, United Arab Emirates)

November 4, 2017 1

Three keystones 





Power of educational assessment (purposes and decisions) Network of stable measures (invariant measurement) Rasch measurement theory

2

Three influential mentors   

Benjamin Bloom Benjamin D. Wright Georg Rasch

3

1 - Influential mentors Professor Benjamin Bloom

4

Bloom (1970, p. 26) 

It is no great exaggeration to compare the power of testing on human affairs with the power of atomic energy. Both are capable of great positive benefit to all of mankind and both contain equally great potential for destroying mankind. If mankind is to survive, we must continually search for the former and seek ways of controlling or limiting the later. Assessment as powerful technology 5

Examples 

 

We assess what we value … reading and mathematics stressed, and less attention to science and social studies in the United States Promotion and graduation tests Certification and licensure tests

Assessment is where the rubber meets the road … 6

2 - Influential mentors Professor Ben Wright

7

Examples  



Objective and invariant measurement Using the Rasch model to solve measurement problems Theory into practice

8

Wright (1968, p. 87) First, the calibration of measuring instruments must be independent of those objects that happen to be used for the calibration. Second, the measurement of objects must be independent of the instrument that happens to be used for the measuring.

Science is impossible without an evolving network of stable measures (Wright, 1997, p. 33) 9

3 - Influential mentors Georg Rasch

Specific objectivity Rasch Model

exp(θ n − δ i1) φ ni1 = 1 + exp(θ n − δ i1) 10

Rasch’s motivation

11

Rasch’s motivation

12

Three keystones 





Power of educational assessment (purposes and decisions) Network of stable measures (invariant measurement) Rasch measurement theory

13

Overview 

I. Educational assessment  



II. Invariant measurement   



Purposes and decisions Scores

Item-invariant measurement of persons Person-invariant calibration of items Invariant continuum

III. Rasch measurement theory   

Wright Map Item and person fit Person response functions

14

I. Educational Assessments 

The purposes of educational assessments are to make decisions on the basis of scores. Macro to micro focus (international, national, state, school, teacher, or student)  Operational definition of the educational outcomes that we value as a society 

What is reading?  What is mathematics?  What is science?  What is English language proficiency? 

15

Definition of a score … the term score is used generically in its broadest sense to mean any coding or summarization of observed consistencies or performance regularities on a test, questionnaire, observation procedure, or other assessment devices such as work samples, portfolios, and realistic problem simulations. Messick (1995, p. 741) 16

Assessment of individual students  

Focus on individual students Purposes of classroom assessments Before instruction: readiness, placement  During instruction: formative, diagnostic  After instruction: summative, strategic 

What do students know, what can students do, and what should students learn next … ? 17

Test Standards

(AERA, APA, & NCME, 2014)

Validity refers to the degree to which evidence and theory support the interpretations of test scores for proposed uses of tests (p. 11) Standard 1.0 Clear articulation of each intended test score interpretation for a specified use should be set forth, and appropriate validity evidence in support of each intended interpretation should be provided (p. 23)

18

Validity Validity is not a property of the test or assessment as such, but rather of the meaning of the test scores. These scores are a function not only of the items or stimulus conditions, but also of the persons responding as well as the context of the assessment. Messick (1995, p. 741)

19

Scores



Scores are a function not only of the items, but also of the persons responding as well as the context of the assessment. Scores = f(persons, items, context)



Context: Purpose and decisions based on scores





Student motivation

20

Persons

Scores Items

Low/Easy

Context

High/Hard 21

II. Invariant measurement The scientist is usually looking for invariance whether he knows it or not (Stevens, 1951, p. 20) The scientist seeks measures that will stay put while his back is turned (Stevens, 1951, p. 21)

22

Albert Einstein (Physics) Einstein, however, was not truly a relativist … beneath all of his theories, including relativity, was a quest for invariants … and the goal of science was to discover it (Isaacson, 2007, p. 3) Relativity/invariance 23

Invariant measurement

24

“Invariance of item and person measures remains the exception rather than the rule … the context-dependent nature of estimates in human science research … seems to be the antithesis of the invariance we expect across thermometers and temperatures” (Bond & Fox, 2015, p. 85)

25

Science is impossible without an evolving network of stable measures (Wright, 1997, p. 33) 





Wright Map as part of a “road map” to teaching and learning No one way: origin, path and destination will vary for each person Learning maps (dynamic learning maps and learning progressions)

Maps! 26

III. Rasch measurement theory Wright Map: Latent continuum (line)  Item and person fit  Person response functions 

exp(θ n − δ i1) φ ni1 = 1 + exp(θ n − δ i1) 27

The latent continuum (line): Wright map

Low/Easy

High/Hard 28

29

Metametrics: Lexiles https://lexile.com/tools/lexile-map/

30

θ

Continuum with Item sets A, B and C A

B

C Low /Easy

High/Hard

Items Item Sets

1

2

3

Score

A: Hard

1

0

0

1

B: Medium

1

1

0

2

C: Easy

1

1

1

3 31

Guttman and Rasch

32

θ

Continuum with three response patterns with scores of 3 …

???

Low /Easy

High/Hard

Items Person

1

2

3

4

5

6

Score

A

1

1

1

0

0

0

3

B

0

1

1

0

1

0

3

C

0

0

0

1

1

1

3 33

  

Bring the person back into measurement Identify unusual response patterns Different ways of getting a “3”

What are the implications for the decisions made for these students who received a “3” in different ways?

34

35

36

Four Components of Scores    

Theta: Location on the line SEM: Probabilistic uncertainty Person fit: Validity of response pattern Visual display for unusual response patterns Crossing person response functions  Residual analyses 

37

Learning about Rasch measurement theory

[In press]

38

Summary    

Assessment is a technology that defines what we value in education Assessment systems define a stable system of measures to represent the constructs Test scores are used to make decisions about individual students, and we should consider Four components of scores: Theta, SEM, person fit, visual display  Validity of response pattern 

39

Final Word Professor Ben Wright

What is the construct?

Where is the Wright map?

Is the Wright map a valid representation of construct? 40

Suggest Documents