Assessing numeracy and other mathematical skills in psychology students as a basis for learning statistics
Gerry Mulhern and Judith Wylie Queen’s University Belfast
The Higher Education Academy Psychology Network Department of Psychology University of York York YO10 5DD Email:
[email protected] Telephone: 01904 433154 Fax: 01904 433181 www.psychology.heacademy.ac.uk
£5 This document is also available online from www.psychology.heacademy.ac.uk ISBN 0-9550372-0-4
A mini-project funded by
Assessing numeracy and other mathematical skills in psychology students as a basis for learning statistics
Gerry Mulhern and Judith Wylie
A mini-project funded by
Assessing numeracy
TABLE OF CONTENTS Project report Assessing numeracy and other mathematical skills in psychology students as a basis for learning statistics Tutors pack Section 1
The test
Section 2
Description of sub-components and details of internal reliabilities
Section 3
Test answers and scoring instructions
Section 4
Results, diagnostic advice and implications for statistics teaching References
-2-
Project report
Assessing numeracy
Background This mini-project was undertaken in light of widespread concern among educators, employers and politicians about levels of basic numeracy among the UK population as a whole – a concern shared by those in higher education. Commentators have pointed to low standards of numeracy and other core mathematical skills among students entering, and indeed completing, undergraduate degree courses (Cartright, 1996; Brown, Askew, Baker, Denvir and Millett, 1998; Hutton, 1998; Phoenix, 1999). Of particular concern was the suggestion of a growing deficit among university entrants in basic numerical dexterity, appreciation of number, and basic algebraic reasoning (Phoenix, 1999). In undergraduate psychology courses, the ability to undertake empirical work and to evaluate empirical evidence in published material had been seen as central to psychological training. Mulhern and Wylie (2004) went so far as to argue that the teaching of quantitative methods to large, heterogeneous groups of students was the greatest pedagogic challenge in psychology. Despite this, little was known about the specific problems encountered by psychology undergraduates, especially in terms of the mathematical skills needed to equip students for learning quantitative methods. In a study of Northern Ireland psychology undergraduates, Mulhern and Wylie (2004) investigated levels of numeracy and mathematical knowledge among two cohorts of students entering university a decade apart (1992 and 2002). They reported evidence of declining levels of skill among students, and significant deficits among females in both cohorts. This latter finding was considered particularly worrying in view of the substantial predominance of female psychology undergraduates throughout the UK (Radford and Holdstock, 1996). While these results were considered interesting in their own right, there was clearly a need to collect further data to reveal the national picture.
Specific objectives The project aimed to extend Mulhern and Wylie’s (2004) findings to a wider sample of UK university psychology departments. Specifically, it sought to: 1. undertake an assessment of numeracy and mathematical reasoning among students entering courses of study in a range of UK psychology departments; 2. use the results of this assessment to identify students’ conceptual difficulties in specific components of mathematical thinking; 3. consider the implications of these conceptual difficulties for the learning of statistics in psychology; 4. make available the findings of this research, including the test materials, to psychology departments throughout the UK.
Outcomes 1. A data set of national baselines consisting of approximately 1000 students representing both pre- and post-1992 universities in England and Wales, Scotland and Northern Ireland. 2. Knowledge about students’ difficulties in specific components of numeracy and mathematical reasoning relevant to the learning of statistics. 3. A publication and conference presentations on the results of this national survey. 4. A pack containing the test items and diagnostic advice for teachers.
Materials The items in this study were the same as those used by Mulhern and Wylie (2004). The test comprised 20 questions, some of which were multi-part giving a total of 32 items (see appendix). Items were classified into six broad components of mathematical reasoning relevant to statistics. This was done on the basis of descriptions provided by Greer and
-1-
Assessing numeracy Semrau (1984), as well as the authors’ agreed judgements of the principal mathematical concept underlying each item. These components were: Calculation involving decimals and fractions (9 items) - Questions 1, 2, 4, 8, 11, 12; Algebraic reasoning (10 items) - Questions 3, 6, 9, 13, 16, 19; Graphical interpretation (6 items) - Questions 10, 18, 20; Proportionality and ratio (3 items) - Questions 7, 17; Probability and sampling (2 items) - Questions 5, 15; Estimation (2 items) - Question 14.
Procedure Groups were tested in their first term/semester before any substantial statistics teaching had begun. Copies of the test were mailed to participating institutions prior to the beginning of the 03/4 academic year. The test was distributed at the beginning of the lectures and participants were informed that they had 30 minutes to complete all questions. Due to the proximity of one participant to another, lecturers were asked to stress that the purpose of the test was to obtain information about students’ mathematical knowledge in order to inform teaching methods and, thus, the exercise would be invalid if students shared answers. It was also stressed that the test was entirely anonymous, so that no individual could be identified.
Achievement of outcomes Outcome 1.
A data set of national baselines consisting of approximately 1000 students representing both pre- and post-1992 universities in England and Wales, Scotland and Northern Ireland.
A total of 890 psychology undergraduates drawn from eight universities participated (173 males, 676 females, 41 dns). The universities represented a broad range of UK psychology departments, including pre- (N = 656) and post-1992 (N = 234) institutions from England and Wales, Scotland and Northern Ireland (see Table 1). Table 1: Sample details
Institution
N
Location
pre/post 1992
Entry
A B C D E F G H
166 107 101 236 32 101 130 17
NI NI Eng/W Scot Eng/W Eng/W Eng/W Eng/W
pre pre post pre post post pre pre
BBB BB CC BBB CC BB/CCC AAB BBB
All students were in their first term at university and their highest qualifications in mathematics are shown in Table 2.
-2-
Assessing numeracy Table 2: Students’ highest qualifications in mathematics
N A-level A/B A-level C-E GCSE Additional maths GCSE A GCSE B/C GCSE D/E O-level Access course European None Blank
Outcome 2.
77 127 9 94 435 10 19 30 41 17 31
Knowledge about students’ difficulties in specific components of numeracy and mathematical reasoning relevant to the learning of statistics.
Results and analysis One mark was awarded for each correct answer, giving a total 32 available marks. For each participant, separate mean scores were calculated for each of the six mathematical components (range 0-1), and an ‘unweighted’ total consisting of the sum of these six component means was obtained (range 0-6). This measure was intended to avoid a disproportional contribution to the total score of those components with a greater number of items. Internal consistency of scales Since the magnitude of the reliability coefficient is a function, among other things, of the number of items in a scale, in order to permit direct comparison between components, α values were adjusted to 10 item scales (equal to the number of items in algebraic reasoning) using the Spearman-Brown formula. Alpha reliability coefficients for the entire sample were calculated for the unweighted total and the six component scales. Internal consistency was found to be high for unweighted total (α = .83). High consistencies were also found for proportionality and ratio (.87) and graphical interpretation (.86), and moderately high coefficients for estimation (.68), algebraic reasoning (.62), calculation (.59) and probability and sampling (.56). Total score by institution, gender and qualification The mean total correct for the entire sample was 13.75 (43.0%), with individual scores ranging from 0 (0.0%) to 29 (90.6%) and males outperforming females (m = 15.53, f = 13.24). Figure 1 shows mean total correct scores by institution and gender. Analysis revealed highly significant main effects of both Institution (F 7,833 = 33.94, p < .0002) and Gender (F 1,833 = 9.06, p = .003) and no Institution x Gender interaction. Analysis also revealed highly significant main effects of both Qualification (F 8,787 = 30.49, p < .0004) and Gender (F 1,787 = 9.71, p = .002) and no Qualification x Gender interaction (see Figure 2). In terms of current UK school-based mathematics qualifications only (A-level grade A through GCSE grade D/E), a highly significant linear trend was found (p < .0004).
-3-
Assessing numeracy
32
28
Mean total correct
24
20
16
12
8
4
Gender male female
0 A
B
C
D
E
F
G
H
Institution
Figure 1: Mean total correct by Institution and Gender
32
Mean total correct
28 24 20 16 12 8 male
4 0
female
an pe ro se ur Eu co ss ce Ac l /E ve D -le e d O a gr C B/ SE e C G ad gr A SE h e C at G ad gr lM na SE C tio G di Ad /E C SE C de G ra B lg A/ ve le de Ara lg ve le A-
Figure 2: Mean total correct by Qualification and Gender Component scores by institution and gender There were no significant Institution x Gender interactions for any of the 6 components. Analysis revealed a highly significant main effect of Institution for all 6 components (in all cases F 7,833 > 6.929, p < .005) and a significant main effect of Gender for calculation (F 1,833 = 10.140, p = .002), proportionality and ratio (F 1,833 = 7.521, p = .006), and estimation (F 1,833 = 12.688, p < .005), with algebraic reasoning approaching significance (F 7,833 = 3.551, p = .060).
-4-
Assessing numeracy Figure 3 shows the overall proportion of correct answers to be low for probability and sampling, algebraic reasoning and estimation. Indeed, with the exception of graphical interpretation, the proportion correct was around .4 or less for all components. Calculation
Proportionality and ratio 1.0
Mean proportion correct
Mean proportion correct
1.0
0.8
0.6
0.4
0.2
0.0
0.8
0.6
0.4
0.2
0.0 A
B
C
D
E
F
G
H
A
B
C
Institution
Algebraic reasoning
F
G
H
1.0
Mean proportion correct
Mean proportion correct
E
Probability and sampling
1.0
0.8
0.6
0.4
0.2
0.0
0.8
0.6
0.4
0.2
0.0 A
B
C
D
E
F
G
H
A
B
C
Institution
D
E
F
G
H
Institution
Estimation
Graphical interpretation 1.0
1.0
Mean proportion correct
Mean proportion correct
D
Institution
0.8
0.6
0.4
0.2
0.8
0.6
0.4
0.2 Gender
0.0
male female
0.0 A
B
C
D
E
F
G
H
A
Institution
B
C
D
E
F
Institution
Figure 3: Mean proportion correct for each component by Institution and Gender
-5-
G
H
Assessing numeracy Common errors Of the 32 items, 23 were found to discriminate between participants and/or to have diagnostic value. Correct responses (shaded) and frequently occurring incorrect responses are presented below. For multiple choice items, mean percentage of the total sample (i.e. unweighted means across institution) is given for each option. For open-ended items, only the most common or diagnostically revealing responses are presented, with all other incorrect responses included in the category ‘other’. Calculation Item 2:
If the petrol tank of a car holds 5.5 gallons, how much is this in litres? (One litre equals 0.22 gallons) Options unweighted % 5.5 ÷ 0.22 62 5.5 x 0.22 28 0.22 ÷ 5.5 2 5.5 – 0.22 1 5.5 + 0.22 0 no answer 7
(Item 4) Item 4.1:
Calculate each of the following: √ 0.09 Answers unweighted % .03 43 .3 17 .81 2 other 12 no answer 26
Item 4.2:
0.02 x 0.12 Answers .0024 .24 .024 .06 other no answer
unweighted % 28 27 14 3 12 15
40 ÷ 0.8 Answers 50 .5 320 .1 other no answer
unweighted % 43 10 3 2 24 19
Item 4.3:
Item 8:
Arrange the following in order of size, starting with the smallest: 2 1 3 0.25 .0099 /3 /50 /200 Answers unweighted % incorrect 50 correct 44 no answer 6
Item 12:
These two blocks are the same shape but different sizes. If the measurements of block A are all 0.75 the lengths of those for block B, and block B is 14cm tall, how tall is block A in cm?
-6-
Assessing numeracy
A
B Options 14 x 0.75 14 ÷ 0.75 14 – 0.75 14 + 0.75 no answer
unweighted % 60 22 6 2 10
-7-
Assessing numeracy Algebraic reasoning Item 3.1:
Suppose we define a * b to mean a + 2b. Is it true that: a*b=b*a Options unweighted % always 46 never 30 sometimes 18 no answer 6
Item 3.2:
a *(b * c) = (a * b) * c Options unweighted % never 48 sometimes 22 always 17 no answer 14
Item 6.2:
If –5 – 2x = 1, what is x? Answer unweighted % -3 52 3 16 2 5 other 14 no answer 12
(Item 9)
a and b are two numbers. If a and b are both doubled, what effect will this have on each of the following? a+b a–b Answer unweighted % no effect 35 doubling effect 28 other 11 no answer 26
Item 9.1:
Item 9.2:
Item 9.3:
a2 + b2 Answer doubling effect increase x4 other no answer
unweighted % 34 14 9 16 28
1 2a + b Answer doubling effect decrease halving other no answer
unweighted % 33 13 11 11 32
-8-
Assessing numeracy Item 13:
x + y = 16. What is 100 – x – y ? If there isn’t enough information, tick here Answer unweighted % not enough information 47 84 43 other 4 no answer 5
Item 16:
If a = 3, b = - 2 and c = 7 what is the value of 3b2 – abc ? Answer unweighted % 54 35 -30 6 30 5 other 35 no answer 18
Item 19:
Which of the following is NOT equal to any of the other three? a–b+c (a – b) + c a – (b + c) a + (c – b) Options unweighted % a + (c – b) 39 a – (b + c) 35 a–b+c 7 (a – b) + c 4 no answer 15
Graphical interpretation Item 20:
The following graph represents the speed of a racing car around a complete lap of a racing circuit.
S P E E D
DISTANCE
Choose whichever of the following is most likely to be a map of the circuit:
1 Options 2 3 1 no answer
2 3 unweighted % 46 32 11 11
-9-
Assessing numeracy
Proportionality and ratio Item 7:
You can see the height of Mr Tiny measured with paper clips. Mr Tiny has a friend, Mrs Large. When we measure their height with matchsticks, Mr Tiny’s height is 4 matchsticks and Mrs Large’s is 6 matchsticks. How many paper clips are needed for Mrs Large’s height?
Answer 9 8 other no answer
(Item 17)
unweighted % 66 15 13 6
These two letters are the same shape, but one is larger. AC is 4 units long, RT is 6 units. U
S
B
D
R A
E
C
V
T
- 10 -
Assessing numeracy Item 17.1: AB is 7 units. How long is RS? Answer unweighted % 10.5 42 9 26 other 14 no answer 19 Item 17.2: UV is 15 units. How long is DE? Answer unweighted % 10 36 13 18 22.5 9 other 16 no answer 21 Probability and sampling Item 15:
A game of squash can be played either to 9 or 15 points. Holding all other rules of the game constant, if player A is better than player B, which scoring system will give player A a higher probability of winning? Options unweighted % no difference 54 9 point game 19 15 point game 19 no answer 8
Item 5:
Forty fish were caught from a pond; each one was marked and thrown back into the pond. On another day 60 fish were caught from the same pond and there were four marked fish among them. Estimate the total number of fish in the pond. Answer unweighted % 96 27 600 18 100 6 240 4 400 3 other 25 no answer 18
Estimation (Item 14) Estimate (do not attempt to calculate) the following: Item 14.1: (85.63 – 1.2384) (101.46 – 97.88) Answers unweighted % within acceptable range (15-35) 51 outside range 32 no answer 17 Item 14.2: 5.6832 x 0.623 0.07689 Answers outside range no answer within acceptable range (30-70)
- 11 -
unweighted % 62 29 9
Assessing numeracy
Summary of findings The study has identified marked deficiencies in mathematical reasoning among psychology undergraduates in a cross section of UK universities. Of particular concern is the poor performance of students from most institutions on probability and sampling, estimation and, to a lesser extent, proportionality and ratio and calculation. Gender differences are also apparent across institutions, which is a source of concern given the demography of undergraduate cohorts. We contend that the test items discriminate well between individuals (and indeed institutions) and elicit consistent patterns of error with diagnostic value. As such the test and Instructor’s Pack should prove useful in assisting teachers of statistics to the diverse groups of undergraduates typical of UK psychology departments. Our findings will be discussed in detail in the Instructor’s Pack, along with their implications for statistics teaching.
Outcome 3.
A publication and conference presentations on the results of this national survey.
A paper summarising the outcomes of the research has been submitted to Psychology Learning and Teaching and a second paper presenting more detailed analysis of students’ errors is in preparation. The research has been presented at The Annual Conference of the British Psychological Society (April 2004) and the Annual Conference of the Northern Ireland Branch of the British Psychological Society (April 2004). One further conference presentation is planned.
Outcome 4.
A pack containing the test items and diagnostic advice for teachers.
The instructor pack aims to allow teachers in psychology departments to better understand the needs of their students with respect to the skills required for effective learning of statistics and research methods. The pack contains the following: • • • • •
original test used in the research; test answers and scoring details; a description of sub-components; details of internal reliabilities of scales; annotated research findings including implications for statistics teaching.
References Brown, M., Askew, M., Baker, D., Denvir, H. & Millett, A. (1998). Is the National Numeracy Strategy research-based? British Journal of Educational Studies, 46, 362-385. Cartright, M. (1996). Numeracy needs of the beginning registered nurse. Nurse Education Today, 16, 137-143. Hutton, B.M. (1998). Do school qualifications predict competence in nursing calculations? Nurse Education Today, 18, 25-31. Mulhern, G. & Wylie, J. (2004). Changing levels of numeracy and other core mathematical skills among psychology undergraduates between 1992 and 2002. British Journal of Psychology, 95, 355-370. Phoenix, D. (1999). Numeracy and the life scientist. Journal of Biological Education, 34, 3-4. Radford, J. & Holdstock, L. (1996). The growth of psychology. The Psychologist, 9, 548-550.
- 12 -
Tutors pack
Assessing numeracy
Section 1 The test
Assessing numeracy 1
Write a fraction in the box to complete the statement. + 8 x 1/100
6.28 = 6 x 1 + 2 x
2 If the petrol tank of a car holds 5.5 gallons, how much is this in litres? Choose (tick) one of the following answers. (One litre equals 0.22 gallons). 5.5 + 0.22 ____
5.5 ÷ 0.22 ____
5.5 – 0.22 ____
0.22 ÷ 5.5 ____
5.5 x 0.22 ____
(3)
Suppose we define a * b to mean a + 2b. Is it true that:
3.1
a*b=b*a Always ____
3.2
Never ____
Sometimes ____
a *(b * c) = (a * b) * c Always ____
Never ____
(4)
Calculate each of the following:
4.1
√0.09
Sometimes ____
______
4.2
0.02 x 0.12 ______
4.3
40 ÷ 0.8 ______
5 Forty fish were caught from a pond; each one was marked and thrown back into the pond. On another day 60 fish were caught from the same pond and there were four marked fish among them. Estimate the total number of fish in the pond. Answer ________
-1-
Assessing numeracy 6.1
Complete the following: 6 – (-3) = ______
6.2
If –5 – 2x = 1, what is x?
7
You can see the height of Mr Tiny measured with paper clips. Mr Tiny has a friend, Mrs Large. When we measure their height with matchsticks, Mr Tiny’s height is 4 matchsticks and Mrs Large’s is matchsticks. How many paper clips are needed for Mrs Large’s height?
______
6
Answer
8
Arrange the following in order of size, starting with the smallest: 0.25
(9)
.0099
2
1
/3
/50
3
/200
a and b are two numbers. If a and b are both doubled, what effect will this have on each of the following?
9.1
a+b a–b
9.2
a2 + b 2
9.3
1 2a + b
-2-
Assessing numeracy 10 Jane planted a flower in her garden and measured it every week. This is a graph of its growth:
Height cm
30
x
x
x
x
20 x
10 x
x
x
x
x
0 1 MAY
8 MAY
15 MAY
22 MAY
29 MAY
5 JUNE
12 JUNE
19 JUNE
26 JUNE
2 JULY
WEEK ENDING
During which week did it grow fastest? The week ending
11.1
Tick whichever of the following is closest to 7.416 7.426 ___
11.2
7.42 ___
7.411 ___
7.41 ___
7.516 ___
Fill in the next two numbers in this sequence: 7.76
12
_________
7.80
7.84
7.88
7.92
_____
_____
These two blocks are the same shape but different sizes. If the measurements of block A are all 0.75 the lengths of those for block B, and block B is 14cm tall, how tall is block A in cm?
-3-
A
B
Assessing numeracy
13
14 + 0.75 ____
14 x 0.75 ____
14 ÷ 0.75 ____
14 – 0.75 ____
x + y = 16 What is 100 – x – y ?
_______
If there isn’t enough information, tick here
____
(14)
Estimate (do NOT attempt to calculate) the following:
14.1
(85.63 – 1.2384) (101.46 – 97.88) ____________
14.2
5.6832 x 0.623 0.07689
____________
15 A game of squash can be played either to 9 or 15 points. Holding all other rules of the game constant, if player A is better than player B, which scoring system will give player A a higher probability of winning? (tick one) 9 point game ____
15 point game ____
Doesn’t make any difference ____
16
If a = 3, b = -2 and c = 7 What is the value of 3b2 – abc ?
-4-
__________
Assessing numeracy
(17) These two letters are the same shape, but one is larger. AC is 4 units long, RT is 6 units.
U
S
B
D
R A
17.1
AB is 7 units. How long is RS? ______ E
17.2
UV is 15 units. How long is DE? ______
C
V
T
(18) Read the following scales and write your answers in the boxes provided. Give all your answers as decimals. 18.1
18.2
8
9
2
2.1
18.3
18.4
3
19
4
5
6
Which of the following is NOT equal to any of the other three? (tick one) a–b+c
____
(a – b) + c
____
a – (b + c)
____
a + (c – b)
____
-5-
Assessing numeracy 20 The following graph represents the speed of a racing car around a complete lap of a racing circuit.
S P E E D
DISTANCE
Tick whichever of the following is most likely to be a map of the circuit:
-6-
Assessing numeracy
Section 2 Description of sub-components and details of internal reliabilities
Assessing numeracy
Sub-components The test was originally compiled by Greer & Semrau (1984), although their study reported performance on a small number of items. The test comprises 20 questions, some of which are multi-part giving a total of 32 items. We have classified these items into six broad components of mathematical reasoning relevant to the learning and teaching of statistics. It is important to point out that these subcomponents were not empirically derived. Instead, they were identified using existing descriptions provided by Greer & Semrau (1984) as well as our own agreed judgements of the principal mathematical concept underlying each item. The components are: Calculation (9 items) involving decimals and fractions - Questions 1, 2, 4, 8, 11, 12; Algebraic reasoning (10 items) - Questions 3, 6, 9, 13, 16, 19; Graphical interpretation (6 items) - Questions 10, 18, 20; Proportionality and ratio (3 items) - Questions 7, 17; Probability and sampling (2 items) - Questions 5, 15; Estimation (2 items) - Question 14.
Internal reliabilities Note: Since the magnitude of the reliability coefficient is a function, among other things, of the number of items in a scale, in order to permit direct comparison between components, α-values were adjusted to 10 item scales (equal to the number of items in algebraic reasoning) using the Spearman-Brown formula.
Alpha reliability coefficients for the entire sample were calculated for the scale as a whole and for the six components. Internal consistency of the total test was found to be high (α = .83). High consistencies were also found for proportionality and ratio (.87) and graphical interpretation (.86), and moderately high coefficients for estimation (.68), algebraic reasoning (.62), calculation (.59) and probability and sampling (.56). Given the fact that these components were not empirically derived, the reliabilities are reassuringly high. Inevitably though, results, particularly marginal differences between individuals and groups, should be treated with some caution.
-7-
Assessing numeracy
Section 3 Test answers and scoring instructions
Assessing numeracy
Answers to test items 1
1/10
2
5.5 ÷ 0.22
3.1 3.2
sometimes sometimes
4.1 4.2 4.3
.3 .0024 50
5
600
6.1 6.2
9 -3
7
9
8
.0099
9.1 9.2 9.3
none x4 x .5
10
5 June
11.1 11.2
7.42 7.96
12
14 x 0.75
13
84
14.1 14.2
all answers between 15 and 35 accepted all answers between 30 and 70 accepted
15
15 point game
16
54
17.1 17.2
10.5 10
18.1 18.2 18.3 18.4
8.05 2.03 2.8 5.4
19
a – (b + c)
20
middle shape (rounded triangle)
3/200
1/50
.25
2/3
8.0
-8-
Assessing numeracy
Scoring the test
Award one mark for each correct answer. This means that each individual can obtain a total test score anywhere between 0 and 32. You may also work out a score for each individual for each of the six mathematical components. To do this, add up the number of correct responses for the items relating to the component in question (see below), then divide by the number of items in that component. This will give six component scores between 0 and 1. Calculation (9 items) - 1, 2, 4.1, 4.2, 4.3, 8, 11.1, 11.2, 12 Algebraic reasoning (10 items) - 3.1, 3.2, 6.1, 6.2, 9.1, 9.2, 9.3, 13, 16, 19 Graphical interpretation (6 items) - 10, 18.1, 18.2, 18.3, 18.4, 20 Proportionality and ratio (3 items) - 7, 17.1, 17.2 Probability and sampling (2 items) - 5, 15 Estimation (2 items) - 14.1, 14.2 You can also obtain an unweighted total score for the test by adding together the six component means. This gives a score between 0 and 6 which avoids a disproportional contribution to the total score of components with a greater number of items. In other words, every subcomponent contributes equally to the total score.
-9-
Assessing numeracy
Section 4 Results, diagnostic advice and implications for statistics teaching
Assessing numeracy
Results Here we report some of the main findings of this HEA-funded project (see also Mulhern & Wylie, 2004 for details of a further study comparing two cohorts of psychology undergraduates entering university ten years apart). We administered this test to 890 undergraduate psychology students in eight universities (173 males, 676 females, 41 did not specify). The universities represented a broad range of UK psychology departments, including pre- (N = 656) and post-1992 (N = 234) institutions from England & Wales, Scotland and Northern Ireland. We identified marked deficiencies in mathematical reasoning among psychology undergraduates in all institutions, although there were marked differences between institutions in terms of overall performance. Performance was particularly poor for probability and sampling, estimation and, to a lesser extent, proportionality and ratio and calculation. Significant gender differences were also apparent across all institutions. Total score by institution, gender and qualification The mean total correct for the entire sample was 13.75 (43.0%), with individual scores ranging from 0 (0.0%) to 29 (90.6%) and males outperforming females (m = 15.53, f = 13.24). Figure 1 shows mean total correct scores by institution and gender. Figure 2 presents performance against highest mathematics qualification. In terms of current UK school-based qualifications only (i.e. ignoring O-level, access courses and European qualifications), a marked linear trend is apparent.
32
28
Mean total correct
24
20
16
12
8
4
Gender male female
0 A
B
C
D
E
F
G
H
Institution
Figure 1: Mean total correct by Institution and Gender
- 10 -
Assessing numeracy
32
Mean total correct
28 24 20 16 12 8 male
4 0
female
an pe ro se ur Eu co ss ce Ac l /E ve D -le e O ad gr C B/ SE e C G ad gr A SE h e C d at G a gr lM na SE C tio G di Ad /E C SE e C G ad r B lg A/ ve le de Ara lg ve le A-
Figure 2: Mean total correct by Qualification and Gender
Component scores by institution and gender Figure 3 shows the overall proportion of correct answers to be low for probability and sampling, algebraic reasoning and estimation. Indeed, with the exception of graphical interpretation, the proportion correct was around .4 or less for all components.
Calculation
Algebraic reasoning 1.0
Mean proportion correct
Mean proportion correct
1.0
0.8
0.6
0.4
0.2
0.0
0.8
0.6
0.4
0.2
0.0 A
B
C
D
E
F
G
H
A
Institution
B
C
D
E
Institution
- 11 -
F
G
H
Assessing numeracy
Graphical interpretation
Probability and sampling 1.0
Mean proportion correct
Mean proportion correct
1.0
0.8
0.6
0.4
0.2
0.0
0.8
0.6
0.4
0.2
0.0 A
B
C
D
E
F
G
H
A
B
C
Institution
Proportionality and ratio
E
F
G
H
Estimation
1.0
1.0
Mean proportion correct
Mean proportion correct
D
Institution
0.8
0.6
0.4
0.2
0.8
0.6
0.4
0.2 Gender
0.0
male female
0.0 A
B
C
D
E
F
G
H
A
Institution
B
C
D
E
F
G
H
Institution
Figure 3: Mean proportion correct for each component by Institution and Gender
- 12 -
Assessing numeracy
Diagnostic advice & implications for teaching statistics Below, we present the most common errors exhibited by individuals along with guidelines regarding associated misconceptions. Of the 32 items in the test, 23 were found to discriminate between individuals and/or to have diagnostic value. Correct responses (shaded) and frequently occurring incorrect responses are presented. For items requiring open-ended answers, only the most common or diagnostically revealing responses are presented, with all other incorrect responses included in the category ‘other’. CALCULATION Item 2
If the petrol tank of a car holds 5.5 gallons, how much is this in litres? (One litre equals 0.22 gallons) Options 5.5 ÷ 0.22 5.5 x 0.22 0.22 ÷ 5.5 5.5 – 0.22 5.5 + 0.22 no answer
% 62 28 2 1 0 7
Diagnostic comment Here, more than a quarter of students chose multiplication rather than division. One likely source of error is the commonly held implicit assumption that “multiplication makes bigger” and “division makes smaller”. Not so, of course, when multiplying or dividing by numbers less than 1. Here most students will know instinctively that the number of litres will be larger than the number of gallons and this may guide the erroneous choice of operation.
Item 4
Calculate each of the following:
4.1
√0.09 Answers .03 .3 .81 other no answer
4.2
0.02 x 0.12 Answers .0024 .24 .024 .06 other no answer
4.3
% 43 17 2 12 26
% 28 27 14 3 12 15
40 ÷ 0.8
- 13 -
Assessing numeracy Answers 50 .5 320 .1 other no answer
% 43 10 3 2 24 19
Diagnostic comment Among other things, the above three items assess students’ basic knowledge of the manipulation of quantities involving decimals. It is striking that only 17% of the entire sample could correctly report √ 0.09 with almost half giving the answer .03. Errors may be greatly reduced by encouraging students to square their answer and check that the result comes to .09. Detailed treatment of this and similar examples may also prove pedagogically effective. Item 4.2 is also worrying, in that three quarters of students simply don’t know the convention for multiplying two numbers each with two decimal places. The response .06 suggests that some students (26 in fact) were at a complete loss by resorting to dividing .12 by 2. Such random manipulations are common features of individuals lacking any conceptual knowledge. Similar comments may be made about item 4.3.
Item 8
Arrange the following in order of size, starting with the smallest: 2 1 3 /3 /50 /200 0.25 .0099 Answers incorrect correct no answer
% 50 44 6
Diagnostic comment The most common error by far involved choosing 3/200 as the smallest number, indicating that students had difficulty converting .0099 to a fraction or 3/200 to a decimal for comparison.
Item 12
These two blocks are the same shape but different sizes. If the measurements of block A are all 0.75 the lengths of those for block B, and block B is 14cm tall, how tall is block A in cm?
A
B
Options 14 x 0.75 14 ÷ 0.75
% 60 22
- 14 -
Assessing numeracy 14 – 0.75 14 + 0.75 no answer
6 2 10
Diagnostic comment This problem produced a similar pattern of results to item 2 and, again, this is likely to be due, at least in part, to the commonly held implicit assumption that “multiplication makes bigger” and “division makes smaller”. The picture of the two blocks did not improve students’ performance on this item compared to item 2.
Implications for teaching statistics – Calculation The need for students to perform basic calculations accurately and to feel comfortable with numerical quantities is self-evidently an important prerequisite for handling quantitative data. Results for the above items suggest that such skills among undergraduate psychology students cannot be relied upon. It may be reasonable to assume that many students will survive by relying on the number-crunching of statistical packages. However, it is surely desirable that these students should be capable of summarising and exploring their data by hand if required. One is reminded of the old chestnut of what to do in the event of a power failure! It is certainly important that students have sufficient calculative competence to realise that multiplication of a quantity by a value between 0 and 1 makes it smaller, and division bigger (see items 2, 4.3 and 12). More generally, concepts of place value and conventions regarding decimal places and fractions in calculation are essential (see item 4.2, 4.3 and 8). Perhaps the most worrying result from the point of view of learning statistics is the inability of students to give the square root of .09, given the importance of r and r2 in correlational statistics in psychology. If, for example, a student were told that the proportion of variance in a set of scores explained by a particular variable was .09, or 9%, it would be a matter of extreme concern if that student were to conclude that the corresponding correlation between the two variables was .03 or, worse, .81 (see item 4.1).
- 15 -
Assessing numeracy ALGEBRAIC REASONING Item 3
Suppose we define a * b to mean a + 2b. Is it true that:
3.1
a*b=b*a Options always never sometimes no answer
% 46 30 18 6
Diagnostic comment This item is concerned with the commutativity of operations. Unpicked, the problem becomes: is it true that a + 2b = b + 2a? This further reduces to a = b. Thus the problem is ‘sometimes’ correct, specifically when a = b.
3.2
a *(b * c) = (a * b) * c Options never sometimes always no answer
% 48 22 17 14
Diagnostic comment This item deals with the distributive nature of operations as well as parenthetical priorities. Unpicked, the problem becomes: a * (b + 2c) = (a + 2b) *c, and thus, a + 2b + 4c = a + 2b + 2c. This will be true when c = 0, so the answer is again ‘sometimes’. In practice, many students will attempt this problem inductively by plugging in a few examples, which is fine, but will not always allow students to come to a conclusion. 48% either guessed ‘never’, or concluded ‘never’ through failing to find an example that obeyed the rule.
Item 6.2
If –5 – 2x = 1, what is x? Answer -3 3 2 other no answer
% 52 16 5 14 12
Diagnostic comment Ultimately, this problem reveals an inability in half the sample to rearrange a relatively simple algebraic expression, particularly in respect of the convention of changing sign when elements move to the opposite side of the equals sign, as evidenced by the response ‘3’ from approximately 90 students. Item 9
a and b are two numbers. If a and b are both doubled, what effect will this have on each of the following?
- 16 -
Assessing numeracy
9.1
a+b a–b Answer no effect doubling effect other no answer
9.2
a2 + b2 Answer doubling effect increase x4 other no answer
9.3
% 35 28 11 26
% 34 14 9 16 28
1 2a + b Answer doubling effect decrease halving other no answer
% 33 13 11 11 32
Diagnostic comment These problems concern the invariance or otherwise of algebraic expressions under an operation such as doubling of the elements. In item 9.1, only one third of students appeared to be able to deduce that doubling a and b may be thought of as (2a + 2b)/(2a-2b), which may be reduced to 2(a + b)/2(a – b), and thus (a + b)/(a – b). In item 9.2, the important rule that doubling an element results in a quadrupling of the square of the element is known to fewer than 10% of students. In item 9.3, only 11% realised that 1/(4a + 2b) = 1/2(2a + b) which is ½ times the original 1/(2a + b).
Item 13
x + y = 16. What is 100 – x – y ? If there isn’t enough information, tick here Answer not enough information 84 other no answer
% 47 43 4 5
Diagnostic comment
- 17 -
Assessing numeracy Here almost half of students believed that there was insufficient information to answer the question. The most likely explanation is that they were pre-occupied with the notion that “two unknowns require two equations to solve”. These students failed to realise that 100 – x – y = 100 – (x + y) which therefore equals 100 – 16.
Item 16
If a = 3, b = -2 and c = 7 what is the value of 3b2 – abc ? Answer 54 -30 30 other no answer
% 35 6 5 35 18
Diagnostic comment This problem involves the solution of a relatively straightforward algebraic expression, although knowledge of several important conventions was required, namely: (1) that squaring a negative number results in a positive number; (2) that 3b2 does not equal (3b)2; (3) that multiplying two positive and one negative number give a negative product; (4) that when subtracting a negative number you add. The 6% of students who answered ‘-30’ had misapplied (4), those who answered 30 had misapplied (1), and the myriad ‘other’ responses indicate that students had misapplied various combinations of (1) – (4).
Item 19
Which of the following is NOT equal to any of the other three? a–b+c (a – b) + c a – (b + c) a + (c – b) Options a + (c – b) a – (b + c) a–b+c (a – b) + c no answer
% 39 35 7 4 15
Diagnostic comment This problem taps students’ knowledge of how to remove brackets from algebraic expressions. Almost 40% appeared to be influenced by the salience of the reordering of the a, b, c sequence in a + (c – b) while failing to realise that, expanded, this expression is a + c – b which is of course equal to a – b + c, while a – (b + c) expanded is a – b – c and is hence the odd one out.
- 18 -
Assessing numeracy
Implications for teaching statistics – Algebraic Reasoning We would not wish to overstate the importance of algebraic reasoning as a basis for learning statistics – certainly, many recent textbooks have attempted to minimise formal algebraic content in favour of more broadly conceptual, informal approaches. While this is a welcome development, there is still the need for students to possess a basic appreciation of ideas expressed in algebraic form. They should certainly be able to plug values into a relatively straightforward formula, including knowing how to treat negative numbers, and knowing the computational priorities for the different arithmetical operators, squaring, brackets, etc. (see items 3.2, 16). They should also be able to rearrange a simple linear expression in order to solve for an unknown (see item 6.2). One of the most important aspects of algebraic reasoning relating to the learning of statistics is covered in item 9, that is, the ability to recognise the impact on an algebraic expression of transforming or rescaling variables, even if that only means doubling a set of scores. This type of reasoning is crucial in allowing an individual to reason about the impact of, for example, changing the units of a measure (e.g. feet to metres, or degrees centigrade to degrees Fahrenheit) on measures such as mean, standard deviation, correlation coefficient, t-test, etc. Some will remain invariant under such rescaling, while others will not.
- 19 -
Assessing numeracy GRAPHICAL INTERPRETATION
Item 20
The following graph represents the speed of a racing car around a complete lap of a racing circuit.
S P E E D
DISTANCE
Choose whichever of the following is most likely to be a map of the circuit:
1 Options 2 3 1 no answer
2
3 % 46 32 11 11
Diagnostic comment This example requires students to reason about the information contained in the graph, specifically, that the car speeds up and slows down in a regular manner three times as it completes the circuit. Fewer than half of the students succeeded in mapping the speed/distance information to the correct circuit shape. Students choosing option 1 are dealing with the graphical representation entirely superficially and this suggests a profound lack of understanding of graphical representation – 98 students in our sample chose this option. Pedagogically, it may be helpful to use this example as vehicle for encouraging students to talk through their reasoning and to make explicit correspondences between the graphical information and the three circuit shapes.
Implications for teaching statistics – Graphical Interpretation The ability of students to make sense of graphical representations of quantitative events is pivotal to learning statistics – arguably the most important skill of all, particularly with the increasing trend towards greater emphasis on descriptive statistics and exploratory data analysis techniques, even in published research. Tutors should seek to maximise students’ interactions with all sorts of graphical techniques, including box plots, stem-and-leaf plots (including back-to-back), dot plots, pie charts, bar charts, histograms, frequency polygons, scatterplots and line graphs. Most importantly, tutors should present the same data in as many different appropriate forms as possible and encourage students to extract the relevant information from each, noting correspondences between, and strengths and weakness of, each graphical form.
- 20 -
Assessing numeracy
PROPORTIONALITY AND RATIO Item 7
You can see the height of Mr Tiny measured with paper clips. Mr Tiny has a friend, Mrs Large. When we measure their height with matchsticks, Mr Tiny’s height is 4 matchsticks and Mrs Large’s is 6 matchsticks. How many paper clips are needed for Mrs Large’s height?
Answer 9 8 other no answer
% 66 15 13 6
Diagnostic comment The answer ‘8’ indicates a worrying flaw in students’ concept of ratio and proportion (134 students gave this answer). This indicates an additive rather than a multiplicative concept of proportionality, i.e. since, for Mr Tiny, the number of paperclips (6) is 2 greater than the number of matchsticks (4), for Mrs Large, the number of paperclips should also be 2 greater than the number of matchsticks (6).
Item 17
These two letters are the same shape, but one is larger. AC is 4 units long, RT is 6 units. U
S
B
D
R A
E
17.1
C
V
T
AB is 7 units. How long is RS? Answer 10.5 9 other
% 42 26 14
- 21 -
Assessing numeracy no answer
19
Diagnostic comment Of those giving an incorrect response, almost three quarters suggested ‘9’, an error which is similar to that made for item 7. Instead of recognising that they should have been making a proportional calculation, students focused on the fact that RT was 2 units greater than AC and so, RS was deemed to be 2 units bigger than AB.
17.2
UV is 15 units. How long is DE? Answer 10 13 22.5 other no answer
% 36 18 9 16 21
Diagnostic comment Approximately a third of students were able to perform the appropriate proportional reduction to obtain the answer ‘10’. Some 80 students performed a proportional calculation, but increased 15 by 50%, rather than decreasing 15 by a third. This would suggest a load on working memory resulting from incomplete mastery of this concept and thus reducing the attentional capacity to attend to the precise question.
Implications for teaching statistics – Proportionality and Ratio While proportionality and ratio is assessed here through comparison of physical objects, more typically, students would be required to interpret less concrete comparisons of participant performance (e.g. a percentage increase or decrease across treatments, as compared to an absolute increase or decrease), sample sizes, effect sizes, and so on. Also judgements about the ordinal, interval or ratio properties of a measure rely on an individual’s ability to reason about ratio (e.g. in judging whether it may be valid to assert that 4 units of a variable X is half of 8 units and twice 2 units of the variable.
- 22 -
Assessing numeracy PROBABILITY AND SAMPLING Item 15
A game of squash can be played either to 9 or 15 points. Holding all other rules of the game constant, if player A is better than player B, which scoring system will give player A a higher probability of winning? Options no difference 9 point game 15 point game no answer
% 54 19 19 8
Diagnostic comment Only 19% of our sample appeared to be aware of the fact that the 15-point game is likely to favour the better player. This goes to the heart of the issue of the effect of sample size on sampling error. Talking students through this example is likely to prove useful in consolidating these ideas.
Item 5
Forty fish were caught from a pond; each one was marked and thrown back into the pond. On another day 60 fish were caught from the same pond and there were four marked fish among them. Estimate the total number of fish in the pond. Answer 96 600 100 240 400 other no answer
% 27 18 6 4 3 25 18
Diagnostic comment Only 18% of students were able to apply the concept of sampling to this problem. 240 students produced the answer ‘96’. These individuals will have reasoned that, in addition to the 40 marked fish in the pond, four of which were caught on the second day, there were 56 unmarked fish, giving a total of 96. This in fact is the minimum possible number of fish in the pond.
Implications for teaching statistics – Probability and Sampling Given the overwhelmingly inferential nature of psychological statistics, it is particularly alarming to find that so few students appear able to reason probabilistically or to understand the relationship between samples and populations. Such reasoning goes to the very core of statistical understanding, so these topics must be at the top of our list of priorities as educators.
- 23 -
Assessing numeracy ESTIMATION Item 14
Estimate (do not attempt to calculate) the following:
14.1
(85.63 – 1.2384) (101.46 – 97.88) Answers within acceptable range (15-35) outside range no answer
14.2
% 51 32 17
5.6832 x 0.623 0.07689 Answers outside range no answer within acceptable range (30-70)
% 62 29 9
Diagnostic comment In spite of quite generous ranges for acceptable answers, fewer than 10% of students were able to provide a reasonable estimate to item 14.2. Performance on item 14.1 was much better, although half of the sample were unable to provide an estimate based on two subtractions and one division. Undoubtedly, the difficulties were compounded in item 14.2 due to the need to both multiply and divide by decimal numbers less than one. This was observed in earlier calculation examples.
Implications for teaching statistics – Estimation Following the Cockcroft (1982) report, among others, considerable emphasis has been placed on students’ ability to estimate quantities. Such estimation is known to predict a wide range of mathematical skills. Also, with the predominance of automated calculation devices, the ability to undertake quick, informal checks of calculations may be considered highly desirable. Related to this is the skill of being able to judge the plausibility of an answer, e.g. whether a result appears to be vastly too large or small, whether it is negative when it should be positive or vice versa, whether a correlation coefficient of 1.42 is meaningful, and so on. Hypothetical exercises, particularly in the case of judging the plausibility of statistical results may prove highly beneficial for students of all ability levels.
References Cockcroft, W. H. (1982). Mathematics Counts. London: HMSO. Greer, B. & Semrau, G. (1984). Investigating psychology students’ conceptual problems in mathematics in relation to learning statistics. Bulletin of the British Psychological Society, 37, 123-135. Mulhern, G. & Wylie, J. (2004). Changing levels of numeracy and other core mathematical skills among psychology undergraduates between 1992 and 2002. British Journal of Psychology, 95, 355-370.
- 24 -
The Higher Education Academy Psychology Network Department of Psychology University of York York YO10 5DD Email:
[email protected] Telephone: 01904 433154 Fax: 01904 433181 www.psychology.heacademy.ac.uk July 2005
£5 This document is also available online from www.psychology.heacademy.ac.uk ISBN 0-9550372-0-4