The Representational Value of
HA TS
Jane M. Watson, Noleine E. Fitzallen, Karen G. Wilson, and Julie F. Creed
Mathematics Teaching in the Middle School
●
Vol. 14, No. 1, August 2008
Copyright © 2008 The National Council of Teachers of Mathematics, Inc. www.nctm.org. All rights reserved. This material may not be copied or distributed electronically or in any other format without written permission from NCTM.
Birgitte Magnus/iStockphoto.com
t
The literature that is available on the topic of representations in mathematics is vast. One commonly discussed item is graphical representations. From the history of mathematics to modern uses of technology, a variety of graphical forms are available for middle school students to use to represent mathematical ideas. The ideas range from algebraic relationships to summaries of data sets. Traditionally, textbooks delineate the rules to be followed in creating conventional graphical forms, and software offers alternatives for attractive presentations. Is there anything new to introduce in the way of graphical representations for middle school students? This article presents a new data representation tool called the hat plot, which is a featured tool of the data analysis software
Fig. 1 Random data arrangement
TinkerPlots Dynamic Data Exploration (Konold and Miller 2005). TinkerPlots software gives students considerable freedom to create graphs to tell the stories of their data sets. Because the hat plot representation will most likely be new to middle school students, it is important to explore the concepts that the hat plot represents, how students use it, and how it links with students’ intuitive notions of distribution. Examples of student work illustrate the potential of hat plots as a representation that can be used to make sense of data. Studies advocate that students should be allowed and encouraged to create their own forms of graphical representation before being introduced formally to conventional graphs (Watson 2006). This constructivist approach to representing data inspired the developers of TinkerPlots to allow students considerable freedom in using
Fig. 2 Data separated in bins
Fig. 3 Data separated on a continuous scale
this graphing software. Data initially appear as haphazardly arranged icons in space (see fig. 1). Students then choose attributes they would like to display and how they would like the icons arranged, for example, in bins (see fig. 2) or stacked along a continuous scale (see fig. 3). The tools avail-
Jane M. Watson,
[email protected], is a professor of mathematics education at the University of Tasmania, Australia. Her major research interest is statistics education. Noleine E. Fitzallen, Noleine.Fitzallen @utas.edu.au, is a PhD student and part-time lecturer in mathematics education at the University of Tasmania. Her research focuses on students’ understanding of covariation. Karen G. Wilson, solentconsulting@bigpond .com, is a teacher-researcher who collaborated with the other authors in using TinkerPlots in the classroom. Julie F. Creed,
[email protected], teaches middle school at Glenora District High School in Tasmania. She is interested in the use of technology in the classroom. Vol. 14, No. 1, August 2008
●
able include the ability to mark centers (such as mean or median), modes, dividers, counts, and percents. These elements, as well as the organization of data icons in the plot window using basic operators such as order, stack, and separate, can be used progressively to highlight features of interest to students (Konold, forthcoming). Among the summary tools available is the hat plot, so named for its appearance resembling a hat with a crown in the middle and a brim on each side (see fig. 4). The brim is a line that extends to the range of the data set, and the crown is a rectangle that shows, by default, the location of the middle 50 percent of the data. The other important feature is that the hat plot is usually superimposed
Mathematics Teaching in the Middle School
Fig. 4 A hat plot
Fig. 5 A box-and-whisker plot for data in figure 4
Fig. 6 A vertical box-and-whisker plot
over a data representation that shows the individual cases (see fig. 4). For those familiar with recent innovations in graphing, the hat plot is similar to a box-and-whisker plot. In a box-and-whisker plot, whiskers emanate from the ends of a box representing the middle half of the data. Straightforward box-andwhisker plots are shown in figures 5 and 6, with the whiskers extending to the lowest and highest values in the data sets. Another feature of a boxand-whisker plot is the representation of the median as a vertical line across the box showing the middle value of the ordered data set. Typically, boxand-whisker plots are shown without displaying the data they represent. Although a box-and-whisker plot is often used to represent data, researchers have found that students do not find it easy to interpret because of the inverse relationship between the size of the four sections of the plot and the spread or density of the values represented (Bakker, Biehler, and Konold 2005). In the box-and-whisker plot in figure 5, for example, the smaller left-hand portion of the central box represents the same number of data values as the larger right-hand portion of the box. This means that the left quarter of the central half is more
densely packed than the right quarter. Difficulties in understanding and explaining the representation, especially if the exact data values are not presented under the box-and-whisker plot, have led mathematics educators to suggest that the box-and-whisker plot not be introduced to middle school students but postponed until the secondary years (Bakker et al. 2005). As can be seen by comparing figures 4 and 5, a hat plot is a simplified version of a box-and-whisker plot, focusing directly on the center half of the data and the extreme quarters. A hat with a short brim compared with a long brim suggests a smaller spread on one side of the middle half of a data set than on the other side. A narrow crown
suggests that the data are clustered in the middle of the data set, compared with data whose hat has a broad crown, as evident in the comparison of exercise and resting heart rates in the top and bottom of figure 7, respectively.
Fig. 7 Hat plots demonstrating different spreads in two data sets
Mathematics Teaching in the Middle School
●
Vol. 14, No. 1, August 2008
Copyright © 2008 The National Council of Teachers of Mathematics, Inc. www.nctm.org. All rights reserved. This material may not be copied or distributed electronically or in any other format without written permission from NCTM.
The hat plot is a more general tool than the box-and-whisker plot and is more flexible. TinkerPlots has four types of hat plots (see table 1), with each type using a different rule for constructing the central crown. The default locations of the crown edges can be adjusted by dragging the crown edges once the plot appears. In this way, for example, it is possible to make percentile hat plots that include 90 percent of the cases in the center part or standard deviation hat plots that range from –2 to +2 standard deviations. For the purposes of this article, students’ application of the default hat plot representation, which is most similar to the box-andwhisker plot, is explored. The question for middle school mathematics teachers is whether students find the features of hat plots useful in telling the story within the data set being studied. “Telling the story” in a hat plot is a representational transition to drawing an informal inference about the data being investigated. The rest of this article uses work from students in grades 5 to 7 to illustrate (1) how students used hat plots when analyzing data and (2) the connections students made between hat plots and significant aspects of spread and clustering in data sets. The students whose graphs are shown in this article worked either in pairs with a teacher-researcher or in a class of about thirteen with their usual teacher and a teacher-researcher. During pair work, students worked with a single computer, whereas in the class environment each student had a computer. After introductory sessions, the students analyzed data sets collected from themselves and their classmates. As the final activity, the class group considered a data set provided with the TinkerPlots package. Some comments were written by students in text boxes within TinkerPlots, and other thoughts were jotted down by observers.
Table 1 Four types of hat plots in TinkerPlots (Konold and Miller 2005) Hat Plot Type
Default Crown Edges
Percentile (default)
25th and 75th percentiles
Range
1/3 and 2/3 of the range
Average Deviation
–1 and +1 average deviations
Standard Deviation
–1 and +1 standard deviations
Fig. 8 Height data for males and females at age 18 years
UNDERSTANDING the hat plot Given the results of research showing that students have difficulties interpreting box-and-whisker plots, it was somewhat surprising to teachers how easily these middle school students were able to use the hat plots, often on their own initiative. After teachers initially introduced the software, they reported observing most students employing hat plots when constructing graphical representations from new data sets, possibly because of the ease with which hat plots in TinkerPlots can be applied to a graph. Of more interest is the way in which students used the hat plot when further analyzing data. During the final teaching session, students were given three graphs that had been produced from a data set, which contained information about the heights of children at ages 2, 9, and 18 years (Konold and Miller 2005). The data from each of the age groups by gender were presented Vol. 14, No. 1, August 2008
●
in separate graphs, which included hat plots. Students were asked to consider and comment on the differences and changes in heights between the males and females at age 2, 9, and 18. Figure 8 shows the graph as presented to the students for the data at age 18 years. Out of the fifteen students in the class, twelve discussed the hat plot when referring to the central 50 percent of the data. The following three summaries illustrate students’ use of the hat plot. I found that men continue to grow through all three graphs, but the female growth rate slows between 9 and 18. The range for the first graph is particularly small compared to the others. Also the men’s middle 50 percent is bigger than the females’ middle 50 percent on all three graphs. And the brims are very big on all 3 graphs. 2 years—the males are spread out more than the girls in the hat plots
Mathematics Teaching in the Middle School
Photograph by Jane Watson; all rights reserved
and the hat plots overlap each other. 9 years—there’s not much difference between the middle 50 percent and most of the hat plots overlap each other. 18 years—there is a little difference between the middle 50 percent. The hat plots [the crown] start to separate from each other. On the plot where both males and females are at 18, it shows the majority of the males are taller than the females. In the last graph the hats don’t overlap as they did in the middle graph. The range is not the same between boys and girls.
In relation to summarizing data, one student reflected, “The hat plots are the main help because they make it easier to compare the boys and the girls.” This idea was discussed in greater detail by another student:
LINKING The hat plot to PERCENT Understanding percent as part of proportional reasoning is an important goal of middle school mathematics as are such links to fractions as 25 percent = one quarter and 50 percent = one half. For many students, the visual representation in the hat plot creates two links: one to the basic numerical fraction and percent facts involved and the other to a concrete context to see the utility of the part-whole relationship. Although the middle school students had been taught percents
The female’s height range is more clustered and all squashed together, whereas the male height range is more evenly spread out. When both the males and females are 18, the hat plots are starting to separate. And the crown is starting to hold more dots.
Since the numbers of males and females were the same, the relationship of a wider crown to a greater spread rather than more values was confusing to this student. Teachers need to reinforce the interaction of quantity and spread at every opportunity.
Fig. 9 Graphs with hat plots of arm spans of middle school males and females
At the start the males and females are a close cluster but as they progress the male group spreads out, whilst the females stay together, the hats help us decide this by: there is not much difference at the age of 2 and at the age of 9, but when they turn 18 the hats show a significant difference, the girl’s crown stretches from 162.4 cm to 169.8, whilst the males stretch from 176.4 cm to 183.0 cm.
as part of numeracy and part-whole reasoning, some were insecure in their application. When students discussed what their hat plots represented, the language of percents came to the fore. At times, students made comparisons based on the relative placements of the hats and where subparts of the data were in relation to one another. There were also opportunities, however, to use students’ comments from the text boxes to detect misunderstandings related to the part-whole nature of the percent representation in the hat. The following comment was made about the graph in figure 8:
Mathematics Teaching in the Middle School
●
Vol. 14, No. 1, August 2008
Making the transition from individual data values to GROUP characteristics Before the students in these middle school classes were introduced to TinkerPlots, they had little experience handling data. As such, they were typical of most students who are interested in individual values in the data set (Ben-Zvi 2004). When discussing a graph displaying the heights of students in their class, for example, they would focus on which student was the tallest or the shortest. Although this interest did not disappear with the appearance of the hat plot, it was complemented by an appreciation of group characteristics, such as the width of the brim or the spread of the data within a specific region. This is seen in reports written by grade 7 students about arm spans of boys and girls in the following examples. The graph in figure 9 represents the data associated with the relationship between arm span and gender collected from students in grades 5 to 8. In the graph, the males (m) and females (f ) are separated, each with its own hat plot. The appearance seemed to influence a comment on spread in the text box that might not be expected in a novice: “There is a big difference in arm spans with the boys but the girls are all concentrated in the middle of the range.” Another student separated the data by grade and discarded the data from grades 6 and 7. The subsequent representation featuring grades 5 and 8 is in figure 10. The color of the circular icons represents gender. Extracts from the student’s notes illustrate the thinking that took place: I found the grade and the gender interesting because I find the difference between the age and the size of the arm span interesting. I used the hat because it shows me that the hat [crown] is covering 50
Fig. 10 Graphs with hat plots for arm spans of students in grades 5 and 8
Fig. 11 Graph with hat plot for height of grade 6 students
percent of the dots. In the grade 8 the hat is around 159.0 to 171.0 and in the grade 5 it’s around 136.1 to 151.6. I think it is easy to use because it shows me where the most arm spans are ranged from.
Summary and Discussion The application of the hat representation as a metaphor for summarizing data appeared easy for the middle school students to absorb. One grade 6 student, for example, offered this comment about a graph that included a hat plot, “You can learn more because you can see the stats.” This comment followed discussion of ranges, means, and medians in relation to comparing two data sets. Additionally, the notion that the brim on either side of the crown represented individually 25 percent of the data set was easy for students to relate to. When one stuVol. 14, No. 1, August 2008
●
dent was describing the parts of a hat plot (see fig. 11) to another student, the following explanation was given. It doesn’t matter that this brim [right brim] is longer than the other. They both equal the same amount. It is just that the range of this side [left brim] is shorter than on the other side. The data is more spread out on the other side [right brim].
Although not as sophisticated as might be expected from older students, these descriptions and the use of hat plots indicate their value as representations. As important to the representational strengths of the hat plot were three observed aspects of the students’ use of hat plots. The first aspect involved the natural way in which students either suggested the use of the tool on their own
Mathematics Teaching in the Middle School
initiative to analyze data or could interpret a hat plot presented to them by the teacher. Hence, the hat plot became both an intuitive and reflective part of data representation. Second, it was encouraging to note that hat plots gave students the opportunity to apply their underlying knowledge of percent in a familiar context and in some cases exposed students’ misconceptions. The third aspect was the observed focus from individual data values to characteristics of subgroups or aggregates of the entire data set. These aspects enabled students to be able to tell the story from the data in meaningful and descriptive ways. There are some parallels between the hat plot representation and the way in which students intuitively see the distribution of data. Students tend to use the position of “modal clumps” evident in a distribution to segment
10
data intuitively into the middle, upper, and lower ranges (Konold et al. 2002). This fits well with the way in which the hat plot representation, consisting of the brim and crown, splits the data into three sections—the lowest 25 percent, the middle 50 percent, and the highest 25 percent. The crown of the hat plot represents the middle 50 percent of the data and, in relation to the data set, is a measure of center. Students find the crown of the hat plot useful when comparing two data sets, summarizing data, and describing the spread and clustering of data. It allows students to develop an understanding of measures of center without focusing on individual values (median or mean), which may be meaningless if considered in isolation. Introducing the median to the hat plot could potentially help students transition from the student-created representation to the more sophisticated statistical form of the box-andwhisker plot. In so doing, students may develop a better understanding of box-and-whisker plots being a measure of spread, with the median being a measure of center. It appears that the visual representation of the hat plot in conjunction with the data icons assists students in making connections with the context of the data. Many box-and-whisker plot representations are more abstract in form, with only an accompanying linear scale but no actual data points. The absence of concrete data may be another aspect that creates difficulties for students first introduced to boxand-whisker plots, especially in relation to the inverse relationship involving data density mentioned earlier. That the default hat plot in TinkerPlots appears with the represented data is considered important in helping students move from the concrete to the more abstract representations of data sets. Along the way it also helps them tell the stories in the data.
Mathematics Teaching in the Middle School
●
Vol. 14, No. 1, August 2008
REFERENCES Bakker, Arthur, Rolf Biehler, and Cliff Konold. “Should Young Students Learn about Box Plots?” In Curricular Development in Statistics Education: International Association for Statistical Education (IASE) Roundtable, edited by Gail Burrill and Mike Camden, pp. 163–73. Voorburg, The Netherlands: International Statistical Institute, 2005. Ben-Zvi, Dani. “Reasoning about Data Analysis.” In The Challenge of Developing Statistical Literacy, Reasoning and Thinking, edited by Dani Ben-Zvi and Joan Garfield, pp. 121–46. Dordrecht, The Netherlands: Kluwer Academic Publishers, 2004. Konold, Clifford. “Designing a Data Analysis Tool for Learners.” In Thinking with Data: The 33rd Annual Carnegie Symposium on Cognition, edited by Marsha Lovett and Priti Shah. Hillside, NJ: Lawrence Erlbaum Associates, forthcoming. Konold, Clifford, and Craig D. Miller. Tinkerplots: Dynamic Data Exploration. Emeryville, CA: Key Curriculum Press, 2005. Software. Konold, Clifford, Amy Robinson, Khalimahtul Khalil, Alexander Pollatsek, Arnold Well, Rachel Wing, and Susanne Mayr. “Students’ Use of Modal Clumps to Summarize Data.” In Proceedings of the Sixth International Conference on Teaching Statistics: Developing a Statistically Literate Society, edited by Brian Phillips. Voorburg, The Netherlands: International Statistical Institute, 2002. Watson, Jane. Statistical Literacy at School: Growth and Goals. Mahwah, NJ: Lawrence Erlbaum Associates, 2006.
This article grew from a research project funded by the Australian Research Council (LP050669106). l