Evaluating Numerical Tabular Data Comprehension

Evaluating Numerical Tabular Data Comprehension Tasks under ‘Speech only Condition’ with ‘Speech and Pitch Condition’ Rameshsharma Ramloll MultiVis Internal Report Status: Work in PROGRESS Computing Science Department Glasgow University [email protected] ABSTRACT

Apparatus and Stimuli

This report describes an experiment to compare the workload of participants when faced with data comprehension tasks under two conditions. In one condition participants can access numerical tabular data in speech only while in the other condition, participants are provided with opportunities to access the numerical data mapped to pitch as well. We found that there is a significant increase (p < 0.01) in the number of correct answers obtained and a significant decrease (p < 0.01) in the mental load and frustration of participants when provided with the opportunity to access numerical tabular data as a piano pitch instead of as speech alone.

The application was developed in VC++ using Microsoft’s SAPI4.0 (Speech Application Program Interface). A MIDI synthesiser generates the ‘piano’ pitch used to represent a given numerical value in the table. In this experiment, since we are dealing with MIDI, the numerical data values are truncated so that they lie between 0 and 127 to allow a straightforward mapping of value to pitch. Participants have access to the auditory messages through headphones and navigate the table through keyboard inputs.

Keywords

Data visualisation, sound graphs, subjective workload assessment INTRODUCTION

The purpose of this experiment is to evaluate how participants tackle data comprehension tasks under two auditory conditions namely (1) speech only (SC) and (2) speech and pitch (S&PC). In particular, we want to investigate whether there are grounds to introduce pitch to make the experience of data browsing more effective. Arguably, the obvious way to represent a table (such as that described in Figure 1) in sound is to use speech feedback to inform the user where she is in the table and what is available there. We are investigating whether there is any merit for associating a given numerical value with a representative pitch instead of always having the value read out in speech. In fact, navigating across a row or a column will generate a sound graph [1] or tone plot [2] that has traditionally been associated with line graphs in the visual medium. Figure 1 Prototypical auditory tabular data browser We next describe the functionality of various keys and our strategies to assist the navigation of the table.

1

In the S&PC mode, the arrow keys cause the value of the relevant cell in the current row to be played.

OVERVIEW OF THE INTERFACE (INPUT)

We primarily use the section of a keyboard typically used for numerical input (Figure 2).

5 KEY (CENTER OF NUMPAD) = READ CURRENT (X,Y) VALUE

Pressing the Center Key 5 will cause the position of the participant in the table followed by the value of the current cell to be read out. Example: “ Tom, English, thirty seven ” PgUp KEY = JUMP TO BEGINNING OF A GIVEN X

Pressing PgUp will bring you to the beginning of the current column. Example: “Jumping to the beginning of French” END KEY = JUMP TO BEGINNING OF A GIVEN Y

Figure 2 Some input keys used in querying the data

Pressing End will bring you to the beginning of the current row.

The keys used and their functionality is now presented.

Example:

SPACE BAR = TOGGLING BETWEEN SC and S&PC modes

“Jumping to the beginning of Tom”

Pressing the space bar allows you to toggle between the SC and S&PC modes. ENTER KEY = OVERVIEW

Pressing Enter once will give you a description of the data set in synthetic speech and will describe the function of the arrow keys. Example: Synthesized speech: “ Auditory display shows Test results. Vertical arrows select subjects. Horizontal arrows select students.” Table 1 Typical 2-dimensional table that can be browsed using our prototype

ESCAPE KEY = STOP SPEECH

Pressing Escape once will stop any speech output by freeing any queued up speech messages.

OVERVIEW OF SOUND MAPPING STRATEGY

UP and DOWN arrow KEYS = EXPLORE Y FOR A GIVEN X (See table 1)

In the S&PC mode, the pitch is directly proportional to the value of the data i.e. the higher the value, the higher the pitch.

Pressing Up and Down arrow keys allow you to navigate up and down a column of the table.

The S&PC mode also involves panning of the sound sources. The latter are localised along a line joining the left and right ears and positioned according to their position in their respective row or column.

In the SC mode, the arrow keys cause the labels of rows to be read out. In the S&PC mode, the arrow keys cause the pitch associated to the value of the relevant cell in the current column to be played.

Thus, when moving down a column, the first value is associated with a pitch heard in the left ear, the subsequent values are associated with pitches localised on a line between the left and right ear, and finally, the last value of the column is associated with the right ear.

RIGHT and LEFT arrow KEYS = EXPLORE X FOR A GIVEN Y

Pressing Right and Left arrow keys allow you to navigate left and right a row of the table.

The same effect is obtained while navigating across a row. The first value is heard to the left, the last value is heard to the right and the intermediate values are heard in between the left and right ears.

In the SC mode, the arrow keys cause the labels of columns to be read out.

2

3.

Jaimie: ‘tones stick better in the mind than spoken data’. 4. Pitch mapping allowed fast identification of trends, maximas and minimas. 5. For example while listening to rape data down the years, Jaimie said : 6. ‘The police has been doing their job well’ without being prompted illustrating that a downward trend had been recognised immediately. 7. He also finds identification of values based on pitch especially in the mid range ‘really works’. In a set task, the subject was able to identify successfully all the four minimas in the ‘Arson curve’ in two navigation strokes, one down the column and the other up, just as a checking measure. After the sound spatialisation strategy is explained, the following was observed:

In this current design, participants receive an auditory cue that informs them whenever they leave the table while navigating across a row or column. This strategy presumably gives them an idea of the limits of rows and columns. Once a participant is out of a table, clicking on the centre ‘5’ key causes the “not in table” message to be read out. When the user wanders away from the contents of the table, clicking on the arrow keys produces messages such as “move left”, “move right”, “move up” and “move down” which guides her back into the table. PILOT STUDY

Before carrying out the workload assessment, a pilot study was carried out in order to identify early on any obvious problems with the prototype that may hinder the progress of the experiment. We also took this opportunity to carry out a think aloud procedure [3]. Participant

SUBJECT: JAIMIE X SUBJECT DATA: Visually impaired, Left ear hearing (100%) and Right ear hearing (60%)

The panning strategy was found to be useful for a number of reasons:

Training

Subject found that the training time was pleasantly short. The commands and modes of interaction were easy to understand and remember. Training time on this occasion did not exceed 5 minutes. The ability to learn the interface quickly and get on with the task was a definite plus. Negative Criticisms

1. 2. 3. 4.

5. 6. 7. 8.

The idea of a no-table zone bordering the table is potentially confusing. A more distinct sound is needed for the end of rows and columns. In the speech mode, the user should be given some control on the degree of verbosity required. In that respect, the use of meta-keys have been suggested to give the user more control on the amount of speech information needed by the user. For example, in the speech mode, the user should be able to control whether he needs the row or column information all the time. Voice messages associated with a given command should be interrupted as soon as a new command is launched. Low frequency sounds cannot be distinguished very easily. The sound mapping is simple but needs to be explained if it is to be useful.

2.

It allows the user to guess the size of the data set. This is especially useful for large data sets.

2.

It also allows fast identification of points of interest and where they occurred.

3.

It is more aesthetically pleasing; it surely adds something to the browsing experience.

Overall impression of the subject

The overall impression was very positive in the sense that the interface was found to be simple and powerful. The subject suggested that the application will be very useful and hopes to be able to use it to read in data in Excel sheets for example. A call for integration in the subject’s current environment illustrates his keen interest in the prototype. COMPARING WORKLOADS USING NASA TLX Hypothesis

Comparing workload for a minimal set of data inspection tasks under ‘speech only condition’ with ‘speech and pitch condition’. (Two tailed hypothesis where effect is not specified at priori.) Speech only condition (SC)

A table is navigated using arrow keys. The user is able to get speech feedback about the current position in the table and the value of the relevant cell. Speech and pitch condition (S&PC)

A table is navigated using arrow keys. The user is able to get both pitch and speech feedback about the current position in the table and the value of the relevant cell. Participants

Positive Comments

1.

1.

The sounds are aesthetically pleasing, the subject compared the table browser to a musical instrument. Navigation of the table was fast according to the user and the feedback speed excellent.

Experimental Design

16 subjects are required for this repeated measures experiment design. Participants are subjected to the SC and S&PC conditions as described in table 2. The order in 3

the training phase, participants were provided with a different data set that was about the gross national products of a country over a number of years.

A.1

(SC)

TLX

Task I A.2

(SC)

(S&PC)

TLX

(S&PC) Task II

Answer the following questions based on the information available in the table. Questions

(15 mins)

Evaluation

(20 mins)

Session II

This dataset is about the performance of a number of students in various subjects.

(S&PC)

TLX

(SC)

(SC)

Name the student(s) scoring the highest marks for Biology.

2.

The list of student is presented in ascending order of marks for a given subject. Name the subject.

3.

Name the subject most likely to have the highest number of passes.

4.

Name the student(s) scoring the lowest marks for the Assembly course.

5.

In which course performance was particularly poor?

6.

Name the student most likely to have the highest total marks.

TASK II: London Crime Statistics

This table is about the type of crime and the number of cases reported in London from 1974 to 2000.

TLX

Answer the following questions based on the information available in the table.

TLX

Task II TLX

1.

TLX

Task I

Task I B.2

(S&PC)

TASK I: Student Performance Analysis

Task II

Task II B.1

(15 mins)

Evaluation

(20 mins)

Session I

15 mins

(20 mins)

Training

Explanation

TLX

Group (4 subjects)

which the tasks and auditory conditions are presented to the participants ensures that any effects due to inherent task difficulty levels and practice (i.e. increase in task tackling efficiency due to increase in familiarity) are minimised. During the first and second session, the participants are given data analysis tasks based on a questionnaire. Each session is followed by a NASA TLX test to gather subjective workload information about each task. Our homegrown computerized version of the NASA TLX is used to speed up and facilitate the process of data collection from the participants. We follow strictly the guidelines of the NASA TLX and calculate the combined workload based on the subjective weights obtained after the pair wise comparison of the workload categories. This step will allow us to compare workload values obtained with this experiment to other values obtained form other independent experiments.

Questions TLX

Task I

1.

State the year(s) in which the highest number of murder cases was/were reported.

2.

State the year(s) in which the highest number of robbery cases was/were reported.

3.

Which type(s) of crime had a consistently high number (~>50) of cases reported?

4.

Which type(s) of crime show(s) a consistent increasing trend?

5.

State the year(s) in which the numbers of hate crime cases was/were lowest.

6.

Which type(s) of crime show(s) a consistent decreasing trend?

Table 2 Experiment schedule Participants

The participants were paid 5 pounds per hour. All are sighted and attempted the tasks with the computer screen switched off. Those who did turn up for the experiment included 8 women and 7 men and did not have any overtly recognisable auditory impairment. The participants were a mix of computing and information technology postgraduates. Experimental Procedure

RESULTS

The data analysis and inspection tasks that participants had to tackle are described shortly. The questions were designed according to a number of requirements. Firstly, the data analysis should not be too complicated so that the focus of the task is on the analysis and interpretation of the question rather than on how the tabular data is perceptualised. Secondly, the questions must not be favourable to the peculiarities of our chosen auditory tabular data browser. The two tasks that participants faced in this experiment are described under the headings of Task I and Task II. During

We now compare the various scores obtained for the individual categories deemed to contribute to the overall work load of a task according to the NASA Task Load Index. Figure 3 to Figure 11 describes the various data we collected from the 15 participants that took part in the experiment. Each picture is followed by the four parameters that are obtained from a related-t test applied on our raw data.

4

Mental Demand

Effort Effort

Mental Dem and 20

20

18

18

16

16

14 TLX Score

TLX Score

14 12 10 8

12 10 8 6

6

4

4

2

2

0

0

F1

F1

F2

F3

F4

F5

F6

F7

F8

M1

M2

M3

M4

M5

M6

F2

F3

F4

F5

F6

F7

M7

F8

M1

M2

M3

M4

M5

M6

M7

Participants

Participants

speech

speech

Figure 6 Comparing effort

Figure 3 Comparing mental demand T14 = 3.040, P 9.600

two-tail

Speech&Pitch

Speech&Pitch

T14 = 1.681, P two-tail = 0.115, SCMean = 27.410, S&PCPitch = 21.124

= 0.009, SCMean = 13.333, S&PCPitch =

Performance

Physical Demand

Performance Physical Demand

20 12

18 16

10

14 TLX Score

TLX Score

8 6 4

12 10 8 6 4

2

2 0

0 F1

F2

F3

F4

F5

F6

F7

F8

M1

M2

M3

M4

M5

M6

M7

F1

F2

F3

F4

F5

F6

F7

Participants

speech

F8

M1

M2

M3

M4

M5

M6

M7

Participants

Speech&Pitch

speech

Speech&Pitch

Figure 4 Comparing physical demand

Figure 7 Comparing performance

T14 = 0, P two-tail = 1, SCMean = 2.533, S&PCPitch = 2.533

T14 = 1.384 , P two-tail = 0.188 SCMean = 11.267, S&PCPitch = 9.000

Temporal Demand

Frustration Te mporal Dem and

Frustration

20

20 18

16

16

14

14

12

12

TLX Score

TLX Score

18

10 8

10 8

6

6

4

4

2

2

0

0

F1

F2

F3

F4

F5

F6

F7

F8

M1

M2

M3

M4

M5

M6

M7

F1

F2

F3

F4

F5

F6

F7

speech

F8

M1

M2

M3

M4

M5

M6

M7

Participants

Participants

speech

Speech&Pitch

Figure 5 Comparing temporal demand

Figure 8 Comparing frustration

T14 = 1.543, P two-tail = 0.145, SCMean = 11.333, S&PCPitch = 9.600

T14 =3.192, P 6.333

5

two-tail

Speech&Pitch

= 0.007, SCMean = 10.4, S&PCPitch =

Correct answers obtained

1.

There is a significant decrease in the mental demand in the speech and pitch condition when compared to that in the speech alone condition.

2.

There is a significant decrease in frustration in the speech and pitch condition when to that in the speech alone condition.

3.

There is a significant increase in the number of correct answers obtained in the speech and pitch condition when compared to that in speech the alone condition.

4.

There is a significant decrease in the combined NASA TLX workload rating in the speech and pitch condition.

5.

While the mean of TLX scores indicate that pitch lowers effort, temporal demand and increases performance, these effects are not significant.

6.

The consistently low score of physical demand indicates that most participants did not regard the task as physically demanding.

Correct Answ ers out of 6 7

Correct Answers

6 5 4 3 2 1 0 F1

F2

F3

F4

F5

F6

F7

F8

M1

M2

M3

M4

M5

M6

M7

Participants Speech

Speech and Pitch

Figure 9 Comparing correct answers obtained T14 = -4.012, P two-tail = 0.001, SCMean = 3.867, S&PCPitch = 5.133 Number of questions attempted Number of Qs Attem pted 7 6

Attempts

5

Based on these results, we can safely propose that providing participants the opportunity to access numerical information as pitch whenever required improves the effectiveness of the auditory data browsing tool.

4 3 2 1

Post experiment discussions with participants reveal that while they found the panning of the data pitches aesthetically pleasing, they did not use it consciously during their tasks. Participants noted that listening to pitches gave them a better overview of the data.

0 F1

F2

F3

F4

F5

F6

F7

F8

M1

M2

M3

M4

M5

M6

M7

Participants #Q Attempted S

#Q Attempted P

Figure 10 Comparing number of questions attempted T14 = -1.920, P = 5.467

two-tail

Interviewing the participants also reveal that the significant improvement of participants under the speech and pitch condition (as far as the number of correct answers is concerned) is not matched by their confidence about how successfully they have completed a given task. The inability of participants to associate an exact value to a given pitch can perhaps explain this lack of confidence about a given answer. However, there are cases where participants do feel more confident about their answers in the speech and pitch condition. They argue that under this condition, they are able to browse a larger data set and are more confident that they haven’t missed any important parts.

= 0.075, SCMean = 4.667, S&PCPitch

Comparing workloads

A number of participants found the speech only condition to be annoying because the length of the auditory messages was long and thus time consuming. In addition, there were a number of situations where they felt the auditory speech message contained more information than what they really needed.

Figure 11 Average of TLX scores next to combined workload

Many participants also suggested that they were often not confident about their mental image of the data set. Some attempted to visualise a table, others just tackled the exercises without constructing some mental image. A number of participants jotted down notes on paper while

T14 = 3.585, P two-tail = 0.003 GENERAL DISCUSSIONS

The results can be summarised as follows.

6

Once the lessons learnt from this experiment is used to inform changes in our next prototype, we plan to replicate the same experiment with visually impaired individuals.

attempting the different tasks. However, this mostly happened in the speech only condition. About the table navigation issues, most users found the shortcut keys very useful. Many participants frequently made use of the Home key or to go back some starting point once they felt lost. However, most participants found that they should not be allowed to leave the table even if they have reached one of its edges. The current strategy did not prevent them from getting lost once they overstep the table’s boundary.

ACKNOWLEDGMENTS

We thank all the participants of this experiment. REFERENCES

FUTURE WORK

There is plenty of room for improving the amount of control that participants need as far as speech output is concerned. This increased speech verbosity control can be achieved by a careful redesign of the input interface. The next prototype will be redesigned so that participants are not allowed to get out of a table by crossing its boundaries. They will still be presented with cues that indicate the limits of rows and columns.

7

1.

Mansur, D.L., Graphs in Sound: A Numerical Data Analysis Method for the Blind, in Computing Science. 1975, University of California, Davis: California.

2.

Bulatov, V.l. and J. Gardner. Visualisation by People without Vision. in Workshop on Content Visualisation and Intermediate Representations. 1998. Montreal, CA.

3.

Blackman, H.S., Overview: The Use of Think Aloud Verbal Protocols for the Identification of Mental Models, in Proceedings of the Human Factors Society 32nd Annual Meeting. 1988. p. 872-874.

Evaluating Numerical Tabular Data Comprehension

Evaluating Numerical Tabular Data Comprehension

Suggest Documents

evaluating language comprehension in

Evaluating Models of Visual Comprehension

Optimal Tabular Releases from Confidential Data

Diagrammatic Reasoning of Tabular Data - Semantic Scholar

Variational Relevance Vector Machine for Tabular Data

Disclosure Limitation Techniques for Tabular Data - American

GEOSPATIAL ANALYSIS OF TABULAR DATA IN ...

GEOSPATIAL ANALYSIS OF TABULAR DATA IN MARKETING ...

AUTOMATICALLY CONVERTING TABULAR DATA TO RDF: AN ...

FanLens: Dynamic Hierarchical Exploration of Tabular Data

FanLens: Dynamic Hierarchical Exploration of Tabular Data

Evaluating Kurdish EFL University Students' Comprehension of

AutoTest: Automation to Test Tabular Data Quality

Disclosure Limitation Techniques for Tabular Data - American ...

Evaluating Answers to Reading Comprehension ... - Google Sites

From field data to numerical modeling: evaluating ...

Evaluating Analytical Data - Saylor.org

Defining, mining and reasoning on rules in tabular data

An Efficient Wrapper for Tabular Data Extraction ... - Semantic Scholar

Table Servers: Protecting Confidentiality in Tabular Data Releases

a tabular method for verification of data exchange algorithms on ...

Mining Association Rules from Tabular Data Guided by Maximal ...

Knowledge Base Augmentation using Tabular Data - CEUR Workshop ...

Protecting Sensitive Tabular Data by Complementary Cell Suppression