an improved method for label matching in e- assessment ... - CiteSeerX

4 downloads 1272 Views 589KB Size Report
single quotation mark are removed while some of the special characters like ..... a Match, http://www.catalysoft.com/articles/StrikeAMatch.html [Accessed on 30.
AN IMPROVED METHOD FOR LABEL MATCHING IN EASSESSMENT OF DIAGRAMS Ambikesh Jayal

Martin Shepperd

Brunel University UB8 3PH [email protected]

Brunel University UB8 3PH [email protected]

ABSTRACT A challenging problem for e-assessment is automatic marking of diagrams. There are a number of difficulties not least that much of the meaning of a diagram resides in the labels and hence label matching is an important process in the e-assessment of diagrams. Previous research has shown that the labels used by the students in the diagrams can be diverse and imprecise which makes this problematic. In this paper we propose and evaluate a new method for label matching to support e-assessment of diagrams and address problems of synonyms, spelling errors and differing levels of decomposition. We have implemented the syntactic part of our method and evaluated it using 160 undergraduate assessments based upon a UML design task. We have found that our method performs better than the other syntax matching algorithms. This framework has significant implications for the ease in which we may develop future e-assessment systems. The results from this pilot study have been encouraging and motivate us to implement the semantic similarity part of our method and conduct further evaluations. Keywords E-Assessment, Diagrams, Text Processing, Empirical Evaluation.

1. INTRODUCTION There has been growing interest within the e-learning community to automate the process of marking student course work [1]. One area that is important but rather challenging is assessment that involves diagrammatic notation. These are commonplace in subjects such as computer science where diagrams generally have semi-formal semantics. Typically, much of the meaning of diagram resides in the labels; however, the choice of labeling is largely unrestricted. To automatically mark the diagrams we need to match the labels used in the student coursework with the labels in the model solution. However, the student may use a label that differs from that provided in the specimen solution even though both have the same meaning. In addition a label may comprise one or more words. Previous research has shown that there is a high degree of variation in the labels used by the students [2, 3] and this makes the label matching process in diagrams difficult. Although some existing tools uses some of the techniques for matching labels like edit distance, the authors are unaware of any generic framework that can guide the label matching process across all the tools for the e-assessment of diagrams. Because the label matching is an important step in automatically marking diagrams there is a need for such a generic framework. We therefore propose a generic framework to manage the label matching process in diagrams for the purpose of automatically marking them. This framework has been named as Diagram Label Processing and Matching Framework (DLPMF) and comprises five stages. We have also introduced a new method to calculate the syntactic similarity between labels in a diagram. This method basically combines the existing word-to-word syntax matching algorithms into a label-to-label syntax matching algorithm. We also evaluate this new method using undergraduate coursework from Brunel University and compare the accuracy of automated approaches with that of the human marker. We also compare the accuracy of our method to that of existing algorithms. The remainder of this paper is organised as follows. We review existing approaches to automated assessment of diagrams and in particular approaches adopted for matching labels. Next we describe our proposed framework for matching labels in a diagram for the purpose of automatically marking them. Then we present the results and conclude with a discussion of the potential significance of the proposed framework for the problem of e-assessment of diagrams.

ITALICS Volume 8 Issue 1 February 2009 ISSN: 1473-7507 3

2. RELATED WORK The e-assessment process involves a number of activities including preparing the course work, delivering it, marking the student’s answers, providing feedback to the students and detecting plagiarism. Also according to [4] the assessment can be used in diagnostic, formative or summative educational contexts. Table 1 describes these contexts. Providing timely feedback, in particular, can be one of the important aspects of an effective assessment system and automatically marking the coursework is one of the factors that can help provide the feedback quickly. Also automatically marking the coursework addresses the problem of resource constraints and inconsistency in human marking which increases significantly with class size. Thus, automatic marking of coursework brings both pedagogic and practical benefits [5]. Educational Context

Purpose

Time

Diagnostic

To ascertain knowledge level of learner

Before or after learning programme

Formative

To provide feedback to the learner

During learning programme

Summative

To formally grade performance of learner

During or end of learning programme

Table 1: Types of Assessment

There has been substantial previous research work to automatically mark coursework. This has targeted objective type questions [6, 7], free response text based questions [8, 9, 10], mathematics based questions [11, 12] and computer programming questions [12] However, there is now growing interest in diagram-based questions [14, 15, 17] but results to date have been limited. Diagrams are difficult to automatically mark because of problems such as the diagrams being malformed or possessing missing or extraneous features [18]. In addition there are problems of topology where equivalent diagrams can be laid out differently and semantic problems where topologically identical diagrams have different meanings due to different labelling. Following is a list of approaches that have been used for automatically marking diagrams using existing eassessment tools: •

Graph Isomorphism In this approach a diagram is treated like a graph with nodes representing the various entities and edges representing the various relationships between them. Marking proceeds by searching for isomorphic graphs or sub graphs with the correct diagram. This approach has been implemented in the ABC system developed by University of Manchester and is reported to have worked well [17]. However, one disadvantage with this approach is that it treats diagrams purely as graphs whereas diagrams can have richer associated information with the need for distinguished nodes and edges [17]. Essentially two diagrams may be topologically equivalent yet have wildly differing semantics. Also this approach might not be scalable because of the computational power required to find isomorphic components [19]. The advantage of this approach is that it is general and can be applied to wide variety of diagrams and problems.



Local Metrics This approach complements the graph isomorphism approach by trying to make up for the problem of richer information associated with nodes and edges like type and label being ignored whilst marking. Instead, a Local Metric object containing information such as the type of box, degree and label is created for each node of the diagram. This Local Metric object is then used to compare the two nodes. This ABC system uses this approach in conjunction with the graph isomorphism approach and it is reported to have worked well [17].



Object Oriented Metrics This approach involves calculating the various object oriented metrics for cohesion, coupling etc. and using them to mark the diagrams. The marking tool for Object Oriented diagrams in the CourseMaker system [15] uses metrics to mark them for correct classes, relationship and completeness. This approach is limited to software design diagrams and can not be used for other types of diagrams. It also assumes that a metric can be a final

ITALICS Volume 8 Issue 1 February 2009 ISSN: 1473-7507 4

arbiter of design quality whereas they are generally offered as design heuristics or rules of thumb that are decidedly context sensitive [20]. •

Graph Transformation Approach Although this approach like the Graph Isomorphism approach treats diagrams as graphs, it differs in the way it marks the diagrams. In this approach the student diagram is treated as a starting graph and the correct diagram by the lecturer is treated as the final graph. An attempt is made to transform the student diagram into the correct diagram by the lecturer. The number of steps required for this transformation is treated as a measure of the distance between the two diagrams and used to mark them. The problem with this approach lies in selecting the transformation steps and the order they should be applied. We could not find any existing implementation of this approach.

All of the above approaches of automatically marking the diagrams involve matching the labels in student diagram with the labels in the model diagram. Following are some on the ways used for managing labels in a diagram: •

Manual intervention The University of Teesside Automated Student Diagram Assessment System [14] uses manual intervention by students to manage labels. Each student is presented with a list of labels within their diagram and the model solution and requested to select which labels of their own match the labels of the model solution [14].



Provide a lexicon or thesaurus Recently the Open University team has reported that they are working on some of the NLP techniques like synonyms, abbreviations, punctuation, hyphenation, stemming and so forth [21].



Syntactic matching algorithms The Assess By Computer system by the Manchester University uses the Edit Distance algorithm [22] for matching the labels. According to [17] the algorithm used by the system for label matching returns maximum score if the two labels are identical, otherwise it returns a score based on the edit distance between the two labels and above a cut-off edit distance the algorithm returns the lowest score of 0. The system also uses a Human-Computer Collaborative approach for matching labels. All labels having an edit distance less than a threshold are considered to be a match and are presented to the lecturer in the form of a navigable tree which can then be reviewed and updated by the lecturer [23].



Semantic matching algorithms We could not find any literature about using semantic similarity for matching labels although in principle it is possible to envisage the use of ontologies or a semantic lexicon such as WordNet which is maintained by the Cognitive Science Laboratory at Princeton University.

Label matching is, therefore, an important process in the e-assessment of most forms of diagram and although the existing tools use some of the techniques for matching labels like edit distance, there is a need for a generic framework for managing labels which can be used across many e-assessment tools. The authors could not find any such generic framework in the existing literature, hence the proposal of the framework, as discussed below, which we believe could benefit many tools aimed at e-assessment of diagrams.

3. FRAMEWORK: DIAGRAM LABEL PROCESSING AND MATCHING FRAMEWORK We propose a five stage framework to process and match the labels used in the diagrams for the purpose of automatically marking them. The first stage processes the labels for any explicit or implicit imprecision and produces a set of labels that can be used for further stages. The second stage runs various algorithms for syntactically matching the labels in the student’s answer with those in the model diagram and produces a set of matrix for the syntactic similarity index. The third stage does the same task as the second stage but for the semantic similarity and produces a matrix for a semantic similarity index. The fourth stage combines the results from the syntactic and semantic similarity index metric and produces a combined similarity index as output. The fifth and final stage analyses the combined similarity index value from the fourth stage and automatically marks the labels in the student’s diagrams as correct if a match can be found in the model diagram and incorrect if a match can not be found in the model diagram. Table 2 lists all the stages of the framework and Table 3 summarizes the input and output of various stages.

ITALICS Volume 8 Issue 1 February 2009 ISSN: 1473-7507 5

Reference

Stage Name

Sub Stage Name

S1.1

Pre-processing

Lowercase And trimming

S1.2

Pre-processing

Special Character Replacement

S1.3

Pre-processing

Abbreviation Expansion

S1.4

Pre-processing

Auto Correct

S1.5

Pre-processing

Removing Stopwords

S1.6

Pre-processing

Stemming

S2

Syntactic Matching

Syntactic Algorithm

S3

Semantic Matching

Semantic Algorithm

S4

Combined Similarity

Combined Similarity

S5

Analysis

Best Match Selection

Table 2: Label Matching Framework Stages

Stage Name

Input

Output

Pre-processing (6 Sub Stages)

Labels

Processed labels

Syntactic Matching

Processed labels

Syntactic similarity

Semantic Matching

Processed labels

Semantic similarity

Combined Similarity

Similarity Index from Stage S2 and S3

Combined similarity

Metric for Combined Similarity

Pairs of Matching Labels

Analysis

Table 3: Framework Stage Input Output

3.1 Stage One: Pre-processing The Pre-processing Stage comprises measures carried out before any actual similarity matching is undertaken and includes the following methods. It takes the raw labels as input and produces a set of processed labels which are used in the subsequent stages. The advantage of this stage is that it reduces the number of labels to be matched because of duplicates. Previous research has shown that this stage is effective [2]. •

Lowercase and Trimming: This is fairly simple and involves converting all labels to lowercase and trimming them.



Special Character Replacement: This involves removing the special characters like “_" and “&". Some of the special characters like a single quotation mark are removed while some of the special characters like “_" and “&" are replaced by single space “ ".



Abbreviation Expansion: This involves expanding abbreviations like “msg" to “message".



Auto Correct: This involves running the Spell Checker, identifying the incorrectly spelled labels and automatically correcting them by replacing with the first word suggested by the spell checker.



Removing Stopwords: This involves removing the words which are very common such as “is”, “an”, “the”, “to” that are not relevant for matching. A list of English language stop words used by google is given at English Stopwords [24].



Stemming: This involves stemming all the labels which means converting them to their root word [25, 26].

ITALICS Volume 8 Issue 1 February 2009 ISSN: 1473-7507 6

3.2 Stage Two: Syntactic Matching The second stage comprises calculating the syntactic similarity between labels using the various syntax matching algorithms and produces as output a set of matrices for the syntactic similarity index. This stage of the framework is flexible and can be customized to use other syntax matching algorithms. For this study, we have considered a total of seven different syntax matching algorithms of which five are existing algorithms and two are modified algorithms introduced by ourselves. •

Exact Match Algorithm: This simply compares the two labels and produces a value 1 if an exact match is found otherwise produces a value 0.



Levenshtein Distance Algorithm: “The Levenshtein distance between two strings is given by the minimum number of operations needed to transform one string into the other, where an operation is an insertion, deletion, or substitution of a single character." [27].



Q-gram Algorithm: “Q-grams are typically used in approximate string matching by sliding a window of length q over the characters of a string to create a number of ’q’ length grams for matching a match is then rated as number of q-gram matches within the second string over possible q-grams." [28]



Simon Algorithm: This algorithm uses the number of lexically similar adjacent character pairs contained in two labels as a measure of their similarity [29].



Soundex Algorithm: This uses the similarity in sound produced by two labels as a measure of similarity between them and is based on the Soundex Algorithm [30, 31]. So, for example, “Update" and “Updat" will have a Soundex similarity score of 1 whilst “go" and “come" will have a Soundex similarity score of 0. This algorithm has some variations like RefinedSoundex, Metaphone and DoubleMetaphone [32] and we intend to include them in the framework in future.

The two modified syntax matching algorithms introduced by us are as follows. 1. AlgoMax: This algorithm takes, as input, two labels and returns a syntactic similarity index between 0 and 1 with 1 meaning a perfect match. It runs the five different algorithms described above, thus obtaining five different similarity scores. It then simply returns the maximum value. 2. Combined Syntactic Algorithm: This method combines the existing word-to-word syntax matching algorithms into a label-to-label syntax matching algorithm. Similar to the Algo-Max algorithm, this algorithm also uses the five different algorithms and returns a syntactic similarity index between 0 and 1 but differs from it in the way it runs these algorithms and processes the results. While Algo-Max runs the various syntax algorithms inputting the two complete labels as they are, this algorithm runs the various syntax algorithms inputting at a time one word from first label and one word from second label. It does this for each pair of words in the cartesian product word pairs between words in first and second labels and obtains a similarity index for each pair of words. After that it selects the matching pair of words based on the threshold value and the maximum value of similarity index. While selecting the matching pairs, it removes the duplicates because one word in first label can have only one corresponding matching word in the second label. The following steps describe this algorithm. (a) Input two labels Li and Lj. (b) Divide each label into groups of words, say Li has words wi1,...,wi3 and Lj has words wj1,...,wj3 where the number of words in a label, n in this example are ni=nj=3. (c) Prepare Cartesian product word pairs between the words in first and second label, so we have (wi1,wj1),(wi1,wj2),(wi1,wj3)...,(win,wjn) (d) For each pair of words, calculate the Syntactic Similarity Index (SynSI) using the following formula: ITALICS Volume 8 Issue 1 February 2009 ISSN: 1473-7507 7

5 SynSI(wi,wj)csa=maxp=1SynSI(wi,wj)algo

p

where csa denotes the Combined Syntactic Algorithm, SynSI(wi,wj)algo denotes the Syntactic p Similarity value returned by executing the algop and p is defined as: i.

Plain Match algorithm (p=1)

ii. Levenshtein Distance algorithm (p=2) iii. Q gram algorithm (p=3) iv. Simon algorithm (p=4) v. Soundex algorithm (p=5) (e) For each word wk in label Li, select the word wl in label Lj such that the pair has maximum value for SynSI(wk,wl)csa. Remove the duplicates and recalcuate because one word in a label can have only one corresponding matching word in the other label. If one word in first label has the same value of similarity index with two words in the other label, then select one randomly. (f) Set the word similarity threshold T(SynSI)=0.6. (g) For the word pairs for which SynSI(wi,wj)csa

Suggest Documents