Ranking Student Ability and Problem Difficulty using Learning Velocities Ananya H. A.1, Akhilesh Hegde I.2, Akshay G. Joshi3, Viraj Kumar4 Abstract Several educational software tools allow students to hone their problem-solving skills using practice problems and feedback in the form of hints. If one can meaningfully define the “distance” between any incorrect student attempt and the correct solution, it is possible to define the student’s learning velocity for that problem: the rate at which the student is able to decrease this distance. In this paper, we present an extension to one such educational software tool (JFLAP) that permits us to compute learning velocities for each student on practice problems involving finite automata construction. These learning velocities are helpful in at least two ways: (1) instructors can rank students (e.g., by identifying students whose learning velocities on most problems are significantly below the class average, and who may therefore require the instructor’s attention), and (2) instructors can rank problems according to difficulty (e.g., while designing a question paper, a “difficult” problem might be one where only a few students have quickly converged to the correct solution).
1
Introduction and Related Work
Instructors and students in several domains have access to a wealth of educational software tools, many of which are freely available. The data generated by learners as they interact with such tools can be exploited using Educational Data Mining (EDM) techniques to help educators answer a variety of interesting questions, such as “which learning material will a particular sub-category of students benefit most from” [2], or “how do different types of student behavior impact their learning” [3]. In this paper, we are concerned with the following question: “how effectively can a particular student solve a particular problem”. To answer such a ques1
Ananya H. A. PES Institute of Technology, Bangalore, India, e-mail:
[email protected] 2 Akhilesh Hegde I. PES Institute of Technology, Bangalore, India, e-mail:
[email protected] 3 Akshay G. Joshi PES Institute of Technology, Bangalore, India, e-mail:
[email protected] 4 Viraj Kumar PES University, Bangalore, India, e-mail:
[email protected]
2
tion on the basis of data, it is necessary to quantify “effectiveness” of the student’s approach. A rudimentary measure is the time required by the student to solve the problem: an “effective” student is one who can solve the problem quickly. Unfortunately, this measure is unsuitable for weak students who may struggle to completely solve the problem, but may nevertheless demonstrate signs of progress in their approach. In some problem domains, it is possible to define a notion of distance between the correct solution to a problem and any incorrect solution. Here, we can define “effectiveness” as the rate at which the student is able to decrease this distance (ideally, but not necessarily, to zero). We call this rate the learning velocity of the particular student for the particular problem. The novel contributions in this paper are: (1) an extension of an educational software tool (JFLAP [7]) to measure learning velocities for a specific problem domain (constructing deterministic finite automata) as an input to other EDM tools; (2) studying how this measure can answer the target question “how effectively can a particular student solve a particular problem” in this domain, thereby yielding a new way to rank students by ability; and (3) using this data as a new way to rank problems according to their difficulty. Finite automata are used extensively to model computational processes, and their study is a core component of the undergraduate Computer Science curriculum [6]. JFLAP is an extremely popular open-source tool that is used world-wide for teaching finite automata and related concepts (see [5] for details and usage statistics). The JFLAP community encourages researchers to contribute extensions to the core software [8], and one such recent extension allows students to solve practice problems [9]. In our work, we log data generated by students as they practice these problems, and thereby compute learning velocities. There are a number of ways to define distance between two deterministic finite automata (DFA)5, and a combination of three such distance functions has proved to be quite effective for grading DFA assignments [1]. These distance functions are computationally nontrivial, and add substantial bulk to the JFLAP package, so we define simpler functions that nevertheless capture distance effectively. We describe these in Section 2 and define learning velocity for DFA construction. We describe our experiments to validate this notion of learning velocity in Section 3 and Section 4 documents our results. Lastly, in Section 5 we discuss our findings and plans for future work.
2
Distance and Learning Velocity for DFA Problems
Our distance functions are motivated by observing student errors. An important 5
For non-deterministic finite automata (NFA), the distance between two NFAs could be defined by determinizing them and computing the distance between the corresponding DFAs. This induced notion of distance seems inadequate, because small changes to an NFA can greatly alter the corresponding DFA. A distance measure for NFAs in this context remains an open problem.
3
kind of error (also pointed out in [1]) is misunderstanding what the question is asking. This kind of error is often observed in students unfamiliar with the language of instruction, but it can also occur because of inherent ambiguities in a natural language such as English. For example, consider constructing a DFA for binary strings where every 1 is followed by at least two 0’s. This is a fairly typical kind of problem, but there are at least two ways of interpreting the italicized expression: (1) binary strings where every 1 is immediately followed by at least two 0’s (solution DFA1 shown in Fig. 1) and (2) binary strings where every 1 is eventually followed by at least two 0’s (solution DFA2 shown in Fig. 2). Suppose DFA1 is the expected solution, and the student’s incorrect solution is DFA 2. Note that the two DFAs are similar: both have only one final state, which is also the initial state. We capture such structural information for every state q of a DFA as a Boolean vector vq. Specifically, this vector encodes the following basic information: whether state q is the initial state or not, and whether state q is a final (or accepting) state or not. Also, for each letter a of the DFA’s input alphabet (a {0, 1} in this example), vq encodes whether the transition from state q on input a is to itself or to a different state, and whether the transition’s target is a final state or not. For example, the vectors associated with state p0 in DFA1 and state q0 in DFA2 are identical. In contrast, the vectors associated with states p1 and q1 are different because the transition in DFA1 from state p1 on input 1 is to another state (p3) whereas the transition in DFA2 from state q1 on input 1 is to itself. For an input alphabet consisting of k letters, the number of distinct Boolean vectors is exactly 22k+2 (for the binary alphabet, k = 2). Fig. 1 DFA1 for binary strings where every 1 is immediately followed by at least two 0’s.
Fig. 2 DFA2 for binary strings where every 1 is eventually followed by at least two 0’s.
4
Fig. 3 Symmetric difference of DFA1 and DFA2.
Since any DFA can be effectively minimized into a unique representation, we define distances between minimized DFAs. This ensures that two distinct but equivalent student DFAs will be at equal distance from the solution DFA.
2.1 Normalized Vector Distance For a given DFA, we can compute the number of states it has for each of the 2 2k+2 types of vectors. We scale this vector by dividing by the total number of states in the DFA. Given two DFAs, we define the normalized vector distance between these DFAs as the Euclidean distance between the two scaled vectors. For instance, the normalized vector distance between DFA1 and DFA2 is 0.5, but this jumps to 0.6455 if q2 is converted into a final state. In practice, this function seems to be the most reliable among the three distance functions we have considered.
2.2 Symmetric Distance Let us define the symmetric difference for two DFAs to be the minimized DFA that accepts precisely the strings accepted by exactly one of the two given DFAs, and not by the other. As an example, Fig. 3 shows the symmetric difference DFA for DFA1 and DFA2. If the two DFAs are equivalent, then the symmetric difference DFA has a single (rejecting) state, and in this special case we define the symmetric distance between the two DFAs as zero. In all other cases, the symmetric difference between the two DFAs is defined as the number of states in the symmetric difference DFA. The symmetric difference between DFA1 and DFA2 above is 6, and this value is unchanged if q2 is converted into a final state. This notion of distance appears to be quite natural, because it captures the complexity (as measured by number of DFA states) of the language representing the extent to which the two DFAs differ. However, this distance appears to be useful only in a few cases. For instance, if the two DFAs are complements of each other (i.e., they are identical except that any accepting state in one DFA is a rejecting state in the other, and vice versa), then the symmetric distance between the DFAs is 1 (sug-
5
gesting that the DFAs are “close”, which indeed they are), whereas all other distances suggest that the DFAs are “far” from each other.
2.3 Structural Distance Using the notion of structural information defined earlier (as a vector vq for each state q), we define the structural distance between two DFAs as the number of distinct vectors for states in the symmetric difference DFA. For instance, the structural distance between DFA1 and DFA2 is 5, but this increases to 6 if q2 is converted into a final state. Once again, in the special case where the two DFAs are identical, the structural distance between the two DFAs is defined to be zero. In this paper, we have used a linear combination of these three distance functions, using linear regression to tune the relative weights for each function so that the distance computed by the combined function closely matches the edit distance function (defined in [1]) on a set of test data.
3
Experiments
We have modified an existing open-source extension to JFLAP [9], which provides students with two kinds of feedback, shown in Fig. 4 and Fig. 5. Before the student gets started on a practice problem and at any time thereafter, she can reFig. 4 Feedback available to students before/during problem solving.
Fig. 5 Feedback after testing a potential solution.
6
quest two types of hints that help her understand the problem better (Fig. 4): she can view (randomly chosen) strings that the DFA should accept, and she can test whether specific strings should be accepted or rejected by the DFA. Once the student is ready to tackle the problem and draws a candidate DFA, additional feedback is generated (Fig. 5). This always informs the student whether her solution is correct or incorrect. In the latter case, the student can optionally request additional hints (as shown in Fig. 5), and the tool generates strings that her DFA should accept (but doesn’t), as well as strings that her DFA should reject (but doesn’t). In this paper, we have modified JFLAP to record the time at which these interactions take place. Further, we have logged the actual DFA constructed by the student at each attempt, and whether the student consulted any additional hints (as shown in Fig. 4 and Fig. 5) in between the previous attempt and the current one. For our experiments, we designed a set of 15 unique questions involving DFA constructions, as shown in Table 1. We requested 30 student volunteers to attempt to solve as many of these problems as possible within a fixed amount of time (90 minutes). Our tool ordered questions randomly from the pool of 15 questions, and all students worked independently. After completing their work, we requested students to allow us to investigate the (anonymized) logs of their attempts recorded by JFLAP. All 30 students agreed to submit their logs, and we obtained a total of 350 attempts for all students (this includes multiple attempts for the same problem). Table 1. The 15 DFA construction questions Question
Description
1
Strings over {0, 1} whose length is a multiple of 3
2
Strings over {a, b} whose 2nd letter from the left is a
3
Strings over {0, 1} that contain 011
4
Strings over {a, b} with exactly two a’s
5
Strings over {a, b} with at least two consecutive a’s
6
Strings over {a, b, c} with at least three letters not equal to a
7
Strings over {a, b} with 0 or more a’s followed by odd number of b’s
8
Strings over {a, b} that start with ab and end with aa
9
Strings over {a, b} that are palindromes of length 3
10
Strings over {0, 1} whose integer value is divisible by 6
11
Strings over {0, 1} that either start with 0 and have odd length OR start with 1 and have even length
12
Strings over {0, 1} that contain neither 00 nor 11
13
Strings over {a, b} whose length is a multiple of 3, which have 0 or more a’s followed by 0 or more b’s
14
Strings over {0, 1} whose integer value is divisible by 3 but NOT divisible by 4
15
Strings over {0, 1} where each occurrence of 01 is followed immediately by 11
7
4
Results
We were interested in determining whether the logs could help us rank students (according to ability) and problems (according to difficulty). While analyzing the data, we discovered that several students made multiple attempts to solve a problem, sometimes spending almost 20 minutes on a single problem. We were curious to see whether these students were making “progress” over these multiple attempts. Hence, we defined our notions of distance and learning velocity to quantify the “progress” (if any) made by students. Our tool can compare the attempts made for a particular problem by one student against the attempts made for the same problem by one or more others. The comparison can be plotted as shown in Fig. 6, which compares the attempts made by student 9 and student 16 on a particular problem. The x-axis represents the time spent (in seconds), the y-axis represents distance (as defined in Section 2), and the slope of the line represents the learning velocity. Fig. 6 also shows the additional hints used by each student as they worked their way through the problem. Both students were able to solve the problem, but while student 16 required 1084 seconds (~18 minutes) to solve this problem, student 9 required only 284 seconds (less than 5 minutes). Furthermore, student 9 used just one additional hint (apart from the “incorrect” feedback after each attempt except the last), whereas student 16 required several hints. (It is interesting to note that student 16 actually made more initial progress, before moving away from the solution.) Apart from these differences, there are interesting similarities between the two plots in Fig. 6. The initial attempts of both students is quite far from the correct solution, and both try several attempts while making little or no apparent “progress”, as represented by the plateaus in the two plots. Then, both students finally “get it” and the distance falls sharply. On closer investigation, we discovered that both students were making exactly the same error before they finally recognized the issue. Fig. 6 Comparison of learning velocities of two students for the same problem.
8
Fig. 7 Comparison of learning velocities of the same student on two different problems.
On the basis of this data, we can confidently rank student 9 higher than student 16 for this type of problem. However, a student’s learning velocity varies (as one would expect) from problem to problem. Fig. 7 compares the learning velocities of a single student on two problems (10 and 12) of similar difficulty 6. The student’s initial attempt for problem 12 was quite close to being correct, but subsequent attempts moved slightly away from the solution, before finally getting very close to (but not equal to) the correct DFA. At this point, the student appears to have given up. In contrast, the same student began problem 10 from much further away, and spent considerably more time (making steady progress throughout), before obtaining the correct solution.
5
Discussion and Future Directions
Our extension to JFLAP can compute learning velocities based on any distance function. This data is presented to the instructor for visualization as explained in Section 4, and can also be fed into other EDM tools. Our findings raise several interesting questions. For instance, consider the two learning velocities curves shown in Fig. 6. What explains the long time both students spent while making no “progress”? In this case, we discovered that the question supported at least two valid interpretations (similar to the example presented in Section 2). Thus, our visualization can help the instructor identify such problematic questions by probing curious patterns further. Another question is raised by the visualization in Fig. 7: why did this student abandon problem 12 despite getting “so close” to the solution? When we investigated this, we discovered that JFLAP was giving feedback/hints that were confusing. Tools which automatically generate useful feedback for students are naturally desirable, but the possibility that students fail to understand these hints has also been noted elsewhere [4]. Our distance functions, being simpler than those in [1], can be computed very rapidly. Thus, we can add a “How close am I?” feedback button for students to check their progress. We are investigating the usefulness of this feature. We 6 In the (subjective) opinion of the last author, who has experience teaching this material, students should find problem 10 slightly easier than problem 12.
9
have also added features to JFLAP that permit instructors to identify questions for which the learning velocities are generally fast (which corresponds to easy questions) or generally slow (hard questions), based on a corpus of student logs. Finally, we believe that administrators can use learning velocities to assess the quality of learning for a given batch of students, by determining learning velocities for all students on a pre-defined type of problem. These statistics can, for instance, be used to measure the quality of instruction given by the course instructor, relative to similar statistics gathered with previous batches of students.
References 1.
2.
3.
4.
5. 6.
7. 8.
9.
Alur, R., D'Antoni, L., Gulwani, S., Kini, D. and Viswanathan, M.: Automated Grading of DFA Constructions, Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence (IJCAI'13), pp. 1976-1982, (2013) Beck, J.E. and Mostow, J: How who should practice: Using learning decomposition to evaluate the efficacy of different types of practice for different types of students. In Proceedings of the 9th International Conference on Intelligent Tutoring Systems, pp. 353-362 (2008) Cocea, M., Hershkovitz, A. and Baker, R.S.J.D: The Impact of Off-task and Gaming Behaviors on Learning: Immediate or Aggregate? In Proceedings of the 14th International Conference on Artificial Intelligence in Education, pp. 507-514 (2009) D’Antoni, L., Kini, D., Alur, R., Gulwani, S., Viswanathan, M., Hartmann, B.: How Can Automatic Feedback Help Students Construct Automata?, ACM Transactions on Computer-Human Interaction, Vol. 22, No. 2, Article 9 (2015) JFLAP website: http://www.jflap.org Joint Task Force on Computing Curricula, Association for Computing Machinery (ACM) and IEEE Computer Society: Computer Science Curricula 2013: Curriculum Guidelines for Undergraduate Degree Programs in Computer Science”, ACM, New York, NY (2013) Rodger, S. H. and Finley, T. W.: JFLAP – An Interactive Formal Languages and Automata Package, Jones and Bartlett, Sudbury, MA (2006) Roger, S. H., Lim, J. and Reading, S.: Increasing interaction and support in the formal languages and automata theory course, Innovation and Technology in Computer Science Education (ITiCSE 2007), pp. 58-62 (2007) Shekhar, V. S., Agarwalla, A., Agarwal, A., Nitish B., and Kumar, V.: Enhancing JFLAP with automata construction problems and automated feedback, In Proceedings of the 7th International Conference on Contemporary Computing (IC3’14), pp. 19-23 (2014)