Future Educational Technology with Big Data and Learning Analytics Rajeev Kanth Department of Future Technologies University of Turku Turku, Finland
[email protected]
Mikko-Jussi Laakso Department of Future Technologies University of Turku Turku, Finland
[email protected]
Paavo Nevalainen Department of Future Technologies University of Turku Turku, Finland
[email protected]
platforms for achieving high-quality benefits in education. Similarly, Gerd Kortemeyer [4] agrees that learning analytics of educational transactional data can provide insights into learning behavior. Moreover, regarding the education using mobile devices illustrates [5] the following fact. An analysis of the empirical research on the use of mobile devices as tools in educational interventions that were published in peer-reviewed journals has revealed that the overall effect of using mobile devices in education is better than when using desktop computers or not using mobile devices as an intervention, with a moderate effect size of 0.523. So, in practice, it is proposed that more elaborate instructional design developments be needed to exploit the educational benefits possible by utilizing mobile devices more thoroughly. These all literature is nicely justifying the present direction of research work especially the education using mobile devices. Likewise, Ryan et al. [6] emphasize that contextual social conditions that support one’s feelings of competence, autonomy, and relatedness are the basis for one maintaining intrinsic motivation and becoming more self-determined concerning extrinsic motivation.
Abstract—In the recent years, big data and learning analytics have been emerging as fast-growing research fields. The application of these emerging research areas is gradually addressing the contemporary challenges of school and university education. Tracing out the information regarding students’ misconceptions and dropping-out probabilities from the courses at the right instant of time, development of detectors of a range of educational importance and achieving the highest level of quality in the higher education are becoming more challenging. Moreover, providing well timed and the best suitable solutions to the students at-risk are even more strenuous. In this concept paper, we aim to address these contemporary challenges of school and the university education and their probable solutions by utilizing our research experiences of automated assessment, immediate feedback, learning analytics and the IT technologies. Solving such problems by knowing the history of students’ activities, submissions, and the performances data is possible. The identification of students’ misconceptions during the learning process, examining behavioral patterns and significant trends by efficiently aggregating and correlating the massive data, improving the state-of-the-art skills in creative thinking and innovation, and detecting the drop-outs on-time are highlighted in this article. We are aiming at extracting such knowledge so that adaptive and personalized learning will become a part of the current education system. Not only the available algorithm of supervised learning methods such as support vector machine, neural network, decision trees, discriminant analysis, and nearest neighborhood method but also new engineering and distillation of relevant data features can be carried out to solve these educational challenges.
Students’motivation toward their study is one of the best catalysts in the learning process. Therefore, an open question is how to enhance the motivation of a student during their studies? This increasing trend of students and teachers adaptation leaves a huge data production, allow us to perform data analytics research to answer the following research questions: How does the system automatically generate a report of students’ misconceptions during a course and to produce an alert to the students? What will be the best and proper exercise for those who are struggling during the session and how the distribution of mapping between the students’ abilities and the supportive activities will be carried out? How does the system automatically detect and identify the possibility of the dropouts as early as possible? How does the system work robustly for automatic assessment and immediate feedback approach for teaching and learning? How does the system autonomously measure the levels of innovations and creativity of a student?
Keywords—Digital Social Research, Learning Analytics, Big Data, Educational Technology
I. INTRODUCTION A report on “Education and Training monitor 2016 Finland” by European Commission shows that one of the biggest priority areas of government’s strategic “Vision: Finland 2025” is to create new learning environments and digital materials for basic education [1]. Data’ has become a key focus in studies of educational governance and policymaking at national, European and global scales [2]. Aleksandra et al. [3] emphasize that big data have a significant impact on higher education, practice, from improving learners’ experience, and knowledge through enhanced academic studying, to more efficient decision-making, and to planned response to varying overall trends. Big data and learning analytic techniques have also been considered as the popular
978-1-5386-3704-3/18/$31.00 ©2018 IEEE
Jukka Heikkonen Department of Future Technologies University of Turku Turku, Finland
[email protected]
II. STATE-OF-THE-ART RESEARCH METHODOLOGIES IN ASSESSING STUDENT’S PERFORMANCE To achieve concrete research results and to answer the given research questions, it is essential to get-acquainted with the state of the art research methodologies for big data, machine learning, and the learning analytics in context with the
906
both assess as well as provide prediction of the student teamwork effectiveness. The above-explained research methods for assessing students in different ways and using ultra-modern and futuristic approach are highly lined up with our work [11] where utilizing learning analytics for real-time identification of the students-at-risk on the introductory programming course has been demonstrated. In this article Rolf Linden et al. investigated with the use of novel methods of learning analytics and the learning management system (LMS) that potential dropouts and the group of students-at-risk is possible to identify as early as after the first two or three weeks of the course.
educational technology. Modern deep neural networks have achieved impressive results in a variety of automated tasks, such as text generation, grammar learning, and speech recognition. Steven Tang from UC Berkeley [7] discusses how education research might leverage recurrent neural network architectures by training a two-layer Long Short-Term Memory (LSTM) on two distinct forms of education data and finally concluded that his early explorations demonstrate the potential for applying deep learning techniques to large education data sets. Murat Kayri [8] has investigated the factors affecting the success of university students by employing two artificial neural network methods (i.e., multilayer perceptron (MLP) and radial basis function (RBF)) and comparing the effects of these techniques on educational data for the predictive analysis of the performances. The findings suggested that model based on Radial Basis Function Artificial Neural Network (RBFANN) should be utilized in the engineering research, while studies showed that the performance of the model based on Multilayer Perceptron Artificial Neural Network (MLPANN) was better in the engineering science. The article also indicated that if the collected data from the students via a set of questionnaires or other instruments then the predictive ability of the MLPANN was more robust and less biased than the RBFANN due to non-linear relationships.
Based on the fact mentioned in this literature, we are researching on effective and efficient methods of learning analytics that not only enhances the learning behavior of a student but also perform an automatic assessment, immediate feedback and provides concrete information about students’ misconceptions and dropouts tendencies as early as possible during a course. The study includes background information of the students, pre-knowledge, registry information, social activities, enrolment information and course activities and the performances as shown in Fig. 1. III. QUALITY, INNOVATIVE ASPECTS AND CREDIBILITY Two different data-driven system for predicting grades and dropouts and the development and Implementation of efficient feedback system have been realized [12]. For the electronic learning path in mathematics and programming, individual exercises (easy-medium-hard) based on gamification that enforces the arithmetic fluencies, self-learning, and the regulation of the children. The system autonomously recognizes the children’s misconceptions and allow them for the continuous assessment and the adaptive exercises. Our objective is to make the kids engaged, happy and grasping more learning. The chosen learning approach is such that children assume as if they are carrying out some fun during their study. The smiles and the expression of success (raising hand-knot) of the kid undoubtedly show the motivation toward his/her learning. The approach does not put any stress on the children’s mind, and the proposed educational methodologies certainly give sufficient scopes of creative thinking and innovation. The research results will impact on both the society and the educational institutions by providing enabling
Anwar Ali Yahya [9] has introduced machine learning approaches to a new application in the field of education. He has explored the effectiveness of three machine learning approaches, namely k-nearest neighbors, Naïve Bayes, and support vector machines on the task of evaluating teaching effectiveness by classifying teachers’ classroom questions into different cognitive levels identified in Bloom’s taxonomy. This study leads to the several conclusions: First machine learning approaches have superior performance over rulebased approach. Second, the term frequency as a term selection approach plays a crucial rule in the performance of machine learning approaches. Third, SVM shows a superior performance over k-nearest neighbor and Naive Bayes which shows a better comparable performance regarding accuracy. Dragutin Petkovic et al. [10] demonstrated a machine learning approach for assessment and prediction of teamwork effectiveness in software engineering (SE) education. The challenges in effective SE education is the lack of objective assessment methods of how well student teams learn the critically needed teamwork practices, as defined as the ability: (i) to learn and efficiently apply SE processes in a teamwork setting and (ii) to work as a team to develop satisfactory software products. Dragutin Petkovic has successfully presented a novel approach to address an assessment procedure based on (i) extracting only objective and quantitative student team activity data during their team class project; (ii) paring these data with related independent observations and grading of student team effectiveness in SE process and SE product components in order to create training database; and (iii) applying a machine learning (ML) approach, namely random forest classification (RF) to the above training database in order to create ML models, ranked factors and rules that can
Fig. 1 Components of Learning Analytics
907
technologies for the development of effective and efficient teaching and learning processes that will help in identifying students’ misconceptions, early prediction of possible future dropouts and finally achieving the highest level of quality. In this context, the role of big data and learning analytics can develop the current educational technology with the following features.
Student learning model is often formulated as time series prediction: given the series of exercises a student has attempted previously and the student’s success or failure on each exercise. The data consists of a set of the binary random variable indicating whether student s produces correct responses on trial t, {Xst}.
Feature1: Development and implementation of an intelligent, scalable and collaborative educational platform based on students’ data. The targeted platform is to support the varieties of courses efficiently, most suitable for the toddlers to doctors level of education, and prediction of real-time performance will have a high priority in the development work.
C. Automatic Assessment and Immediate Feedback As a case study, automatic assessment and immediate feedback in the first-grade mathematics course was already conducted with very positive results on the learning performances of the pupils involved in the experiment [16]. More concrete experimental validation and verification for versatile course subjects, multiple schools, and classes and larger scale of data will be carried out for exploiting the positive effects of computer-assisted learning during the project.
Feature 2: The envisioned platform will be a hierarchical and scalable composition of a round of exercises, tutorials and the course projects where an innovative collaboration among the students and teachers is possible yielding the highest level of learning outcome.
V. DEVELOPED COLLABORATIVE EDUCATIONAL PLATFORM
Feature 3: Development of dynamic mapping matrix between student’s profiles and the proper exercises categorization based on students’ requirement. The goal is to predict the performance level automatically using deep learning and neural network techniques at required stages of courses.
The long-term vision of our research work is to create an electronic learning path from toddler to a doctor by utilizing the learning analytics and the big data. This implies that creating research-based improvement model for Finnish Education System is the prime goal of this research work. We have been researching on the learning environment and the pedagogic use, automatic assessment and immediate feedback, gamification and serious games, automatic detection of student’s misconceptions and tracking the dropouts. A group of researchers, developers, and content providers entitled as a ViLLE research group (www.villeteam.fi) at the Department of Future Technologies, University of Turku has been carrying out the research work since 2004. We are developing this innovative application very carefully and jointly with the primary, lower secondary, upper secondary and the University level teachers. At primary education level, varied exercise sets for the electronic learning path for the mathematics, Finnish language skills, problem solving and computational thinking have already been created. At lower and upper secondary level, not only the digital establishment for the course of mathematics and the programming courses including electronic exams are being developed, but also
Feature 4: To establish a robust protocol for the early prediction of possible dropouts during the course at the moment from where return-back-on-original-track could be within the bounds of possibility. Feature 5: To enhance the resilience in the current digitalization process of instructional worksheets, electronic examination, and submission systems for achieving the highest quality and to emphasizing on weaving the current IT technology into the education for more advantages in the learning. IV. RESEARCH METHODS To achieve the above features and to develop a scalable, dynamic reconfigurable and the creative educational platform, following possible research methods can be employed. A. Deep Learning and Neural Network Techniques Measuring the academic performance of students is highly challengeable since academic performance hinges on several diverse factors such as evaluation criteria, subjective or objective nature of problems, and on many other related considerations. However, students’ performance prediction system can be characterized by learning multi-level data representation by unsupervised training hidden layers over fine-tuned backpropagation neural network techniques [13]. B. Modeling Students’ Learning The measure of learning is how well students perform and able to apply their skills that they have been taught. In this regard, a couple of inbuilt deep knowledge tracing (DKT) architectures over Bayesian Knowledge Tracing (BKT) technique will be investigated [14]-[15] for our purpose of data analytics.
Fig. 2 Alltime Students' Submissions
908
Collection of Data on the Students’ Activity
Machine Learning Training Database /Suitable Machine learning algorithm for feature extraction to discover factors that determine and predict students’ performance
Level 1-Excellent (SG1) S1, S21, S43,...........Sk
Level 2-Very Good(SG2) S1, S8, S31,...........Sm
●●●
Exercise Sets E1, E2, ...En
MM1 of SG1 and Best Suitable Ek
MM2 of SG2 and Best Suitable Ek
Level N- Satisfactory (SGk) S7, S9, S35,...........Sq
Categorizing Students
Categorizing Exercise
●●●
MMk of SGk and Best Suitable Ek
Performance Level Known of each students’ group
Fig.3 Proposed Research Approaches
National evaluation system and the matriculation e-Exam have been created by our research group. At the university level, many schools and departments are collaboratively supporting the digitization process of several course contents (especially the computer science), exercise sets, and the electronic exams. The number of submissions on the current developed system is highly remarkable as average daily, weekly and monthly statistical data exceed respectively 20K, 300K and 1 Million. All-time-submissions have already surpassed the magic number, 10 million as shown in Fig 2, which means in practice, every hour, about four submissions have been posted since last seven years. Moreover, the submissions for November 2017 has already surpassed 1.1 million, implies that about 30 submissions were taken place in an hour, i.e., every two minutes, there is a submission as depicted in Fig 4. Apart from the submissions, the public exercises on the developed platform already overreach 50K and platform users as a teacher and the students are about 5K and 80K respectively.
desired quick feedback process into the system and to carry out needed research work for embedding electronic examination and its compatibility. Our research group has already achieved innovative and pioneer results such s analyzing teachers’ feedback from technology-enhanced mathematics course, automatically assessed computer programming courses for Finnish and Vietnamese higher education courses [17]-[18]. We have experimented with the computer science course which leads to providing automatic detection of dropouts during the 2nd week of the eight-week courses. The mass of the students who are more probable to the dropouts can easily be identified in the second week of the course. Similarly, the prediction of graduating and achieving a degree (e.g., B.Sc./M.Sc.) on the scheduled time is not only crucial for the students’ career but also this will be an essential parameter for the institutions as the government already is investing a considerable amount. The literature also depicts that the lecture attendances and
VI. PROPOSED RESEARCH APPROACHES AND ANTICIPATED OUTCOMES
A wide range of data pertinent to students’ activities is collected as shown in Fig. 2 and Fig. 3. Mostly the data types are quantitative and objective. In essence, there are three steps for assessing and predicting students’ performance as shown in Fig. 3. The training database will be used as an input to machine learning (ML) training, which produces an ML classifier that predicts students’ performance on the factors and rules predicting and assessing the performances. The next step is to develop an efficient data-driven system for predicting grades and possible dropouts which requires profiling and categorizing the performances and the course materials respectively. The hierarchical and scalable behavior of the system includes the profiling and categorization of types of exercises suitable for several courses. The last step would be to test and validate
909
Monthly Submissions
Fig. 4 Submissions in OCT and NOV 2017
final exam score do not have a significant statistical correlation. However, during an experiment in the advanced law course at the University of Turku constitutes proof that small bonus on a lecture not only increases the attendances in the class but also produce a little positive effect on the grade as well. Similarly, the average amount of attendances for the computer science course at the undergraduate level is 72%, 61%, 85% and 75% of the maximum number of students in the year 2012 to 2015 respectively. Regarding the research materials and infrastructures, available on-site research materials include existing ViLLE collaborative platform, 50K plus public exercises, 25K plus intranet available ready exercises, and supporting software and necessary tools for conducting the proposed research work. The devices include RFID based automatic detection for students’ attendances and the content designing instruments, which are capable of supporting various kinds and complexity. We have also illustrated in the article [19] that the proposed teaching model and the educational platform considerably enhances the learning outcomes of Indian school level students.
[3]
[4]
[5]
[6]
[7]
[8]
[9]
VII. CONCLUSION AND FUTURE WORKS In this paper, we have attempted to demonstrate that categorizing the students’ performance data and the exercise sets are adequate parameters for identifying the misconception and possible dropouts during a course. The proposed work also brings a very positive impact on new teaching methodology, and on the e-learning technology. From the perspective of facilitating learning, the proposed innovative platform will enhance the capacity to support the increased level of student engagement and interaction with the learning materials by providing access to additional online documents, collaboration tools, and the peer learning. The e-learning skills (efficient access of educational resources) and the e-assessment based teaching methodology acquired by the teachers can utilize in several other courses and can also be shared with other departments and the other academic institutions. The findings from this work will contribute to a better understanding of where, for whom, and in which way the use of this technologybased learning environment will best highlight the effects of particular educational methods and reveal the usefulness and the limitations of technology-based education. As the future works, there are plenty of scopes for researching on the statistical modeling of the students’ learning behavior, to utilize beyond the state-of-the-art algorithms of machine learning and deep neural network for creating the intelligent and innovative educational platform.
[10]
[11]
[12]
[13]
[14]
[15]
[16]
ACKNOWLEDGEMENTS The Authors would like to thank Finnish Foundation for Economic Education (www.lsr.fi) for the research grant.
[17]
REFERENCES
[18]
[1]
[2]
Education and Training Monitor 2016, Vol 2, Individual Country Report by European Commission, Available online: https://ec.europa.eu/ education/sites/education/files/monitor2016-fi_en.pdf Fenwick T, Mangez E and Ozga J (eds) (2014) Governing Knowledge: Comparison, Knowledge-Based Technologies, and Expertise in the Regulation of Education. London: Routledge.
[19]
910
Powered by TCPDF (www.tcpdf.org)
Aleksandra Klašnja-Milićević, Mirjana Ivanović, Zoran Budimac, “Data Science in Education: Big Data and Learning Analytics,” Computer Applications in Engineering Education, Pages: 1-13, June 2017 Kortemeyer Gerd, “The Spectrum of Learning Analytics,” ELEED: ELearning and Education, Issue-12, 2017 URN: http://nbnresolving.de/urn:nbn:de:0009-5-45384. Yao-Ting Sung, Kuo-En Chang, Tzu-Chien Liu, “The effects of integrating mobile devices with teaching and learning on students’ learning performance: A meta-analysis and research synthesis,” An International Journal of Computers and Education, Vol:94, Pages: 252275, March 2016. Richard M. Ryan and Edward L. Deci, “Intrinsic and Extrinsic Motivations: Classic Definitions and New Directions,” Contemporary Educational Psychology, Vol: 25, Pages: 54-67, 2000. Steven Tang, Joshua C. Peterson, Zachary A. Pardos, “Deep Neural Networks and How They Apply to Sequential Education Data” Proceedings of the Third ACM Conference on Learning, Pages: 321-324, April 25-26, 2016 Murat Kayri, “An Intelligent Approach to Educational Data: Performance Comparision of the Multilayer Perceptron and the Radial Basis Function Artificial Neural Networks”, Journal article of Educational Sciences: Theory & Practice, Vol: 15, Issue: 5, Pages: 1247-1255, June 16, 2015 Anwar Ali Yahya, Mohammad Said El Bashir, “Applying Machine Learning to Analyse Teachers’ instructional questions”, International Journal of Advanced Intelligence Paradigms, Vol: 6, Issue: 4, Pages: 312327, 2014 Dragutin Petkovic, Kazunori, Okada, Marc Sosnick, Aishwarya Iyer, Shenhaochen Zhu, Rainer Todtenhoefer, Shihong Huang “Work in Progress: A Machine Learning Approach for Assessment and Prediction of Teamwork Effectiveness in Software Engineering Education”, Proceedings of the Frontiers in Education Conference (FIE), IEEE, Third ACM Conference on Learning, Pages: 1-3, October 3-6, 2012 R. Linden, T. Rajala, V. Karavirta, M.J. Laakso “Utilizing Learning Analytics for Real-time Identification of Students-at-risk on in introductory programming course”, Proceedings of International Conference on Education and New Learning Technologies (EDULEARN16) Pages: 1466-1473, July 4-6, 2016 Sergi Rovira, Eloi Puertas and Laura Lgual “Data-driven system to predict academic grades and dropouts”, Journal of PLosONE, Vol: 12, Issue: 2, Pages: 1-21, February 2017 Bo Guo, Rui Zhang, Guang Xu, Chuangming Shi and Li Yang, “Predicting Students Performance in Educational Data Mining,” Conference Proceedings of International Symposium on Educational Technology, Pages: 125-128, 2015, DOI: 10.1101/IEST.2015.33 Kevin H, Wilson, Xiaolu Xiong, Mohmmand Khajah, Robert V. Lindsey, Siyuan Zhao, Yan Karklin, Eric G. Van Inwegen, Bojian Han, Chaitanya Ekanadham, Joseph E. Beck, Neil Heffernan, Michael C. Mozer, “Estimating student proficiency: Deep learning is not the panacea,” Proceedings of 30th International Conference on Neural Information Processing Systems ( NIPS 2016), Pages: 1-8, 2016 Chris Piech, Jonathan Bassen, Jonathan Huang, Surya Ganguli, Mehran Sahami, Leonidas J Guibas, and Jascha Sohl-Dickstein “Deep Knowledge Tracing”, Conference Proceedings of Advances in Neural Information Processing Systems, Pages: 550-513, 2015 E. Kurvinenm R. Linden, T. Rajala, Erkki Kaila, Mikko-Jussi Laakso, Tapio Salakoski, “Automatic Assessment and Immediate feedback in first-grade mathematics,” Conference proceedings of International Conference on Computing Education Research, Pages: 15-23, 2014, Laakso Mikko-Jussi, Niko Myller, Ari Korhonen, 2009. Comparing learning performance of students using algorithm visualizations collaboratively on different engagement levels. Journal of Educational Technology & Society, 12, 2, pp. 267-282. A. K. Veerasamy and M.-J. Laakso, “Cultural issues that affect computer programming: A Vietnamese at Higher Education Study”, Asian Journal of Education and E-Learning, vol. 4, no. 2, pp. 30-39, 2016. Rajeev Kanth, Mikko-Jussi Laakso, “A Preliminary study on building an E-education platform for Indian school-level Curricula,” Proceedings of International Conferences on ITS, ICEEdutech and STE 2016, Melbourne Australia, Pages: 159-165, ISBN: 978-989-8533-58-6, 2016