Homogeneity and Enrichment: Two Metrics for Web Applications ...

4 downloads 42093 Views 186KB Size Report
quality and to improve their courses according to the tool indications. ... recording module, which is embedded in the web server of the e-learning platform, and ...
Homogeneity and Enrichment: Two Metrics for Web Applications Assessment Stavros Valsamidis, Sotirios Kontogiannis, Alexandros Karakos Department of Electrical and Computer Engineering Democritus University of Thrace Vas. Sophias 12, 67100 Xanthi, Greece {svalsam, skontog, karakos}@ee.duth.gr

Abstract—Earlier

studies suggest that educational institutions may further benefit from Learning Management Systems (LMSs) by generating reports regarding courses. That is, with the use of tools focus to assess student's paths into course content. This paper affirms that educational meaningful information can be extracted from LMS student logged data and presents a methodology on how such findings may assist to the development of a reporting tool for educators. In the context of course assessment from student logged data, we introduce a new metric called homogeneity that extends the use of an older one, enrichment. We also propose a new algorithm that tries to qualify course content by classifying the course based on students’ course paths interest. We applied our algorithm to Open eClass LMS tracking data of an academic institution and we present the results identified in 12 courses along with interest insights. We leave for future development the creation of an automated tool of our algorithm implemented for the Open eClass platform. Keywords - Web applications, e-Learning, metrics, algorithms, LMS courses.

I.

INTRODUCTION

Learning Management Systems (LMS) offer methods and components for distribution of information and communication between participants in a course. Known commercial LMS platforms used for educational purposes worldwide are [1] Blackboard, WebCT and TopClass, while Claroline, Moodle, Ilias and aTutor are freely distributed under appropriate licenses. Academic institutions in Greece use the Open eClass LMS platform [2] that is provided under the GPL license from the Greek University Network (GUNet). Open eClass is an improved version of the Claroline LMS platform [3]. We implemented a module for the Open eClass platform that uses data mining techniques to discover the patterns of eClass platform usage from students [4]. Server log files store information containing the page requests of each individual user. After the pre-processing process, such information is

Ioannis Kazanidis Department of Information Management Technological Educational Institute of Kavala Agios Loukas, 65404 Kavala, Greece [email protected]

visible as a per-user ordered set of web page requests from which it is possible to infer user navigation sessions. The extraction of sequential patterns has been proven to be particularly useful and has been applied to many different educational tasks [5]. In this work, an algorithm is also proposed called CCA algorithm (Course Classification Algorithm), which will be incorporated in Open eClass platform for analyzing students’ usage. This algorithm provides useful feedback that will motivate instructors to increase the use of platform for the needs of their LMS courses. Instructors shall benefit from the evaluation results of the proposed algorithm, trying to get a good place in the ranking of courses’ usage, course content quality and to improve their courses according to the tool indications. CCA improvement suggestions for course educational content, allow students to profit from the asynchronously study of courses by using actualized and optimal educational material. The paper is organized as follows. In section II we describe the related work. In section III we present the proposed approach and CCA algorithm. Results of an experimental evaluation scenario are reported in section IV. Finally, in section V we summarize and present conclusions along with future work. II.

RELATED WORK

There are several specialized web usage mining tools used by LMS platforms. GISMO [6] tracks web log data of an LMS course and provides instructors with different information feedback, such as statistical information involving student’s usage requests of the course material. Sinergo/ColAT [7] is a tool that acts as an interpreter of the students’ activity in a LMS course. [8] is a tool which uses log files in order to represent the instructor-student interaction in a hierarchical structure. A specialized WUM tool used in elearning platforms is MATEP. MATEP [9] acts in two levels. First, it creates a mixture of data from different sources suitably processed and integrated. These data originate from elearning platform log files, virtual courses, academic and demographic data. Second, it feeds those data to a data warehouse which in turn, with the use of MATEP provides static and dynamic reports. Analog is another system [10]

which is consisted by two main components. The first is performing online and the second offline data processing according to web server activity. The online component builds active user sessions which are then classified into one of the clusters found by the offline component. Aforementioned works of web usage mining in LMS platforms [4,11], led to the motivation data mining techniques application in LMS platforms. We propose an algorithm called CCA for the evaluation of e-learning platforms based on students’ usage analysis with the aid of authors’ web usage metrics.

III.

APPROACH

Three are the main steps of our approach, namely i) logging step, ii) pre-processing and iii) application of the proposed algorithm. The first two steps of the approach are based on the framework described in detail in [12] and facilitate the extraction of useful information from the data logged by a web server running an LMS. The first step involves the logging of specific data with the use of a data recording module and the second the filtering of these data as well as the cration of new metrics. The third step applies an algorithm which is going to enable an automated system to provide authors with suggestions for the course improvement according to the outcomes of the second step. A. Logging data and procedures More specifically the first step involves the logging of specific data from e-learning platforms with the use of a data recording module, which is embedded in the web server of the e-learning platform, and records specific e-learning platform fields. In order this data to be recorded, a module in Perl programming language was developed. This module is independent of specific e-learning platforms and stores rapidly courses’ information because of the direct execution from the API of the server. B. Pre-processing, indexes and metrics The data of the log file contain noise such as missing values, outliers etc. These values have to be preprocessed in order to prepare them for data mining analysis. Specifically, seccond step filters the recorded data delivered from the first step. After the data refinement, the new metrics are being calculated. The existing and the proposed course metrics are provided in Table I. TABLE I.

INDEXES NAME AND DESCRIPTION

Index name

Description of the index

Sessions Pages Unique pages Enrichment Homogeneity

Number of sessions per course viewed by users. Number of pages per course viewed by users. Number of unique pages per course viewed by users. Enrichment of courses (Unique pages /Pages) Homogeneity of unique pages per session (Unique Pages/Sessions) Number of unique pages per course viewed by users per session

UPCS

Initially the number of the sessions and the number of the pages were counted so as to calculate course activity. We try to measure the unique pages number per course by calculating

Unique Pages metric. In order to express the “enrichment” of each course we propose the Enrichment metric. Enrichment is defined as the number of unique pages divided by the number of pages. However users may visit just few pages of each course and therefore sessions alone may lead to unreliable results. Similarly, the number of the pages that the user visited as a metric alone is not reliable enough to confirm course activity. For that reason, we proposed a new metric, named homogeny that combines sessions and pages viewed by users allowing us to evaluate course activity. Finally, UPCS metric expresses the unique user visits per course and per session in order to calculate activity in a more objective manner. For example, some novice users may navigate in a course visiting a page more than once. UPCS is trying to eliminate duplicate page visits, considering the visits of the same user in a session only once that could not be clarified by Enrichment metric. Detailed description of new metric follows: Enrichment is a metric that tries to express course content quality. It is a degree of student appreciation towards course instructor maintained information. This metric was first presented in [11] and is defined as the division of total course pages over unique course pages. However, in order to provide meaningful results, we redefine enrichment as the one’s complement to the division of the unique pages over total course pages: Enrichment=1-Unique Pages/Total Pages,

(1)

where: Unique Pages> Unique course pages. This metric is a course quality index and characterizes the percentage of LMS course information independently discovered by each user participating in an LMS course.

We define as unique pages per course ID per session (UPCS) metric, the number of web pages calculated once per session to the total number of sessions per course ID. This metric is used in order to order all LMS courses in each cluster and present evaluation feedback for each course based on the CCA algorithm presented bellow.

classification of the lessons is depended on the average Enrichment value of the N LMS lessons and the average Homogeneity value of the high and low Enrichment clusters accordingly. The aim of the third stage of the algorithm is to identify whether the content can be characterized as rich or poor, and whether is static, frequent or dynamic. In order to do this we order each cluster’s lessons based on the value of the UPCS.

C. Proposed algorithm The algorithm we propose is called CCA (Course Classification Algorithm) and it initially tries to classify LMS courses based on poor or rich quantity of course information material. Afterwards based on LMS courses with adequate information material, it tries to spot how often course information is added or updated by tutors (or users based on homogeneity classification) or followed by users (the updated information as it is discovered by users). Finally, using the UPCS metric it tries to identify whether updates of course information can increase the student’s interest for the specific course. CCA algorithm discovery schema is depicted in Fig. 1. According to the above the proposed algorithm is based on Enrichment, Homogeneity and UPCS and is consisted by the corresponding stages. At the fisrt stage of the algorithm the Enrichment metric is involved in order to identify courses with poor or rich educational content (poor equals to small enrichment value while rich to high enrichment value). We place to an N-ordered table a set of N lessons based on Enrichment, where N

Suggest Documents