Implementation of Decision Tree for Scrutinizing ...

2 downloads 0 Views 92KB Size Report
Colony, Chembur, Mumbai-400074. Personal Address:- 1. Mrs.Dhanamma Jagli: Ganesh Tailors, Nalanda Nagar,. Maroli Church,Mahul Road, Chembur,.
Implementation of Decision Tree for Scrutinizing Student Results Author’s name:-

Prof.Dhanamma Jagli1 Sahil Mutreja2 Vinay Khatri3 Gulshankumar Khubnani4

Designation:-

1

Assistant professor

2,3,4

Organization:-

M .C.A II Year Students

Vivekanand Education Society of Information Technology

Official Address:-

Hashu Advani memorial complex, Collector Colony, Chembur, Mumbai-400074

Personal Address:-

1

Mrs.Dhanamma Jagli:

Ganesh Tailors, Nalanda Nagar, Maroli Church,Mahul Road, Chembur, Mumbai-400074. 2

Sahil:

D/203, Mota Nagar, Andheri Kurla Road, Andheri (East), Mumbai – 400099. 3

Vinay:

502/B, Pali Hill Apt., Near Sadhu Vaswani Statue, Gol Maidan, Ulhasnagar-421001

4

Gulshankumar:

307,

Navjeevan

Tower,

O.

T.

Section,

Ulhasnagar- 421003 Official Contact:-

022-61532542

Personal Contact:-

9769614365(Dhanamma Jagli) 9664192487(Sahil) 9321555143(Vinay) 8600880480(Gulshankumar)

Official Email:-

[email protected]

Personal Email:-

[email protected] [email protected](Sahil) [email protected](Vinay) [email protected](Gulshankumar)

Implementation of Decision Tree for Scrutinizing Student Results

Abstract Intelligent evaluation as an important branch in the field of artificial intelligence is a decision-making process of simulating the domain experts to solve complex problems. In this paper, a kind of intelligent evaluation method is proposed and applied in the students result analysis based on Decision tree classification. Decision tree classification method is one of the main analytical methods in data mining, which influences the result in a straight line based on the training data (observations, measurements, etc.) In this paper, decision tree classification method is applied to evaluate students result to find out the actual reasons on sample data. Classified the students according to crucial attributes which will effect on them. Finally from the classified data, some standard rules are derived to advise the further students to be vigilant in the education system.

I. Introduction Data mining is often defined as finding hidden information in a database. Through Data Mining we don’t get a subset of data stored in database, instead, we get analysis of contents of database. Data mining is sorting through data to identify patterns and establish relationships. Data mining parameters include:  Association - looking for patterns where one event is connected to another event

 Sequence or path analysis - looking for patterns where one event leads to another event  Classification - looking for new patterns  Clustering - finding and visually documenting groups of facts not previously known Forecasting - discovering patterns in data that can lead to reasonable predictions about the future (This area of data mining is known as predictive analytics.) Different Techniques involved are as follows:Data mining involves following classes of tasks:  Anomaly detection (Outlier/change/deviation detection) – The identification of unusual data records, that might be interesting or data errors that require further investigation.  Association

rule

learning (Dependency

modelling)



Searches

for

relationships between variables. For example a supermarket might gather data on customer purchasing habits. Using association rule learning, the supermarket can determine which products are frequently bought together and use this information for marketing purposes. This is sometimes referred to as market basket analysis.  Clustering – is the task of discovering groups and structures in the data that are in some way or another "similar", without using known structures in the data.  Classification – is the task of generalizing known structure to apply to new data. For example, an e-mail program might attempt to classify an e-mail as "legitimate" or as "spam".  Regression – Attempts to find a function which models the data with the least error.

 Summarization – providing a more compact representation of the data set, including visualization and report generation.  Sequential pattern mining – Sequential pattern mining finds sets of data items that occur together frequently in some sequences. Sequential pattern mining, which extracts frequent sub sequences from a sequence database, has attracted a great deal of interest during the recent data mining research because it is the basis of many applications, such as: web user analysis, stock trend prediction, DNA sequence analysis, finding language or linguistic patterns from natural language texts, and using the history of symptoms to predict certain kind of disease.  Decision Tree - A decision tree is a decision support tool that uses a treelike graph or model of

decisions

and

their

possible

consequences,

including chance event outcomes, resource costs, and utility. Decision Tree A decision tree is an approach like that needed to support the game of twenty questions where each internal node of the tree denotes a question or a test on the value of an independent attribute, each branch represents an outcome of the test, and each leaf represents a class. To classify an object, the appropriate attribute value is used at each node, starting from the root, to determine the branch taken. The path found by tests at each node leads to a leaf node which is the class the model believes the object belongs to. Basic Algorithm Decision tree is an attractive technique since the results are easy to understand. Assume that each object has a number of independent attributes and a dependent attribute.The aim is to build a decision tree consisting of a root node, a number of internal nodes, and a number of leaf nodes. Building the tree starts with the root

node and then splitting the data into two or more children nodes and splitting them in lower level nodes and so on until the process is complete.The students result analysis illustrated in detail by applying decision tree classification in further section and decision tree is constructed as shown in the fig1.

II.

Objectives

The application of decision tree based result analysis having many benefits for teacher as well as to students as follows:  Teachers have easy access to overall performance of all the students in the class, along with the reasoning for their good/poor results  This technique provides copious information to teachers to target the areas of improvements of the students which might be lacking.  To assist the students to work out on these areas, teachers can make use this technique to take decision for conducting training activities dedicated for those students who have performed poorly in their curriculum.  By this application students can understand that teacher’s contribution is so important to them in the education system.  Students can understand that their contribution should be on regular basis to excel in their academics.

III.

Research Hypothesis 1) It is clearly shown that who regularly attend the class as well as perform follow ups with the teachers (Guidelines from the subject teacher or ask for doubts), and also with a good behavior i.e. submitting assignment’s and practical’s on time or concentrating in the class, and also the ability to

prepare well and good in the shorter period of time before the exam such as preparatory leave, they all come in distinction class. 2) There are students who are regularly attending the class as well as perform follow ups with the teachers (Guidelines from the subject teacher or ask for doubts), and also with a good behavior i.e. submitting assignment’s and practical’s on time or concentrating in the class, and also the ability to prepare fair in the shorter period of time before the exam such as preparatory leave, they all belong to first class class. 3) There are student’s who are regularly attentive in the class but not talking follow ups from teachers and their ability to prepare is good in the shorter period of time before the exam such as preparatory leave, they all belong to First class. 4) There are student’s who are regularly attentive in the class but not talking follow ups from teachers but the ability to prepare is average in the shorter period of time before the exam such as preparatory leave, they all belong to Second class. 5) There are student’s who are not regularly attentive in the class, however due to the extra curriculum activities in the college such as Learning beyond syllabus, Festival’s of college they secure Second class owing to their efficient preparation in shorter period of time . 6) And the students who are irregular, not taking any follow ups from teachers and also not participating in the extra activities moreover not able to prepare in the shorter period of time they Fail because they might be unaware as to which topic’s are important, or not having any interest for a particular subject.

IV.

Methodology

These method systematic steps described as follows 1. The training data is S. Discretise all continuous-valued attributes. Let the root node contain S. 2. If all objects S in the root node belong to the same class then stop. 3. Split the next leaf node by selecting an attribute A from amongst the independent attributes that best divides or splits the objects in the node into subsets and create a decision tree node. 4. Split the node according to the values of A. 5. Stop if any of the following conditions is met otherwise continue with 3.  Data in each subset belongs to a single class.  There are no remaining attributes on which the sample may be further divided.

V.

Results

Sample Database for Students Results is extracted and trained based on crucial attributes that affects overall academic performance of the students as shown in below table 1. The sample data related with considered attributes from various sources like  Attendance records  Defaulter lists  Information from students council  Inputs from teachers

Table 1: Students sample data S.

Name

Reg

Participati

Beha

R

ular

on

ol

in

Extra-

with

Prepara

l

colle

Curricular

teache

tion

ge

activities

rs

in viour

Follo

Efficien

Result

ws up t

1

Prity

Yes

No

Good

Yes

Good

Distinction

2

Shilpa

Yes

Yes

Good

Yes

Good

Distinction

3

Vishwanath

Yes

Yes

Good

Yes

Good

Distinction

4

Guru

No

No

Good

No

Fair

Second

5

Salil

Yes

No

Good

Yes

Fair

First

6

Vijay

Yes

No

Good

Yes

Fair

First

7

Minny

Yes

No

Good

No

Bad

Fail

8

Neerja

No

No

Fair

No

Bad

Fail

9

Viju

No

Yes

Fair

No

Fair

Second

10 Akaash

No

No

Bad

No

Bad

Fail

11 Falguni

Yes

No

Bad

No

Fair

Second

12 Himmat

No

No

Fair

No

Bad

Fail

13 Kajal

No

No

Fair

No

Bad

Fail

14 Jeet

Yes

No

Fair

Yes

Bad

Fail

15 Ajay

Yes

No

Good

No

Fair

first

Based on given data the decision tree had been constructed and shows the results considering various possibilities as shown in the below Figure .1

Regular in college

No

Yes

Participation in extra-curricular activities

Follows up with

No

Yes

Yes Efficient Preparatio

Behaviour

Second Class Good

Fair Fail

No

Fair

Good First Class Efficient Preparation

Second Class

Efficient Preparation

Fair

Good

Bad

Fair Distinction

First Class

Second Class

Fail

Figure 1:Decision Tree for Students Results

VI.

Conclusion

The application of decision tree classification is implemented successfully on MCA sem-3 student’s results, which are announced recently. Classified students based on some crucial attributes, so that forefront students can understand the importance of persons attributes to excel in their academics and to achieve their goals in any education systems. This is implemented successfully, anlyzed and derived strong rule to be consider for any education system.

References [1]. G.K.Gupta, “Introduction to Data Mining with Case Studies”, 2nd edition, May 2011. [2]. Raghu Ramakrishnan and Johannes

Gehrke, “Database Management

Systems” ,3rd edition, 2003. [3]. Sangita oswal and Dhanamma Jagli,“An Improved K-Means Clustering Approach

for

Teaching

Evaluation”,

Proceedings

of

ICAC3-2011.

www.springer.com/chapter/10.1007/978-3-642-18440-6_13. [4]. Dhanamma Jagli and Sangita oswal,“Web Usage Mining: Pattern Discovery and Forecasting”, Published in IFRSA International Journal of Data Warehousing & Mining (IJDWM), Vol 2, Issue 4, November 2012. [5]. Dhanamma Jagli and Sangita oswal “Association Rule Mining : Improved Tree Based Graph Based Approach for Mining Frequent Item Sets” , ublished in International Conference Proceedings on Sunrise Technologies at Bapusaheb Shivajirao Deore College of Engineering & Polytechnic, Vidyanagri, Dhule. [6]. www.wikiepedia.org.

Suggest Documents