Abualigah, L. M., Khader, A. T., Al-Betar, M. A., & Alomari, O. A.: Text feature selection with a robust weight scheme and dynamic dimension reduction to text ...
JARDCS, ISSN 1943-023X Journal of Advanced Research in Dynamical and Control Systems Special Issue – 02 / 2017
ONTOLOGY BASED GROUPING OF PRODUCTS USING CLUSTERING AND CLASSIFICATION APPROACHES Razia Sulthana1, Subburaj Ramasamy2 1 SRM Univesity,Kattankulathur, Tamilnadu, India- 603203, raziasulthana. 2 SRM Univesity, Kattankulathur, Tamilnadu, India- 603203,
ABSTRACT. This paper proposes Ontology based clustering and classification of items. There exists a concern to categorize the products based on their similar nature based on few entities. The products are then recommended to the user as per their requirements. Comparing with existing methods, our approach is proactive that could categorize products with high Fmeasure. Our approach uses ontology to classify documents based on their domain as entity and features as sub-entity. The documents under the sub-entities are clustered using different clustering approaches and consensus clustering is applied to generate the final clusters. Ontology describes easy categorization and reuse thereby reducing time. Consensus clustering identifies clusters with better similarity. Our experiments were evaluated to measure the performance metrics: precision and F-measure. It shows a considerable increase in the F-measure making our approach surpass the existing ones. The product thus recommended from same cluster ensures higher satisfaction to the user. Keywords: Ontology, clustering, classification, features, similarity, recommend
1 INTRODUCTION Electronic retailing has enabled the sale of products world-wide. E-retailing enabled business to business (B2B) and business to consumer (B2C) services. The B2C services opened up the door for many e-retailers. This facilitated the growth of numerous technologies and frameworks to retain the customers and to enable the business to progress. The global motto of e-retailing is to increase the revenue by attracting more users. The need and necessity of the users are anticipated based on their current purchase pattern. The predicted items are recommended to the users. In this paper we propose ontology based clustering and JARDCS
Special Issue on Allied Electrical And Control Systems
1032
JARDCS, ISSN 1943-023X classification of products. In our work we have considered grouping the books from the Amazon. This is done by calculating the closure or similarity between the previous bought items or browsed items by the user.
2 LITERATURE REVIEW There are a number of approaches available in literature to group similar items available in estores.
2.1 Related work in Clustering The web pages are classified in [1] using ontology. The domain ontology reduces the time and reuse of the data. It tests three classification algorithms namely Support Vector Machines (SVM), K-Nearest Neighbor (KNN) and Naïve Bayes (NB) with and without clustering. It proves that the classification when applied after clustering proves better results. A brief idea about where the clustering algorithms are applied in real time is given in [2]. Clustering algorithms [3] are generally classified into agglomerative, partitioning-based, and probabilistic based algorithms. Agglomerative clustering merges documents based on similarity. Partitioning clustering algorithms groups the documents in a hierarchical manner. Probabilistic clustering algorithms group’s the documents based on the model. Fuzzy logic and ontology is used in [4] to cluster the patent documents. The documents were processed by understanding the grammar of the sentences instead of using the key phrases. Ontology is constructed using the grammar identified from the sentence. The fuzzy approach is been compared with K-means approach and it resulted to outperform K-Means clustering. Correlation based clustering is proposed in [5] to cluster the documents in a high dimensional space whereas the traditional K-Means approach clusters documents in low – dimensional. This approach uses correlation metric as a similarity measure. Hybrid Scheme for Text Clustering (HSTC) and Text Clustering with Feature Selection (TCFS)were compared for document clustering in [6]. TCFS uses ontology for selecting the features. And it has justified that TCFS gives better accuracy as it gains the advantage of ontology. The ontology reuses the past comparison results and reduces the time taken for execution. Ontology based clustering approach is proposed in [7] to cluster similar students in Moodles e-learning system based on their log activities. The ontology is created with learning JARDCS
Special Issue on Allied Electrical And Control Systems
1033
JARDCS, ISSN 1943-023X attributes as entities and ontology based similarity approach is applied to cluster them. The results have justified that the clusters developed using ontology were intact when compared to other clustering approaches. The K-Means algorithm is slightly modified in [8] to overcome the scalability issues. It proposes a centroid selection strategy and compares the performance of the system with the K-Means approach. The system shows a considerable increase in accuracy than normal KMeans approach. A hybrid document clustering combining hierarchical and traditional kmeans is proposed in [9]. It clusters documents by overcoming few drawbacks in partitioning and hierarchical clustering techniques. This approach is been extended to cluster similar users in social media. Conceptual clustering based on matrix factorization is implemented in [10]. It develops a recommendation system to overcome information overload problem. It proposes K-modes algorithm to reduce the matrix operations. The model developed substantiates higher accuracy of recommendation system using (Root Mean Square Error) RMSE and Mean Absolute Error (MAE) as error measures. A text clustering algorithm is propsed in [11] to group similar documents together. It uses k-means, k-means fast and k-mediods. The experimental results show that K-Means and Kmediods when combined with cosine similarity proved better results. The cosine similarity measure [12] valuates a similarity measure irrespective of the document length. Genetic algorithm (GA), harmony search (HS) algorithm and particle swarm optimization (PSO) were used in [13] for feature selection using Length Feature Weight (LFW) weighting scheme. It proposes dynamic dimension reduction (DDR) to reduce the number of features. Moreover, all of these were clustered using K-Means before feature reduction and after feature reduction. This was implemented in eight standard datasets F-measure and accuracy were the evaluation measures used. As it uses K-Means Algorithm the variance of all the features are considered to be same which in real time is not so. The features have varying variance which has to be captured. Self-Organizing Maps (SOM) is an unsupervised clustering approach used in genetic application to cluster the neurons [14]. A detailed report on the articles using clustering and classification are given below in Table 1.
JARDCS
Special Issue on Allied Electrical And Control Systems
1034
JARDCS, ISSN 1943-023X Table 1. Articles listed with respective algorithms and datasets
Articl Clustering Algorithm
Similarity
e
Approach
measure
13
Hierarchi
Uses
cal
algorithm (GA), Similarity
genetic Cosine
8 datasets were
clustering harmony search (HS)
Data set
taken
from
Laboratory
and
of
Computational
particle swarm
Intelligence
Optimization
(LABIC)
along with KMeans 3
Agglomer k-means,
k- Cosine
dataset
was
ative and means fast, and similarity,
collected
from
partitionin k-medoids
Jaccard
library of
g
similarity, and College
the of
Correlation
Computer
and
Coefficient
Information Sciences,
King
Saud University, Riyadh 4
Hierarchi cal
Fuzzy C-Means Fuzzy
and
Probabilis
based Documents were
similarity
downloaded
measure
from
world
tic
intellectual
clustering
property organization (WIPO)
1
Hierarchi
K-means
-
cal
WEBKB dataset with 7 classes
clustering 8
Hierarchi cal
JARDCS
K-Means, fuzzy Vector
and C-Means,
similarity
SML, FT, BC, ML
and
FM
Special Issue on Allied Electrical And Control Systems
1035
JARDCS, ISSN 1943-023X partitionin Expectation
5
datasets
g
Maximization
Partitioni
correlation
ng
and preserving
Correlation
NG20, Reuters,
similarity
and
probabilis indexing(CPI)
OHSUMED
tic
corpora
algorithm
clustering 6
Partitioni
K-means,
ng
Hybrid Scheme similarity
clustering for
Cosine
Text
document collected
Clustering (HSTC)
Random
as
dataset and
from
internet
Text Clustering with
Feature
Selection (TCFS) 7
Hierarchi
Ontology based Ontology
Raw data from
cal
clustering
similarity
Moodles
score
learning Reuter,
clustering 9
Hierarchi
Iterative
Cosine
cal
document
similarity and Ohsumed
clustering clustering
15
Classifica SVM tion
16
and
extensive
various
similarity
data sets
Nearest
Occupational
Term relevance ontology Ontology
terms
classifier
synonyms
TREC
Health
and
or Security (OHS) dataset
in
oil
ontology based
and
classifier
application
algorithm
context.
Hierarchi
Term frequency Distance
Gmail personal
cal
document
clustering clustering JARDCS
e-
gas
matrix is used email of 19,620 for calculating emails
Special Issue on Allied Electrical And Control Systems
1036
JARDCS, ISSN 1943-023X KNN
for the similarity
clustering
Personal dataset of emails
SVM model for classification 10
Partitioni
k-modes
Cosine
ng
clustering
similarity
clustering algorithm
Epinions dataset
And dissimilarity distance
17
Classifica Naive tion
(NB),
Bayes Cross Support similarity
Vector
method
Kariyer.Net - a job
seeking
website
Machine (SVM) Cosine with rbf kernel similarity classifier, Multilayer Perceptron (MLP)
and
C4.5.
2.2 Related work in classification Ontology based classification of documents is proposed in [15]. It rationalizes the use of ontology for classification approach. Domain ontology is used for leveraging the existing knowledge for classifying textual information. A couple of algorithms were proposed namely Ontology Classifier and Term Relevance Ontology Classifier. The results show that the both yielded better performance results when compared to other classification approaches. The usage of classification and clustering was together imposed in [16] to group the E-mails. KMeans algorithm using the term-frequency is used for subjective clustering of the E-mail content. Following which three classification techniques are applied. The performance of the system improved when both of them were applied consecutively. A couple of classification approaches were applied in [17] to classify the candidate preferences in job recommendation system. Support Vector Machines (SVM) yielded better classification. In conventional methods the items are clustered based on the similar purchase JARDCS
Special Issue on Allied Electrical And Control Systems
1037
JARDCS, ISSN 1943-023X pattern or by analyzing the opinion posed by the user. This works harder when the numbers of items are more.
3 MATERIALS AND METHODS This section describes the architecture of our system and its main components. One of the major knowledge structures is the books available in the web. Initially the books are classified based on their domain. This layer of classification is named as first level classification. Secondly the classification is extended to the sub-entities called as second level classification. Following which domain ontology for the products is constructed which eases the clustering process and enables reuse. Finally clustering is done to group the similar objects.
3.1 Proposed Classification Approach Classification is the supervised learning approach in data mining. The classification is used for exploring and analyzing huge quantity of data. The categories of grouping the collection are known in advance. We try to implement our approach for grouping books in Amazon: world’s largest online retailer. We have applied random sampling approach and have chosen 75,000 books from Amazon. The books were then broadly classified based on the domain. This is the first level classification. The domain is identified by analyzing the implicit profile available for every individual book. The publisher of the book includes a note along with every book uploaded in Amazon. Feature extraction is done to extract the domain of the book, the title of the book and the author of the book from the books profile. The domain whose frequency is more is considered for further classification, thereby ignoring the domain which has less number of books. Term Frequency - Inverse Document Frequency (TF-IDF) is the feature extraction approach is used for extracting the domain of the book. TF-IDF is justified to give better results as the domain is taken from the book’s profile. The steps of TF-IDF process used in work is given below:
Step1: Let Book Collection is a set of books represented as B={b1,b2…..b75000} Step2: Let there be m domains represented as D={D1,D2,……Dm} JARDCS
Special Issue on Allied Electrical And Control Systems
1038
JARDCS, ISSN 1943-023X Step 3: The domain frequency dfij is the number of times the domain Dj occurs in Bi. Step 4: The book frequency is the bfj is the number of books belonging to the domain Dj from the collection B. Step 5: The inverse book frequency is given by log(B/bfi). Step 6: The weightage of a domain Dj in a book bi is given by Wij=dfij * log(B/bfi) Step 7: The domain with highest weights are chosen as major features and is represented in Table 2. Table 2. List of Domains with their frequency.
Domain Computer Science Electrical
Average
word
frequency 31312 and 20218
electronics Medicine
11099
Science
8999
Fiction
2199
Novel
873
Fantasy
175
Novel
125
The domain with minimal of books are ignored. The domains namely {Computer Science, Electrical and Electronics, Medicine and Science} are chosen to proceed further. This preclassification of documents using TF-IDF increases the clarity and removes the outliers. The architectural representation (Fig. 1) is shown below:
JARDCS
Special Issue on Allied Electrical And Control Systems
1039
JARDCS, ISSN 1943-023X
Figure 1: Architectural Framework
3.2 Building a domain ontology This collection of books obtained for different domain is not distinct as there may exist some relation between the books. For Example, an author X, would have written a book in science and engineering domain as well. So when books are extracted based on authors then the above classification will not be able to give the books as they are classified under different domain. Domain ontology is constructed to handle these issues. The ontology represents a hierarchical structure which interlinks the books of different domain with similar features. To implement the system efficiently we have divided them into minor categories as per the user’s interest as shown in Table 3. Table 3. Classified features under ontology.
c1
c2
c3
c4
c5
c6
Langua Auth Publish Editio Pages
Langua
ge
ge
or
er
n
Count
The interest span of the users is identified based on their buying trend. The minor categories (interest span) of the user are represented in ontology. The classification is done here based on the minor categories and is the second-level classification. Protégé tool is used JARDCS
Special Issue on Allied Electrical And Control Systems
1040
JARDCS, ISSN 1943-023X in constructing the ontology engine. The hierarchical relationship between the features is stored using Ontology Web Language (OWL). The system identifies the similar books by using the relation between the book features in OWL. The ontology engine maintains is-a relation between the individual data structures. In conventional methods the constructed ontology is pruned to remove the unwanted entities or features, as the ontology is a composite collection of all collection of all the features. This is a time and effort consuming process. Moreover, it is difficult to identify the outlier features and cost of pruning is high. In our approach the outliers are pruned in the classification stage thereby removing the overload of pruning.
3.3 Clustering similar items Clustering is an unsupervised learning approach in data mining. The clustering enables to group the documents further based on the requirement posted in the user query or by analyzing the purchase trend of the user. The books grouped under the subcategories in ontology are further clustered based on similarity. In our approach we apply two types of clustering namely: K- Means and CLARANS. Following which consensus clustering is applied to unify the results of K-Means and CLARANS clustering. The clustering approach generates a triplet. The triplet contains information about the number of clusters, the book-set classified under the cluster and feature-set based on which the clusters are categorized. KMeans generates a tripled as {KD, Dset, Fset}Kmeans. The CLARANS clustering approach generates a triplet as {KD, Dset, Fset}CLARANS. The results of both these are combined in consensus clustering and it generates a triplet as {KD, Dset, Fset}CC. In order to evaluate the implied clustering, precision performance metric is used. The precision value of consensus clustering obtained by combining K-means and CLARANS was considerably more than their individual clustering precision.
4 EXPERIMENTAL RESULTS The resultant items under every domain of ontology are clustered using the KMeans and CLARANS clustering approach. The K-means and CLARANS has been brought together to cluster the items using conceptual clustering approach. These are implemented using R software.. The items under each domain are further clustered. The clustered items are then considered to be similar and thus recommended. We have shown the results of clustering the JARDCS
Special Issue on Allied Electrical And Control Systems
1041
JARDCS, ISSN 1943-023X “Author” domain from ontology. The items classified under this domain are subjected to clustering. This is practiced for all the other domains. Thereafter the clustered items under the domain act similarly. There are quite lot methods available in literature to find similar items. The existing algorithms determine the clusters by applying individual clustering approach or by hybrid clustering approach using generic similarity measures like cosine similarity. But the concern is on the features based on which the items are clustered, as the text data is highly dimensional. There is a necessity to give preference to features of every individual items as every item has a unique feature. The similarity measure identifies similar items but is less effective in identifying items with unique feature. This has led to proposing a combined clustering approach in our work. The major steps in implementation process are listed below:
Step 1: Obtain the clustering results using K-Means approach. Step 2: Obtain the clustering results using CLARANS approach Step 3: The results of step 1 and 2 are given as input to conceptual clustering Step 4: The clustered items using our approach were considerably similar when compared to existing approach.
4.1 K-Means K-Means clustering is a popular analysis approach used in data mining. It partitions a set of items n items to m clusters. The K-Means clustering is applied to all the domains in ontology. It returns 7 clusters in lieu of 7 features as given in Table 4. The precision value returned by K-means clustering for every domain against the features is also mentioned. It shows that KMeans has identified considerable items based on application-oriented features. Table 4. Average precision of topics classified by K-Means.
Precisi
Topic
c1
Mathematical
Descriptive
Programming JARDCS
c2
c3
c4
c5
c6
0.5 0.5 0.6 0.7 0.6 0.6 0
4
3
2
2
5
0.7 0.6 0.6 0.7 0.7 0.7 5
9
8
5
8
2
0.6 0.6 0.6 0.6 0.6 5
7
8
2
1
0.7
on 0.61
0.73
0.66
Special Issue on Allied Electrical And Control Systems
1042
JARDCS, ISSN 1943-023X
Framework
Application-oriented
Case study-oriented
Real Time Examples
0.6 0.7 5
2
0.8 1
0.7 0.7 0.6 0.7 5
8
9
0.8 0.8 0.8 0.8 0.9 7
7
8
8
0.6 0.7 0.6 0.6 0.6 0.5 6
2
0.8 4
3
2
2
0.7 0.7 0.8 8
9
4 0.7
0.8 5
0.72
0.87
0.63
0.79
The performance measure precision gives the usefulness of the result. Precision returns the ratio of acceptable items to the selected items. The simplicity of K-Means is that it allocates an item to a cluster with nearest mean distance. In spite of its simplicity it may end up with more number of clusters.
4.2 CLARANS A clustering algorithm based on randomized search (CLARANS) identifies its neighbors dynamically at run time. In contrast to K-Means approach which clusters based on the mean, CLARANS applies randomized search and uses medoid as the representative item. One of the notable features of this clustering approach is that it is applicable for large data sets. K-Means uses Euclidean as a distance function whereas CLARANS does not restrict to any distance measure. The average precision of topic is given in Table 5 and it is found that CLARANS will produce better clustering results than K-Means. Table 5. Average precision of topics classified by CLARANS.
Precisi
opic
c1
Mathematical
Descriptive
Programming
JARDCS
c2
c3
c4
c5
c6
0.7 0.6 0.6 0.7 0.7 0.6 6
6
4
1
2
7
0.7 0.6 0.7 0.7 0.7 0.7 3
7
6
4
0.8 0.7 0.8 0.8 5
5
7
1
9
8 0.8
0.8 4
on 0.69
0.75
0.82
Special Issue on Allied Electrical And Control Systems
1043
JARDCS, ISSN 1943-023X Diagrammatic
0.6
representation
9
Application-oriented
Case study-oriented
Fictious
0.5 0.5 0.4 0.4 0.5 9
5
8
0.7 0.7 0.6 0.6 9
5
9
9
9 0.7
0.7 6
0.6 0.8 0.7 0.5 0.5 0.6 7
1
0.7 9
9
5
9
5
0.7 0.7 0.7 0.8 0.8 4
5
3
1
0.55 0.73
0.68
0.77
4.3 Conceptual Clustering It is a clustering approach that clusters the concepts in a hierarchical manner. In addition it relies on concept description language, thereby making the clusters inherently strong. The result of the K-Means and CLARANS baseline clustering approaches were given as input to conceptual clustering and are represented in Table 6. Table 6. Average precision of topics classified by Conceptual Clustering.
Precisi
Topic
c1
Mathematical – F1 Descriptive – F2
c2
c3
c4
6
4
0.8 9
8
5
0.6 4
7
representation – F3
4
Fictious – F6
1
3
0.7 0.8 0.7 0.8
0.7 0.6 0.5 0.5
Framework – F5
c6
0.7 0.7 0.6 0.6 0.7 0.6
Diagrammatic
Programming – F4
c5
6
5
6
9
4
5
2
9
0.6 1
6
7
0.7 0.6 0.7 0.5 0.6 0.6 3
1
1
0.7 0.7 0.5 8
8
5
1
0.7
0.79
0.6
0.7 0.7 0.9 0.7 0.8 0.6 9
on
8
3
0.7 0.7 0.6 1
6
0.62 0.8
0.65
0.7
Application-oriented 0.9 0.8 0.8 0.8 0.9 0.6 0.85
JARDCS
Special Issue on Allied Electrical And Control Systems
1044
JARDCS, ISSN 1943-023X – F7
1
8
6
7
5
Case study-oriented 0.5 0.7 0.6 0.7 0.6 0.7 – F8
3
8
3
3
9
1
Real time Examples 0.7 0.8 0.8 0.7 0.8 0.8 – F9
6
4
7
1
5
3
0.68
0.81
The conceptual clustering generates 9 clusters, thereby showing an increase in number of clusters. The significant increase in the number of clusters shows that the intra-cluster similarity is high and inter-cluster similarity is low. The proposed method depicts that the precision value of the clusters been clustered based on features is considerably high. The comparison of precision measure of the clustering approaches is shown in Fig. 2. It shows that the conceptual clustering gives better precision for considerably more features.
Figure 2: Comparison of precision values of Clustering Methods over features
The clustering approaches were evaluated against the groups classified in ontology is given in Table 7. The Fig. 3 shows that the conceptual clustering shows a better precision for 4 out of 6 classified groups. This ascertains that ontology based classification of documents supports clustering by 66%.
JARDCS
Special Issue on Allied Electrical And Control Systems
1045
JARDCS, ISSN 1943-023X Table 7. Precision values of clustering methods over classified groups.
c1
c2
c3
c4
c5
c6
0.6
0.7
0.7
0.7
0.7
0.7
2
1
3
3
0
0.7
0.7
0.7
0.6
0.6
0.7
5
1
3
9
9
1
0.7
0.7
0.7
0.7
0.7
0.7
7
4
2
0
5
0
K-Means 9
Clarans
CC
Figure 3: Comparison of precision values of Clustering Methods over classified groups.
5 EVALUATION CRITERIA In this work, precision metric is used to measure the performance of the clustering approach. The quality of clustering approach depends on how pure the clusters are?. The items in the cluster are externally observed for similarity. The valuation is done by cross-identifying the items thereby the precision is measured. Precision=xij/mj for all I and j, where in our case j is the number of clusters from 1 to 9 and I is number of groups from 1 to 6. Xij is the number of items of group I in cluster j and mj is the number of items in cluster j.
JARDCS
Special Issue on Allied Electrical And Control Systems
1046
JARDCS, ISSN 1943-023X
CONCLUSIONS In this paper we presented a hybrid approach of Classification and clustering of items using ontology. The previous works studied in literature used either classification or clustering approach to group the documents/items. The voluminous items were initially classified using ontology. The classification thus groups similar items in the initial stage. Domain ontology reduces the time taken to classify the items as the features of the items were uses as entities in ontology. Secondly, the classified items under every entity of ontology are subjected to clustering. K-Means and CLARANS clustering approaches were individually applied to generate a set of features. Consequently, the results of them were applied to consensus clustering. The experimental comparison carried out showed that consensus clustering yielded a better performance when compared to individual clustering approaches. The paper initially presents the comparison of the three clustering methods with allied features. It has also extended to compare the performance the clustering approaches to classified groups. The performance of the system shows that the hybrid clustering along with domain ontology based classification contributes to the increase in precision than conventional approaches. Thus it groups similar items and can be extended to recommend the items in real time recommendation system.
REFERENCES 1. Soltani, S., Barforoush, A. A.: Web pages classification using domain ontology and clustering. International Journal of Pattern Recognition and Artificial Intelligence. (2006) 17-29. 2. Manouselis, N., & Costopoulou, C.: Analysis and classification of multi-criteria recommender systems. World Wide Web. (2007) 415-441. 3. Al-Anazi, S., AlMahmoud, H., Al-Turaiki, I.: Finding Similar Documents Using Different Clustering Techniques. Procedia Computer Science. (2016) 28-34. 4. Trappey, A. J., Trappey, C. V., Hsu, F. C., Hsiao, D. W.: A fuzzy ontological knowledge document clustering methodology. IEEE Transactions on Systems, Man, and Cybernetics. (2009) 806-814. 5. Zhang, T., Tang, Y. Y., Fang, B., Xiang, Y.: Document clustering in correlation similarity measure space. IEEE Transactions on Knowledge and Data Engineering. (2012) 10021013. JARDCS
Special Issue on Allied Electrical And Control Systems
1047
JARDCS, ISSN 1943-023X 6. Punitha, S. C., & Punithavalli, M.: Performance evaluation of semantic based and ontology based text document clustering techniques. Procedia Engineering. (2012) 100-106. 7. Mansur, A. B. F., & Yusof, N.: Social learning network analysis model to identify learning patterns using ontology clustering techniques and meaningful learning. Computers & Education. (2013) 73-86. 8. Zahra, S., Ghazanfar, M. A., Khalid, A., Azam, M. A., Naeem, U., & Prugel-Bennett, A.: Novel centroid selection approaches for KMeans-clustering based recommender systems. Information Sciences. (2015) 156-189. 9. Basu, T., & Murthy, C. A.: A similarity assessment technique for effective grouping of documents. Information Sciences. (2015) 149-162. 10. Zheng, X., Luo, Y., Sun, L., & Chen, F.: A New Recommender System Using Context Clustering Based on Matrix Factorization Techniques. Chinese Journal of Electronics. (2016) 334-340. 11. Huang, A.: Similarity measures for text document clustering. In Proceedings of the sixth new zealand computer science research student conference. (2008) 49-56. 12. Abualigah, L. M., Khader, A. T., Al-Betar, M. A., & Alomari, O. A.: Text feature selection with a robust weight scheme and dynamic dimension reduction to text document clustering. Expert Systems with Applications. (2017). 13. Park, D. H., Kim, H. K., Choi, I. Y., & Kim, J. K.: A literature review and classification of recommender systems research. Expert Systems with Applications. (2012) 10059-10072. 14. Sanchez-Pi, N., Martí, L., & Garcia, A. C. B.: Improving ontology-based text classification: An occupational health and security application. Journal of Applied Logic. (2016) 48-58. 15. Alsmadi, I., & Alhami, I.: Clustering and classification of email contents. Journal of King Saud University-Computer and Information Sciences. (2015) 46-57. 16. Özcan, G., & Ögüdücü, S. G. Applying different classification techniques in reciprocal job recommender system for considering job candidate preferences. In Internet Technology and Secured Transactions (ICITST). (2016) 235-240.
JARDCS
Special Issue on Allied Electrical And Control Systems
1048