International Artificial Intelligence and Data Processing Symposium (IDAP'16)
1
Tuba Pala1 Elektrik-Elektronik ve Bilgisayar
2 2
[email protected] [email protected]
-ticaret,
Anahtar kelimeler: t Veri Seti, Apriori A Data Mining Approach for Diagnosis of Diabetes Using Association Rules and Clustering Abstract Successful applications of data mining have a rising trend in e-commerce, Tip
marketing, industry and other sectors. These sectors include healthcare sector as well.
%5-
Tip 2 diyabetlidir [1]. Diyabetli
hasta
Day by day,
the amount
of
information in the field of health is growing quickly. In addition, data analysis studies carried out in the field of health, has increased rapidly since it will help the physician and employees in this sector. Thus, the discovery of hidden relationships
Tip
and trends in medical data are provided. Type 2 diabetes is the most common metabolic disease in adult society. 5- 10% of the population in developed countries is 1
September 17-18, 2016 Malatya/TURKEY
187
International Artificial Intelligence and Data Processing Symposium (IDAP'16)
type 2 diabetes disease [1]. Such a high
varsa, bu diyabete de Tip 2 diyabet denir.
proportion of diabetes patients will also
Tip
probably the most important cause of illness and death in the future. For this
-
u
Tip
2
reason, diabetes dataset for use in this study were selected. In this study, a data mining approach is proposed using data of
s
Tip 2
Type 2 diabetes patients. The approach is a hybrid structure that includes with of clustering and association analysis of data
Tip Tip 2 diyabetli
mining method. Apriori Algorithm has been selected for extracting rule in medical datasets. Keywords: Data Mining, Clustering, Pima Diabetes Dataset, Apriori Algorithm, KMeans Algorithm
hormonunun
diyabete "Tip
daha zor hale gelmektedir. Geleneksel
2 September 17-18, 2016 Malatya/TURKEY
188
International Artificial Intelligence and Data Processing Symposium (IDAP'16)
ta tahminlen
teknikleriyle
Aljumah ve ark. [5]
perform
Han J. ve ark. ile diyabet verilerini analiz eden diyabet
e K-Means ve HAC
(Hiyerhierarchical
agglomerative 10].
Bu
2 diyabet a Tip
ileri
3 September 17-18, 2016 Malatya/TURKEY
189
International Artificial Intelligence and Data Processing Symposium (IDAP'16)
a -Means in daha basit tahminleme do -Means daha uygun olarak 2. Veri Seti Hussan [12], orijinal PIMA diyabet veriset Institute of Diabetes and Digestive and ]. Veri
uygula Tip 2 diya
settir.
n
Pandey, Atul Kumar ve ark. [13]
Tablo 1. Pima Diyabet Veri Seti Nitelikleri Veri sette No 1 Pregnant 2 Plasma-Glucose 3 4 5 6 7 8 9
Diastolic BP Tciceps SFT Serum-Insulin BMI DPF Age Class
Plazma glikoz konsantrasyonu
Oral glikoz tolerans testinde 2. saat glikoz konsantrasyonu
2. saat serum insulin
1 - diyabet test sonucu pozitif 0 - diyabet test sonucu negatif
4 September 17-18, 2016 Malatya/TURKEY
190
International Artificial Intelligence and Data Processing Symposium (IDAP'16)
Bulunan bu ku
268, verilerin %34
suppor Y birliktelik
-
5 September 17-18, 2016 Malatya/TURKEY
191
International Artificial Intelligence and Data Processing Symposium (IDAP'16)
-
]. B. K-
-Means
en
iyi
bilinen
Algoritma temelde tekrarlayan bir nite
mum, ifade
etmektedir.
Minimum
destek
dahil edilmemektedir. Her taramada bir
er
6 September 17-18, 2016 Malatya/TURKEY
192
International Artificial Intelligence and Data Processing Symposium (IDAP'16)
Veri setinin merkezi belirlenir ve
3-
bu m
K-
4-
1-
Uygulama
2-
ve
-
7 September 17-18, 2016 Malatya/TURKEY
193
International Artificial Intelligence and Data Processing Symposium (IDAP'16)
Nitelik Plasma-Glucose, Dastolic
gelmek
Nitelik Triceps SFT ve Serum-
ikler bir -
Pergnant:110 Plasma-Glucose :5 Diastolic BP: 35 Triceps SFT: 227 Serum-Insulin:374 BMI:11 DPF:0 Age:0 Class:0
Pergnant, Plasma-Glucose, Diastolic BP,
medikal uz
Tablo 2.
8 September 17-18, 2016 Malatya/TURKEY
194
International Artificial Intelligence and Data Processing Symposium (IDAP'16)
Nitelik Pregnant Plasma-Glucose Diastolic BP BMI DPF Age
veriler
Kategoriler low (=6) low (140) normal (90) low (35) low (0.82) young (20 39), medium (40 59), 60 plus (high)
Kaynak [21] [21] [21] [21] [22] [8]
Tablo 3. K-
Cluster 0