A Data Mining Approach f

20 downloads 0 Views 3MB Size Report
rapidminer." 2008 Second International. Conference on. Future. Generation ... [18] A Tutorial on Clustering Algorithms, http://home.dei.polimi.it/matteucc/Clusteri.
International Artificial Intelligence and Data Processing Symposium (IDAP'16)

1

Tuba Pala1 Elektrik-Elektronik ve Bilgisayar

2 2

[email protected] [email protected]

-ticaret,

Anahtar kelimeler: t Veri Seti, Apriori A Data Mining Approach for Diagnosis of Diabetes Using Association Rules and Clustering Abstract Successful applications of data mining have a rising trend in e-commerce, Tip

marketing, industry and other sectors. These sectors include healthcare sector as well.

%5-

Tip 2 diyabetlidir [1]. Diyabetli

hasta

Day by day,

the amount

of

information in the field of health is growing quickly. In addition, data analysis studies carried out in the field of health, has increased rapidly since it will help the physician and employees in this sector. Thus, the discovery of hidden relationships

Tip

and trends in medical data are provided. Type 2 diabetes is the most common metabolic disease in adult society. 5- 10% of the population in developed countries is 1

September 17-18, 2016 Malatya/TURKEY

187

International Artificial Intelligence and Data Processing Symposium (IDAP'16)

type 2 diabetes disease [1]. Such a high

varsa, bu diyabete de Tip 2 diyabet denir.

proportion of diabetes patients will also

Tip

probably the most important cause of illness and death in the future. For this

-

u

Tip

2

reason, diabetes dataset for use in this study were selected. In this study, a data mining approach is proposed using data of

s

Tip 2

Type 2 diabetes patients. The approach is a hybrid structure that includes with of clustering and association analysis of data

Tip Tip 2 diyabetli

mining method. Apriori Algorithm has been selected for extracting rule in medical datasets. Keywords: Data Mining, Clustering, Pima Diabetes Dataset, Apriori Algorithm, KMeans Algorithm

hormonunun

diyabete "Tip

daha zor hale gelmektedir. Geleneksel

2 September 17-18, 2016 Malatya/TURKEY

188

International Artificial Intelligence and Data Processing Symposium (IDAP'16)

ta tahminlen

teknikleriyle

Aljumah ve ark. [5]

perform

Han J. ve ark. ile diyabet verilerini analiz eden diyabet

e K-Means ve HAC

(Hiyerhierarchical

agglomerative 10].

Bu

2 diyabet a Tip

ileri

3 September 17-18, 2016 Malatya/TURKEY

189

International Artificial Intelligence and Data Processing Symposium (IDAP'16)

a -Means in daha basit tahminleme do -Means daha uygun olarak 2. Veri Seti Hussan [12], orijinal PIMA diyabet veriset Institute of Diabetes and Digestive and ]. Veri

uygula Tip 2 diya

settir.

n

Pandey, Atul Kumar ve ark. [13]

Tablo 1. Pima Diyabet Veri Seti Nitelikleri Veri sette No 1 Pregnant 2 Plasma-Glucose 3 4 5 6 7 8 9

Diastolic BP Tciceps SFT Serum-Insulin BMI DPF Age Class

Plazma glikoz konsantrasyonu

Oral glikoz tolerans testinde 2. saat glikoz konsantrasyonu

2. saat serum insulin

1 - diyabet test sonucu pozitif 0 - diyabet test sonucu negatif

4 September 17-18, 2016 Malatya/TURKEY

190

International Artificial Intelligence and Data Processing Symposium (IDAP'16)

Bulunan bu ku

268, verilerin %34

suppor Y birliktelik

-

5 September 17-18, 2016 Malatya/TURKEY

191

International Artificial Intelligence and Data Processing Symposium (IDAP'16)

-

]. B. K-

-Means

en

iyi

bilinen

Algoritma temelde tekrarlayan bir nite

mum, ifade

etmektedir.

Minimum

destek

dahil edilmemektedir. Her taramada bir

er

6 September 17-18, 2016 Malatya/TURKEY

192

International Artificial Intelligence and Data Processing Symposium (IDAP'16)

Veri setinin merkezi belirlenir ve

3-

bu m

K-

4-

1-

Uygulama

2-

ve

-

7 September 17-18, 2016 Malatya/TURKEY

193

International Artificial Intelligence and Data Processing Symposium (IDAP'16)

Nitelik Plasma-Glucose, Dastolic

gelmek

Nitelik Triceps SFT ve Serum-

ikler bir -

Pergnant:110 Plasma-Glucose :5 Diastolic BP: 35 Triceps SFT: 227 Serum-Insulin:374 BMI:11 DPF:0 Age:0 Class:0

Pergnant, Plasma-Glucose, Diastolic BP,

medikal uz

Tablo 2.

8 September 17-18, 2016 Malatya/TURKEY

194

International Artificial Intelligence and Data Processing Symposium (IDAP'16)

Nitelik Pregnant Plasma-Glucose Diastolic BP BMI DPF Age

veriler

Kategoriler low (=6) low (140) normal (90) low (35) low (0.82) young (20 39), medium (40 59), 60 plus (high)

Kaynak [21] [21] [21] [21] [22] [8]

Tablo 3. K-

Cluster 0