Data Mining and Exploration: A Comparison Study among Data Mining Techniques on Iris Data Set

  • Taher M. Ghazal, Mohammed A. M. Afifi, Deepak Kalra


This work aims at investigating the efficiency of diverse methods of classification through the use of WEKA software for the well-known Iris data set. For the assessment of the classification algorithm performance, this paper adopted the use of Receiver Operating Characteristic (ROC) curves. The different classification algorithm techniques used for this work include neural networks, naïve Bayes and decision trees. The data set used in our investigation, Iris data, is one of the oldest and widely used data sets in data mining. For the three techniques of classification used in this study, a comparison of the ROC curves used in this study indicate that the Neural Network (NN) is the most appropriate method of evaluation investigated in this work. The other two methods, Bayes network classifier and decision trees, have their classical procedures for classification that might need to improve significantly.