Paper:
Cluster Analysis as a First Step in the Knowledge Discovery Process
Andreas Rauber* and Jan Paralic**
*Department of Software Technology, Vienna University of Technology Favoritenstr., 9 - 11 / 188, A - 1040 Vienna, Austria
**Department of Cybernetics and Artificial Intelligence, Technical University of Kosice, Letna 9, 04200 Kosice, Slovak Republic
Cluster analysis is one of the most prominent methods for the analysis of large, unknown datasets. It provides a particularly suitable tool for obtaining a first overview of data, forming a prominent starting point for further evaluation. . In this paper, we present some lessons learned during the application of two clustering approaches to the analysis of castle admission ticket sales data. A Bayesian unsupervised classification based on AutoClass and an unsupervised neural network, the Self-Organizing Map, are used to obtain a first impression of the available data to form the basis for further exploration. We show that this type of cluster analysis provides a suitable first step in the knowledge discovery process. The different types of result representation and their suitability of providing a first insight into datasets are analyzed and compared.
This article is published under a Creative Commons Attribution-NoDerivatives 4.0 Internationa License.