An Indiscernibility-Based Clustering Method with Iterative Refinement of Equivalence Relations -Rough Clustering-
Shoji Hirano and Shusaku Tsumoto
Department of Medical Informatics, Shimane Medical University, School of Medicine, 89-1 Enya-cho, Izumo, Shimane 693-8501, Japan
This paper presents a new indiscernibility-based clustering method called rough clustering, that can handle relative proximity. Relative proximity is a class of proximity measures that can be used to represent subjective similarity or dissimilarity; such as human judgment about likeness of persons. Since relative proximity is not necessarily required to satisfy the triangular inequality, conventional centroid-based clustering methods may fail to produce good clusters due to inappropriate assignment of cluster representatives. Our method is based on iterative refinement of N binary classifications, where N denotes the number of objects. First, an equivalence relation, that classifies all the other objects into two classes, similar and dissimilar, is assigned by referring to their relative proximity. Next, for each pair of the objects, we count the number of binary classifications in which the pair is included in the same class. We call this number as indiscernibility degree. If the indiscernibility degree of a pair is larger than a user-defined threshold value, we modify the equivalence relations so that all of them commonly classify the pair into the same class. This process is repeated until class assignment becomes stable. Consequently, we obtain the clustering result that follows given level of granularity without using geometric measures.