Determining the Optimal Number of Clusters by an Extended RPCL Algorithm
Xin Li, Man Wai Mak and Chi Kwong Li
Department of Electronic and Information Engineering The Hong Kong Polytechnic University Hong Kong
Determining an appropriate number of clusters is a difficult yet important problem that the rival penalized competitive learning (RPCL) algorithm was designed to solve, but its performance is not satifactory with overlapping clusters or cases where input vectors contain dependent components. We address this problem by incorporating full covariance matrices into the original RPCL algorithm. The resulting extended RPCL algorithm progressively eliminates units whose clusters contain only a small amount of training data. The algorithm is used to determine the number of clusters in a Gaussian distribution. It is also used to optimize the architecture of elliptical basis function networks for speaker verification and vowel classification. We found that covariance matrices obtained by the extended RPCL algorithm have a better representation of clusters than those obtained by the original RPCL algorithm, resulting in a lower verification error rate in speaker verification and a higher recognition accuracy in vowel classification.