Research Paper:
q-Divergence-Based Fuzzy c-Linear Gaussian State Space Model
Tomoki Nomura and Yuchi Kanzawa

Shibaura Institute of Technology
3-7-5 Toyosu, Koto-ku, Tokyo 135-8548, Japan
This paper proposes a fuzzy clustering algorithm based on the linear Gaussian state space model (LGSSM), referred to as q-divergence-based fuzzy c-linear Gaussian state space models (QFCLGSSMs) for series data. QFCLGSSMs are constructed from mixtures of linear Gaussian state space models (MLGSSMs), which is a conventional probabilistic clustering algorithm based on LGSSM for series data. QFCLGSSMs are motivated by the relationship between the following clustering algorithms for vectorial data: Gaussian mixture model with identity covariances and q-divergence-based fuzzy c-means. In numerical experiments that use an artificial dataset, we revealed the effects of fuzzification parameters on the fuzziness of clustering results in the proposed algorithm demonstrating the close relationship between the proposed algorithm and the conventional algorithm, MLGSSMs. Moreover, through numerical experiments, using nine real datasets, we demonstrated that the proposed algorithm outperformed the conventional algorithm, MLGSSMs, in terms of clustering accuracy.

Fuzzy clustering for series data
- [1] J. C. Bezdek, “Pattern Recognition with Fuzzy Objective Function Algorithms,” Plenum Press, 1981.
- [2] S. Miyamoto and N. Kurosawa, “Controlling cluster volume sizes in fuzzy c-means clustering,” Proc. of Joint 2nd Int. Conf. on Soft Computing and Intelligent Systems and 5th Int. Symp. on Advanced Intelligent System (SCIS&ISIS2004), 2004.
- [3] Y. Kanzawa, “On fuzzy clustering based on Tsallis entropy-regularization,” Proc. of the 30th Fuzzy System Symp., pp. 452-457 2014 (in Japanese). https://doi.org/10.14864/fss.30.0_452
- [4] S. Tagil, S. Danacioglu, and N. Yurtseven, “Time series clustering of sea surface temperature in the mediterranean and black sea marine system,” Int. J. of Climatology, Vol.44, No.16, pp. 6083-6099, 2024. https://doi.org/10.1002/joc.8687
- [5] S. Hirano and S. Tsumoto, “Mining similar temporal patterns in long time-series data and its application to medicine,” 2002 IEEE Int. Conf. on Data Mining, pp. 219-226, 2002. https://doi.org/10.1109/ICDM.2002.1183906
- [6] S. Majumdar and A. K. Laha, “Clustering and classification of time series using topological data analysis with applications to finance,” Expert Systems with Applications, Vol.162, Article No.113868, 2020. https://doi.org/10.1016/j.eswa.2020.113868
- [7] A. Corbineau et al., “Time series analysis of tuna and swordfish catches and climate variability in the Indian Ocean (1968–2003),” Aquatic Living Resources, Vol.21, No.3, pp. 277-285, 2008. https://doi.org/10.1051/alr:2008045
- [8] P. D’Urso, “Fuzzy clustering for data time arrays with inlier and outlier time trajectories,” IEEE Trans. on Fuzzy Systems, Vol.13, No.5, pp. 583-604, 2005. https://doi.org/10.1109/TFUZZ.2005.856565
- [9] C. Serantoni et al., “Integrating dynamic time warping and k-means clustering for enhanced cardiovascular fitness assessment,” Biomedical Signal Processing and Control, Vol.97, Article No.106677, 2024. https://doi.org/10.1016/j.bspc.2024.106677
- [10] S. H. Holan and N. Ravishanker, “Time series clustering and classification via frequency domain methods,” WIREs Computational Statistics, Vol.10, No.6, Article No.e1444, 2018. https://doi.org/10.1002/wics.1444
- [11] J. Kreienkamp et al., “A gentle introduction and application of feature-based clustering with psychological time series,” Multivariate Behavioral Research, Vol.60, No.2, pp. 362-392, 2024. https://doi.org/10.1080/00273171.2024.2432918
- [12] Y. Xiong and D.-Y. Yeung, “Mixtures of ARMA models for model-based time series clustering,” 2002 IEEE Int. Conf. on Data Mining, pp. 717-720, 2002. https://doi.org/10.1109/ICDM.2002.1184037
- [13] R. Umatani, T. Imai, K. Kawamoto, and S. Kunimasa, “Time series clustering with an EM algorithm for mixtures of linear Gaussian state space models,” Pattern Recognition, Vol.138, Article No.109375, 2023. https://doi.org/10.1016/j.patcog.2023.109375
- [14] H. Chernoff, “A measure of asymptotic efficiency for tests of a hypothesis based on the sum of observations,” The Annals of Mathematical Statistics, Vol.23, No.4, pp. 493-507, 1952. https://doi.org/10.1214/aoms/1177729330
- [15] M. Sbert and L. Szirmay-Kalos, “Robust multiple importance sampling with Tsallis ϕ-divergences,” Entropy, Vol.24, No.9, Article No.1240, 2022. https://doi.org/10.3390/e24091240
- [16] E. Amid, M. K. Warmuth, and S. Srinivasan, “Two-temperature logistic regression based on the Tsallis divergence,” Proc. of the 22nd Int. Conf. on Artificial Intelligence and Statistics, pp. 2388-2396, 2020.
- [17] B. D. O. Anderson and J. B. Moore, “Optimal Filtering,” Prentice Hall, 1979.
- [18] K. Kalpakis, “Mining of science time-series data.” https://redirect.cs.umbc.edu/kalpakis/TS-mining/ [Accessed January 28, 2024]
- [19] H. A. Dau et al., “The UCR time series classification archive,” 2018. https://www.cs.ucr.edu/eamonn/time_series_data_2018/ [Accessed December 12, 2021]
- [20] L. Hubert and P. Arabie, “Comparing partitions,” J. of Classification, Vol.2, No.1, pp. 193-218, 1985. https://doi.org/10.1007/BF01908075
- [21] José E. Chacón and A. I. Rastrojo, “Minimum adjusted Rand index for two clusterings of a given size,” Advances in Data Analysis and Classification, Vol.17, No.1, pp. 125-133, 2023. https://doi.org/10.1007/s11634-022-00491-w
This article is published under a Creative Commons Attribution-NoDerivatives 4.0 Internationa License.