Paper:

# Distributed Mining of Closed Patterns from Multi-Relational Data

## Yohei Kamiya and Hirohisa Seki

Department of Computer Science, Nagoya Institute of Technology

Gokiso-cho, Showa-ku, Nagoya 466-8555, Japan

In multi-relational data mining (MRDM), there have been proposed many methods for searching for patterns that involve multiple tables (relations) from a relational database. In this paper, we consider closed pattern mining from distributed multi-relational databases (MRDBs). Since the computation of MRDM is costly compared with the conventional itemset mining, we propose some efficient methods for computing closed patterns using the techniques studied in Inductive Logic Programming (ILP) and Formal Concept Analysis (FCA). Given a set of * local* databases, we first compute sets of their closed patterns (concepts) using a closed pattern mining algorithm tailored to MRDM, and then generate the set of closed patterns in the global database by utilizing the *merge* operator. We also present some experimental results, which shows the effectiveness of the proposed methods.

- [1] S. Dzeroski and N. Lavra (Eds.), “Relational Data Mining,” Springer, 2001.
- [2] S. Dzeroski, “Multi-Relational Data Mining: An Introduction,” SIGKDD Explorations Newsletter, Vol.5, No.1, pp. 1-16, 2003.
- [3] B. Ganter and R. Wille, “Formal Concept Analysis: Mathematical Foundations,” Springer, 1999.
- [4] B. Ganter, G. Stumme, and R. Wille (Eds.), “Formal Concept Analysis, Foundations and Applications,” Vol.3626, LNCS, Springer, 2005.
- [5] S. Kuznetsov, O. Napoli, and S. Rudolph (Eds.), “FCA4AI: “What can FCA do for Artificial Intelligence?” IJCAI 2013 Workshop, 2013.
- [6] G. Stumme, “Iceberg Query Lattices for Datalog,” Conceptual Structures at Work, Vol.3127, LNCS, pp.109-125, Springer, 2004.
- [7] B. Ganter and S. Kuznetsov, “Pattern Structures and Their Projections,” ICCS-01, Vol.2120, LNCS, pp. 129-142, Springer, 2001.
- [8] L. De Raedt and J. Ramon, “Condensed representations for Inductive Logic Programming,” Proc. KR’04, pp. 438-446. 2004.
- [9] G. C. Garriga, R. Khardon, and L. De Raedt, “On Mining Closed Sets in Multi-Relational Data,” IJCAI 2007, pp. 804-809. 2007.
- [10] C. Lucchese, S. Orlando, and R. Rergo, “Distributed Mining of Frequent Closed Itemsets: Some Preliminary Results,” Int. Workshop on High Performance and Distributed Mining, 2005.
- [11] P. Valtchev and R. Missaoui, “Building Concept (Galois) Lattices from Parts: Generalizing the Incremental Methods,” Proc. of the 9
^{th}Int. Conf. on Conceptual Structures (ICCS’01), pp. 290-303, Springer, 2001. - [12] H. Seki and S. Tanimoto, “Distributed Closed Pattern Mining in Multi-Relational Data based on Iceberg Query Lattices: Some Preliminary Results,” CLA’12, pp. 115-126, 2012.
- [13] Y. Kamiya and H. Seki, “Towards Efficient Closed Pattern Mining from Distributed Multi-Relational Data,” Proc. Joint 7
^{th}Int. Conf. on Soft Computing and Intelligent Systems and 15^{th}Int. Symp. on Advanced Intelligent Systems (SCIS&ISIS 2014), pp. 1138-1141, 2014. - [14] J. Dean and S. Ghemawat, “MapReduce: simplified data processing on large clusters,” Commun. ACM, Vol.51, No.1, pp. 107-113, 2008.
- [15] S.-H. Nienhuys-Cheng and R. de Wolf, “Foundations of Inductive Logic Programming,” Vol.1228, LNAI, Springer, 1997.
- [16] L. Dehaspe, “Frequent pattern discovery in first-order logic,” Ph.D. thesis, Dept. Computer Science, Katholieke Universiteit Leuven, 1998.
- [17] N. Helft, “Induction as nonmonotonic inference,” Proc. KR’89, pp. 149-156. 1989.
- [18] H. Seki, Y. Honda, and S. Nagano, “On Enumerating Frequent Closed Patterns with Key in Multi-relational Data,” DS’10, Vol.6332, LNAI, pp. 72-86, Springer, 2010.
- [19] N. Pasquier, Y. Bastide, R. Taouil, and L. Lakhal, “Discovering Frequent Closed Itemsets for Association Rules,” Proc. ICDT’99, pp. 398-416, Springer, 1999.
- [20] T. Uno, T. Asai, Y. Uchida, and H. Arimura, “An Efficient Algorithm for Enumerating Closed Patterns in Transaction Databases,” DS’04, Vol.3245, LNAI, pp. 16-31, Springer, 2004.
- [21] S. Muggleton, “Inverse entailment and progol,” New Generation Computing, Vol.13, No.3,4, pp. 245-286, 1995.
- [22] H. Seki and Y. Kamiya, “Merging Closed Pattern Sets in Distributed Multi-Relational Data,” CLA’14, pp. 71-82, 2014.
- [23] H. Blockeel and M. Sebag, “Scalability and efficiency in multirelational data mining,” SIGKDD Explorations Newsletter, Vol.4, No.2, pp. 1-14, 2003.
- [24] P. Krajca and V. Vychodil, “Distributed Algorithm for Computing Formal Concepts Using Map-Reduce Framework,” IDA’09, pp. 333-344. Springer, 2009.
- [25] P. Krajca, J. Outrata, and V. Vychodil, “Parallel algorithm for computing fixpoints of Galois connections,” Annals of Mathematics and Artificial Intelligence, Vol.59, No.2, pp. 257-272, 2010.
- [26] P. Valtchev, R. Missaoui, and P. Pierre Lebrun, “A Partition-based Approach towards Constructing Galois (Concept) Lattices,” Discrete Mathematics, Vol.256, No.3, pp. 801-829, 2002.