Paper:

# On Two Apriori-Based Rule Generators: Apriori in Prolog and Apriori in SQL

## Hiroshi Sakai^{*}, Kao-Yi Shen^{**}, and Michinori Nakata^{***}

^{*}Department of Basic Sciences, Graduate School of Engineering, Kyushu Institute of Technology

Tobata, Kitakyushu 804-8550, Japan

^{**}Department of Banking and Finance, Chinese Culture University (SCE)

Da’an District, Taipei City, Taiwan

^{***}Faculty of Management and Information Science, Josai International University

Gumyo, Togane, Chiba 283-8555, Japan

This paper focuses on two Apriori-based rule generators. The first is the rule generator in Prolog and C, and the second is the one in SQL. They are named *Apriori in Prolog* and *Apriori in SQL*, respectively. Each rule generator is based on the Apriori algorithm. However, each rule generator has its own properties. Apriori in Prolog employs the equivalence classes defined by table data sets and follows the framework of rough sets. On the other hand, Apriori in SQL employs a search for rule generation and does not make use of equivalence classes. This paper clarifies the properties of these two rule generators and considers effective applications of each to existing data sets.

- [1] R. Agrawal and R. Srikant, “Fast algorithms for mining association rules in large databases,” Proc. VLDB’94, pp. 487-499, 1994.
- [2] Z. Pawlak, “Rough Sets: Theoretical Aspects of Reasoning About Data,” 1991.
- [3] A. Skowron and C. Rauszer, “The discernibility matrices and functions in information systems,” Intelligent Decision Support -Handbook of Advances and Applications of the Rough Set Theory, pp. 331-362, 1992.
- [4] E. Orłowska and Z. Pawlak, “Representation of nondeterministic information,” Theoretical Computer Science, Vol.29, No.1-2, pp. 27-39, 1984.
- [5] W. Lipski, “On semantic issues connected with incomplete information databases,” ACM Trans. on Database Systems,” Vol.4, No.3, pp. 262-296, 1979.
- [6] J. W. Grzymała-Busse and P. Werbrouck, “On the best search method in the LEM1 and LEM2 algorithms,” Incomplete Information, Rough Set Analysis, Studies in Fuzziness and Soft Computing, Vol.13, pp. 75-91, 1998.
- [7] K. Y. Shen and G. H. Tzeng, “Contextual improvement planning by fuzzy-rough machine learning: A novel bipolar approach for business analytics,” Int. J. of Fuzzy Systems, Vol.18, No.6, pp. 940-955, 2016.
- [8] M. Nakata and H. Sakai, “Twofold rough approximations under incomplete information,” Int. J. of General Systems, Vol.42, No.6, pp. 546-571, 2013.
- [9] H. Sakai, M. Wu, and M. Nakata, “Apriori-based rule generation in incomplete information databases and non-deterministic information systems,” Fundamenta Informaticae, Vol.130, No.3, pp. 343-376, 2014.
- [10] H. Sakai, M. Nakata, and Y. Yao, “Pawlak’s many valued information system, non-deterministic information system, and a proposal of new topics on information incompleteness toward the actual application,” Studies in Computational Intelligence, Vol.708, pp. 187-204, 2017.
- [11] H. Sakai, “Execution logs by RNIA software tools,” http://www.mns.kyutech.ac.jp/sakai/RNIA [Accessed December 12, 2017]
- [12] K. Y. Shen, H. Sakai, and G. H. Tzeng, “Stable rules evaluation for rough-set-based bipolar model : A preliminary study for credit loan evaluation,” Proc. Int. Conf. on Rough Sets, LNCS 10313, pp. 317-328, 2017.
- [13] J. Bazan and M. Szczuka, “The rough set exploration system,” Trans. on Rough Sets, Vol.3, pp. 37-56, 2005.
- [14] L. S. Riza et al., “Implementing algorithms of rough set theory and fuzzy rough set theory in the R package RoughSets,” Information Sciences, Vol.287, No.10, pp. 68-89, 2014.
- [15] H. Sakai, M. Nakata, and D. Ślęzak, “A NIS-Apriori based rule generator in Prolog and its functionality for table data,” Proc. RSKT 2011, LNAI, Vol.6954, pp. 226-231, 2011.
- [16] H. Sakai, C. Liu, X. Zhu, and M. Nakata, “On NIS-Apriori based data mining in SQL,” Proc. Int. Conf. on Rough Sets, LNCS Vol.9920, pp. 514-524, 2016.
- [17] A. Ceglar and J. F. Roddick, “Association mining,” ACM Computing Survey, Vol.38, No.2, 2006.
- [18] M. Wu, M. Nakata, and H. Sakai, “An overview of the getRNIA system for non-deterministic data,” Procedia Computer Science, Vol.22, pp. 615-62, 2013.
- [19] A. Frank and A. Asuncion, “UCI Machine Learning Repository,” http://mlearn.ics.uci.edu/MLRepository.html [Accessed December 12, 2017]
- [20] M. Kowalski and S. Stawicki, “SQL-based heuristics for selected KDD tasks over large data sets,” Proc. FedCSIS 2012, pp. 303-310, 2012.
- [21] D. Ślęzak and H. Sakai, “Automatic extraction of decision rules from non-deterministic data systems: Theoretical foundations and SQL-based implementation,” Database Theory and Application, CCIS Vol.64, pp. 151-162, 2009.
- [22] W. Swieboda and S. Nguyen, “Rough set methods for large and spare data in EAV format,” Proc. IEEE RIVF 2012, pp. 1-6, 2012.
- [23] phpMyAdmin Web Page, http://www.phpmyadmin.net/ [Accessed December 12, 2017]
- [24] H. Sakai, C. Liu, M. Nakata, and S. Tsumoto, “A proposal of a privacy-preserving questionnaire by non-deterministic information and its analysis,” Proc. IEEE Big Data Conf., pp. 1956-1965, 2016.

*J. Adv. Comput. Intell. Intell. Inform.*, Vol.22, No.3, pp. 394-403, 2018

*J. Adv. Comput. Intell. Intell. Inform.*, Vol.22, No.3, pp. 394-403, 2018