The Impact of k-means on Association Rules Mining Algorithms Performance

Andre Hasudungan, Rizki Muliono, Nurul Khairina, Nanda Novita

Abstract


Association Rule Mining (ARM) is one of unsupervised learning approach of machine learning. It acts as a data analysis technique that enables the identification of frequent patterns, correlations, associations, and causal structures within certain datasets. This method widely used in numerous studies and practices to explore knowledges and strengthen decision making. However, dealing a large dataset with high number of transactions may become the shortcoming for the ARM algorithms, such as Apriori, FP-Growth, and Eclat. It leads them to face several challenges, including computational complexity, long mining durations, and memory consumption. Hence, this paper proposes k-means clustering to generates several groups of data to handle the issue, then proceed the ARM algorithms for each individual produced cluster. The study used Elbow method and Silhouette Coefficient as the method to determining optimum number of clusters to be used. The result pointed out that k-means-ARM generates a greater number of rules and provides more contextually relevant and significant correlations. In term of Lift Ratio average score, the k-means-ARM shows the greater value rather than non k-means ARM. The k-means-ARM combination is robust; this approach improves the efficiency and scalability of ARM for large datasets and enhances the interpretability of the discovered association rules

Keywords


Unsupervised Learning, Apriori, FP-Growth, Eclat, k-means

Full Text:

PDF

References


I. H. Sarker, “Machine learning: Algorithms, real-world applications and research directions,” SN Comput. Sci., vol. 2, no. 3, p. 160, 2021.

S. M. Tahsien, H. Karimipour, and P. Spachos, “Machine learning based solutions for security of Internet of Things (IoT): A survey,” J. Netw. Comput. Appl., vol. 161, p. 102630, 2020.

M. Alloghani, D. Al-Jumeily, J. Mustafina, A. Hussain, and A. J. Aljaaf, “A systematic review on supervised and unsupervised machine learning algorithms for data science,” Supervised unsupervised Learn. data Sci., pp. 3–21, 2020.

P. N. Fale, N. Moundekar, P. K. RiteshSaudagar, M. Rode, and J. Borkar, “Review on Optimization of Apriori Algorithm for Finding the Association Rules in Different Business and Other Datasets for Retrieval of Relations Between Different Entities,” Int. J. Sci. Res. Sci. Eng. Technol., vol. 9, no. 2, pp. 271–276, 2022.

S. M. Ghafari and C. Tjortjis, “A survey on association rules mining using heuristics,” Wiley Interdiscip. Rev. Data Min. Knowl. Discov., vol. 9, no. 4, p. e1307, 2019.

H. R. Ibraheem and M. M. Hamad, “A Hybrid Integrated Model for Big Data Applications Based on Association Rules and Fuzzy Logic: A Review,” Iraqi J. Comput. Sci. Math., vol. 4, no. 2, pp. 171–178, 2023.

R. Millham, I. E. Agbehadji, and H. Yang, “Pattern mining algorithms,” Bio-inspired Algorithms Data Streaming Vis. Big Data Manag. Fog Comput., pp. 67–80, 2021.

M.-F. Kaya, “Pattern Labelling of Business Communication Data,” Gr. Decis. Negot., vol. 31, no. 6, pp. 1203–1234, 2022.

V. Srinadh, “Evaluation of Apriori, FP growth and Eclat Association rule mining algorithms,” Int. J. Health Sci. (Qassim)., no. II, pp. 7475–7485, 2022.

A. Sharma and A. Ganpati, “Association rule mining algorithms: A Comparative review,” Int. Res. J. Eng. Technol., vol. 8, no. 11, pp. 848–853, 2021.

M. J. S. Fard and P. A. Namin, “Review of apriori based frequent itemset mining solutions on big data,” in 2020 6th International Conference on Web Research (ICWR), 2020, pp. 157–164.

A. A. Aldino, E. D. Pratiwi, S. Sintaro, A. D. Putra, and others, “Comparison of market basket analysis to determine consumer purchasing patterns using fp-growth and apriori algorithm,” in 2021 International Conference on Computer Science, Information Technology, and Electrical Engineering (ICOMITEE), 2021, pp. 29–34.

D. Wicaksono, M. I. Jambak, and D. M. Saputra, “The comparison of apriori algorithm with preprocessing and FP-growth algorithm for finding frequent data pattern in association rule,” in Sriwijaya International Conference on Information Technology and Its Applications (SICONIAN 2019), 2020, pp. 315–319.

M. Man and M. A. Jalil, “Frequent itemset mining: technique to improve eclat based algorithm,” Int. J. Electr. Comput. Eng., vol. 9, no. 6, pp. 5471–5478, 2019.

I. Aqra, N. Abdul Ghani, C. Maple, J. Machado, and N. Sohrabi Safa, “Incremental algorithm for association rule mining under dynamic threshold,” Appl. Sci., vol. 9, no. 24, p. 5398, 2019.

A. Fadaei Tehrani, M. Sharifi, and A. M. Rahmani, “Frequent pattern mining algorithms in fog computing environments: A systematic review,” Concurr. Comput. Pract. Exp., vol. 34, no. 24, p. e7229, 2022.

M. Kumar and A. K. Dubey, “An analysis and literature review of algorithms for frequent itemset mining,” Int. J. Adv. Comput. Res., vol. 13, no. 62, p. 1, 2023.

A. E. Ezugwu et al., “A comprehensive survey of clustering algorithms: State-of-the-art machine learning applications, taxonomy, challenges, and future research prospects,” Eng. Appl. Artif. Intell., vol. 110, p. 104743, 2022.

M. A. Mahdi, K. M. Hosny, and I. Elhenawy, “Scalable clustering algorithms for big data: A review,” IEEE Access, vol. 9, pp. 80015–80027, 2021.

M. Kaushik, R. Sharma, S. A. Peious, M. Shahin, S. Ben Yahia, and D. Draheim, “A systematic assessment of numerical association rule mining methods,” SN Comput. Sci., vol. 2, no. 5, p. 348, 2021.

S. M. Dol and P. M. Jawandhiya, “Classification Technique and its Combination with Clustering and Association Rule Mining in Educational Data Mining—A survey,” Eng. Appl. Artif. Intell., vol. 122, p. 106071, 2023.

A. Telikani, A. H. Gandomi, and A. Shahbahrami, “A survey of evolutionary computation for association rule mining,” Inf. Sci. (Ny)., vol. 524, pp. 318–352, 2020.

S. Kanhere, A. Sahni, P. Stynes, and P. Pathak, “Clustering based approach to enhance association rule mining,” in 2021 28th Conference of Open Innovations Association (FRUCT), 2021, pp. 142–150.

S. A. Moahmmed, M. A. Alasow, and E.-S. M. El-Alfy, “Clustering of Association Rules for Big Datasets using Hadoop MapReduce,” Int. J. Adv. Comput. Sci. Appl., vol. 12, no. 3, 2021.

W. A. AlZoubi, “A survey of clustering algorithms in association rules mining,” Int. J. Comput. Sci. & Inf. Technol. Vol, vol. 11, 2019.

J. Mattiev and B. Kavšek, “CMAC: clustering class association rules to form a compact and meaningful associative classifier,” in Machine Learning, Optimization, and Data Science: 6th International Conference, LOD 2020, Siena, Italy, July 19--23, 2020, Revised Selected Papers, Part I 6, 2020, pp. 372–384.

G. Zhang, C. Liu, and T. Men, “Research on data mining technology based on association rules algorithm,” in 2019 IEEE 8th Joint International Information Technology and Artificial Intelligence Conference (ITAIC), 2019, pp. 526–530.

M. Ahmed, R. Seraj, and S. M. S. Islam, “The k-means algorithm: A comprehensive survey and performance evaluation,” Electronics, vol. 9, no. 8, p. 1295, 2020.

A. M. Ikotun, A. E. Ezugwu, L. Abualigah, B. Abuhaija, and J. Heming, “K-means Clustering Algorithms: A Comprehensive Review, Variants Analysis, and Advances in the Era of Big Data,” Inf. Sci. (Ny)., 2022.

S. G. Setyorini, E. K. Sari, L. R. Elita, and S. A. Putri, “Analisis Keranjang Pasar Menggunakan Algoritma K-Means dan FP-Growth pada PT. Citra Mustika Pandawa: Market Basket Analysis with K-Means and FP-Growth Algorithm as Citra Mustika Pandawa Company,” MALCOM Indones. J. Mach. Learn. Comput. Sci., vol. 1, no. 1, pp. 41–46, 2021.

K. Gayathri and R. Arunodhaya, “Customer Segmentation and Personalized Marketing Using K-Means and APRIORI Algorithm,” 2021.

N. P. Dharshinni, E. Bangun, S. Karunia, R. Damayanti, G. Rophe, and R. Pandapotan, “Menu Package Recommendation using Combination of K-Means and FP-Growth Algorithms at Bakery Stores: Menu Package Recommendation using Combination of K-Means and FP-Growth Algorithms at Bakery Stores,” J. Mantik, vol. 4, no. 2, pp. 1272–1277, 2020.

N. P. Dharshinni, F. Azmi, I. Fawwaz, A. M. Husein, and S. D. Siregar, “Analysis of accuracy K-means and apriori algorithms for patient data clusters,” in Journal of Physics: Conference Series, 2019, vol. 1230, no. 1, p. 12020.

S. Aryanti, D. Mahdiana, and A. Setiadi, “Penerapan Metode K-Means Dan Apriori Untuk Pemilihan Produk Bundling,” J. CERITA ISSN, vol. 2461, p. 1417.

S. Enggari and S. Defit, “Divorce Fact Detection Based on Internet User Behavior Using Hybrid Systems with Combination of Apriori Algorithm and K-Means Method,” Khazanah Inform. J. Ilmu Komput. dan Inform., vol. 8, no. 1, pp. 8–17, 2022.

S. Liu, H. Chen, and Y. Yu, “Research on Multi-factors Terrorist Attacks in China Based on K-Apriori Algorithm Research,” in Journal of Physics: Conference Series, 2021, vol. 1746, no. 1, p. 12042.

L. Lisnawita and M. Devega, “Implementation of ECLAT Algorithm Technology: Determining Books Borrowing Pattern in University library,” in IOP Conference Series: Earth and Environmental Science, 2020, vol. 469, no. 1, p. 12036.

K. R. Laxmi, N. Ramya, S. Pallavi, and K. Madhuravani, “Study and Analysis of Apriori and K-Means Algorithms for Web Mining,” in Innovations in Electronics and Communication Engineering: Proceedings of the 8th ICIECE 2019, Springer, 2020, pp. 693–701.

N. Y. Yürüsen, B. Uzunouglu, A. P. Talayero, and A. L. Estopiñán, “Apriori and K-Means algorithms of machine learning for spatio-temporal solar generation balancing,” Renew. Energy, vol. 175, pp. 702–717, 2021.

M. R. Sadeghi Moghadam, H. Safari, and N. Yousefi, “Clustering quality management models and methods: systematic literature review and text-mining analysis approach,” Total Qual. Manag. & Bus. Excell., vol. 32, no. 3–4, pp. 241–264, 2021.

D. Jollyta, S. Efendi, M. Zarlis, and H. Mawengkang, “Analysis of an optimal cluster approach: a review paper,” in Journal of Physics: Conference Series, 2023, vol. 2421, no. 1, p. 12015.

I. Pauletic, L. N. Prskalo, and M. B. Bakaric, “An overview of clustering models with an application to document clustering,” in 2019 42nd International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO), 2019, pp. 1659–1664.

A. Jain, “Association Rule Mining in Transactional Data: Challenges and Opportunities,” Int. J. Mech. Eng., vol. 6, no. 3, pp. 4548–4557, 2021.

M. V. Babu and M. Sreedevi, “A Literature Study on Various Techniques of Association Rule Mining,” TIJER-International Res. J., vol. 10, no. 6, pp. 648–654, 2023.

H. Humaira and R. Rasyidah, “Determining the appropiate cluster number using elbow method for k-means algorithm,” 2020.

M. Cui, “Introduction to the k-means clustering algorithm based on the elbow method,” Accounting, Audit. Financ., vol. 1, no. 1, pp. 5–8, 2020.

T. Ullmann, C. Hennig, and A.-L. Boulesteix, “Validation of cluster analysis results on validation data: A systematic framework,” Wiley Interdiscip. Rev. Data Min. Knowl. Discov., vol. 12, no. 3, p. e1444, 2022.




DOI: https://doi.org/10.30596/jcositte.v5i2.20907

Refbacks

  • There are currently no refbacks.