Integration of Probabilistic Multi-Class Labeling and Adaptive K-Means Clustering with KNN Classification: Application to Weather Data

Husni Lubis, Ihsan Lubis, Herlina Harahap, Tommy Tommy, Rosyidah Siregar

Abstract


Clustering and classification technologies are pivotal in data analysis, helping to uncover hidden patterns in complex datasets. Despite their broad applications across fields such as pattern recognition, market segmentation, anomaly detection, and weather prediction, these techniques face significant limitations. Clustering methods like K-Means assume known cluster numbers and data distributions, while classification approaches such as K-Nearest Neighbors (KNN) rely heavily on the quality of labeled data. These challenges are particularly pronounced in the context of dynamic weather data, which exhibits high variability and complexity. This research addresses these limitations by integrating probabilistic multi-class labeling with an adaptive K-Means clustering approach. Probabilistic labeling allows data points to belong to multiple classes, reflecting the nuanced nature of overlapping weather conditions. Adaptive K-Means dynamically determines the optimal number of clusters, overcoming traditional constraints. By combining these methods with KNN classification, the proposed approach enhances the accuracy of weather classification. KNN leverages cluster centroids and class probabilities to provide more precise predictions. This approach provides a robust foundation for further research and optimization of adaptive methods applicable to other complex data types. Ultimately, the proposed model contributes significantly to advancing data analysis methods, particularly for dynamic and multi-class datasets like weather data.


Keywords


Clustering; Classification; Probabilistic Labeling; Adaptive K-Means; Weather;

Full Text:

PDF

References


Ahmed, M., Seraj, R., & Islam, S. (2020). The k-means algorithm: A comprehensive survey and performance evaluation. Electronics, 9(8), 1295.

Ajina, A., Jaya, C., Bhat, D., & Saxena, K. (2023). Prediction of weather forecasting using artificial neural networks. Journal of applied research and technology, 21(2), 205-211.

Ben Ayed, R., & Hanana, M. (2021). Artificial intelligence to improve the food and agriculture sector. Journal of Food Quality, 2021(1), 5584754.

Cho, D., Yoo, C., Im, J., & Cha, D. (2020). Comparative assessment of various machine learning‐based bias correction methods for numerical weather prediction model forecasts of extreme air temperatures in urban areas. Earth and Space Science, 7(4), e2019EA000740.

Huang, A., Xu, R., Chen, Y., & Guo, M. (2023). Research on multi-label user classification of social media based on ML-KNN algorithm. Technological Forecasting and Social Change, 188, 122271.

Ikotun, A., Almutari, M., & Ezugwu, A. (2021). K-means-based nature-inspired metaheuristic algorithms for automatic data clustering problems: Recent advances and future directions. Applied Sciences, 11(23), 11246.

Kareem, F., Abdulazeez, A., & Hasan, D. (2021). Predicting weather forecasting state based on data mining classification algorithms. Asian Journal of Research in Computer Science, 9(3), 13-24.

Kusy, M., & Kowalski, P. (2022). Architecture reduction of a probabilistic neural network by merging k-means and k-nearest neighbour algorithms. Applied Soft Computing, 128, 109387.

Pang, Y., Zhao, X., Yan, H., & Liu, Y. (2021). Data-driven trajectory prediction with weather uncertainties: A Bayesian deep learning approach. Transportation Research Part C: Emerging Technologies, 130, 103326.

Purwandari, K., Sigalingging, J., Cenggoro, T., & Pardamean, B. (2021). Multi-class weather forecasting from twitter using machine learning aprroaches. Procedia Computer Science, 179, 47-54.

Shofura, S., Suryani, S., Salma, L., & Harini, S. (2021). The Effect of Number of Factors and Data on Monthly Weather Classification Performance Using Artificial Neural Networks. International Journal on Information and Communication Technology (IJoICT), 7(2), 23-35.

Sinaga, K., & Yang, M. (2020). Unsupervised K-means clustering algorithm. IEEE access, 8, 80716-80727.

Tabianan, K., Velu, S., & Ravi, V. (2022). K-means clustering approach for intelligent customer segmentation using customer purchase behavior data. Sustainability, 14(12), 7243.

Wang, B., Ying, S., & Yang, Z. (2020). A Log‐Based Anomaly Detection Method with Efficient Neighbor Searching and Automatic K Neighbor Selection. Scientific Programming, 2020(1), 4365356.

Wang, L., Han, M., Li, X., Zhang, N., & Cheng, H. (2021). Review of classification methods on unbalanced data sets. Ieee Access, 9, 64606-64628.

Zhang, S. (2021). Challenges in KNN classification. IEEE Transactions on Knowledge and Data Engineering, 34(10), 4663-4675.




DOI: https://doi.org/10.30596/jcositte.v5i2.20905

Refbacks

  • There are currently no refbacks.