Assessing the Efficiency and Accuracy of K-Means Clustering Compared to Other Clustering Techniques
DOI:
https://doi.org/10.63017/jdsi.v3i2.23Keywords:
Accuracy, execution time, comparative analysis, clustering algorithmsAbstract
Clustering is an important method in data analysis, faces challenges due to the different nature of datasets, resulting in certain algorithms being less effective and taking a long time. Choosing the most effective clustering method involves evaluating its accuracy and computational speed for a dataset poses a significant challenge for today's researchers. To address these issues, current study compares different clustering methods, by using datasets, including iris, seed, and well log to evaluate their accuracy and execution speed. Results show that K-means performs better with large datasets. As sample size increases, the accuracy of the K-means algorithm tends to improve. The execution time of k-means is influenced by the number of features in the dataset, with datasets having a larger number of features typically requiring more time to process. Mean shift algorithm and spectral clustering algorithm are performed well in small data sets, but it takes a long time.
References
Li, W., et al., An ensemble clustering framework based on hierarchical clustering ensemble selection and
clusters clustering. Cybernetics and Systems, 2023. 54(5): p. 741-766.
Li, H., et al., LSEC: Large-scale spectral ensemble clustering. Intelligent Data Analysis, 2023. 27(1): p. 59-77.
Shanmugam, G., et al., Student Psychology based optimized routing algorithm for big data clustering in IoT
with MapReduce framework. Journal of Intelligent & Fuzzy Systems, 2023(Preprint): p. 1-13.
Li, Y., et al., ZINBMM: a general mixture model for simultaneous clustering and gene selection using single
cell transcriptomic data. Genome Biology, 2023. 24(1): p. 208.
Singh, S. and K. Singh, Novel fuzzy similarity measures and their applications in pattern recognition and
clustering analysis. Granular Computing, 2023: p. 1-23.
Flores, M.A., et al., Thermographic image processing analysis in a solar concentrator with hard C-means
clustering. Energy Reports, 2023. 9: p. 312-321.
Kiran, A., et al., Enhancing Data Security in IoT Networks with Blockchain-Based Management and Adaptive
Clustering Techniques. Mathematics, 2023. 11(9): p. 2073.
Wiroonsri, N., Clustering performance analysis using a new correlation-based cluster validity index. Pattern
Recognition, 2024. 145: p. 109910.
Ahmadinejad, N., Y. Chung, and L. Liu, J-Score: a robust measure of clustering accuracy. PeerJ Computer
Science, 2023. 9: p. e1545.
Li, Q., et al., How to improve the accuracy of clustering algorithms. Information Sciences, 2023. 627: p. 52
Kodinariya, T.M. and P.R. Makwana, Review on determining number of Cluster in K-Means Clustering.
International Journal, 2013. 1(6): p. 90-95.
Gholizadeh, N., H. Saadatfar, and N. Hanafi, K-DBSCAN: An improved DBSCAN algorithm for big data. The
Journal of Supercomputing, 2021. 77: p. 6214-6235.
Monath, N., et al. Scalable hierarchical agglomerative clustering. in Proceedings of the 27th ACM SIGKDD
Conference on knowledge discovery & data mining. 2021.
Demirović, D., An implementation of the mean shift algorithm. Image Processing On Line, 2019. 9: p. 251
Song, X., et al., A spectral clustering algorithm based on attribute fluctuation and density peaks clustering
algorithm. Applied Intelligence, 2023. 53(9): p. 10520-10534.
Löster, T., Determining the optimal number of clusters in cluster analysis. Proceedings of the 10th international
days of statistics and economics, 2016: p. 8-10.
Li, M., E. Frank, and B. Pfahringer, Large scale K-means clustering using GPUs. Data Mining and Knowledge
Discovery, 2023. 37(1): p. 67-109.
Liu, J., F. Cao, and J. Liang, Centroids-guided deep multi-view k-means clustering. Information Sciences,
609: p. 876-896.
Brown, P.O., et al. Mahalanobis distance based k-means clustering. in International Conference on Big Data
Analytics and Knowledge Discovery. 2022. Springer.
De Rosa, A. and A. Khajavirad, The ratio-cut polytope and K-means clustering. SIAM Journal on
Optimization, 2022. 32(1): p. 173-203.
Pinheiro, W.A. and A.B.S. Pinheiro, Hierarchical++: improving the hierarchical clustering algorithm.
International Journal of Data Mining, Modelling and Management, 2023. 15(3): p. 223-239.
.
Yu, H. and X. Hou, Hierarchical clustering in astronomy. Astronomy and Computing, 2022: p. 100662.
Vichi, M., C. Cavicchia, and P.J. Groenen, Hierarchical means clustering. Journal of Classification, 2022.
(3): p. 553-577.
Koren, O., A. Shamalov, and N. Perel, Small Files Problem Resolution via Hierarchical Clustering Algorithm.
Big Data, 2023.
Wu, G., et al., HY-DBSCAN: A hybrid parallel DBSCAN clustering algorithm scalable on distributed-memory
computers. Journal of Parallel and Distributed Computing, 2022. 168: p. 57-69.
Hanafi, N. and H. Saadatfar, A fast DBSCAN algorithm for big data based on efficient density calculation.
Expert Systems with Applications, 2022. 203: p. 117501.
An, X., et al., STRP-DBSCAN: A Parallel DBSCAN Algorithm Based on Spatial-Temporal Random
Partitioning for Clustering Trajectory Data. Applied Sciences, 2023. 13(20): p. 11122.
Jain, P.K., M.S. Bajpai, and R. Pamula, A modified DBSCAN algorithm for anomaly detection in time-series
data with seasonality. Int. Arab J. Inf. Technol., 2022. 19(1): p. 23-28.
Cariou, C., S. Le Moan, and K. Chehdi, A novel mean-shift algorithm for data clustering. IEEE Access, 2022. 10: p. 14575-14585.
Chen, J., et al., Robust Truth Discovery Scheme Based on Mean Shift Clustering Algorithm. Journal of Internet Technology, 2021. 22(4): p. 835-842.
Belloum, F., L. Houichi, and M. Kherouf, The Performance of Spectral Clustering Algorithms on Water
Distribution Networks: Further Evidence. Engineering, Technology & Applied Science Research, 2022. 12(4):
p. 9056-9062.
Cui, Y., et al. A Spectral Clustering Algorithm Based on Differential Privacy Preservation. in International
Conference on Algorithms and Architectures for Parallel Processing. 2021. Springer.
Ikotun, A.M., et al., K-means clustering algorithms: A comprehensive review, variants analysis, and advances
in the era of big data. Information Sciences, 2023. 622: p. 178-210.
Murtagh, F. and P. Contreras, Algorithms for hierarchical clustering: an overview. Wiley Interdisciplinary
Reviews: Data Mining and Knowledge Discovery, 2012. 2(1): p. 86-97.
Dogan, A. and D. Birant, K-centroid link: a novel hierarchical clustering linkage method. Applied Intelligence,
: p. 1-24.
Ozertem, U., D. Erdogmus, and R. Jenssen, Mean shift spectral clustering. Pattern Recognition, 2008. 41(6):
p. 1924-1938.
Gou, S., X. Zhuang, and L. Jiao, Quantum immune fast spectral clustering for SAR image segmentation. IEEE
Geoscience and Remote Sensing Letters, 2011. 9(1): p. 8-12.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Data Science Insights

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.