OPTIMASI K-MEANS CLUSTERING PSO UNTUK PENENTUAN JUMLAH CLUSTER OPTIMAL PADA DATA KANKER PAYUDARA

Adhitya Purboyo; Laksamana Rajendra Haidar Azani Fajri; Imam Syafii

doi:10.69714/drakfm71

Authors

Adhitya Purboyo Universitas Islam Negeri Sunan Kudus Author
Laksamana Rajendra Haidar Azani Fajri Universitas Islam Negeri Sunan Kudus Author
Imam Syafii Universitas Islam Negeri Sunan Kudus Author

DOI:

https://doi.org/10.69714/drakfm71

Keywords:

K-Means, Particle Swarm Optimization, Clustering, Breast Cancer

Abstract

Breast cancer is one of the most dangerous diseases and a leading cause of death among women worldwide. Clustering methods can assist in diagnosing breast cancer to determine the best course of treatment. K-Means is a widely used clustering algorithm known for its ability to handle large datasets efficiently with fast computational time. However, K-Means has a significant weakness: the number of clusters is determined randomly, resulting in suboptimal clustering outcomes. To overcome this limitation, Particle Swarm Optimization (PSO) is applied for automatic determination of the optimal number of clusters. PSO was selected due to its advantages, including requiring few parameters, ease of implementation, fast convergence, and low computational cost. This study uses the breast cancer dataset from the UCI Machine Learning Repository, consisting of 699 records and 10 attributes. The proposed PSO–K-Means method was evaluated using the Silhouette Coefficient and Davies-Bouldin Index. The results show that the optimal number of clusters is k = 2, achieving a Silhouette Coefficient of 0.92 and a Davies-Bouldin Index of 1.374. These results demonstrate that the PSO–K-Means method significantly outperforms standard K-Means by directly producing optimal clustering results without the need for conducting repeated experiments.

References

S. Aamir et al., “Predicting Breast Cancer Leveraging Supervised Machine Learning Techniques,” Comput. Math. Methods Med., vol. 2022, no. 1, 2022, doi 10.1155/2022/5869529

[2] S. Salloum, “K-Means clustering and classification of breast cancer images using histogram of oriented gradients features and convolutional neural network models,” JMIR Med. Informatics, vol. 3, p. e71974, 2025, doi: 10.2196/71974

[3] F. J. Fernández-Ovies, E. S. Alférez-García, E. J. De Andrés-Galiana, A. Cernea, Z. Fernández-Muñiz, and J. Klinger, “Detection of breast cancer using infrared thermography and unsupervised clustering algorithms,” in Lecture Notes in Bioengineering, 2023, pp. 3–12. doi: 10.1007/978-3-031-17024-9_1

[4] S. M. Miraftabzadeh, C. G. Colombo, M. Longo, and F. Foiadelli, “K-Means and alternative clustering methods in modern power systems,” IEEE Access, vol. 11, pp. 119596–119633, 2023, doi: 10.1109/ACCESS.2023.3327640

[5] A. Zhu, Z. Hua, Y. Shi, Y. Tang, and L. Miao, “An improved K-Means algorithm based on evidence distance,” Entropy, vol. 23, no. 11, p. 1550, 2021, doi: 10.3390/e23111550

[6] R. Nainggolan, R. Perangin-angin, E. Simarmata, and A. F. Tarigan, “Improved the performance of the K-Means cluster using the sum of squared error (SSE) optimized by using the elbow method,” J. Phys. Conf. Ser., vol. 1361, no. 1, p. 012015, 2019, doi: 10.1088/1742-6596/1361/1/012015

[7] C. S. Wahono and N. Suryana, “Optimasi penentuan sentroid awal pada K-Means untuk meningkatkan hasil evaluasi Davies-Bouldin Index,” J. Inform. Teknol. dan Sains, vol. 5, no. 1, pp. 112–119, 2023, doi: 10.51401/jinteks.v5i1.3873

[8] H. Harliana, M. H. Bhakti, O. S. Bachri, and F. . Efendi, “Optimasi K-Means dengan Particle Swarm Optimization pada pengelompokkan daerah stunting,” J. Ilm. Intech Inf. Technol. J. UMUS, vol. 3, no. 02, pp. 95–101, 2021, doi: 10.46772/intech.v3i02.457

[9] H. Hayashi, Y. Endo, and S. Miyamoto, “A new PSO-based clustering algorithm with adaptive parameter tuning for high dimensional data,” J. Adv. Comput. Intell. Intell. Informatics, vol. 25, no. 5, pp. 756–764, 2021, doi: 10.20965/jaciii.2021.p0756

[10] A. Bello, S. . Ng, and M. F. Leung, “A PCA-based residual clustering analysis for hybrid PSO-K-Means optimization,” Appl. Sci., vol. 13, no. 3, pp. 1–18, 2023, doi: 10.3390/app13031681

[11] Q. Pu, J. Gan, L. Qiu, Z. Cai, and R. Li, “An efficient hybrid approach based on PSO, ABC and K-Means for cluster analysis,” Appl. Multimed. Tools, vol. 81, pp. 19321–19339, 2022, doi: 10.1007/s11042-021-11016-6

[12] D. Surya, R. Hidayat, A. Ramadhan, and R. Putra, “Optimasi K-Means untuk pengelompokan jumlah sekolah di Riau menggunakan Particle Swarm Optimization,” J. Inform. dan Rekayasa Perangkat Lunak, vol. 3, no. 1, pp. 37–46, 2024.

[13] J. Ortiz-Bejar, J. Zarate-Orihuela, J. Tellez-Velazquez, and M. V. Chávez-Báez, “SA-PSO-GK++: A new hybrid clustering approach for analyzing medical data,” IEEE Access, vol. 12, pp. 1–15, 2024, doi: 10.1109/ACCESS.2024.3351719

[14] O. D. Nurhayati and R. Permata, “Analysis of elbow, silhouette, Davies-Bouldin, Calinski-Harabasz, and Rand-Index evaluation on K-Means algorithm for classifying flood-affected areas in Jakarta,” JAIC J. Artif. Intell. Capsul. Networks, vol. 7, no. 1, pp. 95–103, 2023, doi: 10.36548/jaicn.2023.1.008

[15] I. Arfiani, H. Yuliansyah, and M. D. Suratin, “Implementasi bee colony optimization pada pemilihan centroid dalam algoritma K-Means,” Build. Informatics, Technol. Sci., vol. 3, no. 4, pp. 756–763, 2022, doi: 10.47065/bits.v3i4.1446

[16] H. Yue, H. Zhang, and Y. Dai, “Application of PSO-integrated K-Means algorithm in resident digital portrait classification,” PLoS One, vol. 20, no. 8, p. e0329123, 2025, doi: 10.1371/journal.pone.0329123

OPTIMASI K-MEANS CLUSTERING PSO UNTUK PENENTUAN JUMLAH CLUSTER OPTIMAL PADA DATA KANKER PAYUDARA

Authors

DOI:

Keywords:

Abstract

References

Downloads

Published

Issue

Section

License

How to Cite

menu-kanan

Latest publications

Information

Language