ResearchSpace

AutoElbow: An automatic elbow detection method for estimating the number of clusters in a dataset

Show simple item record

dc.contributor.author Onumanyi, Adeiza J
dc.contributor.author Molokomme, Daisy N
dc.contributor.author Isaac, Sherrin J
dc.contributor.author Abu-Mahfouz, Adnan MI
dc.date.accessioned 2022-08-22T08:19:02Z
dc.date.available 2022-08-22T08:19:02Z
dc.date.issued 2022
dc.identifier.citation Onumanyi, A.J., Molokomme, D.N., Isaac, S.J. & Abu-Mahfouz, A.M. 2022. AutoElbow: An automatic elbow detection method for estimating the number of clusters in a dataset. <i>Applied Sciences-Basel, 12(15).</i> http://hdl.handle.net/10204/12479 en_ZA
dc.identifier.issn 2076-3417
dc.identifier.uri https://doi.org/10.3390/app12157515
dc.identifier.uri http://hdl.handle.net/10204/12479
dc.description.abstract The elbow technique is a well-known method for estimating the number of clusters required as a starting parameter in the K-means algorithm and certain other unsupervised machine-learning algorithms. However, due to the graphical output nature of the method, human assessment is necessary to determine the location of the elbow and, consequently, the number of data clusters. This article presents a simple method for estimating the elbow point, thus, enabling the K-means algorithm to be readily automated. First, the elbow-based graph is normalized using the graph’s minimum and maximum values along the ordinate and abscissa coordinates. Then, the distance between each point on the graph to the minimum (i.e., the origin) and maximum reference points, and the “heel” of the graph are calculated. The estimated elbow location is, thus, the point that maximizes the ratio of these distances, which corresponds to an approximate number of clusters in the dataset. We demonstrate that the strategy is effective, stable, and adaptable over different types of datasets characterized by small and large clusters, different cluster shapes, high dimensionality, and unbalanced distributions. We provide the clustering community with a description of the method and present comparative results against other well-known methods in the prior state of the art. en_US
dc.format Fulltext en_US
dc.language.iso en en_US
dc.relation.uri https://www.mdpi.com/2076-3417/12/15/7515/htm en_US
dc.source Applied Sciences-Basel, 12(15) en_US
dc.subject Clustering en_US
dc.subject Elbow method en_US
dc.subject K-means algorithm en_US
dc.title AutoElbow: An automatic elbow detection method for estimating the number of clusters in a dataset en_US
dc.type Article en_US
dc.description.pages 17 en_US
dc.description.note Copyright: © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/) en_US
dc.description.cluster Next Generation Enterprises & Institutions en_US
dc.description.impactarea Advanced Internet of Things en_US
dc.description.impactarea EDT4IR Management en_US
dc.identifier.apacitation Onumanyi, A. J., Molokomme, D. N., Isaac, S. J., & Abu-Mahfouz, A. M. (2022). AutoElbow: An automatic elbow detection method for estimating the number of clusters in a dataset. <i>Applied Sciences-Basel, 12(15)</i>, http://hdl.handle.net/10204/12479 en_ZA
dc.identifier.chicagocitation Onumanyi, Adeiza J, Daisy N Molokomme, Sherrin J Isaac, and Adnan MI Abu-Mahfouz "AutoElbow: An automatic elbow detection method for estimating the number of clusters in a dataset." <i>Applied Sciences-Basel, 12(15)</i> (2022) http://hdl.handle.net/10204/12479 en_ZA
dc.identifier.vancouvercitation Onumanyi AJ, Molokomme DN, Isaac SJ, Abu-Mahfouz AM. AutoElbow: An automatic elbow detection method for estimating the number of clusters in a dataset. Applied Sciences-Basel, 12(15). 2022; http://hdl.handle.net/10204/12479. en_ZA
dc.identifier.ris TY - Article AU - Onumanyi, Adeiza J AU - Molokomme, Daisy N AU - Isaac, Sherrin J AU - Abu-Mahfouz, Adnan MI AB - The elbow technique is a well-known method for estimating the number of clusters required as a starting parameter in the K-means algorithm and certain other unsupervised machine-learning algorithms. However, due to the graphical output nature of the method, human assessment is necessary to determine the location of the elbow and, consequently, the number of data clusters. This article presents a simple method for estimating the elbow point, thus, enabling the K-means algorithm to be readily automated. First, the elbow-based graph is normalized using the graph’s minimum and maximum values along the ordinate and abscissa coordinates. Then, the distance between each point on the graph to the minimum (i.e., the origin) and maximum reference points, and the “heel” of the graph are calculated. The estimated elbow location is, thus, the point that maximizes the ratio of these distances, which corresponds to an approximate number of clusters in the dataset. We demonstrate that the strategy is effective, stable, and adaptable over different types of datasets characterized by small and large clusters, different cluster shapes, high dimensionality, and unbalanced distributions. We provide the clustering community with a description of the method and present comparative results against other well-known methods in the prior state of the art. DA - 2022 DB - ResearchSpace DP - CSIR J1 - Applied Sciences-Basel, 12(15) KW - Clustering KW - Elbow method KW - K-means algorithm LK - https://researchspace.csir.co.za PY - 2022 SM - 2076-3417 T1 - AutoElbow: An automatic elbow detection method for estimating the number of clusters in a dataset TI - AutoElbow: An automatic elbow detection method for estimating the number of clusters in a dataset UR - http://hdl.handle.net/10204/12479 ER - en_ZA
dc.identifier.worklist 25952 en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record