Open Access
ARTICLE
ENHANCED DIABETES PREDICTION VIA STACKED ENSEMBLE MACHINE LEARNING
Issue Vol. 1 No. 01 (2024): Volume 01 Issue 01 --- Section Articles --- Published Date: 2024-12-13
Abstract
Diabetes mellitus, a pervasive chronic metabolic disorder, presents an escalating global health crisis necessitating highly accurate and timely diagnostic interventions to prevent severe long-term complications. This research comprehensively investigates the application and efficacy of a stacked ensemble machine learning paradigm for enhancing diabetes prediction capabilities. Utilizing the well-established Pima Indian Diabetes Dataset, our methodology employs a multi-tiered stacking framework. This framework synergistically combines the predictive outputs of diverse base learners, including Logistic Regression, K-Nearest Neighbors, Support Vector Machine, Decision Tree, and Extreme Gradient Boosting. A Logistic Regression model was strategically selected to serve as the meta-learner, intelligently integrating and optimizing the collective predictions derived from these foundational models. Through rigorous evaluation against a suite of standard classification metrics—namely accuracy, precision, recall, F1-score, and Area Under the Receiver Operating Characteristic Curve (AUC-ROC)—the proposed stacked ensemble model consistently demonstrated superior performance when compared to its individual constituent base learners. The ensemble achieved a notable accuracy of 81.3%, precision of 76.8%, recall of 68.2%, an F1-score of 72.2%, and an impressive AUC-ROC of 0.871. These compelling results unequivocally underscore the substantial advantages of adopting ensemble learning methodologies in bolstering predictive robustness and achieving enhanced accuracy within the domain of medical diagnostics. Consequently, the developed model represents a significant advancement, offering a highly promising and practical tool for healthcare professionals. Its deployment could facilitate the early and precise identification of individuals at elevated risk of developing diabetes, thereby enabling crucial timely interventions and ultimately contributing to improved patient management strategies and public health outcomes.
Keywords
References
[1] WHO, “Diabetes,” World Health Organization. 2024. Accessed: Jun. 03, 2024. [Online]. Available: https://www.who.int/newsroom/fact-sheets/detail/diabetes
[2] American Diabetes Association, “2. Classification and diagnosis of diabetes: standards of medical care in diabetes-2021,” Diabetes Care, vol. 44, no. 1, pp. S15–S33, 2021, doi: 10.2337/dc21-S002.
[3] S. Kumari, D. Kumar, and M. Mittal, “An ensemble approach for classification and prediction of diabetes mellitus using soft voting classifier,” International Journal of Cognitive Computing in Engineering, vol. 2, pp. 40–46, 2021, doi: 10.1016/j.ijcce.2021.01.001.
[4] P. Rani, R. Lamba, R. K. Sachdeva, P. Bathla, and A. N. Aledaily, “Diabetes prediction using machine learning classification algorithms,” in 2023 International Conference on Smart Computing and Application (ICSCA), 2023, pp. 1–5, doi: 10.1109/ICSCA57840.2023.10087827.
[5] J. Liu, L. Fan, Q. Jia, L. Wen, and C. Shi, “Early diabetes prediction based on stacking ensemble learning model,” in 2021 33rd Chinese Control and Decision Conference (CCDC), 2021, pp. 2687–2692, doi: 10.1109/CCDC52312.2021.9601932.
Open Access Journal
Submit a Paper
Propose a Special lssue
pdf