ENHANCED DIABETES PREDICTION VIA STACKED ENSEMBLE MACHINE LEARNING
- Authors
-
-
Dr. Caroline M. Walsh
Department of Political Science, University of Utah, Salt Lake City, UT, USAAuthor -
Dr. Joshua L. Bennett
Department of Media and Communication, Temple University, Philadelphia, PA, USAAuthor
-
- Keywords:
- Diabetes prediction, Ensemble methods, Stacking, Machine learning
- Abstract
-
Diabetes mellitus, a pervasive chronic metabolic disorder, presents an escalating global health crisis necessitating highly accurate and timely diagnostic interventions to prevent severe long-term complications. This research comprehensively investigates the application and efficacy of a stacked ensemble machine learning paradigm for enhancing diabetes prediction capabilities. Utilizing the well-established Pima Indian Diabetes Dataset, our methodology employs a multi-tiered stacking framework. This framework synergistically combines the predictive outputs of diverse base learners, including Logistic Regression, K-Nearest Neighbors, Support Vector Machine, Decision Tree, and Extreme Gradient Boosting. A Logistic Regression model was strategically selected to serve as the meta-learner, intelligently integrating and optimizing the collective predictions derived from these foundational models. Through rigorous evaluation against a suite of standard classification metrics—namely accuracy, precision, recall, F1-score, and Area Under the Receiver Operating Characteristic Curve (AUC-ROC)—the proposed stacked ensemble model consistently demonstrated superior performance when compared to its individual constituent base learners. The ensemble achieved a notable accuracy of 81.3%, precision of 76.8%, recall of 68.2%, an F1-score of 72.2%, and an impressive AUC-ROC of 0.871. These compelling results unequivocally underscore the substantial advantages of adopting ensemble learning methodologies in bolstering predictive robustness and achieving enhanced accuracy within the domain of medical diagnostics. Consequently, the developed model represents a significant advancement, offering a highly promising and practical tool for healthcare professionals. Its deployment could facilitate the early and precise identification of individuals at elevated risk of developing diabetes, thereby enabling crucial timely interventions and ultimately contributing to improved patient management strategies and public health outcomes.
- Downloads
-
Download data is not yet available.
- References
-
[1] WHO, “Diabetes,” World Health Organization. 2024. Accessed: Jun. 03, 2024. [Online]. Available: https://www.who.int/newsroom/fact-sheets/detail/diabetes
[2] American Diabetes Association, “2. Classification and diagnosis of diabetes: standards of medical care in diabetes-2021,” Diabetes Care, vol. 44, no. 1, pp. S15–S33, 2021, doi: 10.2337/dc21-S002.
[3] S. Kumari, D. Kumar, and M. Mittal, “An ensemble approach for classification and prediction of diabetes mellitus using soft voting classifier,” International Journal of Cognitive Computing in Engineering, vol. 2, pp. 40–46, 2021, doi: 10.1016/j.ijcce.2021.01.001.
[4] P. Rani, R. Lamba, R. K. Sachdeva, P. Bathla, and A. N. Aledaily, “Diabetes prediction using machine learning classification algorithms,” in 2023 International Conference on Smart Computing and Application (ICSCA), 2023, pp. 1–5, doi: 10.1109/ICSCA57840.2023.10087827.
[5] J. Liu, L. Fan, Q. Jia, L. Wen, and C. Shi, “Early diabetes prediction based on stacking ensemble learning model,” in 2021 33rd Chinese Control and Decision Conference (CCDC), 2021, pp. 2687–2692, doi: 10.1109/CCDC52312.2021.9601932.
[6] A. Dutta et al., “Early prediction of diabetes using an ensemble of machine learning models,” International Journal of Environmental Research and Public Health, vol. 19, no. 19, 2022, doi: 10.3390/ijerph191912378.
[7] S. M. Ganie and M. B. Malik, “An ensemble machine learning approach for predicting type-II diabetes mellitus based on lifestyle indicators,” Healthcare Analytics, vol. 2, 2022, doi: 10.1016/j.health.2022.100092.
[8] M. K. Gourisaria, G. Jee, G. M. Harshvardhan, V. Singh, P. K. Singh, and T. C. Workneh, “Data science appositeness in diabetes mellitus diagnosis for healthcare systems of developing nations,” IET Communications, vol. 16, no. 5, pp. 532–547, 2022, doi: 10.1049/cmu2.12338.
[9] V. Jain, “Performance analysis of supervised machine learning algorithm for prediction of diabetes,” in 2022 International Conference on Edge Computing and Applications (ICECAA), 2022, pp. 1162–1165, doi: 10.1109/ICECAA55415.2022.9936503.
[10] C. Charitha, A. Devi Chaitrasree, P. C. Varma, and C. Lakshmi, “Type-II diabetes prediction using machine learning algorithms,” in 2022 International Conference on Computer Communication and Informatics (ICCCI), 2022, pp. 1–5, doi: 10.1109/ICCCI54379.2022.9740844.
[11] K. Abnoosian, R. Farnoosh, and M. H. Behzadi, “Prediction of diabetes disease using an ensemble of machine learning multi-classifier models,” BMC Bioinformatics, vol. 24, no. 1, 2023, doi: 10.1186/s12859-023-05465-z.
[12] J. Abdollahi and B. Nouri-Moghaddam, “Hybrid stacked ensemble combined with genetic algorithms for diabetes prediction,” Iran Journal of Computer Science, vol. 5, no. 3, pp. 205–220, 2022, doi: 10.1007/s42044-022-00100-1.
[13] S. Singh and S. Gupta, “Prediction of diabetes using ensemble learning model,” in Machine Intelligence and Soft Computing, Singapore: Springer, 2021, pp. 39–59, doi: 10.1007/978-981-15-9516-5_4.
[14] F. Fahim, M. T. Ahmed, M. N. M. Shuvo, and M. R. Islam, “A comparison between different kernels of support vector machine to predict cardiovascular diseases using phonocardiogram signal,” in 2022 Second International Conference on Advances in Electrical, Computing, Communication and Sustainable Technologies (ICAECT), 2022, pp. 1–4, doi: 10.1109/ICAECT54875.2022.9808063.
[15] K. Oliullah, M. H. Rasel, M. M. Islam, M. R. Islam, M. A. H. Wadud, and M. Whaiduzzaman, “A stacked ensemble machine learning approach for the prediction of diabetes,” Journal of Diabetes and Metabolic Disorders, vol. 23, no. 1, pp. 603–617, 2024, doi: 10.1007/s40200-023-01321-2.
[16] M. Martínez-García, I. Inza, and J. A. Lozano, “Learning a logistic regression with the help of unknown features at prediction stage,” in 2023 IEEE Conference on Artificial Intelligence (CAI), 2023, pp. 298–299, doi: 10.1109/CAI54212.2023.00133.
[17] M. R. Romadhon and F. Kurniawan, “A comparison of naive Bayes methods, logistic regression, and KNN for predicting healing of covid-19 patients in Indonesia,” in 2021 3rd East Indonesia Conference on Computer and Information Technology (EIConCIT), 2021, pp. 41–44, doi: 10.1109/EIConCIT50028.2021.9431845.
[18] V. K. G. Kalaiselvi, H. Shanmugasundaram, E. Aishwarya, M. Ragavi, C. Nandhini, and S. J. Bhuvaneshwari, “Analysis of Pima Indian diabetes using KNN classifier and support vector machine technique,” in 2022 Third International Conference on Intelligent Computing Instrumentation and Control Technologies (ICICICT), 2022, pp. 1376–1380, doi: 10.1109/ICICICT54557.2022.9917992.
[19] V. S. Narayana, L. Chennagiri, B. D. P. Kumar, S. K. R. Mallidi, and T. S. R. Sai, “Prediction of COVID-19 victim’s well-being using extreme gradient boost algorithm,” in 2023 2nd International Conference on Edge Computing and Applications (ICECAA), 2023, pp. 958–963, doi: 10.1109/ICECAA58104.2023.10212406.
[20] F. Aaboub, H. Chamlal, and T. Ouaderhman, “Analysis of the prediction performance of decision tree-based algorithms,” in 2023 International Conference on Decision Aid Sciences and Applications (DASA), 2023, pp. 7–11, doi: 10.1109/DASA59624.2023.10286809.
[21] X. Dong, Z. Yu, W. Cao, Y. Shi, and Q. Ma, “A survey on ensemble learning,” Frontiers of Computer Science, vol. 14, no. 2, pp. 241–258, 2020, doi: 10.1007/s11704-019-8208-z.
[22] C. Cai et al., “Using ensemble of ensemble machine learning methods to predict outcomes of cardiac resynchronization,” Journal of Cardiovascular Electrophysiology, vol. 32, no. 9, pp. 2504–2514, 2021, doi: 10.1111/jce.15171.
[23] S. Asif, Y. Wenhui, Y. Tao, S. Jinhai, and H. Jin, “An ensemble machine learning method for the prediction of heart disease,” in 2021 4th International Conference on Artificial Intelligence and Big Data (ICAIBD), 2021, pp. 98–103, doi: 10.1109/ICAIBD51990.2021.9459010.
[24] C. A. T. Stevens et al., “Ensemble machine learning methods in screening electronic health records: A scoping review,” Digital Health, vol. 9, 2023, doi: 10.1177/20552076231173225.
[25] B. K. Priya, V. S. A. K. Tanniru, and M. Katamaneni, “Ensemble learning model for diabetes prediction,” in 2023 International Conference on Innovative Data Communication Technologies and Application (ICIDCA), 2023, pp. 33–36, doi: 10.1109/ICIDCA56705.2023.10099617.
[26] I. Tasin, T. U. Nabil, S. Islam, and R. Khan, “Diabetes prediction using machine learning and explainable AI techniques,” Healthcare Technology Letters, vol. 10, no. 1–2, pp. 1–10, 2023, doi: 10.1049/htl2.12039.
- Downloads
- Published
- 2024-12-13
- Section
- Articles
- License
-
All articles published by The Parthenon Frontiers and its associated journals are distributed under the terms of the Creative Commons Attribution (CC BY 4.0) International License unless otherwise stated.
Authors retain full copyright of their published work. By submitting their manuscript, authors agree to grant The Parthenon Frontiers a non-exclusive license to publish, archive, and distribute the article worldwide. Authors are free to:
-
Share their article on personal websites, institutional repositories, or social media platforms.
-
Reuse their content in future works, presentations, or educational materials, provided proper citation of the original publication.
-
How to Cite
Similar Articles
- Dr. Celso Zito, Dr. Osirian Dawn, ENHANCED SUPPORT VECTOR REGRESSION PERFORMANCE THROUGH HARRIS HAWKS OPTIMIZATION FOR PARAMETER SELECTION , European Journal of Emerging Cloud and Quantum Computing: Vol. 1 No. 01 (2024): Volume 01 Issue 01
- Dr. Lily A. Simmons, Dr. Owen J. Martinez, COMPUTED TOMOGRAPHY IMAGE SEGMENTATION FOR ISCHEMIC STROKE LESION DELINEATION: A COMPREHENSIVE BIBLIOMETRIC ANALYSIS AND ADVANCED METHODOLOGICAL REVIEW , European Journal of Emerging Cloud and Quantum Computing: Vol. 1 No. 01 (2024): Volume 01 Issue 01
You may also start an advanced similarity search for this article.