ENHANCED DIABETES PREDICTION VIA STACKED ENSEMBLE MACHINE LEARNING

Dr. Caroline M. Walsh; Dr. Joshua L. Bennett

Open Access icon Open Access

ARTICLE

ENHANCED DIABETES PREDICTION VIA STACKED ENSEMBLE MACHINE LEARNING

Dr. Caroline M. Walsh ¹ , Dr. Joshua L. Bennett ²

¹ Department of Political Science, University of Utah, Salt Lake City, UT, USA

² Department of Media and Communication, Temple University, Philadelphia, PA, USA

Issue Vol. 1 No. 01 (2024): Volume 01 Issue 01 --- Section Articles --- Published Date: 2024-12-13

Citations: Loading…

ABSTRACT VIEWS: 33 | FILE VIEWS: 11 | PDF: 11 HTML: 0 OTHER: 0 | TOTAL: 44

Views + Downloads (Last 90 days)

Cumulative % included

Abstract

Diabetes mellitus, a pervasive chronic metabolic disorder, presents an escalating global health crisis necessitating highly accurate and timely diagnostic interventions to prevent severe long-term complications. This research comprehensively investigates the application and efficacy of a stacked ensemble machine learning paradigm for enhancing diabetes prediction capabilities. Utilizing the well-established Pima Indian Diabetes Dataset, our methodology employs a multi-tiered stacking framework. This framework synergistically combines the predictive outputs of diverse base learners, including Logistic Regression, K-Nearest Neighbors, Support Vector Machine, Decision Tree, and Extreme Gradient Boosting. A Logistic Regression model was strategically selected to serve as the meta-learner, intelligently integrating and optimizing the collective predictions derived from these foundational models. Through rigorous evaluation against a suite of standard classification metrics—namely accuracy, precision, recall, F1-score, and Area Under the Receiver Operating Characteristic Curve (AUC-ROC)—the proposed stacked ensemble model consistently demonstrated superior performance when compared to its individual constituent base learners. The ensemble achieved a notable accuracy of 81.3%, precision of 76.8%, recall of 68.2%, an F1-score of 72.2%, and an impressive AUC-ROC of 0.871. These compelling results unequivocally underscore the substantial advantages of adopting ensemble learning methodologies in bolstering predictive robustness and achieving enhanced accuracy within the domain of medical diagnostics. Consequently, the developed model represents a significant advancement, offering a highly promising and practical tool for healthcare professionals. Its deployment could facilitate the early and precise identification of individuals at elevated risk of developing diabetes, thereby enabling crucial timely interventions and ultimately contributing to improved patient management strategies and public health outcomes.

Keywords

Diabetes prediction, Ensemble methods, Stacking, Machine learning

References

[1] WHO, “Diabetes,” World Health Organization. 2024. Accessed: Jun. 03, 2024. [Online]. Available: https://www.who.int/newsroom/fact-sheets/detail/diabetes

[2] American Diabetes Association, “2. Classification and diagnosis of diabetes: standards of medical care in diabetes-2021,” Diabetes Care, vol. 44, no. 1, pp. S15–S33, 2021, doi: 10.2337/dc21-S002.

[3] S. Kumari, D. Kumar, and M. Mittal, “An ensemble approach for classification and prediction of diabetes mellitus using soft voting classifier,” International Journal of Cognitive Computing in Engineering, vol. 2, pp. 40–46, 2021, doi: 10.1016/j.ijcce.2021.01.001.

[4] P. Rani, R. Lamba, R. K. Sachdeva, P. Bathla, and A. N. Aledaily, “Diabetes prediction using machine learning classification algorithms,” in 2023 International Conference on Smart Computing and Application (ICSCA), 2023, pp. 1–5, doi: 10.1109/ICSCA57840.2023.10087827.

[5] J. Liu, L. Fan, Q. Jia, L. Wen, and C. Shi, “Early diabetes prediction based on stacking ensemble learning model,” in 2021 33rd Chinese Control and Decision Conference (CCDC), 2021, pp. 2687–2692, doi: 10.1109/CCDC52312.2021.9601932.

[6] A. Dutta et al., “Early prediction of diabetes using an ensemble of machine learning models,” International Journal of Environmental Research and Public Health, vol. 19, no. 19, 2022, doi: 10.3390/ijerph191912378.

[7] S. M. Ganie and M. B. Malik, “An ensemble machine learning approach for predicting type-II diabetes mellitus based on lifestyle indicators,” Healthcare Analytics, vol. 2, 2022, doi: 10.1016/j.health.2022.100092.

[8] M. K. Gourisaria, G. Jee, G. M. Harshvardhan, V. Singh, P. K. Singh, and T. C. Workneh, “Data science appositeness in diabetes mellitus diagnosis for healthcare systems of developing nations,” IET Communications, vol. 16, no. 5, pp. 532–547, 2022, doi: 10.1049/cmu2.12338.

[9] V. Jain, “Performance analysis of supervised machine learning algorithm for prediction of diabetes,” in 2022 International Conference on Edge Computing and Applications (ICECAA), 2022, pp. 1162–1165, doi: 10.1109/ICECAA55415.2022.9936503.

[10] C. Charitha, A. Devi Chaitrasree, P. C. Varma, and C. Lakshmi, “Type-II diabetes prediction using machine learning algorithms,” in 2022 International Conference on Computer Communication and Informatics (ICCCI), 2022, pp. 1–5, doi: 10.1109/ICCCI54379.2022.9740844.

[11] K. Abnoosian, R. Farnoosh, and M. H. Behzadi, “Prediction of diabetes disease using an ensemble of machine learning multi-classifier models,” BMC Bioinformatics, vol. 24, no. 1, 2023, doi: 10.1186/s12859-023-05465-z.

[12] J. Abdollahi and B. Nouri-Moghaddam, “Hybrid stacked ensemble combined with genetic algorithms for diabetes prediction,” Iran Journal of Computer Science, vol. 5, no. 3, pp. 205–220, 2022, doi: 10.1007/s42044-022-00100-1.

[13] S. Singh and S. Gupta, “Prediction of diabetes using ensemble learning model,” in Machine Intelligence and Soft Computing, Singapore: Springer, 2021, pp. 39–59, doi: 10.1007/978-981-15-9516-5_4.

[14] F. Fahim, M. T. Ahmed, M. N. M. Shuvo, and M. R. Islam, “A comparison between different kernels of support vector machine to predict cardiovascular diseases using phonocardiogram signal,” in 2022 Second International Conference on Advances in Electrical, Computing, Communication and Sustainable Technologies (ICAECT), 2022, pp. 1–4, doi: 10.1109/ICAECT54875.2022.9808063.

[15] K. Oliullah, M. H. Rasel, M. M. Islam, M. R. Islam, M. A. H. Wadud, and M. Whaiduzzaman, “A stacked ensemble machine learning approach for the prediction of diabetes,” Journal of Diabetes and Metabolic Disorders, vol. 23, no. 1, pp. 603–617, 2024, doi: 10.1007/s40200-023-01321-2.

[16] M. Martínez-García, I. Inza, and J. A. Lozano, “Learning a logistic regression with the help of unknown features at prediction stage,” in 2023 IEEE Conference on Artificial Intelligence (CAI), 2023, pp. 298–299, doi: 10.1109/CAI54212.2023.00133.

[17] M. R. Romadhon and F. Kurniawan, “A comparison of naive Bayes methods, logistic regression, and KNN for predicting healing of covid-19 patients in Indonesia,” in 2021 3rd East Indonesia Conference on Computer and Information Technology (EIConCIT), 2021, pp. 41–44, doi: 10.1109/EIConCIT50028.2021.9431845.

[18] V. K. G. Kalaiselvi, H. Shanmugasundaram, E. Aishwarya, M. Ragavi, C. Nandhini, and S. J. Bhuvaneshwari, “Analysis of Pima Indian diabetes using KNN classifier and support vector machine technique,” in 2022 Third International Conference on Intelligent Computing Instrumentation and Control Technologies (ICICICT), 2022, pp. 1376–1380, doi: 10.1109/ICICICT54557.2022.9917992.

[19] V. S. Narayana, L. Chennagiri, B. D. P. Kumar, S. K. R. Mallidi, and T. S. R. Sai, “Prediction of COVID-19 victim’s well-being using extreme gradient boost algorithm,” in 2023 2nd International Conference on Edge Computing and Applications (ICECAA), 2023, pp. 958–963, doi: 10.1109/ICECAA58104.2023.10212406.

[20] F. Aaboub, H. Chamlal, and T. Ouaderhman, “Analysis of the prediction performance of decision tree-based algorithms,” in 2023 International Conference on Decision Aid Sciences and Applications (DASA), 2023, pp. 7–11, doi: 10.1109/DASA59624.2023.10286809.

[21] X. Dong, Z. Yu, W. Cao, Y. Shi, and Q. Ma, “A survey on ensemble learning,” Frontiers of Computer Science, vol. 14, no. 2, pp. 241–258, 2020, doi: 10.1007/s11704-019-8208-z.

[22] C. Cai et al., “Using ensemble of ensemble machine learning methods to predict outcomes of cardiac resynchronization,” Journal of Cardiovascular Electrophysiology, vol. 32, no. 9, pp. 2504–2514, 2021, doi: 10.1111/jce.15171.

[23] S. Asif, Y. Wenhui, Y. Tao, S. Jinhai, and H. Jin, “An ensemble machine learning method for the prediction of heart disease,” in 2021 4th International Conference on Artificial Intelligence and Big Data (ICAIBD), 2021, pp. 98–103, doi: 10.1109/ICAIBD51990.2021.9459010.

[24] C. A. T. Stevens et al., “Ensemble machine learning methods in screening electronic health records: A scoping review,” Digital Health, vol. 9, 2023, doi: 10.1177/20552076231173225.

[25] B. K. Priya, V. S. A. K. Tanniru, and M. Katamaneni, “Ensemble learning model for diabetes prediction,” in 2023 International Conference on Innovative Data Communication Technologies and Application (ICIDCA), 2023, pp. 33–36, doi: 10.1109/ICIDCA56705.2023.10099617.

[26] I. Tasin, T. U. Nabil, S. Islam, and R. Khan, “Diabetes prediction using machine learning and explainable AI techniques,” Healthcare Technology Letters, vol. 10, no. 1–2, pp. 1–10, 2023, doi: 10.1049/htl2.12039.

How to Cite

ENHANCED DIABETES PREDICTION VIA STACKED ENSEMBLE MACHINE LEARNING. (2024). European Journal of Emerging Cloud and Quantum Computing, 1(01), 16-32. https://parthenonfrontiers.com/index.php/ejecqc/article/view/91

Download Citation

ejecqc Open Access Journal

European Journal of Emerging Cloud and Quantum Computing

All issues

ENHANCED DIABETES PREDICTION VIA STACKED ENSEMBLE MACHINE LEARNING

Abstract

Keywords

References

How to Cite

Related articles

Journal Information

Journal Guidelines

Follow Us

Join Us

Contact Us

Share Link

Related articles

ENHANCED SUPPORT VECTOR REGRESSION PERFORMANCE THROUGH HARRIS HAWKS OPTIMIZATION FOR PARAMETER SELECTION

Deep Learning–Driven Sentiment and Depression Analysis from Social Media Text: A Comprehensive Multilingual and Theoretical Investigation

Deep Learning as a Socio-Technical General-Purpose Technology: Architectural Evolution, Market Dynamics, and Cross-Domain Transformations

Deep Neural Architectures for Thoracic Disease Identification from Chest Radiography: Interpretability, Robustness, and Clinical Integration

COMPUTED TOMOGRAPHY IMAGE SEGMENTATION FOR ISCHEMIC STROKE LESION DELINEATION: A COMPREHENSIVE BIBLIOMETRIC ANALYSIS AND ADVANCED METHODOLOGICAL REVIEW

Environmental, Operational, and Systemic Determinants of Photovoltaic Performance: An Integrated Performance–Monitoring and Data-Driven Interpretation Framework

Governing Privacy in a Datafied World: Comparative Legal Frameworks, Compliance Architectures, and the Evolution of Data Governance Paradigms