Performance Assessment of Machine Learning based Models for Diabetes Prediction

Ridhi Deo1, Suranjan Panigrahi

  • 1Purdue University

Details

12:30 - 14:30 | Thu 21 Nov | Upper Foyer Balcony | B1P-E.3

Session: Poster Session - Early Detection of Disease or Toxicity 2

Abstract

Diabetes is a major chronic disease which impacts all age groups. Prediction-based modeling has been used previously to provide a prevention-based approach to diabetes. In this paper, a machine learning-based approach is presented to predict the individual diabetes occurrence based on specific lifestyle, and demographic factors. Publicly available dataset - continuous NHANES, was used. To account for small data size due to missing data and class imbalanced data, certain statistical techniques were applied. Predictive models were developed using MATLAB. Highest accuracy of 91% (on test data) was obtained by the linear SVM model using 5-fold cross-validation and holdout validation approaches.