Multi-Model ML Pipeline with Automated CI/CD
Comprehensive comparison of 4 machine learning models on diabetes prediction task
Simple baseline model with linear assumptions
L2 regularization to prevent overfitting
Ensemble of 100 decision trees
Gradient boosting - typically the winner!
Every push to main branch automatically:
Diabetes Dataset (scikit-learn)
Ten baseline variables: age, sex, BMI, average blood pressure, and six blood serum measurements
Target: Quantitative measure of disease progression one year after baseline