Diabetes Prediction
๐ Introduction
This is a Streamlit web application that predicts the likelihood of diabetes based on user input features. The app allows users to input health metrics such as glucose levels, BMI, and age to receive predictions from multiple machine learning models.
๐ Features
- User Input: Collects user health metrics through an interactive interface.
- Data Processing: Utilizes a pre-trained model and preprocessing pipeline for accurate predictions.
- Prediction Results: Uses various machine learning models to predict diabetes outcomes:
- Logistic Regression
- Random Forest Classifier
- AdaBoost Classifier
- Gradient Boosting Classifier
- XGBoost Classifier
- Streamlit Interface: Easy-to-use interface for inputting health metrics and viewing predictions.
๐ Getting Started
Prerequisites
To run this project locally, you'll need:
- Python 3.7+
- pip (Python package manager)
Installation
- Create a virtual environment:
- Install the required Python packages:
Run the App
To run the Streamlit app, use the following command:
---OR---
Run the main.py
Script: This will first train the models and then start the Streamlit app.
This will launch the app in your web browser.
Directory Structure
๐ง How It Works
- User Input: The app prompts the user to enter health metrics such as pregnancies, glucose, blood pressure, etc.
- Data Preprocessing: The input features are processed using the pre-trained preprocessing pipeline.
- Model Prediction: The app utilizes several machine learning models to predict whether the user is diabetic or not.
- Result Display: The predictions from each model are displayed to the user, indicating whether they are "Diabetic" or "Not Diabetic."
๐ Models Used
- Logistic Regression: A linear model for binary classification tasks.
- Random Forest Classifier: An ensemble method that uses multiple decision trees to improve accuracy.
- AdaBoost Classifier: An ensemble technique that combines weak classifiers to create a strong classifier.
- Gradient Boosting Classifier: Builds models sequentially to reduce errors by focusing on difficult cases.
- XGBoost Classifier: An optimized gradient boosting algorithm designed for speed and performance.
๐งช Example
Input:
Output:
๐ท Screenshots
App Interface
Prediction Results
Feature Importance Heatmap
๐ Dependencies
All required dependencies are listed in the requirements.txt
file. You can install them using:
Key dependencies:
streamlit
pandas
numpy
scikit-learn
imbalanced-learn
scipy
xgboost
๐ Google Colab Notebook
For a more interactive experience, you can also run the Diabetes Prediction models in a Google Colab notebook. NOTEBOOK LINK
This notebook allows you to experiment with the code and datasets without setting up a local environment.