Student Performance Indicator
End to End Machine Learning Project
Project Overview
Education is a critical factor in shaping an individual's future. The primary objective of this project is to develop a predictive model that can predict the performance of students in their academics. This project aims to explore various factors that may affect students' performance, including demographic information, family background, and school-related attributes. By analyzing this data, we can gain insights into potential areas for improvement in educational systems.
Dataset Overview
The dataset consists of the following columns:
- gender: Gender of the student (categorical: 'male', 'female')
- race/ethnicity: Ethnic background of the student (categorical: five levels)
- parental level of education: Highest level of education attained by the student's parents (categorical: 'high school', 'some college', 'associate's degree', 'bachelor's degree', 'master's degree')
- lunch: Type of lunch received (categorical: 'standard', 'free/reduced')
- test preparation course: Whether the student completed a test preparation course (categorical: 'none', 'completed')
- math score: Score in mathematics (numerical: 0-100)
- reading score: Score in reading (numerical: 0-100)
- writing score: Score in writing (numerical: 0-100)
Key Features
-
Data Cleaning and Preprocessing: Handling missing values, correcting data types, and preparing the dataset for analysis.
-
Exploratory Data Analysis (EDA): Visualization of key variables to understand their distribution and relationships.
-
Statistical Analysis: Identifying significant factors that correlate with student performance using correlation analysis and regression models.
-
Visualization: Visualizing relationships between variables through various plots, such as bar charts, and scatter plots.
-
Model Training: Model training on several machine learning models and finding out which model performed the best.
Technologies Used
-
Python
- pandas
- NumPy
- seaborn
- matplotlib
- scikit-learn
- Flask
- Jupyter Notebook
Results
- The project provides insights into which factors have a significant impact on student performance. These findings can be utilized to tailor educational strategies and improve academic outcomes.
Installation Steps
Follow these steps to install and set up the project directly from the GitHub repository:
-
Clone the Repository
-
Create a Virtual Environment (Optional but recommended)
-
Activate the Virtual Environment (Optional)
-
Install Dependencies
-
Run the Project
- Start the project by running the appropriate command.
python app.py
-
Access the Project
- Open a web browser or the appropriate client to access the project.