Email Spam Classifier

This project involves building a machine learning model to classify emails as spam or not spam. The model is trained using a dataset from Kaggle and is implemented in a Jupyter Notebook using Logistic Regression. Additionally, a Flask API is provided for interfacing with the trained model to classify emails. The API can be accessed locally by running the Flask server or through the hosted version here.

You can explore the frontend application and test its functionality by visiting the hosted site here. The source code for the frontend is available in a separate repository, which can be found here.

Dataset

The dataset used for training the model is sourced from Kaggle and can be found in datasets/emails.csv. The model, trained on this dataset, achieves an accuracy of 98.34% on the test data.

Technologies Used

Python: The core programming language used to develop the machine learning model and the Flask API.
Jupyter Notebook: Used for implementing and testing the machine learning model.
Flask: A lightweight WSGI web application framework used to create the API.
scikit-learn: The machine learning library used to build and train the logistic regression model.
matplotlib: A plotting library used for visualizing data and model performance.
pandas: A data manipulation and analysis library used for handling datasets, including loading, cleaning, and preprocessing data.
numpy: A library for numerical computations, used for handling arrays and performing mathematical operations.

Project Structure

EmailSpamClassifier.ipynb: Jupyter Notebook containing the implementation of the spam classifier model.
models/: Directory where the trained model and feature extractor are saved.
datasets/: Directory where the datasets are stored.
app.py: Flask API for interfacing with the trained model.
requirements.txt: Python package dependencies.

API Interface

Access the API at https://bilalm14.pythonanywhere.com/.

API Endpoints:

GET / predict: Classify an email as spam or not spam.

Request Body:

{
  "message": "Your email content here."
}

Response:

{
  "message": "Your email content here.",
  "prediction": "spam or not spam"
}

Example Request:

Request: https://bilalm14.pythonanywhere.com/predict?message=click%20here%20to%20win%20free%20prize
```
{
  "message": "click here to win free prize"
}
```

Response:

{
  "message": "click here to win free prize",
  "prediction": "spam"
}

Running Locally

Dependencies

Clone the repository.

git clone https://github.com/BilalM04/email-spam-classifier.git

Naviagte to the project directory.
```
cd email-spam-classifier
```
Ensure you have Python and Jupyter Notebook installed.
Install project dependencies.
```
pip install -r requirements.txt
```

Jupyter Notebook

Launch the Jupyter Notebook server by running the following command.
```
jupyter notebook
```
Open EmailSpamClassifier.ipynb in the Jupyter Notebook server.
Edit input_mail to test your own input.
```
input_mail = [""]
```
Execute the code to see the result.

Flask API

Start the Flask server.
```
python app.py
```
Use the same endpoints as described in the API Interface section, but for local use, the URL root will be http://127.0.0.1:####/, where #### is the port number.

Frontend

The frontend for this project is a web application built using React.js and styled with CSS. It allows users to input email messages and receive a classification of whether the email is spam or not. The frontend communicates with this backend API to utilize the machine learning model for classification. You can explore the frontend application and test its functionality by visiting the hosted site here. The source code for the frontend is available in a separate repository, which can be found here.

Related Projects

spam-classifier

ML-powered Flask app to perform spam classification of SMS messages. Uses TFIDF vectorization + l...

17 Apr 2020 0

galaxy

Galaxy Classification is a machine learning project focused on classifying galaxies into two subc...

31 Jul 2024 0