🌩️ Tweet Disaster Detection

📘 Introduction

This repository hosts the Tweet Disaster Detection system, a NLP solution designed to identify disaster-related tweets in real-time. With the explosion of social media usage, rapidly detecting potential disaster events through user-generated content is crucial for timely interventions and responses.

🌟 Libraries and Frameworks

The project leverages several powerful libraries and tools, including:

TensorFlow and Keras: Used for implementing and fine-tuning the BERT model.
Huggingface Transformers: Provides pre-trained BERT models and utilities for tokenization, model fine-tuning, and other NLP tasks.
scikit-learn: Used for traditional machine learning tasks, including implementing the Naive Bayes model and performance evaluation metrics.
Matplotlib: Utilized for plotting learning curves, confusion matrices, and other visualizations that help in analyzing model performance.
Pandas: Facilitates data manipulation and analysis, making it easier to preprocess the tweet data and prepare it for model training.

💡 Project Overview

In the vast sea of tweets generated every second, our system stands out by efficiently distinguishing between tweets that indicate real disasters and those that don't. Leveraging cutting-edge machine learning algorithms and deep learning models, our approach ensures high precision and accuracy in disaster detection.

🧠 Model Fine-Tuning and Training

Our primary model is a fine-tuned version of BERT (Bidirectional Encoder Representations from Transformers), a state-of-the-art transformer model originally developed by Google. BERT's ability to understand context and disambiguate meaning in text makes it particularly suited for this task.

Model Fine-Tuning Process:

Preprocessing:
- Tweets are tokenized using BERT's tokenizer, converting the text into a format that BERT can process (token IDs, attention masks, and segment IDs).

Model Architecture:

The BERT model is fine-tuned with an additional dense layer to classify tweets as either disaster-related or not. The architecture captures the complex semantics of tweets, ensuring robust classification performance.

input_word_ids = Input(shape=(self.max_seq_length,), dtype=tf.int32, name='input_word_ids')
input_mask = Input(shape=(self.max_seq_length,), dtype=tf.int32, name='input_mask')
segment_ids = Input(shape=(self.max_seq_length,), dtype=tf.int32, name='segment_ids')

pooled_output, sequence_output = self.bert_layer([input_word_ids, input_mask, segment_ids])
clf_output = sequence_output[:, 0, :]
out = Dense(1, activation='sigmoid')(clf_output)
model = Model(inputs=[input_word_ids, input_mask, segment_ids], outputs=out)

Training Strategy:
- The model is trained using the SGD optimizer with a learning rate of 0.0001 and momentum of 0.8, ensuring convergence and stability during the fine-tuning process. Multiple epochs are run, and key metrics like accuracy, precision, recall, and F1-score are tracked to monitor performance.

🚀 Results

Model	Precision	Recall	Accuracy	F1-Score
BERT	`86%`	`84%`	`85%`	`86%`
Naive Bayes	`82%`	`70%`	`56%`	`75%`

📊 Visualizations and Performance Metrics

Throughout the training process, several visualizations were generated:

Learning Curves: These illustrate the model's accuracy, precision, recall, and F1-score across epochs, offering insights into its learning behavior.
Confusion Matrix: A detailed confusion matrix for the BERT model highlights its performance in correctly classifying disaster and non-disaster tweets.

🌍 Use Cases

Our model has several real-world applications that can make a significant impact:

Preventing Accidents: By identifying tweets that signal real disasters, our system can alert first responders and relevant authorities, potentially preventing accidents or minimizing damage.
Early Warning Systems: The model can provide early warnings of disasters, giving people time to prepare or evacuate to safety.
Accurate Disaster Reporting: By filtering out false or irrelevant tweets, our system can improve the accuracy of disaster reporting, ensuring that people receive trustworthy information during crises.

🎯 Conclusion

The Tweet Disaster Detection system demonstrates the powerful application of modern NLP techniques in critical real-world scenarios. With its high accuracy and precision, especially using the fine-tuned BERT model, this project shows great potential in contributing to disaster management and response strategies globally.

We are committed to further refining this system and exploring its applications across different domains to make the world a safer place.

🛠️ How to Use

Clone the Repository:

git clone https://github.com/deepmancer/tweet-disaster-detection.git
cd tweet-disaster-detection

Install Dependencies:
```
pip install -r requirements.txt
```
Run the Jupyter Notebook:
- Open Advanced_Data_Science_Capstone.ipynb to explore the code and see the results.
Predict Disaster Tweets:
- Use the trained models to predict new tweets by following the instructions in the notebook.

Related Projects

Repo-2017

My first Python repo with codes in Machine Learning, NLP and Deep Learning with Keras and Theano

17 Dec 2016 1,191

awesome-tensorlayer

A curated list of dedicated resources and applications

06 May 2018 266

toolkit4nlp

transformers implement (architecture, task example, serving and more)

30 Jun 2020 97

twitter-sentiment-analysis

Sentiment analysis on tweets using Naive Bayes, SVM, CNN, LSTM, etc.

08 Oct 2017 1,526

auto_ml

[UNMAINTAINED] Automated machine learning for analytics & production

07 Aug 2016 1,641

deepvision

PyTorch and TensorFlow/Keras image models with automatic weight conversions and equal API/impleme...

31 Jan 2023 33

Keras-TextClassification

中文长文本分类、短句子分类、多标签分类、两句子相似度（Chinese Text Classification of Keras NLP, multi-label classify, or sen...

13 Jun 2019 1,761

Tweet-Sentimental-Analysis

A deep learning project for sentiment analysis on tweets, classifying them into positive, negativ...

03 Sep 2024 2

langml

A Keras-based and TensorFlow-backend NLP Models Toolkit.

03 Nov 2021 11

Keras-implementation-of-Transformer-Architecture

This repository presents a Python-based implementation of the Transformer architecture on Keras T...

23 Jan 2023 5

deep_autoviml

Build tensorflow keras model pipelines in a single line of code. Now with mlflow tracking. Create...

08 May 2021 120

ktrain

ktrain is a Python library that makes deep learning and AI more accessible and easier to apply

06 Feb 2019 1,226

Transformers-Sentence-Reconstruction

Sentence Reconstruction using Transformer Model

21 Jul 2024 0

Amazon-Forest-Computer-Vision

Amazon Forest Computer Vision: Satellite Image tagging code using PyTorch / Keras with lots of Py...

08 Sep 2017 366

DeepMoji

State-of-the-art deep learning model for analyzing sentiment, emotion, sarcasm etc.

18 Jul 2017 1,512