This project focuses on the classification of emotions from text using machine learning and deep learning techniques. The model, tokenizer, and preprocessing steps are provided, enabling users to directly utilize the trained model without retraining.
The primary objective of this project is to detect emotions such as joy, sadness, anger, and more from textual input. A deep learning approach, specifically an LSTM (Long Short-Term Memory) model, is employed for classification. The project also includes a pretrained model and tokenizer for quick deployment.
Pretrained Resources:
- Model Files:
emotion_recognizer.h5
andemotion_recognizer.keras
- Tokenizer:
tokenizer.pkl
- Usage Script:
text_emo_detection.py
The dataset consists of labeled text data representing various emotions. Preprocessing and feature engineering steps ensure compatibility with the LSTM-based deep learning pipeline. You can access the dataset here: Emotions Dataset for NLP.
Libraries Used: numpy, string (default library), nltk, scikit-learn, tensorflow, matplotlib, seaborn; install them using:
pip install numpy nltk scikit-learn tensorflow matplotlib seaborn
-
Text Preprocessing:
- Remove punctuation and special characters using the
string
library. - Tokenize sentences using
nltk.tokenize.word_tokenize
. - Remove stopwords with the help of
nltk.corpus.stopwords
. - Convert tokens to lowercase.
- Remove punctuation and special characters using the
-
Feature Engineering:
- Use
Tokenizer
from TensorFlow to convert text into sequences. - Pad sequences using
pad_sequences
for uniform input length.
- Use
-
Model Architecture:
- Build a Sequential model using TensorFlow's Keras API.
- Include an
Embedding
layer for word embeddings. - Add an
LSTM
layer for capturing temporal relationships. - Use
Dense
layers for classification.
-
Evaluation:
- Use
classification_report
andconfusion_matrix
from sklearn to analyze performance. - Visualize results using
matplotlib
andseaborn
.
- Use
-
Usage:
- Download
emotion_recognizer.h5
,emotion_recognizer.keras
, andtokenizer.pkl
. - Use
text_emo_detection.py
to make predictions by loading these resources directly.
- Download
We welcome contributions to enhance this project. To contribute:
- Fork the repository.
- Create a feature branch (
git checkout -b feature-name
). - Commit your changes (
git commit -m "Add feature-name"
). - Push to the branch (
git push origin feature-name
). - Submit a pull request for review.
This project is licensed under the MIT License. Feel free to use, modify, and distribute the project, provided proper credit is given.
We thank:
- The creators of the dataset used in this project for their valuable contribution: Emotions Dataset for NLP.
- The open-source library developers for providing robust tools to streamline the development process.