Sentiment Analysis of Book Reviews using Recurrent Neural Networks

Overview

This project implements a deep learning model to classify book reviews as either positive or negative using a Recurrent Neural Network (RNN). The dataset used for this classification task originates from a Twitter sentiment analysis dataset sourced from Kaggle. Since social media plays a crucial role in gauging public reception of products and services, this model could be used to analyze sentiment and help companies understand consumer preferences.

Motivation

The goal of this project is to explore deep learning techniques for Natural Language Processing (NLP), particularly using RNNs for sentiment analysis. The model's ability to parse key words and detect sentiment is useful for companies aiming to analyze customer feedback and improve recommendations.

Dataset

The dataset consists of tweets related to various products and services, with each tweet labeled as either positive or negative. This is a binary classification problem with no class imbalance.

Features

Text data: The primary feature in this dataset is the text of the reviews themselves.

Data Preprocessing

To prepare the data for training, the following preprocessing steps are applied:

Convert all text to lowercase
Remove stop words
Apply stemming
Tokenization
Convert text into word embeddings using an embedding layer

Model Architecture

The model is based on a deep learning approach, utilizing:

Word embeddings mapped to tokens
Two Long Short-Term Memory (LSTM) layers for sequence processing
A final output layer for classification

Model Training and Evaluation

The model is trained using a combination of feed-forward and recurrent architectures.
Performance is evaluated using AUC (Area Under the Curve), which measures classification effectiveness independently of any classification threshold.
If RNNs pose computational challenges, a downsampled RNN will be tested against a full dataset feed-forward network.

Potential Improvements

Experimenting with different neural network architectures (e.g., GRUs, Transformers)
Optimizing hyperparameters for better performance
Utilizing pre-trained word embeddings such as Word2Vec or GloVe
Expand the model for multi-class classification

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
ImplementMLProjectPlan.ipynb		ImplementMLProjectPlan.ipynb
README.md		README.md
twitter_training.csv		twitter_training.csv
twitter_training_word2vec.txt		twitter_training_word2vec.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Sentiment Analysis of Book Reviews using Recurrent Neural Networks

Overview

Motivation

Dataset

Features

Data Preprocessing

Model Architecture

Model Training and Evaluation

Potential Improvements

Requirements

How to Run

Dataset is from Kaggle

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Sentiment Analysis of Book Reviews using Recurrent Neural Networks

Overview

Motivation

Dataset

Features

Data Preprocessing

Model Architecture

Model Training and Evaluation

Potential Improvements

Requirements

How to Run

Dataset is from Kaggle

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages