README

toxic comments classification

The aim of this project is to detect toxic comments containing hate speech and other forms of discriminating or insulting content.

To do so the data consisting of raw Wikipedia comments is cleaned and brought into vocabulary representation. The data was obtained on Kaggle at the Toxic Comments Classification Challenge.

For the classification task a convolutional neural network using keras is employed. Three different approaches of obtaining word embeddings are tested. While the architecture of the CNN remains the same, first a model initializing all word embeddings randomly is tested (CNNrand). In addition, a model using the word2vec word and keeping them fixed during the training period (CNNfix) is created and finally, another approach allowing to refine the word2vec vectors used during the learning period has been implemented (CNNtrain).

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
Code		Code
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

README

toxic comments classification

About

Uh oh!

Releases

Packages

Languages

babettebuehler/toxic_comments_classification

Folders and files

Latest commit

History

Repository files navigation

README

toxic comments classification

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages