Skip to content

roisinmcl/NLPFinalProject

Repository files navigation

NLPFinalProject

data/raw/ contains data_stage_one data files from the convote dataset. They are seperated into train/, test/, and dev/. They are separated by speaker, and the political party of the speaker is 'P' in the name of the file: ([0-9]\_)+P\w\w.txt, where 'R' represents the Republican party, and 'D' represents the Democratic party.

The data can also be downloaded here.

Running Project

Project requires nltk and spacy

pip install -r requirements.txt

Tag the data with:

python tag_raw_dataset.py --tagger <tagger_name>

Valid tagger names include hmm, perceptron, and spacy.

To run the perceptron on the untagged data, run:

python perceptron.py <num_iterations>

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •  

Languages