controversy detection project for Social Computing course in IIIT Hyderabad. 2021.
The script used for scraping is present in the folder dataset_scraper. Detailed instructions on running the script and its outpout can be found in dataset_scraper/README.md
After downloading the dataset in the data folder, we need to preprocess the data. The data can be preprocessed by using the code in the preprocessor directory. Change the file_name in the notebook to point to your dataset. Run that code to preprocess the data. It will automatically save the file in a pickle file as a dataframe. Change the variable output_name to point to the name of the pickle file.
We then need extract perspective API scores on the comments of the extracted dataset. Run perspective-experiments.ipnb on the relevant dataset to get the perspective-API scores. It will be stored as perspective.pickle in the data directory.
- For running the baseline models, you should run the notebook within the baselines folder
- For running the baseline model on the perspective scores, you should run
perspective_models.ipynbwithin the perspective-api folder - For the running out final model, you should run the
transformer-cl.ipynbwithin the models folder. Set thesentvariable to True within the params, if you want to include the sentiment scores or False otherwise. Same applies for thepersvariable within the params. Change theoutput_pathwithin the params to direct to where to store the models. It will store the models after every epoch.
- Choose the best performing model for inference. Only the models using the perspective scores and transformer embeddings can be used for inference. Change the
modelvariable in theflask_inference.pyto point to the saved model. This will start the REST API at port 3000(possibly). You can use ngrok to expose the port to internet. You can now send POST requests to the url in format below to see of the tweet is controversial or not:
text = {
'root': "Root tweet",
'comments': [
'Comment 1',
'Comment 2',
'Comment 3',
....
]
}
- First navigate to the manage extensions page in your chrome browser by typing
chrome://extensions/into the address bar - Click on load unpacked and navigate and select the folder
controfound in this project codebase
Now the browser extension set up is done
- Execute command
pip3 install uvicorn fastapi cd fastapi-backendand then enter the commanduvicorn receiver:app --reloadinto the terminal