This project contains the code of the SF Crimes Statistics with Spark Streaming project, the final project from Udacity Data Streaming nanodegree program.
All source code is located in the src folder.
All data used in the project is located in the data folder.
All configuration files (from both Zookeeper and Kafka) are located in the config folder.
The screenshots are all located in the screenshots.zip file. The first screenshot shows the console output of the kafka consumer. The second screenshot shows the output of the count agg function from data_stream.py spark job. The third one shows the Spark Web UI from the data_stream.py spark job.