Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
80 commits
Select commit Hold shift + click to select a range
d4a1249
Add files via upload
nortonlyr Apr 3, 2020
c5d26f2
Add files via upload
nortonlyr Apr 3, 2020
d6ca992
Update README.md
nortonlyr Apr 3, 2020
a9edee3
Updated readme, project detail included
Apr 3, 2020
568451b
Test
Apr 3, 2020
41e96fe
Test
Apr 3, 2020
e59fda2
test2
Apr 3, 2020
f1c46f9
Delete README.md
nortonlyr Apr 3, 2020
8b11e5f
Delete airbnb_zillow_project.png
nortonlyr Apr 3, 2020
0bbc906
Test
Apr 3, 2020
a774e5e
Updated Readme with image
Apr 3, 2020
e9ef8ba
Hello DAG file
Apr 3, 2020
428d8b8
requests nyc housing data
Apr 4, 2020
4b63418
2nd datasets added
Apr 4, 2020
c9e60cf
3rd datasets added, done for open_nyc_data
Apr 4, 2020
b89ece4
abb_data added
Apr 4, 2020
96e4b91
get restaurant from yelp-api, need omre work to save these data as da…
Apr 4, 2020
f40e40f
yelp_api request, done, next, will be json to dataframe
Apr 4, 2020
54b7d7c
yelp_api json->pickle, json->cvs, completed
Apr 5, 2020
f5d18c5
data upload
Apr 5, 2020
a5c9ce1
get_data_url upload, all csv url included
Apr 5, 2020
737ed66
nyc housing data first data analysis add
Apr 5, 2020
a8b2060
data-etl, download -> cleaning -> mySql
Apr 6, 2020
ae0c5b8
airbnb housing data first analysis
Apr 6, 2020
b056801
moredataset analysis updated
Apr 6, 2020
6d4edf2
Add files via upload
nortonlyr Apr 6, 2020
3ca640f
Update README.md
nortonlyr Apr 6, 2020
f52f417
Update README.md
nortonlyr Apr 6, 2020
2214111
more data analaysis updated
nortonlyr Apr 6, 2020
8a48d12
updated
nortonlyr Apr 6, 2020
6198157
more updated
nortonlyr Apr 6, 2020
dd4f00c
five etl py files are done in the dags files, are tests passed
nortonlyr Apr 7, 2020
58ec784
updated some notebooks
nortonlyr Apr 8, 2020
34a0d9e
updated etl py, papermill report step is added to file
nortonlyr Apr 10, 2020
d4c9ecc
join all the elt together
nortonlyr Apr 15, 2020
177b1cb
combined all the ELT data together
nortonlyr Apr 15, 2020
f656f29
Merge pull request #1 from nortonlyr/dev
nortonlyr Apr 15, 2020
5133d9e
updated dags and airbnb_analysis_notebook
nortonlyr Apr 20, 2020
993b864
Merge pull request #2 from nortonlyr/dev
nortonlyr Apr 20, 2020
cd37dcb
Cleanup and reorganized profile
nortonlyr Apr 22, 2020
4e5c792
Merge pull request #3 from nortonlyr/dev
nortonlyr Apr 22, 2020
2e841cf
updated notbooks
nortonlyr Apr 25, 2020
32b180b
Merge pull request #4 from nortonlyr/dev
nortonlyr Apr 25, 2020
1db1631
data cleaning and reload
nortonlyr Apr 25, 2020
62a17d2
Merge pull request #5 from nortonlyr/dev
nortonlyr Apr 25, 2020
4be09a0
data cleaning and reload
nortonlyr Apr 25, 2020
24213e3
Merge pull request #6 from nortonlyr/dev
nortonlyr Apr 25, 2020
d32ccdc
Update README.md
nortonlyr Apr 25, 2020
8e7fafc
updated airflow dags, all_in_one file
nortonlyr May 6, 2020
eb830b2
Merge pull request #7 from nortonlyr/dev
nortonlyr May 6, 2020
28c21d6
clean datasets, updated flowchart
nortonlyr May 6, 2020
d79ee1b
Merge pull request #8 from nortonlyr/dev
nortonlyr May 6, 2020
353e6e1
Update README.md
nortonlyr May 6, 2020
584f5a9
Update README.md
nortonlyr May 6, 2020
7ad1286
Update README.md
nortonlyr May 6, 2020
1e33c34
Update README.md
nortonlyr May 6, 2020
598bf00
Update README.md
nortonlyr May 6, 2020
e571112
Updated dags with amzon aws code
nortonlyr May 6, 2020
a62b1a5
Merge pull request #9 from nortonlyr/dev
nortonlyr May 6, 2020
3c80403
Delete nyc_data_all_in_one_aws.py
nortonlyr May 6, 2020
4d92400
Delete nyc_data_all_in_one_local.py
nortonlyr May 6, 2020
94d4787
update
nortonlyr May 6, 2020
12e946c
Merge pull request #12 from nortonlyr/dev2
nortonlyr May 6, 2020
1efefd2
updated dags file, localhost
nortonlyr May 8, 2020
ca2ed6b
Merge pull request #13 from nortonlyr/dev2
nortonlyr May 8, 2020
cb67824
cleaning working tree, updated data
nortonlyr May 8, 2020
762873f
Updated Readme
nortonlyr May 25, 2020
ce6bcff
updated flowchart
nortonlyr May 31, 2020
21019b7
updated readme
nortonlyr Jul 5, 2020
6bdd8ca
Merge pull request #14 from nortonlyr/master
nortonlyr Jul 5, 2020
ac47808
updated
nortonlyr Nov 19, 2022
5bf1a48
updated sql part
nortonlyr Nov 20, 2022
94be0c2
updated
nortonlyr Nov 20, 2022
f04c555
updated
nortonlyr Nov 20, 2022
1cd47b2
updated
nortonlyr Nov 20, 2022
cc6c011
updated sql part
nortonlyr Nov 22, 2022
ed9f6d4
updated wrap a function
nortonlyr Nov 22, 2022
8a3979d
updated with sqlite trick
nortonlyr Nov 23, 2022
9ce036d
updated for testing
nortonlyr Nov 23, 2022
acf1e0a
Merge branch 'master' into dev2
nortonlyr Nov 23, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file added Airflow_project_Updated050520.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
23 changes: 23 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -1 +1,24 @@
# Airflow Project

![description_if_image_fails_to_load](https://github.com/nortonlyr/DataEngineering.Labs.AirflowProject/blob/master/Airflow_project_Updated050520.png)

Question?
- How to select the valuable airbnbn home when travel to the NYC?

Goal:
- Applied the Apache Airflow directed acyclic graphs (DAGs) to build data pipelines on NYC open data (park, shooting, hot_spot, hotel, public housing) and Airbnb housing data, followed by data minipulation, analysis, and visualization.


Flowchart
- Original sources: NYC OPEN DATA, Airbnb dataset (from Insider Airbnb)

- Get requests and download the sources

- Preliminary Data cleaning and manipulation

- Import to SQL database (MySQL/PostgreSQL)

- Load data from database, and use jupyter notebook to show analysis and visualization (run in both localhost and AWS (EC2, RDS, S3))


![description_if_image_fails_to_load](https://github.com/nortonlyr/DataEngineering.Labs.AirflowProject/blob/master/airflow_flow_chart.png)
Binary file added airflow_flow_chart.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading