Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions scripts/us_cdc/500_places/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,7 @@ The data imported in this effort is from the CDC's [500 Places project](https://
For data refresh for CDC500 import we need to manually search in the website for the latest release files across all geo levels and add the required configuration in [Json file](gs://datcom-csv/cdc500_places/download_config.json) present in the GCP Bucket Location. The config file is present locally as well [download_config.json](https://github.com/datacommonsorg/data/blob/master/scripts/us_cdc/500_places/download_config.json) we can use this file as well to generate the output.

NOTE: If any changes made in local config update same changes in config file present in GCP as well vice versa. We should always keep both config file in sync.
Here is the path for download_config.json in bucket : gs://datcom-csv/cdc500_places/download_config.json
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This line is a bit redundant as the GCS path is already linked in the paragraph above. It also contains a typo (double space after 'for'). If the goal is to make the path easily copy-pastable, consider rephrasing for clarity and formatting it as a code block.

Suggested change
Here is the path for download_config.json in bucket : gs://datcom-csv/cdc500_places/download_config.json
The GCS path for `download_config.json` is: `gs://datcom-csv/cdc500_places/download_config.json`


Please fill the json file for the latest release data in below format:

Expand Down
28 changes: 24 additions & 4 deletions scripts/us_cdc/500_places/download_config.json
Original file line number Diff line number Diff line change
Expand Up @@ -53,24 +53,44 @@
"release_year": 2024,
"parameter": [
{
"URL": "https://data.cdc.gov/api/views/swc5-untb/rows.csv?accessType=DOWNLOAD",
"URL": "https://data.cdc.gov/api/views/fu4u-a9bh/rows.csv?accessType=DOWNLOAD",
"FILE_TYPE": "County",
"FILE_NAME": "county_raw_data_2024.csv"
},
{
"URL": "https://data.cdc.gov/api/views/eav7-hnsx/rows.csv?accessType=DOWNLOAD",
"URL": "https://data.cdc.gov/api/views/sd8v-uq83/rows.csv?accessType=DOWNLOAD",
"FILE_TYPE": "City",
"FILE_NAME": "city_raw_data_2024.csv"
},
{
"URL": "https://data.cdc.gov/api/views/cwsq-ngmh/rows.csv?accessType=DOWNLOAD",
"URL": "https://data.cdc.gov/api/views/ai6z-tcin/rows.csv?accessType=DOWNLOAD",
"FILE_TYPE": "CensusTract",
"FILE_NAME": "censustract_raw_data_2024.csv"
}
Comment on lines 68 to +69
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The ZipCode (ZCTA) data for the 2024 release seems to be missing from this configuration. According to the CDC PLACES data portal, the 2024 release includes ZCTA data. I've suggested adding it back with the correct URL for the 2024 release to ensure data completeness.

                "FILE_NAME": "censustract_raw_data_2024.csv"
            },
            {
                "URL": "https://data.cdc.gov/api/views/t2d6-nre4/rows.csv?accessType=DOWNLOAD",
                "FILE_TYPE": "ZipCode",
                "FILE_NAME": "zipcode_raw_data_2024.csv"
            }

]
},
{
"release_year": 2025,
"parameter": [
{
"URL": "https://data.cdc.gov/api/views/swc5-untb/rows.csv?accessType=DOWNLOAD",
"FILE_TYPE": "County",
"FILE_NAME": "county_raw_data_2025.csv"
},
{
"URL": "https://data.cdc.gov/api/views/eav7-hnsx/rows.csv?accessType=DOWNLOAD",
"FILE_TYPE": "City",
"FILE_NAME": "city_raw_data_2025.csv"
},
{
"URL": "https://data.cdc.gov/api/views/cwsq-ngmh/rows.csv?accessType=DOWNLOAD",
"FILE_TYPE": "CensusTract",
"FILE_NAME": "censustract_raw_data_2025.csv"
},
{
"URL": "https://data.cdc.gov/api/views/qnzd-25i4/rows.csv?accessType=DOWNLOAD",
"FILE_TYPE": "ZipCode",
"FILE_NAME": "zipcode_raw_data_2024.csv"
"FILE_NAME": "zipcode_raw_data_2025.csv"
}
]
}
Comment on lines +72 to 96
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

This new configuration block for release_year: 2025 appears to be incorrect. The URLs provided (e.g., swc5-untb, eav7-hnsx) correspond to the CDC PLACES 2023 release data, not 2025. This will cause the script to download and process 2023 data as if it were from 2025, leading to data correctness issues. This block was likely added by mistake and should be removed.

Expand Down
Loading