Skip to content

Conversation

@niveditasing
Copy link

No description provided.

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @niveditasing, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request focuses on updating and expanding the data configuration for the CDC 500 Places project. It involves refreshing the data source links for the existing 2024 release and establishing a new configuration for the 2025 release, which incorporates a set of data sources. Additionally, a minor documentation enhancement was made to provide clearer guidance on the location of a critical configuration file.

Highlights

  • Data Source Updates for 2024 Release: The download URLs for County, City, and Census Tract data within the 2024 CDC 500 Places configuration have been updated to new endpoints.
  • New 2025 Release Configuration Added: A new configuration block for the 2025 release has been introduced, including download URLs for County, City, Census Tract, and Zip Code data. Notably, the URLs for the 2025 release's County, City, and Census Tract data appear to be the previous 2024 URLs.
  • Documentation Improvement: The README.md file has been updated to explicitly state the GCP bucket path for the download_config.json file, improving clarity for data refresh procedures.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the configuration for the CDC 500 Places data import to include the 2024 release. My review focuses on the correctness of these configuration changes and an update to the documentation.

I've found a few issues:

  • The README.md update is slightly redundant and has a typo. I've suggested a clearer phrasing.
  • In download_config.json, a new block for release_year: 2025 has been added, but it incorrectly uses data URLs from the 2023 release. This is a critical issue that will lead to incorrect data ingestion and should be removed.
  • The configuration for the 2024 release is missing the ZipCode (ZCTA) data. I've suggested adding it back with the correct URL to ensure data completeness.

Overall, the intent to update to the 2024 release is good, but the configuration needs to be corrected to avoid data issues.

Comment on lines +72 to 96
{
"release_year": 2025,
"parameter": [
{
"URL": "https://data.cdc.gov/api/views/swc5-untb/rows.csv?accessType=DOWNLOAD",
"FILE_TYPE": "County",
"FILE_NAME": "county_raw_data_2025.csv"
},
{
"URL": "https://data.cdc.gov/api/views/eav7-hnsx/rows.csv?accessType=DOWNLOAD",
"FILE_TYPE": "City",
"FILE_NAME": "city_raw_data_2025.csv"
},
{
"URL": "https://data.cdc.gov/api/views/cwsq-ngmh/rows.csv?accessType=DOWNLOAD",
"FILE_TYPE": "CensusTract",
"FILE_NAME": "censustract_raw_data_2025.csv"
},
{
"URL": "https://data.cdc.gov/api/views/qnzd-25i4/rows.csv?accessType=DOWNLOAD",
"FILE_TYPE": "ZipCode",
"FILE_NAME": "zipcode_raw_data_2024.csv"
"FILE_NAME": "zipcode_raw_data_2025.csv"
}
]
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

This new configuration block for release_year: 2025 appears to be incorrect. The URLs provided (e.g., swc5-untb, eav7-hnsx) correspond to the CDC PLACES 2023 release data, not 2025. This will cause the script to download and process 2023 data as if it were from 2025, leading to data correctness issues. This block was likely added by mistake and should be removed.

Comment on lines 68 to +69
"FILE_NAME": "censustract_raw_data_2024.csv"
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The ZipCode (ZCTA) data for the 2024 release seems to be missing from this configuration. According to the CDC PLACES data portal, the 2024 release includes ZCTA data. I've suggested adding it back with the correct URL for the 2024 release to ensure data completeness.

                "FILE_NAME": "censustract_raw_data_2024.csv"
            },
            {
                "URL": "https://data.cdc.gov/api/views/t2d6-nre4/rows.csv?accessType=DOWNLOAD",
                "FILE_TYPE": "ZipCode",
                "FILE_NAME": "zipcode_raw_data_2024.csv"
            }

For data refresh for CDC500 import we need to manually search in the website for the latest release files across all geo levels and add the required configuration in [Json file](gs://datcom-csv/cdc500_places/download_config.json) present in the GCP Bucket Location. The config file is present locally as well [download_config.json](https://github.com/datacommonsorg/data/blob/master/scripts/us_cdc/500_places/download_config.json) we can use this file as well to generate the output.

NOTE: If any changes made in local config update same changes in config file present in GCP as well vice versa. We should always keep both config file in sync.
Here is the path for download_config.json in bucket : gs://datcom-csv/cdc500_places/download_config.json
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This line is a bit redundant as the GCS path is already linked in the paragraph above. It also contains a typo (double space after 'for'). If the goal is to make the path easily copy-pastable, consider rephrasing for clarity and formatting it as a code block.

Suggested change
Here is the path for download_config.json in bucket : gs://datcom-csv/cdc500_places/download_config.json
The GCS path for `download_config.json` is: `gs://datcom-csv/cdc500_places/download_config.json`

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant