Skip to content

Conversation

@gmechali
Copy link
Contributor

@gmechali gmechali commented Dec 10, 2025

This PR refactors the versioning strategy and release workflow to be automated and maintainable.

🚀 Key Changes

  • Dynamic Versioning (Single Source of Truth)

    • Moved version source of truth from datacommons_mcp/version.py to pyproject.toml
    • Dynamic Loading: Updated version.py to use importlib.metadata to retrieve the installed version at runtime, ensuring code and metadata are always in sync.
    • Pipeline Updates: autopush, staging, and production release pipelines now patch pyproject.toml directly during builds.
  • Release Workflow Overhaul

    • Split Workflow: Decoupled "Version Bumping" from "Release Deployment"
      • New Tool: Added scripts/create_release_pr.py to interactively calculate the next version (Major/Minor/Patch) and trigger the bump.
      • New Pipeline: Added deploy/bump_version.yaml to handle the automated creation of version bump PRs via Cloud Build.
      • Simplified Release: Removed the PR creation step from deploy/release.yaml. It now solely focuses on building and deploying the tagged release.
  • CI/CD Hardening & Security

    • SSL Verification: Re-enabled strict SSL verification in all Cloud Build Python scripts using certifi (fixing CERTIFICATE_VERIFY_FAILED errors).
    • Docker Optimization: Merged pip install layers in Dockerfile and added fastapi pre-install step to prevent TestPyPI squatting attacks.
    • TestPyPI Robustness: Updated scripts/get_next_version.py to correctly handle version normalization (e.g., .dev1)
  • Documentation

    • Updated deploy/README.md to reflect the new 2-step release process (Prepare -> Release) and the dual-tagging strategy (Immutable Tags + Mutable :latest pointers).

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @gmechali, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a Dockerfile to containerize the application, making it easily deployable and manageable in containerized environments. The Dockerfile sets up a Python 3.11 slim environment, installs the necessary package, configures logging and port exposure, and includes a health check to ensure the application's readiness.

Highlights

  • New Dockerfile: A new Dockerfile has been added to the repository, enabling the containerization of the application.
  • Base Image: The Dockerfile uses python:3.11-slim as its base image, providing a lightweight Python environment.
  • Application Installation: The datacommons-mcp package is installed using pip with --no-cache-dir to optimize image size.
  • Environment Configuration: Environment variables PYTHONUNBUFFERED=1 and PORT=8080 are set for improved logging and port exposure.
  • Health Check: A HEALTHCHECK instruction is included, using curl to verify the application's responsiveness on port 8080, which is beneficial for platforms like Cloud Run.
  • Application Command: The CMD instruction is configured to run datacommons-mcp serve http binding to 0.0.0.0 and the exposed port 8080.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a Dockerfile to containerize the application. While this is a great addition, the current implementation has several areas for improvement regarding security, correctness, and Docker best practices. My review provides suggestions to build the image from local source code instead of PyPI, fix a critical issue with the HEALTHCHECK command, run the container as a non-root user for better security, and ensure the application port is configurable via environment variables as is standard for containerized applications.

@gmechali gmechali changed the title Add dockerfile [DO NOT SUBMIT] Add dockerfile Dec 10, 2025
@gmechali
Copy link
Contributor Author

gmechali commented Jan 5, 2026

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a comprehensive CI/CD pipeline for Google Cloud Build, establishing autopush, staging, and production deployment tiers. The changes include new Cloud Build configurations (autopush.yaml, staging.yaml, release.yaml) that automate publishing to PyPI (TestPyPI for dev/RC, official for production), building and pushing Docker images, and deploying to Cloud Run. A new Dockerfile is added to define the build environment, including pre-installing fastapi and installing the main package. Helper scripts (create_staging_tag.py, get_next_version.py, setup_apigee_sa.sh, setup_iam.sh) are introduced to manage versioning, tag creation, and IAM setup for deployment. The pyproject.toml is updated to specify a minimum fastapi version and remove uv from build requirements. Review comments highlight critical security concerns regarding disabled SSL verification in get_next_version.py, suggest optimizing the Dockerfile by using --no-install-recommends for apt-get and combining pip install commands to reduce image layers, recommend using a custom Cloud Build builder image with pre-installed tools for the release pipeline, and advise against hardcoding the Cloud Build service account email in setup_iam.sh for better reusability.

@gmechali
Copy link
Contributor Author

gmechali commented Jan 5, 2026

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a comprehensive CI/CD setup using Google Cloud Build and Docker, along with several helper scripts to automate development, staging, and production release workflows. The changes are extensive and well-structured, moving towards a more robust and automated deployment process. My review focuses on ensuring the correctness and security of the new Dockerfile and CI/CD pipelines. I've identified a couple of critical issues in the production release pipeline that need to be addressed, as well as several medium-severity suggestions to improve maintainability and security best practices across the new files.


set -e

# TODO(gmechali): Decide whether we want to delete this as it is for a one time setup only.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This script appears to be for a one-time setup. Once it has been used to configure the necessary environments, consider whether it should be removed from the repository to avoid clutter and prevent accidental re-execution. If it's intended for repeated use, adding more guards or making it fully idempotent would be beneficial.

@gmechali gmechali changed the title [DO NOT SUBMIT] Add dockerfile [Deployment] Dynamic Versioning, Release Workflow Split, and CI/CD Hardening Jan 5, 2026
# 1. Pre-install fastapi from PyPI to avoid TestPyPI squatting
# 2. Install main package
# Note: We must explicitly unset PIP_EXTRA_INDEX_URL for the first command to force PyPI usage.
RUN PIP_EXTRA_INDEX_URL="" pip install --no-cache-dir "fastapi>=0.115.0" --index-url https://pypi.org/simple/ && \
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fastapi change here is because for autopush and staging, it ended up finding a fastapi build in TestPyPi that was broken.

I can look for a better long term solution as a follow up

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I asked gemini about this and this is what it suggested:

Use --index-url for the official PyPI and only use --extra-index-url for your specific private/test index. To be even safer, use the --only-binary flag or specify the index per-package in a requirements.txt if you are using a tool like pip-compile.

Have you tried using --index-url only?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah basically without this override here on fastapi it kept finding some broken fastapi packages in testpypi
https://pantheon.corp.google.com/cloud-build/builds;region=global/4c87aff2-af83-45f8-a285-3e951cc2fd8a;step=2?e=13803378&invt=AcGOtQ&mods=-monitoring_api_staging&project=datcom-ci
So it required this explicit override. I couldnt find another way to circumvent it...

@gmechali gmechali requested review from clincoln8 and keyurva January 5, 2026 19:55
Copy link
Contributor

@keyurva keyurva left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Gabe! I've partially reviewed the PR but sending comments now so you can see them sooner.

# 1. Pre-install fastapi from PyPI to avoid TestPyPI squatting
# 2. Install main package
# Note: We must explicitly unset PIP_EXTRA_INDEX_URL for the first command to force PyPI usage.
RUN PIP_EXTRA_INDEX_URL="" pip install --no-cache-dir "fastapi>=0.115.0" --index-url https://pypi.org/simple/ && \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I asked gemini about this and this is what it suggested:

Use --index-url for the official PyPI and only use --extra-index-url for your specific private/test index. To be even safer, use the --only-binary flag or specify the index per-package in a requirements.txt if you are using a tool like pip-compile.

Have you tried using --index-url only?

@gmechali gmechali requested a review from keyurva January 6, 2026 21:23
@gmechali
Copy link
Contributor Author

gmechali commented Jan 6, 2026

Thanks Keyur, responded to your comments!

I noticed there was one major change I had to add (see bottom of the server.py file) but now I've also confirmed the server is responding with all the MCP tools so far more confident in the successful deployment!


steps:
# 1. Verify and Publish
- name: python:3.12-slim
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a fair amount of duplication between the 3 release yamls. Not sure if this is possible, but is there a way the common parts can be shared / reused?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I already extracted out the wait_for_pypi script to avoid duplication there.
I could possibly extract out step 1 but also i think it's nice to see the code in the cloudbuild directly. I dont think that duplication is severe enough to warrant more breaking out.

Steps 3 and 4 are very straightforward. just docker build + push + cloud run deploy.
I think that extracting it into a more shared state will just lead to hyper-parameterized scripts that make it hard to read. I'd recommend sticking like this but lmk if you have something else in mind!

@gmechali gmechali requested a review from keyurva January 7, 2026 20:01
Copy link
Contributor

@keyurva keyurva left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates!

@gmechali gmechali merged commit bb8cbf4 into datacommonsorg:main Jan 8, 2026
8 of 10 checks passed
@gmechali gmechali deleted the docker branch January 8, 2026 01:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants