doc(trainer): add architecture diagrams to local execution guides by sh4shv4t · Pull Request #4301 · kubeflow/website

sh4shv4t · 2026-02-04T12:00:15Z

Description of Changes

This PR adds detailed Architecture sections and Mermaid diagrams to the Local Execution guides for the Kubeflow Trainer. These additions visualize the workflow and component interactions for the three local backends:

Docker Backend: Visualizes the interaction between the SDK, Docker Daemon, and container networking.
Podman Backend: detailed flow for rootless container execution and process isolation.
Local Process Backend: Visualizes the creation of virtual environments and native process management.

These diagrams help users better understand the internal mechanics of how TrainerClient orchestrates jobs locally before scaling to a cluster.

Related Issues

Closes: #4231

Checklist

You have [signed off your commits](https://www.kubeflow.org/docs/about/contributing/#sign-off-your-commits)
Ensure you follow best practices from our [contributing guide](https://github.com/kubeflow/website/blob/master/content/en/docs/about/contributing.md).
(for big changes) I will post screenshots of the changes in a PR comment

Signed-off-by: sh4shv4t <shashvat.k.singh.16@gmail.com>

google-oss-prow · 2026-02-04T12:00:24Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign andreyvelich for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

content/en/docs/components/trainer/OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

google-oss-prow · 2026-02-04T12:00:26Z

Hi @sh4shv4t. Thanks for your PR.

I'm waiting for a kubeflow member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

github-actions · 2026-02-04T12:00:41Z

🚫 This command cannot be processed. Only organization members or owners can use the commands.

sh4shv4t · 2026-02-04T12:02:06Z

Here are the rendered screenshots of the new architecture diagrams for verification:

Arhell

/ok-to-test

Fiona-Waters

This is great, thanks. Just some notes and comments.

The docker diagram show logs from node-0 but actually we can stream logs from all nodes.
Auto-remove is conditional/optional
The podman diagram should be updated to be more in line with the docker one (with network creation and multi-node, for example)
The local process diagram has a few discrepancies, the flow is not quite right (create - extract - install (in 1 script)). We can show the bash script generation step and make it clear that other processes happen within a single subprocess via the generated bash script.
I hope that makes sense. Please let me know if you need to clarify anything.

content/en/docs/components/trainer/user-guides/local-execution-mode/local_process.md

content/en/docs/components/trainer/user-guides/local-execution-mode/podman.md

sh4shv4t · 2026-02-06T21:05:59Z

Hi @Fiona-Waters , thank you so much for the review! I apologize for the terminology mix-up and the technical discrepancies in the diagrams. I’ll take all your suggestions into account and push the updated changes shortly. Thanks for the guidance!

- docker.md: show logs streaming from all nodes, clarify conditional cleanup - podman.md: correct architecture text (Docker→Podman), align diagram with implementation, remove incorrect workflow details - local_process.md: update diagram to reflect bash script generation and single subprocess execution These changes address reviewer feedback and align documentation with actual SDK implementation. Signed-off-by: sh4shv4t <shashvat.k.singh.16@gmail.com>

sh4shv4t · 2026-02-06T23:43:55Z

Hi @Fiona-Waters , thanks again for the detailed feedback! I have updated the diagrams and documentation to accurately reflect the internal logic this time (hopefully!). Specifically, I’ve made the following changes:

Docker Backend:
Log Streaming: Updated the diagram to show that logs can be streamed from all nodes, not just node-0.
Auto-remove: Clarified that container removal is conditional/optional based on the job configuration.
Podman Backend:
Consistency: Refactored the Podman diagram to align with the Docker version (including network creation and multi-node setup).
Terminology: Fixed the descriptions in podman.md to remove "Docker" references and corrected the pull_policy wording to be less misleading.
Local Process Backend:
Flow Correction: Updated the diagram to show the correct sequence: the SDK generates a bash script, which then handles the environment extraction and installation in a single subprocess.
Clarity: Made it clear that these operations happen within the generated script execution rather than as separate SDK-managed steps.

The new rendered diagrams are:

I believe these changes resolve the discrepancies mentioned. Please let me know if any further adjustments are needed!

Fiona-Waters

/lgtm thanks @sh4shv4t !
/assign @kramaranya @andreyvelich please review

google-oss-prow · 2026-02-11T14:31:28Z

@Fiona-Waters: GitHub didn't allow me to assign the following users: please, review.

Note that only kubeflow members with read permissions, repo collaborators and people who have commented on this issue/PR can be assigned. Additionally, issues/PRs can only have 10 assignees at the same time.
For more information please see the contributor guide

Details

In response to this:

/lgtm thanks @sh4shv4t !
/assign @kramaranya @andreyvelich please review

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

doc(trainer): add architecture diagrams to local execution guides

72015bc

Signed-off-by: sh4shv4t <shashvat.k.singh.16@gmail.com>

google-oss-prow bot added the area/trainer AREA: Kubeflow Trainer / Kubeflow Training Operator label Feb 4, 2026

google-oss-prow bot requested a review from ChanYiLin February 4, 2026 12:00

google-oss-prow bot requested a review from Jeffwan February 4, 2026 12:00

google-oss-prow bot added needs-ok-to-test size/M labels Feb 4, 2026

sh4shv4t mentioned this pull request Feb 4, 2026

chore(trainer): Add architecture section and diagram to Execute TrainJobs Locally section #4231

Open

Arhell reviewed Feb 5, 2026

View reviewed changes

google-oss-prow bot added ok-to-test and removed needs-ok-to-test labels Feb 5, 2026

Fiona-Waters reviewed Feb 6, 2026

View reviewed changes

sh4shv4t force-pushed the fix-trainer-docs-architecture branch from 2b04b76 to 11700f7 Compare February 6, 2026 23:13

Fiona-Waters reviewed Feb 11, 2026

View reviewed changes

google-oss-prow bot assigned andreyvelich and kramaranya Feb 11, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

doc(trainer): add architecture diagrams to local execution guides#4301

doc(trainer): add architecture diagrams to local execution guides#4301
sh4shv4t wants to merge 2 commits intokubeflow:masterfrom
sh4shv4t:fix-trainer-docs-architecture

sh4shv4t commented Feb 4, 2026

Uh oh!

google-oss-prow bot commented Feb 4, 2026

Uh oh!

google-oss-prow bot commented Feb 4, 2026

Uh oh!

github-actions bot commented Feb 4, 2026

Uh oh!

sh4shv4t commented Feb 4, 2026

Uh oh!

Arhell left a comment

Uh oh!

Fiona-Waters left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sh4shv4t commented Feb 6, 2026

Uh oh!

sh4shv4t commented Feb 6, 2026

Uh oh!

Fiona-Waters left a comment

Uh oh!

google-oss-prow bot commented Feb 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

sh4shv4t commented Feb 4, 2026

Description of Changes

Related Issues

Checklist

Uh oh!

google-oss-prow bot commented Feb 4, 2026

Uh oh!

google-oss-prow bot commented Feb 4, 2026

Uh oh!

github-actions bot commented Feb 4, 2026

Uh oh!

sh4shv4t commented Feb 4, 2026

Uh oh!

Arhell left a comment

Choose a reason for hiding this comment

Uh oh!

Fiona-Waters left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sh4shv4t commented Feb 6, 2026

Uh oh!

sh4shv4t commented Feb 6, 2026

Uh oh!

Fiona-Waters left a comment

Choose a reason for hiding this comment

Uh oh!

google-oss-prow bot commented Feb 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants