doc(trainer): add architecture diagrams to local execution guides#4301
doc(trainer): add architecture diagrams to local execution guides#4301sh4shv4t wants to merge 2 commits intokubeflow:masterfrom
Conversation
Signed-off-by: sh4shv4t <shashvat.k.singh.16@gmail.com>
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
Hi @sh4shv4t. Thanks for your PR. I'm waiting for a kubeflow member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
🚫 This command cannot be processed. Only organization members or owners can use the commands. |
Fiona-Waters
left a comment
There was a problem hiding this comment.
This is great, thanks. Just some notes and comments.
- The docker diagram show logs from node-0 but actually we can stream logs from all nodes.
- Auto-remove is conditional/optional
- The podman diagram should be updated to be more in line with the docker one (with network creation and multi-node, for example)
- The local process diagram has a few discrepancies, the flow is not quite right (create - extract - install (in 1 script)). We can show the bash script generation step and make it clear that other processes happen within a single subprocess via the generated bash script.
I hope that makes sense. Please let me know if you need to clarify anything.
content/en/docs/components/trainer/user-guides/local-execution-mode/local_process.md
Show resolved
Hide resolved
content/en/docs/components/trainer/user-guides/local-execution-mode/podman.md
Outdated
Show resolved
Hide resolved
content/en/docs/components/trainer/user-guides/local-execution-mode/podman.md
Outdated
Show resolved
Hide resolved
content/en/docs/components/trainer/user-guides/local-execution-mode/podman.md
Outdated
Show resolved
Hide resolved
|
Hi @Fiona-Waters , thank you so much for the review! I apologize for the terminology mix-up and the technical discrepancies in the diagrams. I’ll take all your suggestions into account and push the updated changes shortly. Thanks for the guidance! |
- docker.md: show logs streaming from all nodes, clarify conditional cleanup - podman.md: correct architecture text (Docker→Podman), align diagram with implementation, remove incorrect workflow details - local_process.md: update diagram to reflect bash script generation and single subprocess execution These changes address reviewer feedback and align documentation with actual SDK implementation. Signed-off-by: sh4shv4t <shashvat.k.singh.16@gmail.com>
2b04b76 to
11700f7
Compare
|
Hi @Fiona-Waters , thanks again for the detailed feedback! I have updated the diagrams and documentation to accurately reflect the internal logic this time (hopefully!). Specifically, I’ve made the following changes:
The new rendered diagrams are: I believe these changes resolve the discrepancies mentioned. Please let me know if any further adjustments are needed! |
Fiona-Waters
left a comment
There was a problem hiding this comment.
/lgtm thanks @sh4shv4t !
/assign @kramaranya @andreyvelich please review
|
@Fiona-Waters: GitHub didn't allow me to assign the following users: please, review. Note that only kubeflow members with read permissions, repo collaborators and people who have commented on this issue/PR can be assigned. Additionally, issues/PRs can only have 10 assignees at the same time. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |






Description of Changes
This PR adds detailed Architecture sections and Mermaid diagrams to the Local Execution guides for the Kubeflow Trainer. These additions visualize the workflow and component interactions for the three local backends:
These diagrams help users better understand the internal mechanics of how
TrainerClientorchestrates jobs locally before scaling to a cluster.Related Issues
Closes: #4231
Checklist