Skip to content

ISSUE-559: Add cell types in ReadMe#620

Open
ada333 wants to merge 3 commits intokubeflow:mainfrom
ada333:ISSUE-559
Open

ISSUE-559: Add cell types in ReadMe#620
ada333 wants to merge 3 commits intokubeflow:mainfrom
ada333:ISSUE-559

Conversation

@ada333
Copy link
Collaborator

@ada333 ada333 commented Feb 19, 2026

This PR adds clear information about cell types in ReadMe mainly for first-time user. I think it is important that we mention somewhere that imports work only in imports cell.

@ada333 ada333 changed the title ISSUE-556: Add cell types in ReadMe ISSUE-559: Add cell types in ReadMe Feb 19, 2026
Copy link
Member

@ederign ederign left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some minor comments.

README.md Outdated
| Cell Type | Status | Description |
|-----------|--------|-------------|
| **Imports** | ✅ Works | Use for all import statements. **All imports must be placed in cells tagged as `imports`.** Importing libraries (pandas, tensorflow, etc.) in other cell types will cause pipeline execution errors. |
| **Functions** | ✅ Works | Use for function and class definitions only. **Do not include** logic, print statements, or import statements in these cells. Only pure definitions should be here. |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function cells can contain logic inside function bodies right? I think what you mean is that function cells should only contain definitions (function/class defs), not top-level
executable statements

README.md Outdated
### Important Guidelines

> [!WARNING]
> **Import statements must be in `imports` cells only!**
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A more accurate statement: imports outside imports cells won't be detected for automatic dependency installation,
which causes ImportError at runtime if the package isn't pre-installed in the container image.

If the package is on the container image works isn't?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point, if the package is on the container image it will work. I will add it to Readme.

@google-oss-prow
Copy link

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please ask for approval from ederign. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ada333 ada333 requested a review from ederign February 23, 2026 12:10
Copy link
Collaborator

@hmtosi hmtosi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ada333 overall looks good, just a few clarifying comments. Thanks!

README.md Outdated
|-----------|--------|-------------|
| **Imports** | ✅ Works | The code in this cell will be pre-pended to every step of the pipeline. Used for all import statements. **All imports must be placed in cells tagged as `imports`.** Importing libraries (pandas, tensorflow, etc.) in other cell types will cause pipeline execution errors. |
| **Functions** | ✅ Works |The code in this cell will be pre-pended to every step of the pipeline, after **imports**. Used for function and class definitions only. **Do not include** top-level executable statements |
| **Pipeline Parameters** | ✅ Works | Define variables that will become pipeline parameters. |
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some testing on the Pipeline Parameters label showed that if more than on e Pipeline Parameters cell exists, and a parameter is defined in each cell, only the final value will be taken. For example, parameter cell 1: num_cats = 3 and parameter cell 2: num_cats = 5 will result in num_cats = 5 for the whole pipeline.

Please update this section to say that if the same parameter is defined in mulitple Pipeline Parameter cells, only the value in the last cell will be used.

README.md Outdated
### Important Guidelines

> [!WARNING]
> **Imports outside imports cells won't be detected for automatic dependency installation, which causes ImportError at runtime if the package isn't pre-installed in the container image.**
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe write this as "Imports outside Imports cells..." to be slightly clearer

README.md Outdated

**Best Practices:**
- Place all imports at the beginning of your notebook in cells tagged as `imports`
- Keep function definitions pure - no side effects, prints, or imports
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what do you mean by side effects in this context?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant no changing of global variables / parameters.

Signed-off-by: Adam Maly <amaly@redhat.com>
Signed-off-by: Adam Maly <amaly@redhat.com>
Signed-off-by: Adam Maly <amaly@redhat.com>
@ederign
Copy link
Member

ederign commented Feb 25, 2026

@hmtosi would mind taking a second look here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants