Skip to content

Conversation

@nojaf
Copy link
Contributor

@nojaf nojaf commented Dec 18, 2025

📝 Summary

🔍 Description of Changes

📋 Checklist

  • I have read the contributor guidelines.
  • For large changes, or changes that affect the public API: this change was discussed or approved through an issue, on Discord, or the community discussions (Please provide a link if applicable).
  • I have added tests for the changes made.
  • I have run the code and verified that it works as expected.

Hi @mscolnick , I'm trying to add support for heatmaps in Plotly.
This is working for me, but I'm not quite sure if this is how you would expect it to behave.
If it makes sense, let me know what needs polish to get merged in.

@vercel
Copy link

vercel bot commented Dec 18, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Review Updated (UTC)
marimo-docs Ready Ready Preview, Comment Dec 23, 2025 4:44pm

@nojaf nojaf changed the title Initial extract heatmap cells Reactive Plotly heatmap Dec 18, 2025
Copy link
Contributor

@Light2Dark Light2Dark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a concern on the performance? since there is a nested nested for loop.

Co-authored-by: Shahmir Varqha <Sham9871@gmail.com>
@nojaf
Copy link
Contributor Author

nojaf commented Dec 18, 2025

@Light2Dark potentially, I suppose we can use more async parallel stuf in Python?
Is there any library you use for that right now?

@akshayka
Copy link
Contributor

@Light2Dark potentially, I suppose we can use more async parallel stuf in Python? Is there any library you use for that right now?

If the performance feels fine on examples that are important to you, @nojaf, I wouldn't worry about making optimizations at this point. We can improve performance later if need be.

@nojaf
Copy link
Contributor Author

nojaf commented Dec 22, 2025

If the performance feels fine on examples that are important to you,

Yes, this will bite us sooner than later I believe.
Could someone please give me a pointer on how to refactor these loops into something more parallel. I'm genuinely asking, as I'm no Python expert.

Copy link
Contributor

@Light2Dark Light2Dark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My personal approach is you could write some comprehensive tests to make sure logic is correct. And some performance tests to check if logic < threshold_time.

Then, you can get claude/ai to improve your function. The thing that comes to mind for optimizing math calculations is using numpy (vectorized functions) but I am also unsure on implementation.

result = self.points
return result

def _extract_heatmap_cells_from_range(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you'd want to share some of this logic to _add_selection so that initial selections are also supported.

`@staticmethod. Add initial selection support for Heatmaps
@nojaf nojaf marked this pull request as ready for review December 23, 2025 17:00
@nojaf nojaf requested a review from akshayka as a code owner December 23, 2025 17:00
@nojaf
Copy link
Contributor Author

nojaf commented Jan 5, 2026

Could some re-run the failing build, that test seems unrelated and passed for me locally.

@mscolnick
Copy link
Contributor

@nojaf failing tests look unrelated

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request adds reactive selection support for Plotly heatmaps to the mo.ui.plotly component, enabling users to select cells in a heatmap and retrieve the selected data in Python.

Key changes:

  • Added _extract_heatmap_cells_from_range method to extract heatmap cells within a selection range, handling both numeric and categorical axes
  • Modified _convert_value and __init__ methods to detect and process heatmap selections
  • Added comprehensive test coverage for various heatmap selection scenarios

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 12 comments.

File Description
marimo/_plugins/ui/_impl/plotly.py Core implementation of heatmap selection logic, including cell extraction for both numeric and categorical axes
tests/_plugins/ui/_impl/test_plotly.py Comprehensive test suite covering basic heatmap creation, numeric/categorical/mixed axes, edge cases, and multi-trace plots
examples/third_party/plotly/heatmap.py Example demonstrating how to create and use reactive heatmaps with categorical axes

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +343 to +344
if isinstance(y_val, (int, float)):
y_in_range = y_min <= y_val <= y_max
Copy link

Copilot AI Jan 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same issue as with x_val: the code uses isinstance(y_val, (int, float)) which won't properly handle numpy numeric types. This should use the same robust numeric type checking as suggested for the x_val check.

Copilot uses AI. Check for mistakes.
This function currently only supports scatter plots, treemaps charts,
and sunbursts charts.
sunbursts charts and heatmaps.
Copy link

Copilot AI Jan 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The documentation string needs to be updated to reflect the correct list. The current text says "scatter plots, treemaps charts, sunbursts charts and heatmaps" but is missing the word "and" before "sunbursts". It should read either "scatter plots, treemaps charts, sunbursts charts, and heatmaps" (with Oxford comma) or "scatter plots, treemap charts, sunburst charts and heatmaps".

Suggested change
sunbursts charts and heatmaps.
sunbursts charts, and heatmaps.

Copilot uses AI. Check for mistakes.
Comment on lines +336 to +352
# Include cell if selection range overlaps with cell bounds
cell_x_min = j - 0.5
cell_x_max = j + 0.5
x_in_range = not (
x_max < cell_x_min or x_min > cell_x_max
)

if isinstance(y_val, (int, float)):
y_in_range = y_min <= y_val <= y_max
else:
# Categorical - each cell spans from (index - 0.5) to (index + 0.5)
# Include cell if selection range overlaps with cell bounds
cell_y_min = i - 0.5
cell_y_max = i + 0.5
y_in_range = not (
y_max < cell_y_min or y_min > cell_y_max
)
Copy link

Copilot AI Jan 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The heatmap cell extraction logic has a subtle issue with how it handles categorical axes. For categorical axes, the code uses not (x_max < cell_x_min or x_min > cell_x_max) to check overlap. However, this includes cells where the selection range only touches the boundary (e.g., x_max == cell_x_min). This might be intentional for edge cases, but it's inconsistent with the numeric case which uses x_min <= x_val <= x_max (inclusive on both ends). Consider whether the categorical case should also be strict about what counts as "within range" or if boundary touching is desired behavior.

Suggested change
# Include cell if selection range overlaps with cell bounds
cell_x_min = j - 0.5
cell_x_max = j + 0.5
x_in_range = not (
x_max < cell_x_min or x_min > cell_x_max
)
if isinstance(y_val, (int, float)):
y_in_range = y_min <= y_val <= y_max
else:
# Categorical - each cell spans from (index - 0.5) to (index + 0.5)
# Include cell if selection range overlaps with cell bounds
cell_y_min = i - 0.5
cell_y_max = i + 0.5
y_in_range = not (
y_max < cell_y_min or y_min > cell_y_max
)
# Include cell if selection range has a strict overlap with cell bounds
cell_x_min = j - 0.5
cell_x_max = j + 0.5
x_in_range = (x_min < cell_x_max) and (x_max > cell_x_min)
if isinstance(y_val, (int, float)):
y_in_range = y_min <= y_val <= y_max
else:
# Categorical - each cell spans from (index - 0.5) to (index + 0.5)
# Include cell if selection range has a strict overlap with cell bounds
cell_y_min = i - 0.5
cell_y_max = i + 0.5
y_in_range = (y_min < cell_y_max) and (y_max > cell_y_min)

Copilot uses AI. Check for mistakes.
Comment on lines +355 to +359
selected_cells.append(
{
"x": x_val,
"y": y_val,
"z": z_data[i][j],
Copy link

Copilot AI Jan 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code assumes z_data[i][j] is always indexable, but it doesn't verify that z_data[i] exists or has sufficient length. If the heatmap data is malformed (e.g., ragged array, or fewer rows/cols than x/y labels), this could raise an IndexError. Consider adding bounds checking or try-except to handle malformed data gracefully.

Suggested change
selected_cells.append(
{
"x": x_val,
"y": y_val,
"z": z_data[i][j],
try:
z_value = z_data[i][j]
except (IndexError, TypeError):
# Skip cells where z_data is malformed or ragged
continue
selected_cells.append(
{
"x": x_val,
"y": y_val,
"z": z_value,

Copilot uses AI. Check for mistakes.
Comment on lines +206 to +208
initial_value["points"] = heatmap_cells
initial_value["indices"] = list(
range(len(heatmap_cells))
Copy link

Copilot AI Jan 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When a figure contains both scatter and heatmap traces, the initial selection logic first processes scatter points (lines 169-192), then checks for heatmaps (lines 194-209). If heatmap cells are found, they completely overwrite the scatter points stored in initial_value["points"] and initial_value["indices"]. This means scatter point selections are lost in mixed plots. Consider either: (1) appending heatmap cells to existing points instead of replacing them, or (2) documenting that mixed plot types will prioritize heatmap selections.

Suggested change
initial_value["points"] = heatmap_cells
initial_value["indices"] = list(
range(len(heatmap_cells))
# Merge any existing scatter selections with heatmap cells
existing_points = cast(
list[dict[str, Any]],
initial_value.get("points") or [],
)
combined_points = existing_points + heatmap_cells
initial_value["points"] = combined_points
# Indices refer to positions within the combined points list
initial_value["indices"] = list(
range(len(combined_points))

Copilot uses AI. Check for mistakes.
Comment on lines +324 to +362
# Iterate through the heatmap cells
for i, y_val in enumerate(y_data):
for j, x_val in enumerate(x_data):
# Check if cell is within selection range
# Handle both numeric and categorical data
x_in_range = False
y_in_range = False

if isinstance(x_val, (int, float)):
x_in_range = x_min <= x_val <= x_max
else:
# Categorical - each cell spans from (index - 0.5) to (index + 0.5)
# Include cell if selection range overlaps with cell bounds
cell_x_min = j - 0.5
cell_x_max = j + 0.5
x_in_range = not (
x_max < cell_x_min or x_min > cell_x_max
)

if isinstance(y_val, (int, float)):
y_in_range = y_min <= y_val <= y_max
else:
# Categorical - each cell spans from (index - 0.5) to (index + 0.5)
# Include cell if selection range overlaps with cell bounds
cell_y_min = i - 0.5
cell_y_max = i + 0.5
y_in_range = not (
y_max < cell_y_min or y_min > cell_y_max
)

if x_in_range and y_in_range:
selected_cells.append(
{
"x": x_val,
"y": y_val,
"z": z_data[i][j],
"curveNumber": trace_idx,
}
)
Copy link

Copilot AI Jan 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The nested loop iterates through all heatmap cells (O(n*m) where n=rows, m=cols) even when the selection range is small. For large heatmaps, this could be inefficient. Consider optimizing by first determining the min/max indices to check based on the selection range, especially for numeric axes where you can calculate which cells to check rather than iterating through all cells. The PERF comment on line 295 acknowledges this, but the current implementation doesn't implement any optimization.

Copilot uses AI. Check for mistakes.
Comment on lines +562 to +597
def test_heatmap_initial_selection() -> None:
"""Test that initial selection works with heatmaps."""
fig = go.Figure(
data=go.Heatmap(
z=[[1, 2, 3], [4, 5, 6], [7, 8, 9]],
x=["A", "B", "C"],
y=["X", "Y", "Z"],
)
)

# Add an initial selection
fig.add_selection(x0=0.5, x1=1.5, y0=0.5, y1=1.5, xref="x", yref="y")

plot = plotly(fig)

# Check that initial value contains the selection
initial_value = plot._args.initial_value
assert "range" in initial_value
assert initial_value["range"]["x"] == [0.5, 1.5]
assert initial_value["range"]["y"] == [0.5, 1.5]

# For heatmap, should extract cells, not scatter points
assert "points" in initial_value
assert len(initial_value["points"]) > 0

# Should have x, y, z values (heatmap cells)
for point in initial_value["points"]:
assert "x" in point
assert "y" in point
assert "z" in point

# Should extract cell at index (1, 1) which is ("B", "Y", 5)
assert any(
p["x"] == "B" and p["y"] == "Y" and p["z"] == 5
for p in initial_value["points"]
)
Copy link

Copilot AI Jan 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no test coverage for edge cases like empty heatmaps (z=[], x=[], y=[]) or single-cell heatmaps (z=[[1]], x=["A"], y=["X"]). These edge cases could expose bugs in the extraction logic. Consider adding tests for these scenarios to ensure robust handling.

Copilot uses AI. Check for mistakes.
Comment on lines +101 to +102
_figure: go.Figure
_selection_data: PlotlySelection
Copy link

Copilot AI Jan 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The instance variables _figure and _selection_data should be declared with proper type hints. Currently they are declared at the class level but initialized in __init__. They should either be declared in __init__ with self._figure: go.Figure = figure or use proper type annotation at class level (which would make them class variables). Consider declaring them only in __init__ to make it clear they are instance variables.

Suggested change
_figure: go.Figure
_selection_data: PlotlySelection
_figure: Optional["go.Figure"] = None
_selection_data: Optional[PlotlySelection] = None

Copilot uses AI. Check for mistakes.
Comment on lines +286 to +289
self._selection_data["points"] = heatmap_cells
# Update indices to match the heatmap cells
self._selection_data["indices"] = list(
range(len(heatmap_cells))
Copy link

Copilot AI Jan 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to the issue in __init__, when processing selections in mixed plots (scatter + heatmap), if heatmap cells are found, they completely replace any existing points in self._selection_data["points"]. This could lead to loss of scatter point selections. The same fix should be applied here - either append to existing points or clearly document the behavior.

Suggested change
self._selection_data["points"] = heatmap_cells
# Update indices to match the heatmap cells
self._selection_data["indices"] = list(
range(len(heatmap_cells))
# Append heatmap cells to any existing selection points
existing_points = self._selection_data.get("points") or []
if not isinstance(existing_points, list):
existing_points = []
combined_points = list(existing_points) + list(heatmap_cells)
self._selection_data["points"] = combined_points
# Update indices to match all selected points
self._selection_data["indices"] = list(
range(len(combined_points))

Copilot uses AI. Check for mistakes.
Comment on lines +332 to +333
if isinstance(x_val, (int, float)):
x_in_range = x_min <= x_val <= x_max
Copy link

Copilot AI Jan 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For numeric heatmap axes, the code uses isinstance(x_val, (int, float)) to determine if an axis is numeric. However, this won't handle numpy numeric types (e.g., np.int64, np.float32) which are commonly used with heatmaps. Consider using a more robust check like isinstance(x_val, (int, float, np.integer, np.floating)) or checking with np.isscalar and numeric type checking to properly handle numpy arrays.

Copilot uses AI. Check for mistakes.
@mscolnick
Copy link
Contributor

@nojaf i had Copilot take a pass which can be noisy, but good at finding some more nuanced bugs. no need to take action on all its comments, but some of them look pretty valid.

dont worry about comments in tests/examples. and for numpy, this is an optional dependency, so you would need to check if DependencyManager.numpy.imported()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants