Multi session support by SimonHeybrock · Pull Request #673 · scipp/esslivedata

SimonHeybrock · 2026-01-23T08:41:27Z

Summary

Adds proper multi-session support to the dashboard so multiple browser tabs/users can view plots simultaneously without interference.

Fixes #609.
Fixes #672.

Architecture changes

Two-stage plotter pattern: Plotters now separate compute() (shared, runs in background thread) from present() (per-session, creates Pipe/DynamicMap). This avoids sharing Panel/HoloViews streaming objects across sessions.
Explicit layer state machine: LayerStateMachine tracks each plot layer through well-defined states (WAITING_FOR_JOB → WAITING_FOR_DATA → READY → STOPPED/ERROR) with validated transitions.
Session infrastructure: SessionRegistry tracks active browser sessions with heartbeat-based cleanup (but see Add session status display to System Status tab and fix stale-session detection #685 for fixes for this mechanism); SessionUpdater handles per-session periodic updates.
PlotDataService: Central service for plot layer state that coordinates between background data processing and per-session presentation.

Key learnings

pn.state.notification must be used within a session context. Calling it from background threads does not work.
Sharing Pipe and DynamicMap between sessions does work if Pipe.send is called from a background thread, but see next point.
pn.io.hold() must be used within a session context. If a background thread uses Pipe.send, all sessions see updates but there is no way to update all plots within a session in a single batched step.
Therefore, each session needs its own periodic callback that pulls data from the background thread, uses pn.io.hold(), and calls Pipe.send on its own Pipe and DynamicMap.

Reduced from ~1300 lines to ~280 lines by: - Removing redundant explanations (Plotter/Presenter was explained 3x) - Replacing full implementations with interface signatures - Cutting verbose justifications (thread safety, why subscriptions are bad) - Removing the Architecture Summary section that repeated earlier diagrams Retained the essential content: - Problem statement and root cause - Critical deferred setup constraint (key insight) - Architecture diagrams (current vs proposed) - Key interfaces (signatures only) - Phased implementation plan Prompt: Please carefully review docs/developer/plans/multi-session-architecture.md - we need to hand this off to an implementer and need to ensure it is both complete and concise. Are there unnecessary details that just add context pollution and would be easier to figure out during implementation?

Fix the periodic callback bug and add infrastructure for multi-session support. Phase 1: Fix periodic callback bug in DashboardBase - Remove shared `_callback` attribute that caused only last tab to work - Each session now gets its own independent periodic callback - Register cleanup via pn.state.on_session_destroyed() Phase 1.5: Implement SessionRegistry - Track active browser sessions with heartbeat-based cleanup - Defense-in-depth for when on_session_destroyed fails to fire - Thread-safe with configurable stale timeout Phase 2: Implement state stores - WidgetStateStore: Key-value store with global versioning - NotificationQueue: Cursor-based queue for per-session notifications - Enables polling model instead of direct subscription callbacks Phase 4: Implement PlotDataService - Stores computed plot state with version tracking - Enables each session to poll for updates independently Related: Issue #609 - Multi-session support for plots Prompt: Implement multi-session architecture per docs/developer/plans/multi-session-architecture.md Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Refactor Plotter to support the two-stage compute/present architecture: - Add `compute()` method as Stage 1 (data -> HoloViews elements) - Add `Presenter` protocol for Stage 2 (pipe -> DynamicMap) - Add `DefaultPresenter` that wraps plotter's compute() in DynamicMap - Add `create_presenter()` factory method to Plotter base class - Keep `__call__` as backwards-compatible alias for `compute()` This enables the multi-session pattern where: - Shared orchestrator calls plotter.compute() to create shareable PlotState - Each session creates its own Presenter, Pipe, and DynamicMap - Session-bound components are created in correct session context Related: Issue #609 - Multi-session support for plots Prompt: Implement multi-session architecture per docs/developer/plans/multi-session-architecture.md Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Implement SessionUpdater and integrate with dashboard infrastructure: - Add SessionUpdater class for per-session update management - Polls shared state stores (WidgetStateStore, PlotDataService) - Processes notification queue for per-session notifications - Handles widget and plot update handlers with error isolation - Provides batched UI updates via pn.io.hold() - Integrate SessionUpdater with DashboardBase - Create SessionUpdater per session in start_periodic_updates() - Store session updaters keyed by session ID - Clean up SessionUpdater on session destruction - Add shared state stores to DashboardServices - WidgetStateStore for cross-session widget state - PlotDataService for version-tracked plot data - NotificationQueue for per-session notifications The infrastructure is now complete for gradually migrating widgets to use the polling model instead of direct subscription callbacks. Related: Issue #609 - Multi-session support for plots Prompt: Implement multi-session architecture per docs/developer/plans/multi-session-architecture.md Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Implements Phase 5 (SessionUpdater integration) and Phase 2 (notification migration) of the multi-session architecture plan. Plot data flow: - Extended PlotLayerState with plotter, initial_data, and shared_pipe fields - PlotOrchestrator stores plot state in PlotDataService instead of creating session-bound DynamicMaps directly - PlotGridTabs uses deferred setup pattern: callbacks only record pending setups, actual DynamicMap creation happens in session's periodic callback - Added data forwarding from shared pipes to per-session pipes - Added late subscriber support: new sessions scan PlotDataService for existing layers on initialization Notification migration: - JobOrchestrator pushes command notifications to NotificationQueue instead of calling widget callbacks directly - Removed on_command_success/on_command_error from WidgetLifecycleCallbacks - Removed direct pn.state.notifications calls from WorkflowStatusWidget - SessionUpdater polls NotificationQueue and shows notifications in correct session context SessionUpdater enhancements: - Added register_custom_handler() for session-specific periodic processing - Custom handlers run in correct session context during periodic_update() This enables multiple browser tabs to: - Each receive their own plot updates independently - All see command notifications (start/stop/reset success/error) - Join late and still see existing plots Prompt: Implement the following plan: Multi-Session Architecture: Wire Up Infrastructure

Extract session-specific plot management from PlotGridTabs into a dedicated SessionPlotManager class. This reduces PlotGridTabs complexity by ~40% and provides a clean, testable abstraction for per-session components. Key changes: - Add SessionPlotManager to handle per-session Pipes, Presenters, DynamicMaps - Add ROIPresenter class for proper Presenter pattern with ROI plotters - Add PlottingController.create_plotter() for dependency injection - Remove legacy path from PlotOrchestrator (PlotDataService now required) - Use version-based updates instead of object identity checks Update tests to reflect that LayerState.plot is now None (deferred to per-session creation via SessionPlotManager). Prompt: Please read docs/developer/plans/multi-session-architecture.md - the implementer handed us the changeset. Tested this in the UI, and apart from minor issues the key requirements are fulfilled. However, if we look at the concrete implementation changes certain things are too complicated. Please analyze and think carefully if this can be simplified. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

The previous commit broke plot updates because SessionPlotManager used PlotDataService.get_updates_since() which only tracks initial setup, not ongoing streaming data. Restore the original data forwarding mechanism: - Track shared_pipe references when setting up layers - Forward data using object identity checks on shared_pipe.data - This is how streaming updates propagate to each browser session Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Ensure compute() is never called from a session since it could modify shared state (like autoscalers). Instead, compute() runs once in shared context, and per-session Presenter.present() creates DynamicMaps that render from pre-computed state. Key changes: - Add SlicerState dataclass to hold computed data and pre-computed clim - SlicerPlotter.compute() returns SlicerState with global color limits - SlicerPlotter.render_slice() renders per-session slices without modifying shared state - Add SlicerPresenter for per-session kdims and DynamicMap creation - DefaultPresenter is now simple pass-through (receives pre-computed HoloViews elements) - Remove kdims property and initialize_from_data() from base Plotter - Add on_data_update callback to data pipeline for triggering recompute

Replace the two-phase callback system (on_first_data/on_data_update) with a single on_data callback. This removes unnecessary complexity from the multi-session architecture: - DataSubscriber: Single on_data callback, remove vestigial Pipe protocol - StreamManager: Remove pipe_factory, simplify make_stream() signature - PlottingController: setup_pipeline() uses single on_data callback - PlotOrchestrator: Create plotter eagerly when workflow commits, not when data arrives. Remove the layer_plotter closure hack. - DashboardServices: Remove pipe_factory parameter Key behavioral changes: - Plotter creation happens when workflow commits (eager), not on first data - on_cell_updated notifications no longer fire on data arrival - sessions poll PlotDataService directly instead - Source name filtering is handled by the data pipeline (on_data only fires when subscribed keys have data) Update tests to reflect new semantics: eager plotter creation, changed notification counts, and simplified interfaces. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Previously, each browser session's periodic callback called orchestrator.update(), but BackgroundMessageSource drains the entire message queue on get_messages(). This meant only whichever session called first would actually process messages, while others got empty queues. Now DashboardServices spawns a dedicated background thread that runs orchestrator.update() every 0.5s, separate from per-session callbacks. Each session's callback only polls for UI changes via SessionUpdater. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Custom handlers like _poll_for_plot_updates update plot pipes and were running outside the pn.io.hold() block, causing staggered rendering. The original code only entered the hold block when there were polled changes, but custom handlers need to run every update cycle regardless. Now all UI updates (polled changes and custom handler work) are batched together in a single hold block. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

The orchestrator update loop previously waited a fixed time AFTER each update, causing drift when processing took variable time. This led to irregular perceived update intervals (e.g., 1.5s, 0.3s) due to poor alignment with session polling. Now uses fixed-rate scheduling that accounts for processing time, maintaining consistent 200ms intervals. If processing takes longer than the interval, the schedule resets to avoid permanent backlog. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Instead of always sleeping for a fixed interval, only sleep when update() completes quickly (< 10ms), indicating no real work was done. When there's actual data to process, loop immediately to drain the queue. This avoids busy-waiting when idle while being responsive when data is flowing. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Move session cleanup from per-session periodic callbacks to the shared background thread that drives the orchestrator. This fixes two issues: - Redundancy: cleanup_stale_sessions was called N times per cycle (once per browser tab) instead of once - No cleanup after last session: when all sessions closed cleanly, the background thread continued but cleanup stopped Changes: - SessionRegistry now stores SessionUpdater instances and handles their cleanup automatically on unregister or stale detection - DashboardServices owns SessionRegistry and calls cleanup from _update_loop - DashboardBase simplified: removed _session_updaters dict, _on_session_cleanup callback, and duplicate cleanup logic Prompt: I wonder why we call cleanup_stale_sessions from every session's periodic callback. This seems (a) redundant and (b) will stop cleanup once the last session is destroyed. Shouldn't it be called from the background thread that also drives the orchestrator? Follow-up: yes, move it there; related, I wonder why we store self._session_updaters and have both _cleanup_session and _on_session_cleanup; does this belong more into SessionRegistry? Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

SessionUpdater now registers itself with the SessionRegistry in __init__, removing the need for a separate register() call. This makes the setup flow clearer: creating a SessionUpdater = starting a session. Prompt: The session setup code in start_periodic_updates feels odd. We first pass the registry into the updater and then register the updater in the registry. Should _get_session_id move into the registry and the setup into a factory method on the registry that can auto-register? Or can you think of a way to make this cleaner and more obvious? Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Flip the lifecycle order so SessionUpdater is created before widgets. This allows widgets to receive session_updater in their constructor and register handlers directly, eliminating post-hoc wiring. Changes: - DashboardBase: create_main_content now receives session_updater - DashboardBase: extracted _create_session_updater() and _start_periodic_callback() from the old start_periodic_updates() - PlotGridTabs: now requires plot_data_service and session_updater in constructor, registers handler in __init__ - Removed PlotGridTabs.set_session_services() - no longer needed - Removed hasattr check for _plot_grid_tabs in base class This design makes dependencies explicit: widgets declare what they need in their constructor. Adding new session-aware widgets is now simple - just add session_updater parameter, no base class changes needed. Prompt: If we assume that there may be more widgets that need to hook into the session updater in the future, what would you recommend? Think carefully, we want to get this right as it is central to our architecture. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

WidgetStateStore was planned for polling-based widget state synchronization but was never wired up in production code. The existing WidgetLifecycleCallbacks mechanism works correctly for widget synchronization, so we keep the simpler callback approach. Removed: - WidgetStateStore, StateKey, and VersionedState from state_stores.py - widget_state_store creation from DashboardServices - Widget state polling from SessionUpdater - Related tests The dual mechanism (NotificationQueue for toasts, callbacks for widgets) is intentional: toasts must be triggered from session context via polling, while widget callbacks work correctly with Panel's session handling. --- Prompt: Remove unused WidgetStateStore and update review notes

Separate notification queue into its own file for better organization: - Create notification_queue.py with NotificationQueue, NotificationEvent, NotificationType - Update state_stores.py to only contain PlotDataService and PlotLayerState - Create notification_queue_test.py with notification tests - Update imports in all consuming modules --- Prompt: Move notification queue to its own file (same for tests)

Simplify the plot state architecture by: - Merging LayerState into PlotLayerState (single source of truth) - Routing static plots through PlotDataService like streaming plots - Simplifying CellUpdatedCallback to just signal "config changed" Key changes: PlotLayerState (state_stores.py): - Add error/stopped fields for lifecycle state - Add helper methods: create_entry(), set_error(), set_stopped() PlotOrchestrator: - Remove LayerState class entirely - Rename get_cell_state() to get_cell() (returns just config) - Remove _compose_cell_plot() (widgets compose via session DMaps) - Store static plots in PlotDataService via StaticPresenter CellUpdatedCallback: - Remove layer_states and plot parameters - Callback now just signals config changed, subscribers query PlotDataService for runtime state Static plots: - Add StaticPresenter class that returns hv.Element directly - Add create_presenter() method to StaticPlotter base class PlotGridTabs: - Add _get_layer_states() helper to query PlotDataService - Update all methods to get state from PlotDataService - Update type hints to include hv.Element for static plots SessionPlotManager: - Update type hints to allow hv.Element return from get_dmap() Prompt: Implement the following plan: <plan for unifying LayerState and PlotDataService> Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Add check for state.state being None before attempting to set up a layer. This handles the case where an error occurs after data was previously received - the plotter is preserved but state is cleared. Without this check, the session would try to create a Pipe with None data. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Previously there were two different LayerId types: - plot_data_service.LayerId = NewType('LayerId', str) - plot_orchestrator.LayerId = NewType('LayerId', UUID) This required importing as StateLayerId and constant boilerplate conversions: StateLayerId(str(layer_id)) Now there is a single LayerId type in plot_data_service based on UUID, which is imported by plot_orchestrator and other modules. Prompt: Find out why we have two different LayerId types, one often imported as StateLayerId. Follow-up: unify them to LayerId (UUID) and commit

Extract SlicerState, SlicerPresenter, and SlicerPlotter from plots.py into a new slicer.py module. Update imports in plotting.py and tests. Prompt: Please move SlicerPresenter and SlicerPlotter to their own file.

@AbstractMethod

- Remove @AbstractMethod from Plotter.plot(), provide default NotImplementedError - Remove ABC base class from Plotter (no longer has abstract methods) - Move render_slice, _slice_data, _flatten_outer_dims from SlicerPlotter to SlicerPresenter - SlicerPresenter now takes base_opts dict instead of plotter reference - Pre-prepare 3D data in compute() (dtype conversion + log masking) - Update tests to use presenter for render_slice testing This improves separation of concerns: - SlicerPlotter (shared): config setup + compute() for data preparation - SlicerPresenter (per-session): render_slice + kdims + DynamicMap Prompt: Consider slicer.py and the split between the plotter (shared compute stage) and presenter (per-session second stage). We need to think about whether more work that is performed in render_slice (called per-session) could be performed in the shared compute phase.

SlicerPresenter was only receiving base_opts but not sizing_opts, which caused aspect ratio and responsive settings to not be applied. Now passes both base_opts and sizing_opts to the presenter.

When changing plot settings (e.g., scale from LINEAR to LOG), the plot would freeze in its previous state and stop updating. Sometimes switching windows back and forth would "unfreeze" it, and there were cases where the state was out of sync. Root cause: SessionPlotManager caches DynamicMaps/Pipes/Presenters keyed by layer_id. When config changed, the same layer_id was reused, so the cached old presenter (with old params) continued to be used. Solution: Treat config changes as remove+add with a NEW layer_id. When update_layer_config() is called: 1. Old layer_id is fully cleaned up (removed from PlotDataService) 2. A fresh layer_id is generated for the updated layer 3. Session caches for old layer_id become orphaned and are cleaned up 4. New layer_id has no cache, so fresh components are created naturally This approach is simpler than tracking config versions or plotter identity because cache invalidation happens automatically via key expiration. DataService triggers subscribers immediately with existing data on registration, so the plot updates immediately without waiting for new stream data. Prompt: Please investigate the following bug: When changing the settings of an existing plot (which should recreate it with new config), the plot just freezes in previous state and stops updating. Follow-up: Simplified from plotter identity tracking to new layer_id approach per discussion that treating config changes as remove+add is cleaner and more fool-proof. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Take snapshots of job_statuses dictionary values before iterating to prevent "dictionary changed size during iteration" errors when background Kafka threads modify the dictionary concurrently. Prompt: I got a one-off exception in the dashboard logs: RuntimeError: dictionary changed size during iteration Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

When using 'nearest' mode in _make_lookup, the code tried to convert the axis dimension coordinate to float64, but scipp's astype('float64') doesn't support datetime64 dtypes. This caused sporadic exceptions when the axis dimension was 'time' with datetime64 values. Skip the float64 conversion for datetime64 coordinates since they are accepted as-is by scipp's lookup in nearest mode. The conversion was only needed for int coords (rejected since scipp v25.11.0). Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Restore pn.state.execute() deferral in _poll_for_plot_updates() to prevent race condition between Bokeh model updates and widget insertion. The issue occurred during tab switching with workflow stop/start cycles: 1. update_pipes() calls pipe.send() triggering DynamicMap/Bokeh model updates 2. Widget insertion happened immediately, racing with model processing 3. Panel tried to access models that were being removed, causing KeyError Deferring insertion to the next event loop iteration gives Bokeh time to process pending updates before widget replacement. This reverts the problematic part of commit 18fd8a6. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

When a workflow restarts, PlotOrchestrator creates a new DataSubscriber but the old one remained in DataService._subscribers, causing a memory leak. Old subscribers wouldn't receive data (keys don't match new job_number) but accumulated over time. Changes: - Add DataService.unregister_subscriber() to remove subscribers - Update StreamManager.make_stream() to return the subscriber - Update PlottingController.setup_pipeline() to return the subscriber - Track data subscriptions per layer in PlotOrchestrator._data_subscriptions - Clean up old subscription in _on_all_jobs_ready() before creating new one - Clean up subscription in _unsubscribe_layer() when layer is removed Prompt: 9451e56 should have fixed job-transition issues, but I am still seeing issues in some cases. Can you investigate and think if the mechanism has any other holes? Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Converting WeakSet to list before iteration prevents RuntimeError that can occur when the garbage collector removes dead references during iteration. Prompt: fix the WeakSet issue

This refactors the plot layer lifecycle management from implicit state (derived from multiple fields) to an explicit state machine with well-defined states and transitions: - WAITING_FOR_JOB: Layer configured, workflow not running - WAITING_FOR_DATA: Job started, plotter exists, no data yet - READY: Has data, can render plot - STOPPED: Workflow stopped (may have stale data) - ERROR: Error occurred Key changes: - Add LayerState enum and LayerStateMachine class with validated transitions: job_started(), data_arrived(), job_stopped(), error_occurred() - Add version counter that increments on every state transition, enabling polling-based change detection - Update PlotOrchestrator to use state machine methods instead of set_plotter/set_error/set_stopped - Remove callback-based _notify_cell_updated calls for state changes (stopped, error). UI now detects state changes via polling. Keep callbacks only for structural changes (new cells). - Update PlotGridTabs to detect version changes in polling loop, replacing the callback-based _on_cell_updated for state changes - Update placeholder content to use match statement on state enum for clearer status display logic - Update SessionPlotManager._setup_layer to use has_displayable_plot() which checks for READY or STOPPED state with cached data This fixes the bug where "Workflow ended" briefly appeared during workflow restart, because job_started() now properly increments version and polling detects the state change. --- Prompt: Implement the following plan: [Explicit State Machine for Plot Layer Lifecycle]

Simplify the state machine implementation by removing the backward-compatible PlotLayerState facade class. PlotDataService now stores LayerStateMachine instances directly. Changes: - Remove PlotLayerState dataclass facade - Remove unused ensure_layer method - Remove all logging from plot_data_service.py - Update tests to use new API (job_started, error_occurred, job_stopped) - Use .error_message instead of .error, check state enum instead of .stopped Prompt: "Keeping PlotLayerState as a facade is not what we want, no backward compat needed!" followed by "ensure_layer looks unused" and "Please also remove all the logging from plot_data_service.py, we will add this on a per-need basis." Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Forgot to include in previous commit - updates imports and type annotations from PlotLayerState to LayerStateMachine, and .error to .error_message. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

- Remove redundant DMap transition check in _poll_for_plot_updates (version tracking now handles all state change detection) - Add logging warnings for invalid state transitions in LayerStateMachine - Make READY case in placeholder content defensive with warning log Original prompt: Help me understand the PlotOrchestrator/PlotGridTabs mechanism and critical review of state machine implementation (+400 lines question). Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

The on_data() callback in plot_orchestrator is invoked on every data update, calling data_arrived() each time. However, the state machine only needs to transition once from WAITING_FOR_DATA to READY. After that, subsequent data arrivals are handled by plotter.compute() which updates the plot state directly. Make data_arrived() a silent no-op when already in READY state, since this is expected behavior for continuous data streams. Prompt: The recent commits refactor the plot update mechanism to a state machine. I get warnings in the logs (even though the plot updates just fine): Invalid transition: data_arrived() called in state READY (expected WAITING_FOR_DATA) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

When adding a new plot to an already-running workflow, the plot would appear briefly and then be recreated, causing a visible flicker. Root cause: _on_cell_updated creates the widget but doesn't record the layer versions in _last_seen_versions. On the next poll cycle (~100ms), _poll_for_plot_updates sees last_version=None for the new layers and triggers a redundant rebuild. Fix: Record current versions for all layers in _on_cell_updated after creating the widget, preventing the polling loop from triggering duplicate rebuilds. Prompt: I see another issue: When adding a new plot where the job is already streaming data, the plot appears for a fraction of a second, and is then instantly recreated a second time (afterwards it is stable). Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Two issues prevented static overlays from displaying: 1. Missing UI notification: _subscribe_layer_to_workflow returned early for static layers without calling _notify_cell_updated, so sessions were never informed to display the overlay. 2. Missing cached state: create_static_plot() returned the element but didn't store it. has_displayable_plot() returned False because _cached_state was None, so SessionPlotManager skipped setup. Fixes: - Add _notify_cell_updated call after creating static layer plot - Capture create_static_plot() return value and call _set_cached_state() Prompt: Next problem: Static overlays are not showing up any more. Follow-up: This is not working. Note that the toolbar (header) for the static overlay appears just fine, but it is not shown in the composed plot. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Layers were previously added to the plot grid config before PlotDataService knew about them, requiring `state is None` checks throughout widget code. Now layers are registered in PlotDataService before widgets are notified: - Add layer_added() for dynamic layers (WAITING_FOR_JOB state) - Add static_layer_ready() for static overlays (READY state) - Call these methods in _subscribe_layer before _notify_cell_updated() This means WAITING_FOR_JOB state is now actually used (was dead code before), and widget code can rely on state always being present: - Type signatures simplified from LayerStateMachine | None to LayerStateMachine - _get_layer_states() raises RuntimeError if state missing (catches bugs) - Removed state is None checks from _create_layer_toolbars and _create_placeholder_content Also made plot_data_service required in PlotOrchestrator (was optional with None default, but docstring already said "required for dashboard to function"). --- Prompt: Help me think through _create_layer_toolbars. Why do we need to handle `state is None` and why does that correspond to `Waiting for data...`? Follow-up: User agreed that STATIC==READY makes sense and that registering layers before notifying widgets is cleaner.

Clarify that version is needed to detect plotter replacements when job_started() is called while already in WAITING_FOR_DATA state - the state doesn't change but UI still needs to rebuild. Prompt: "Do we need version in LayerStateMachine or can PlotGridTabs store latest_state instead of latest_version? What about layer reconfig, does it remove and create a new layer? Are there other 'self' transitions where we still want to increment the version to effect a widget recreation?" Follow-up: "Great analysis, can you add some clarifying info in the docstring (or code comment)?"

Resolved conflicts in: - plot_orchestrator.py: Keep PlotDataService-based state management, remove _compose_cell_plot (composition now happens in widget layer) - plot_grid_tabs.py: Combine imports (PRIMARY + PlotDataService types) - plot_grid_tabs_test.py: Remove redundant fixtures, use data_service

PlotterT was defined but never used anywhere in the codebase. Prompt: Why do we define PlotterT in plots.py? Can we not use Plotter directly? Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Replace the separate SessionPlotManager class with a simpler SessionLayer dataclass managed directly by PlotGridTabs. This eliminates parallel collections that required synchronization logic spread across two classes. Before: - PlotGridTabs._last_seen_versions: dict[LayerId, int] - SessionPlotManager._presenters: dict[LayerId, Presenter] - SessionPlotManager._pipes: dict[LayerId, Pipe] - SessionPlotManager._dmaps: dict[LayerId, DynamicMap] After: - PlotGridTabs._session_layers: dict[LayerId, SessionLayer] (each SessionLayer holds presenter, pipe, dmap, last_seen_version) Benefits: - Single collection instead of parallel collections - Version tracking colocated with session components - Orphan cleanup and plotter replacement detection in one place - Each SessionLayer is a self-contained unit that knows how to update itself Original prompt: Help me think through the PlotDataService and SessionPlotManager. I am trying to understand if there is unnecessary complexity in trying to keep them in sync. Would a lot of the management disappear if each session could simply hook into PlotDataService, with all the per-session objects managed by the service?

Simplify _poll_for_plot_updates() by merging the two iteration phases into a single pass over orchestrator layers. Previously: - Phase 1: Iterate _session_layers to push data and clean up invalids - Phase 2: Iterate orchestrator layers to detect version changes Now a single pass handles all three concerns: - Push data updates to existing session pipes - Clean up invalid session layers (plotter replaced) - Detect version changes requiring cell rebuilds Orphan cleanup (layers removed from orchestrator) happens after the main loop by checking which layer_ids were seen. Also remove PlotDataService.static_layer_ready() which was dead code: _create_static_layer_plot() already registers the layer via job_started() and data_arrived() (or error_occurred() on failure), so the subsequent static_layer_ready() call was always a no-op. Original prompt: Does this refactor the door for further simplications? Anything in the mechanism that can now be done simpler?

The bug: Commit 57e0d61 moved version tracking into SessionLayer, but SessionLayer was only created when displayable data was available. For layers waiting for data, no SessionLayer existed, so `last_seen_version` was always None, triggering rebuilds on every poll cycle. The fix: Create SessionLayer eagerly for version tracking, with optional render components that are added lazily when data becomes available. Refactor SessionLayer to separate concerns: - SessionComponents: groups presenter/pipe/dmap (always created together) - SessionLayer: holds layer_id, last_seen_version, and optional components This keeps the "single collection" benefit of the original refactor while properly handling layers that don't have displayable data yet. Original prompt: I think 57e0d61 broke something: The widgets are now constantly beeing recreating (in every update?).

The comment claimed circular import issues, but there were none: plots.py doesn't import session_layer.py or plot_data_service.py. The same file already used Plotter in type annotations elsewhere. Clean up by importing Plotter/Presenter directly and using an assertion to help the type checker narrow the Optional type. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Add StaticPlotter.compute() method that accepts (and ignores) a data dict, enabling interface compatibility with dynamic plotters. This keeps caching logic encapsulated within the plotter instead of requiring external calls to private methods. Extract two helper methods in PlotOrchestrator: - _create_and_register_plotter: creates plotter, calls job_started(), handles errors - _run_compute: calls plotter.compute(), calls data_arrived(), handles errors Both static and dynamic paths now use these shared helpers, eliminating code duplication and the need for _create_static_layer_plot. Also removes unused cell_id and grid_id parameters from _on_all_jobs_ready. Prompt: Help me understand why we have `create_static_plot` instead of simply implementing `StaticPlotter.compute`? Wouldn't this avoid the manual caching and private method call in _create_static_layer_plot? Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

The version-based change detection mechanism already ensures rebuilds happen when plotters change. The explicit is_valid_for check was redundant because: - job_started(plotter) always increments version (invariant guaranteed) - Version change detection triggers cell rebuilds - ensure_components handles plotter validity internally during rebuild Also removes the now-unused SessionLayer.is_valid_for() method. Adds comprehensive tests for the version invariant in LayerStateMachine and the version-based rebuild mechanism in PlotGridTabs. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Four files introduced on this branch were using logging.getLogger() instead of structlog.get_logger(). This fixes them to be consistent with the rest of the codebase after the structlog migration. Files fixed: - dashboard/session_registry.py - dashboard/session_updater.py - dashboard/plot_data_service.py - dashboard/widgets/plot_grid_tabs.py Prompt: commit these fixes Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

SimonHeybrock and others added 30 commits January 22, 2026 06:02

Design doc v1

2b34900

v2

6deda3c

Rename file to match class name

7969a2a

Remove references to temporary file

25f5633

Remove logging from NotificationQueue

365df89

Move SlicerPresenter and SlicerPlotter to their own file

f83b0a6

Extract SlicerState, SlicerPresenter, and SlicerPlotter from plots.py into a new slicer.py module. Update imports in plotting.py and tests. Prompt: Please move SlicerPresenter and SlicerPlotter to their own file.

Fix missing sizing_opts in SlicerPresenter

5088ec6

SlicerPresenter was only receiving base_opts but not sizing_opts, which caused aspect ratio and responsive settings to not be applied. Now passes both base_opts and sizing_opts to the presenter.

SimonHeybrock and others added 13 commits January 28, 2026 09:13

Fix WeakSet iteration race condition in presenter dirty marking

9824bcc

Converting WeakSet to list before iteration prevents RuntimeError that can occur when the garbage collector removes dead references during iteration. Prompt: fix the WeakSet issue

Update plot_grid_tabs.py for LayerStateMachine changes

2033f6c

Forgot to include in previous commit - updates imports and type annotations from PlotLayerState to LayerStateMachine, and .error to .error_message. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Base automatically changed from improved-logging to main January 29, 2026 10:04

SimonHeybrock and others added 12 commits January 29, 2026 10:32

Rename

f6acc9a

Remove unused PlotterT TypeVar from plots.py

33bdba5

PlotterT was defined but never used anywhere in the codebase. Prompt: Why do we define PlotterT in plots.py? Can we not use Plotter directly? Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Cleanup

87bd4ac

SimonHeybrock marked this pull request as ready for review February 2, 2026 13:25

Merge branch 'main' into multi-session

6e615c3

SimonHeybrock marked this pull request as draft February 4, 2026 10:59

SimonHeybrock mentioned this pull request Feb 6, 2026

Per-session dashboard services (alternative to #673) #688

Draft

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multi session support#673

Multi session support#673
SimonHeybrock wants to merge 77 commits intomainfrom
multi-session

SimonHeybrock commented Jan 23, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

SimonHeybrock commented Jan 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Architecture changes

Key learnings

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

SimonHeybrock commented Jan 23, 2026 •

edited

Loading