Add `PiecewiseITS` experiment for known interruption dates by drbenvincent · Pull Request #614 · pymc-labs/CausalPy

drbenvincent · 2025-12-24T13:09:24Z

Closes #613

This pull request introduces support for Piecewise Interrupted Time Series (ITS) analysis in the codebase. The main changes include adding a new experiment class, stateful patsy transforms for specifying level and slope changes at multiple intervention points, a simulation utility for generating piecewise ITS data, and updates to the package API and documentation to expose these new features.

Piecewise ITS support and API exposure:

Added the PiecewiseITS experiment class to the codebase and included it in the main package API (__init__.py) and experiments API (causalpy/experiments/__init__.py). This enables users to import and use PiecewiseITS directly.

Patsy transforms for segmented regression:

Introduced a new module causalpy/transforms.py providing stateful patsy transforms: step for level changes and ramp for slope changes at arbitrary intervention points. These can be used in regression formulas for flexible piecewise ITS modeling, supporting both numeric and datetime time variables.
Exposed step and ramp transforms in the main package API (__init__.py) to allow easy access.

Data simulation utilities:

Added the generate_piecewise_its_data function to causalpy/data/simulate_data.py for simulating time series data with multiple interventions, customizable level and slope changes, and ground truth counterfactuals for testing and demonstration.

Documentation and notebook updates:

Added a reference to a new notebook piecewise_its_pymc.ipynb in the documentation index to demonstrate piecewise ITS analysis.

Pre-commit configuration:

Updated the pre-commit configuration to exclude the new notebook piecewise_its_pymc.ipynb from large file checks, ensuring smoother development workflow.

📚 Documentation preview 📚: https://causalpy--614.org.readthedocs.build/en/614/

review-notebook-app · 2025-12-24T13:09:30Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

drbenvincent · 2025-12-24T13:12:36Z

bugbot run

cursor · 2025-12-24T13:12:41Z

PR Summary

Adds a new segmented-regression ITS workflow with explicit level/slope changes at known interruptions.

New PiecewiseITS experiment: builds step/ramp design matrix, supports PyMC and OLS, computes counterfactual/effects, plotting, summaries, and plot-data extraction
New simulator generate_piecewise_its_data for synthetic piecewise ITS datasets
Exposes PiecewiseITS via causalpy/__init__.py and experiments/__init__.py
Extensive unit/integration tests covering validation, OLS/PyMC paths, plotting, controls, and datetime time columns
Docs: add piecewise_its_pymc.ipynb to notebooks index; pre-commit excludes that notebook; update interrogate badge

^{Written by Cursor Bugbot for commit c14f15c. This will update automatically on new commits. Configure here.}

codecov · 2025-12-24T13:18:27Z

Codecov Report

❌ Patch coverage is 93.63958% with 54 lines in your changes missing coverage. Please review.
✅ Project coverage is 93.74%. Comparing base (560b4e9) to head (0f86191).
⚠️ Report is 3 commits behind head on main.

Files with missing lines	Patch %	Lines
causalpy/experiments/piecewise_its.py	89.25%	11 Missing and 15 partials ⚠️
causalpy/tests/test_piecewise_its.py	96.23%	18 Missing and 1 partial ⚠️
causalpy/transforms.py	88.57%	4 Missing and 4 partials ⚠️
causalpy/data/simulate_data.py	96.66%	0 Missing and 1 partial ⚠️

Additional details and impacted files

@@           Coverage Diff            @@
##             main     #614    +/-   ##
========================================
  Coverage   93.74%   93.74%            
========================================
  Files          41       44     +3     
  Lines        6827     7676   +849     
  Branches      458      517    +59     
========================================
+ Hits         6400     7196   +796     
- Misses        267      300    +33     
- Partials      160      180    +20

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

causalpy/experiments/piecewise_its.py

Added detailed explanations comparing Piecewise ITS to Regression Discontinuity and Regression Kink designs. Introduced new real-world scenarios for level and slope changes, multiple interventions, and level-only models. Enhanced example code and output to illustrate these cases, improving clarity and practical guidance for users.

Improved clarity and conciseness throughout the Piecewise Interrupted Time Series (ITS) notebook. Rewrote several sections for better readability, combined and streamlined example scenarios, and clarified distinctions between level and slope changes, as well as the relationship to regression discontinuity and regression kink designs.

Refactors the PiecewiseITS experiment to use flexible patsy formulas with new stateful step() and ramp() transforms for specifying level and slope changes at interventions. Adds the causalpy.transforms module with robust, datetime-aware step/ramp transforms, updates tests to cover new formula interface and transform behavior, and improves documentation and error handling. This enables more flexible modeling of multiple interventions and supports both numeric and datetime time columns.

drbenvincent · 2025-12-24T19:34:48Z

bca3699 adds the most amazing patsy-based API for segmented/piecewise regression!

Added a new section describing the formula-based API for PiecewiseITS, including explanations of the custom step() and ramp() transforms, usage examples, and clarification on how the counterfactual is computed. This improves documentation clarity and helps users understand flexible model specification.

Implemented creation of post_impact, datapost, and post_pred attributes in PiecewiseITS for compatibility with effect_summary() from BaseExperiment. Added tests to verify effect_summary works for both OLS and PyMC models and that the new attributes are correctly created.

Added mathematical definitions for step and ramp functions using LaTeX for clarity, and moved import/setup code to the top of the notebook for better organization. Improved explanations of function arguments and removed duplicate import cell.

The introductory markdown in the piecewise_its_pymc.ipynb notebook has been significantly expanded and reorganized. The new content provides clearer explanations of when to use Piecewise ITS, the distinction between level and slope changes, the mathematical model, and its relationship to regression discontinuity and regression kink designs. Redundant sections were removed and a more structured, didactic flow was introduced.

Expanded explanations of level and slope changes in piecewise ITS, referencing a new illustrative figure. Added a code cell to display the figure, and clarified the description of multiple interventions for improved instructional clarity.

Inserted a markdown cell with a table summarizing model formulas for single and two intervention cases, covering level, slope, and combined effects. This provides clearer guidance on specifying models for each panel in the notebook.

Condenses and reorganizes introductory explanations for piecewise interrupted time series (ITS), splitting out key concepts, model details, and comparisons to related methods into clearer, more focused sections. Adds collapsible dropdowns and card formatting for scenario examples, and improves clarity and flow for users learning the model and its API.

Adds a comprehensive suite of tests for the PiecewiseITS class, including class and instance attribute checks, formula parsing, plotting, PyMC integration, counterfactuals, data generation, and error handling. Also updates the interrogate badge to reflect increased coverage.

Added detailed references and in-text citations to the piecewise_its_pymc.ipynb notebook to support methodological explanations. Updated the references.bib file with key literature on segmented regression and interrupted time series analysis. Improved clarity on model parameterization and corrected the references section to use the Sphinx bibliography directive.

drbenvincent · 2025-12-25T19:47:16Z

Tagging @tomicapretto in case you are interested in the stateful transforms (transforms.py) added in this PR. If I understand correctly, then if these were implemented in formulae then Bambi could be a great way for users to explore piecewise regression models?

drbenvincent · 2026-01-08T03:59:17Z

TODO: does this api work when we have datetime rather than integer time index?

JeanVanDyk

Hi! I’ve done a brief review of the changes, and I must say the notebook is extensive and really interesting—the variety of examples makes the functionality very clear. I’ve also run the notebook and the tests locally: everything works and passes as expected.

However, I noticed a few points that might need addressing before we merge:

Data Structure & Types: There are a few places where date/threshold handling is quite defensive (using multiple try/except and isinstance checks). I suspect we could simplify the entire class by standardizing these to pd.Timestamp or numeric types at the initial extraction point. This would also allow us to remove the redundant _convert_threshold_for_plotting helper.

Missing Method: It looks like the effect_summary method is currently missing. Since the global refactor, this seems to have been dropped or overlooked, but it's quite central to the experiment's output.

Overall, great work on the documentation and examples! Let me know what you think about streamlining the type handling and re-adding the summary method.

JeanVanDyk · 2026-01-12T18:32:06Z

causalpy/experiments/piecewise_its.py

+        if matches:
+            return matches[0]
+        # Fallback: try to find a time-like column
+        return "t"


I noticed that the current logic merges all thresholds into a single list, regardless of the variable name (for example, step(t, 10) and step(month, 5) would result in thresholds = [10, 5]). This loses the context of which limit applies to which variable.

Is it intended to support multiple tracking variables within a single formula?

If yes: We should consider storing these in a dictionary (e.g., {"t": [10], "month": [5]}) to ensure the thresholds are applied to the correct variables later in the execution.

If no: It might be safer to check the number of unique variables found and raise a ValueError if more than one is detected. This would prevent unexpected behavior if a user provides a complex formula.

JeanVanDyk · 2026-01-12T18:41:33Z

causalpy/experiments/piecewise_its.py

+        else:
+            # Numeric threshold
+            post_mask = self.data[time_col] >= first_interruption
+


I see we handle str to Timestamp conversion with a fallback to direct comparison. Is there a specific case where we need to compare raw strings that aren't timestamps?

If we are primarily dealing with dates or numbers, I wonder if it wouldn't be safer to convert everything to the proper type (using pd.to_datetime) right at the beginning of the pipeline?

My thinking is that it might allow us to "fail fast" if a user provides an invalid date, and it would simplify the final comparison to a single line: self.data[time_col] >= first_interruption.

I might be overlooking a specific scenario where this late-stage conversion is necessary, so I'd love to hear your thoughts on the intent here!

JeanVanDyk · 2026-01-12T18:46:41Z

causalpy/experiments/piecewise_its.py

+                return pd.Timestamp(threshold)
+            except Exception:
+                return threshold  # type: ignore[return-value]
+        return threshold


If we standardize the threshold types to pd.Timestamp or numeric values immediately upon extraction, this method likely becomes redundant as the data would already be in its final, usable form. I might be overlooking a specific edge case, but it seems that ensuring clean types at the entry point would allow us to simplify the class by removing this defensive logic and the repetitive try/except checks.

drbenvincent · 2026-01-12T19:04:09Z

Thanks @JeanVanDyk. I'll take these comments into account and ping you whenever I've got an improved version.

tomicapretto · 2026-01-13T21:25:21Z

Tagging @tomicapretto in case you are interested in the stateful transforms (transforms.py) added in this PR. If I understand correctly, then if these were implemented in formulae then Bambi could be a great way for users to explore piecewise regression models?

@drbenvincent, thanks for tagging me here.

Since a long time ago we have this issue opened.
If I understand correctly, what I create with truncate in that snippet is similar to what you would create with RampTransform.

In Bambi (but also with any formula-based modeling interface), I think one could do:

def step(x, threshold):
    return 1.0 * (x >= threshold) # ensure numeric output

def ramp(x, threshold):
    return  (x - threshold) * (x >= threshold)

Then you would write

formula = "y ~ x + step(x, 10)"  # level (aka intercept) changes when x>=10
formula = "y ~ x + ramp(x, 10)"  # slope changes when x>=10
formula = "y ~ x + step(x, 10) + ramp(x, 10)" # both level and slope change when x>=10

With that said, I’m not sure whether those examples provide enough motivation for a stateful transformation. A stateful transformation is useful when some aspect of the transformation (e.g., a threshold) depends on the initial (training) dataset. If the value is hardcoded in the formula, there is no need for a stateful transformation (although using one would not cause any harm).

While adding the examples, I just realized one could achieve the same via the interaction operator in combination with the special identity function I, which is useful for escaping expressions:

f = "y ~ x + I(x >= 10)" # step
f = "y ~ x + x:I(x >= 10)" # ramp
f = "y ~ x + I(x >= 10) + x:I(x >= 10)" # step + ramp

@drbenvincent, just let me know if you want to have a further chat about this

drbenvincent · 2026-01-14T03:09:54Z

Thanks for this! I think you are right, there's no need for this to be a stateful transform. It can just be a regular function.

I think this specific example

f = "y ~ x + x:I(x >= 10)" # ramp

might be problematic. Because x is not zero at 10 it creates a step and a ramp. I have memories of seeing a paper which flagged that this interaction approach could be problematic. And I guess the size of the step will vary with x which could make the interpretation of the coefficient a bit tricky. Would it have to be this?

f = "y ~ x + I(x-10):I(x >= 10)" # ramp

(Typing this at 3am after a toddler wake-up, so we'll see if this makes sense in the morning 🤣)

tomicapretto · 2026-01-14T13:17:37Z

@drbenvincent: yes, you are right about the problem and the solution, I missed it.

Anyway, I think having specific keywords such as step and ramp are quite self-explanatory and that is great for users.

initial commit - MVP

c14f15c

drbenvincent added enhancement New feature or request major labels Dec 24, 2025

drbenvincent mentioned this pull request Dec 24, 2025

[meta issue] Additional quasi-experimental procedures tracker #607

Open

14 tasks

cursor bot reviewed Dec 24, 2025

View reviewed changes

causalpy/experiments/piecewise_its.py Show resolved Hide resolved

drbenvincent added 4 commits December 24, 2025 15:44

Merge branch 'main' into piecewise-its

8225a02

drbenvincent added 10 commits December 24, 2025 19:41

Clarify usage of step and ramp transforms in docs

b7cdf66

Add model formula table to piecewise ITS notebook

4fa4a59

Inserted a markdown cell with a table summarizing model formulas for single and two intervention cases, covering level, slope, and combined effects. This provides clearer guidance on specifying models for each panel in the notebook.

drbenvincent requested review from NathanielF and juanitorduz December 25, 2025 19:41

drbenvincent marked this pull request as ready for review December 25, 2025 20:12

drbenvincent requested a review from JeanVanDyk December 26, 2025 20:55

Merge branch 'main' into piecewise-its

7e10a52

run pre-commit checks

0f86191

JeanVanDyk reviewed Jan 12, 2026

View reviewed changes

drbenvincent mentioned this pull request Feb 12, 2026

Multi-channel ITS for lift test evaluation in MMMs #708

Open

Conversation

drbenvincent commented Dec 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

review-notebook-app bot commented Dec 24, 2025

Uh oh!

drbenvincent commented Dec 24, 2025

Uh oh!

cursor bot commented Dec 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Summary

Uh oh!

codecov bot commented Dec 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

drbenvincent commented Dec 24, 2025

Uh oh!

drbenvincent commented Dec 25, 2025

Uh oh!

drbenvincent commented Jan 8, 2026

Uh oh!

JeanVanDyk left a comment

Choose a reason for hiding this comment

Uh oh!

JeanVanDyk Jan 12, 2026

Choose a reason for hiding this comment

Uh oh!

JeanVanDyk Jan 12, 2026

Choose a reason for hiding this comment

Uh oh!

JeanVanDyk Jan 12, 2026

Choose a reason for hiding this comment

Uh oh!

drbenvincent commented Jan 12, 2026

Uh oh!

tomicapretto commented Jan 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

drbenvincent commented Jan 14, 2026

Uh oh!

tomicapretto commented Jan 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

drbenvincent commented Dec 24, 2025 •

edited

Loading

cursor bot commented Dec 24, 2025 •

edited

Loading

codecov bot commented Dec 24, 2025 •

edited

Loading

tomicapretto commented Jan 13, 2026 •

edited

Loading