Add Event Study (aka Dynamic Difference in Differences) functionality by drbenvincent · Pull Request #584 · pymc-labs/CausalPy

drbenvincent · 2025-12-05T21:56:19Z

Towards Event Study and Staggered Diff in Diff #384

This pull request adds support for event study analysis to the CausalPy package. The main changes include introducing the new EventStudy class, updating the package exports to include it, and providing a utility function for generating synthetic panel data suitable for event study and dynamic DiD analyses.

This implementation does not deal with staggered treatments. Add a warning callout box, as data validation with exception
Update model description - the model being used is not static, it depends on the user provided formula. Give some examples.
Add an example where we remove the time fixed effects and add in other predictors or some form of seasonal trend.
Check the integration with the reporting layer.
~~Consider adding the time relative to treatment column in the notebook, not hidden in the experiment class.~~
Ensure we are exposing plot options via arguments
Simplify data plot using a 1-liner with seaborn
Why is the event study column not appearing in the table on home page?

⚠️ Includes a more widespread big fix identified by bugbot here #584 (comment) which will affect multiple experiment classes.

📚 Documentation preview 📚: https://causalpy--584.org.readthedocs.build/en/584/

review-notebook-app · 2025-12-05T21:56:24Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

codecov · 2025-12-05T22:06:20Z

Codecov Report

❌ Patch coverage is 97.12163% with 31 lines in your changes missing coverage. Please review.
✅ Project coverage is 95.24%. Comparing base (57c4e2e) to head (24252dc).

Files with missing lines	Patch %	Lines
causalpy/reporting.py	71.18%	7 Missing and 10 partials ⚠️
causalpy/tests/test_event_study.py	97.50%	0 Missing and 8 partials ⚠️
causalpy/experiments/event_study.py	97.40%	1 Missing and 5 partials ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #584      +/-   ##
==========================================
+ Coverage   94.19%   95.24%   +1.04%     
==========================================
  Files          43       46       +3     
  Lines        7316     8383    +1067     
  Branches      455      547      +92     
==========================================
+ Hits         6891     7984    +1093     
+ Misses        262      217      -45     
- Partials      163      182      +19

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

The EventStudy class now requires a patsy-style formula to specify the outcome and fixed effects, removing the separate outcome_col argument. Design matrix construction uses patsy, and event-time dummies are appended. Input validation checks for formula presence, and tests and documentation are updated to reflect the new API and output format.

drbenvincent · 2025-12-05T23:26:30Z

@cursor review

Expanded documentation to explain the patsy formula syntax, the role of unit and time fixed effects, and how event-time dummies ($\beta_k$) are automatically constructed by the EventStudy class. Added details on the event window and reference event time parameters for clearer guidance.

Added a warning in the EventStudy class and documentation that the implementation only supports simultaneous treatment timing and does not support staggered adoption. Introduced a validation to raise a DataException if treated units have different treatment times. Added a corresponding test to ensure staggered adoption raises an error, and updated the notebook to clarify estimator limitations.

Enhanced the _bayesian_plot and _ols_plot methods in EventStudy to support configurable figure size and HDI probability. Updated docstrings to document new parameters and improved plot labeling for clarity.

Introduces event study support to effect_summary(), including parallel trends check and dynamic effect reporting. Updates event study class to allow HDI probability customization and reporting, and extends documentation with effect summary usage and interpretation.

The `generate_event_study_data` function now supports optional time-varying predictors generated as AR(1) processes, controlled by new parameters: `predictor_effects`, `ar_phi`, and `ar_scale`. Added the `generate_ar1_series` utility function. Updated docstrings and examples to reflect these changes. The event study PyMC notebook was updated with additional analysis and improved section headings.

Introduces integration tests for the EventStudy.effect_summary method using both PyMC and sklearn models. Tests verify the returned EffectSummary object, its table and text attributes, and key output elements.

juanitorduz · 2025-12-06T12:05:49Z

I will take a look in the next few days :)

nialloulton · 2025-12-06T16:28:38Z

@cursor review

cursor · 2025-12-06T16:28:43Z

PR Summary

Adds an Event Studies section with the event_study_pymc.ipynb notebook to the docs and cites Sun & Abraham (2021) in references.

Docs:
- Notebooks index: Add an Event Studies toctree with event_study_pymc.ipynb in docs/source/notebooks/index.md.
- References: Add bibliographic entry for Sun & Abraham (2021) in docs/source/references.bib.

^{Written by Cursor Bugbot for commit 786a5f8. This will update automatically on new commits. Configure here.}

nialloulton · 2025-12-06T16:30:29Z

bugbot run

causalpy/experiments/event_study.py

Added Event Study to the experiment support table in reporting_statistics.md and updated AGENTS.md to instruct updating the table when adding new experiment types.

This commit adds extensive new tests to test_reporting.py and test_synthetic_data.py, covering error handling, experiment type detection, OLS statistics edge cases, prose and table generation for various models, and all synthetic data generation utilities. These tests improve coverage and robustness for reporting and data simulation functions.

Expanded documentation in both the EventStudy class and the event study PyMC notebook to explain the equivalence between indicator functions and dummy variables. Added details on how dummy variables are constructed for each event time, the omission of the reference period to avoid multicollinearity, and the interpretation of regression coefficients as ATT at each event time.

review-notebook-app · 2025-12-15T12:05:27Z

View / edit / reply to this conversation on ReviewNB

juanitorduz commented on 2025-12-15T12:05:27Z
----------------------------------------------------------------

What about adding https://arxiv.org/pdf/2503.13323 as a reference?

drbenvincent commented on 2025-12-23T15:32:57Z
----------------------------------------------------------------

Done, along with some other citations

review-notebook-app · 2025-12-15T12:05:28Z

View / edit / reply to this conversation on ReviewNB

juanitorduz commented on 2025-12-15T12:05:27Z
----------------------------------------------------------------

Can we say something about the inferred parameters? (e.g., a trace plot of some parameters? I am asking because high R-hats

review-notebook-app · 2025-12-15T12:05:29Z

View / edit / reply to this conversation on ReviewNB

juanitorduz commented on 2025-12-15T12:05:28Z
----------------------------------------------------------------

again here: shall we inspect the model sampling?

review-notebook-app · 2025-12-15T12:05:30Z

View / edit / reply to this conversation on ReviewNB

juanitorduz commented on 2025-12-15T12:05:29Z
----------------------------------------------------------------

Can we also plot the real true effects here?

juanitorduz

Very cool! This is super helpful in marketing (the effect of a marketing campaign depends on time!). We should try to find other real datasets to showcase this.

Minor comments.

Suggestion -> add more tests for the indidual methods and address @BugBot comments

causalpy/experiments/event_study.py

juanitorduz · 2025-12-15T12:13:22Z

causalpy/experiments/event_study.py

+            ).astype(float)
+
+        # Combine patsy design matrix with event-time dummies
+        X_df = pd.concat([X_df, event_time_dummies], axis=1)


@drbenvincent, if the number of units gets very big, then fitting all these dummies will become hard. What about adding the option to "demean" the data instead of using https://github.com/py-econometrics/pyfixest ? We can still use Bayesian methods to do inference after the demeaning procedure (just an idea)

Good idea, but a) not sure if we need an expternal package for de-meaning? b) let's have this as an iteration on top of the first mvp (i.e. yes, but not in this PR)

We could also consider mundlak variations which is just the groupby mean per group.

sim_df['unit_mean'] = sim_df.groupby('unit')['treat'].transform(np.mean) sim_df['state_mean'] = sim_df.groupby('state')['treat'].transform(np.mean) sim_df['cohort_mean'] = sim_df.groupby('cohort_year')['treat'].transform(np.mean) sim_df['time_mean'] = sim_df.groupby('year')['treat'].transform(np.mean)

causalpy/experiments/event_study.py

causalpy/tests/test_event_study.py

Control units marked with np.inf or np.nan in the treat_time column are now correctly excluded from treated unit checks in EventStudy. Adds tests to ensure control units with np.inf, np.nan, or mixed markers are handled properly, preventing false staggered adoption errors.

Improved rounding logic in EventStudy.get_event_time_summary for PyMC and sklearn models, ensuring consistent application of the round_to parameter. Refactored and expanded test_event_study.py to cover more edge cases, input validation, and integration scenarios, including new tests for rounding, event window handling, and control units. Cleaned up and reorganized tests for clarity and maintainability.

Updates all experiment classes to filter out rows with NaN values after patsy design matrix construction, ensuring consistent shapes between data and design matrices. Adds comprehensive tests for NaN handling across all experiment classes to verify correct filtering and prevent shape mismatch errors.

Enhanced the EventStudy class to emit warnings for unbalanced panels, gaps in time periods, and excessive data loss due to NaN filtering. Added comprehensive unit tests to cover these edge cases and verify warning behavior. Updated interrogate badge to reflect increased coverage.

drbenvincent · 2025-12-23T15:32:58Z

Done, along with some other citations

View entire conversation on ReviewNB

Replaced direct 'pip' calls with 'python -m pip' in CONTRIBUTING.md to ensure the correct pip is used within the conda environment. Added a note explaining the reason for this change.

NathanielF · 2026-01-11T23:02:58Z

causalpy/experiments/event_study.py

+    ... )
+    >>> result = cp.EventStudy(
+    ...     df,
+    ...     formula="y ~ C(unit) + C(time)",


I think you should be able to encode the formula fully in patsy. Something like

"y ~ C(unit) + C(rel_time, Treatment(reference=-1))"

It just requires that you pre-processe the df['time'] column before running patsy.

Something like

df = df.sort_values(["unit", "time"])

df["t_idx"] = df.groupby("unit").cumcount()

df["event_idx"] = (
df[df["treated"] == 1]
.groupby("unit")["t_idx"]
.transform("min")
)

df["rel_time"] = df["t_idx"] - df["event_idx"]

Sorry for the AI evaluation, but some major concerns with that proposal

The Suggestion

Use patsy's categorical handling to simplify the event-time dummy creation:

"y ~ C(unit) + C(rel_time, Treatment(reference=-1))"

With pre-processing to compute rel_time before calling patsy.

Benefits

Simpler code (removes ~50 lines of manual dummy creation)

More idiomatic use of patsy

Reference category handled automatically via Treatment(reference=-1)

Concerns

1. Control Units Would Be Dropped

This is the most significant issue. Control units have NaN for treatment time, so rel_time would be NaN. Patsy drops rows with NaN values entirely.

Current behavior: Control units are kept in the estimation. Their event-time dummies are all 0, but they contribute to the time fixed effects estimation.

Proposed behavior: Control units would be excluded, fundamentally changing the model and breaking the parallel trends comparison.

2. Loss of Calendar Time Fixed Effects

The proposed formula replaces calendar time FEs with event-time dummies. But a proper TWFE event study needs both:

Y_it = α_i + λ_t + Σ β_k · 1{E_it = k} + ε_it ↑ ↑ ↑ unit time event-time

With only C(rel_time), we lose λ_t (calendar time effects). This changes the identifying assumptions of the model.

3. Event Window Handling

Currently, only event times within the window (e.g., -5 to +5) get dummies. Observations outside the window are kept but don't contribute to event-time indicators.

With patsy handling everything, we'd need to either:

Filter out observations with rel_time outside the window (loses data)

Let patsy create dummies for ALL unique rel_time values (different model, many more parameters)

Bin/clip values outside the window to special categories

4. Minor Issues

Coefficient naming: Changes from event_time_0 to C(rel_time, Treatment(reference=-1))[T.0]

Dynamic reference: Would need to build formula strings dynamically to inject reference_event_time

Test updates: Existing tests check current coefficient naming

Possible Middle Ground

Keep the current model structure (which correctly handles control units and maintains both time FEs and event-time dummies) but simplify the internal implementation where possible.

Questions

Am I missing something that would address the control unit issue?

Is the loss of calendar time FEs intentional in your suggestion, or should both be included?

Any thoughts on handling the event window bounds?

Happy to implement if we can resolve these concerns!

Initial Concerns

The naive approach has some issues:

Control units dropped: NaN rel_time causes patsy to drop entire rows

Loss of calendar time FEs: Replacing C(time) with C(rel_time) loses λ_t

Event window handling: How to handle observations outside the window?

The Elegant Solution

Key insight: In patsy's Treatment coding, the reference category gets 0 for all dummies. This is exactly what we want for:

The reference period (k=-1)

Control units

Observations outside the event window

Solution: Map all these cases to the reference category!

# Step 1: Compute relative time as usual df["rel_time"] = df["time"] - df["treat_time"] # Step 2: Create a categorical version where: # - Control units → reference_event_time # - Outside event window → reference_event_time # - Inside window → actual rel_time df["rel_time_cat"] = df["rel_time"].copy() # Control units (NaN rel_time) get the reference category df.loc[df["rel_time"].isna(), "rel_time_cat"] = reference_event_time # Observations outside window also get reference category outside_window = (df["rel_time"] < event_window[0]) | (df["rel_time"] > event_window[1]) df.loc[outside_window & df["rel_time"].notna(), "rel_time_cat"] = reference_event_time # Step 3: Use patsy with BOTH time FEs and event-time dummies formula = f"y ~ C(unit) + C(time) + C(rel_time_cat, Treatment(reference={reference_event_time}))"

Why This Works

Observation Type rel_time_cat Value Result

Treated, k=-1 (reference) -1 0 for all event dummies ✓

Treated, k in window actual k Gets its own dummy ✓

Treated, k outside window -1 (reference) 0 for all event dummies ✓

Control unit -1 (reference) 0 for all event dummies ✓

Benefits

Calendar time FEs preserved - C(time) stays in the formula

Control units handled correctly - They contribute to FE estimation, get 0 for event dummies

Event window handled naturally - Outside-window observations map to reference

Much simpler code - Patsy creates all the dummies

No rows dropped - No NaN values in the categorical column

Implementation Approach

The user could still provide a simple formula, and the class would augment it internally:

# User provides: formula = "y ~ C(unit) + C(time)" # Class internally transforms to: augmented_formula = f"y ~ C(unit) + C(time) + C(rel_time_cat, Treatment(reference={ref}))"

This significantly simplifies _build_design_matrix() while preserving all the correct model behavior.

What do you think @NathanielF? Maybe we merge as is, create an issue for further work, then come back to it in a second iteration?

initial commit

34c93da

drbenvincent added the major label Dec 5, 2025

drbenvincent added 2 commits December 5, 2025 21:59

add new notebook to index so it renders in the docs

8d624ba

Add notebook indexing guideline to documentation

f78eac2

drbenvincent added 6 commits December 5, 2025 22:06

add cell tags

7abd4f5

add glossary entry + refs

3e09253

update README with event study functionality notes

bed06c5

fix math formatting

da9f77c

add reference

e2c59f4

drbenvincent added 14 commits December 5, 2025 23:41

Add plot customization options to EventStudy

c4e678e

Enhanced the _bayesian_plot and _ols_plot methods in EventStudy to support configurable figure size and HDI probability. Updated docstrings to document new parameters and improved plot labeling for clarity.

plot data with seaborn, not matplotlib

0915ebe

re-render notebook

9cff7f0

run make uml

59595d2

minor changes + re-run notebook

b606970

update the homepage (index.md, not README.md)

deee019

Add note about dual README files in documentation

af32b60

Add tests for EventStudy effect_summary method

10690c6

Introduces integration tests for the EventStudy.effect_summary method using both PyMC and sklearn models. Tests verify the returned EffectSummary object, its table and text attributes, and key output elements.

re-run notebook

b442e4d

minor notebook edit

786a5f8

drbenvincent marked this pull request as ready for review December 6, 2025 11:48

drbenvincent requested review from NathanielF and juanitorduz December 6, 2025 11:49

drbenvincent requested a review from cetagostini December 6, 2025 11:49

fix spelling error

1d66f25

cursor bot reviewed Dec 6, 2025

View reviewed changes

causalpy/experiments/event_study.py Outdated Show resolved Hide resolved

cursor bot reviewed Dec 6, 2025

View reviewed changes

causalpy/experiments/event_study.py Show resolved Hide resolved

causalpy/experiments/event_study.py Outdated Show resolved Hide resolved

drbenvincent added 3 commits December 6, 2025 17:03

Document Event Study support in effect_summary

308ee80

Added Event Study to the experiment support table in reporting_statistics.md and updated AGENTS.md to instruct updating the table when adding new experiment types.

juanitorduz requested changes Dec 15, 2025

View reviewed changes

drbenvincent added 7 commits December 23, 2025 13:46

Merge branch 'main' into event-study

b79e1e8

make uml

433f517

Cite some classic papers

b4eb1bf

Update pip commands to use python -m pip in docs

88a3470

Replaced direct 'pip' calls with 'python -m pip' in CONTRIBUTING.md to ensure the correct pip is used within the conda environment. Added a note explaining the reason for this change.

This was referenced Dec 23, 2025

[meta issue] Additional quasi-experimental procedures tracker #607

Open

Feature request: Staggered adoption Difference-in-Differences / Event Study support (imputation-based) #620

Closed

NathanielF reviewed Jan 11, 2026

View reviewed changes

Merge branch 'main' into event-study

24252dc

drbenvincent self-assigned this Jan 14, 2026

Observation Type	`rel_time_cat` Value	Result
Treated, k=-1 (reference)	-1	0 for all event dummies ✓
Treated, k in window	actual k	Gets its own dummy ✓
Treated, k outside window	-1 (reference)	0 for all event dummies ✓
Control unit	-1 (reference)	0 for all event dummies ✓

Conversation

drbenvincent commented Dec 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

review-notebook-app bot commented Dec 5, 2025

Uh oh!

codecov bot commented Dec 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

drbenvincent commented Dec 5, 2025

Uh oh!

juanitorduz commented Dec 6, 2025

Uh oh!

nialloulton commented Dec 6, 2025

Uh oh!

cursor bot commented Dec 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Summary

Uh oh!

nialloulton commented Dec 6, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

review-notebook-app bot commented Dec 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

review-notebook-app bot commented Dec 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

review-notebook-app bot commented Dec 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

review-notebook-app bot commented Dec 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

juanitorduz left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

juanitorduz Dec 15, 2025

Choose a reason for hiding this comment

Uh oh!

drbenvincent Dec 23, 2025

Choose a reason for hiding this comment

Uh oh!

NathanielF Jan 11, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

drbenvincent commented Dec 23, 2025

Uh oh!

NathanielF Jan 11, 2026

Choose a reason for hiding this comment

Uh oh!

drbenvincent Jan 12, 2026

Choose a reason for hiding this comment

The Suggestion

Benefits

Concerns

1. Control Units Would Be Dropped

2. Loss of Calendar Time Fixed Effects

3. Event Window Handling

4. Minor Issues

Possible Middle Ground

Questions

Uh oh!

drbenvincent Jan 12, 2026

Choose a reason for hiding this comment

Initial Concerns

The Elegant Solution

Why This Works

Benefits

Implementation Approach

Uh oh!

drbenvincent Jan 12, 2026

drbenvincent commented Dec 5, 2025 •

edited

Loading

codecov bot commented Dec 5, 2025 •

edited

Loading

cursor bot commented Dec 6, 2025 •

edited

Loading

review-notebook-app bot commented Dec 15, 2025 •

edited

Loading

review-notebook-app bot commented Dec 15, 2025 •

edited

Loading

review-notebook-app bot commented Dec 15, 2025 •

edited

Loading

review-notebook-app bot commented Dec 15, 2025 •

edited

Loading