Add automated notebook testing with Papermill by drbenvincent · Pull Request #602 · pymc-labs/CausalPy

drbenvincent · 2025-12-20T21:52:30Z

Summary

guard nbclient widget output to avoid display_id assertion errors
clear notebook outputs before injection and use unique injection ids
add optional --parallel execution flag and document runner behavior
update interrogate badge from pre-commit

Test Plan

pre-commit run --all-files

📚 Documentation preview 📚: https://causalpy--602.org.readthedocs.build/en/602/

Introduces a GitHub Actions workflow to run and validate Jupyter notebooks in CI using a new runner script. Adds scripts to mock PyMC sampling for faster execution, updates test dependencies to include papermill, and documents the notebook runner usage. Also updates the interrogate badge to reflect new coverage.

Copilot

Pull request overview

This PR introduces automated testing for Jupyter notebooks in CI using Papermill. The implementation includes a runner script that mocks PyMC's MCMC sampling with faster prior predictive sampling to validate notebooks execute without errors.

Key changes:

New notebook runner script with filtering capabilities for different notebook types
Mock PyMC sampling implementation that replaces expensive MCMC with prior predictive sampling (10 draws)
GitHub Actions workflow that runs notebooks in parallel across three categories (PyMC, sklearn, and other notebooks)

Reviewed changes

Copilot reviewed 5 out of 6 changed files in this pull request and generated 5 comments.

Show a summary per file

File	Description
scripts/run_notebooks/runner.py	Main script for executing notebooks with Papermill, includes filtering and logging
scripts/run_notebooks/injected.py	Mock implementation of pm.sample that uses prior predictive sampling
scripts/run_notebooks/README.md	Documentation for the notebook runner usage and CI integration
.github/workflows/test_notebook.yml	GitHub Actions workflow for parallel notebook testing
pyproject.toml	Adds papermill to test dependencies
docs/source/_static/interrogate_badge.svg	Updates documentation coverage badge from 96.3% to 96.0%

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

scripts/run_notebooks/runner.py

scripts/run_notebooks/injected.py

scripts/run_notebooks/README.md

codecov · 2025-12-20T22:00:54Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 94.35%. Comparing base (fed2b5c) to head (dede24d).
⚠️ Report is 3 commits behind head on main.

Additional details and impacted files

@@           Coverage Diff           @@
##             main     #602   +/-   ##
=======================================
  Coverage   94.35%   94.35%           
=======================================
  Files          44       44           
  Lines        7517     7517           
  Branches      456      456           
=======================================
  Hits         7093     7093           
  Misses        262      262           
  Partials      162      162

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Updated the mock for pm.sample() to use 100 draws instead of 50 for prior predictive sampling, as reflected in both the injected script and documentation. This change aims to provide more robust validation during notebook execution.

Updated the mock for pm.sample to use 500 draws instead of 100 to ensure compatibility with notebook code that iterates over posterior samples, such as plot_ate which defaults to 500 draws. Adjusted documentation and injected.py accordingly.

Introduces skip_notebooks.yml to specify notebooks incompatible with prior predictive sampling mock. Updates runner.py to filter out these notebooks and reduces MIN_DRAWS from 500 to 100 for faster execution.

Replaces import of LinearRegression from causalpy.skl_models with sklearn's LinearRegression and removes execution count from the first code cell.

review-notebook-app · 2025-12-24T08:04:24Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

Replaces hardcoded sample size with dynamic calculation based on the length of 'uncertainty' to prevent errors when fewer than 500 samples are available. Also resets execution count to null for the affected notebook cell.

Expanded the skip_notebooks.yml file to include iv_pymc.ipynb, which requires JAX not available in the CI test environment. Updated comments to clarify reasons for skipping each notebook.

Installs Graphviz as a system dependency in the test_notebook GitHub Actions workflow to support notebooks or tests that require it.

drbenvincent · 2025-12-24T09:13:56Z

passing!

@NathanielF See skip_notebooks.yaml. I had to bypass testing some of yours for reasons explained in the file. Those reasons are probably fixable in a follow up PR.

drbenvincent · 2025-12-24T09:14:47Z

bugbot review

cursor · 2025-12-24T09:14:51Z

PR Summary

Introduces CI to validate docs notebooks execute without errors.

Adds .github/workflows/test_notebook.yml to run notebooks in parallel splits on Python 3.12
New scripts/run_notebooks/ utilities: runner.py (Papermill execution with temporary notebooks), injected.py (mocks pm.sample with prior draws and minimal sample_stats), skip_notebooks.yml (notebooks excluded from CI), and a brief README.md
Updates docs/source/_static/interrogate_badge.svg from 96.3% to 96.0%

^{Written by Cursor Bugbot for commit c15a229. This will update automatically on new commits. Configure here.}

cursor

✅ Bugbot reviewed your changes and found no bugs!

drbenvincent · 2026-01-17T04:34:23Z

~~TODO: don't use papermill. Use nbmake or nbconvert~~

Papermill has good support for the trick where we change sampling behaviour to sample from the prior for ultra fast evaluation

Add nbclient widget output guards, clear notebook outputs before injection, and support optional parallel execution to reduce notebook flakiness in CI.

Use the requested draw count when provided and fall back to the minimum to avoid indexing errors in notebooks that iterate over many draws.

drbenvincent requested review from Copilot and williambdean December 20, 2025 21:52

drbenvincent added documentation Improvements or additions to documentation devops DevOps related labels Dec 20, 2025

Copilot started reviewing on behalf of drbenvincent December 20, 2025 21:52 View session

Copilot AI reviewed Dec 20, 2025

View reviewed changes

drbenvincent and others added 10 commits December 24, 2025 04:13

Update scripts/run_notebooks/README.md

8340526

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Update scripts/run_notebooks/injected.py

a58bd62

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Update scripts/run_notebooks/injected.py

73481eb

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Update scripts/run_notebooks/runner.py

ecd690a

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

attempt to fix failing remote notebook execution test

555b4ad

fix pre-commit checks

702089e

Add skip list for incompatible notebooks

2d68170

Introduces skip_notebooks.yml to specify notebooks incompatible with prior predictive sampling mock. Updates runner.py to filter out these notebooks and reduces MIN_DRAWS from 500 to 100 for faster execution.

Update imports in iv_pymc notebook

4bf53ae

Replaces import of LinearRegression from causalpy.skl_models with sklearn's LinearRegression and removes execution count from the first code cell.

drbenvincent added 4 commits December 24, 2025 08:08

Fix sampling bug in uncertainty plot

d05002d

Replaces hardcoded sample size with dynamic calculation based on the length of 'uncertainty' to prevent errors when fewer than 500 samples are available. Also resets execution count to null for the affected notebook cell.

Update skipped notebooks list for CI environment

c91c5d8

Expanded the skip_notebooks.yml file to include iv_pymc.ipynb, which requires JAX not available in the CI test environment. Updated comments to clarify reasons for skipping each notebook.

Add iv_weak_instruments.ipynb to skipped notebooks list

79323dc

Add Graphviz installation to CI workflow

c15a229

Installs Graphviz as a system dependency in the test_notebook GitHub Actions workflow to support notebooks or tests that require it.

drbenvincent requested review from NathanielF and juanitorduz December 24, 2025 09:14

cursor bot reviewed Dec 24, 2025

View reviewed changes

Merge branch 'main' into notebook-testing

f2e7b46

update pre-commit checks

5e8005c

drbenvincent mentioned this pull request Jan 7, 2026

iv_vs_priors.ipynb fails: az.plot_energy() incompatible with numpyro sampler #634

Closed

drbenvincent added 5 commits January 8, 2026 20:49

Merge branch 'main' into notebook-testing

1181a76

run pre-commit

52364ea

Merge branch 'main' into notebook-testing

6f28bb7

Update interrogate badge to 96.4% coverage

2277d2a

Remove inv_prop_pymc.ipynb from skip list

da4e06c

drbenvincent added 3 commits January 17, 2026 19:57

Merge branch 'main' into notebook-testing

0049406

run pre-commit

7aeabd3

Improve notebook runner robustness

97db952

Add nbclient widget output guards, clear notebook outputs before injection, and support optional parallel execution to reduce notebook flakiness in CI.

drbenvincent mentioned this pull request Jan 17, 2026

Add full notebook re-execution command #672

Open

drbenvincent added 4 commits January 17, 2026 21:13

Respect requested draws in notebook mock sampling

3e21285

Use the requested draw count when provided and fall back to the minimum to avoid indexing errors in notebooks that iterate over many draws.

update readme explaining local usage

51ead43

remove inv_prop_latent.ipynb from being skipped

dcf5361

remove watermark stuff from notebook

dede24d

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add automated notebook testing with Papermill#602

Add automated notebook testing with Papermill#602
drbenvincent wants to merge 29 commits intomainfrom
notebook-testing

drbenvincent commented Dec 20, 2025 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

codecov bot commented Dec 20, 2025 •

edited

Loading

Uh oh!

review-notebook-app bot commented Dec 24, 2025

Uh oh!

drbenvincent commented Dec 24, 2025

Uh oh!

drbenvincent commented Dec 24, 2025

Uh oh!

cursor bot commented Dec 24, 2025 •

edited

Loading

Uh oh!

cursor bot left a comment

Uh oh!

drbenvincent commented Jan 17, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

drbenvincent commented Dec 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test Plan

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

codecov bot commented Dec 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

review-notebook-app bot commented Dec 24, 2025

Uh oh!

drbenvincent commented Dec 24, 2025

Uh oh!

drbenvincent commented Dec 24, 2025

Uh oh!

cursor bot commented Dec 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Summary

Uh oh!

cursor bot left a comment

Choose a reason for hiding this comment

Uh oh!

drbenvincent commented Jan 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

drbenvincent commented Dec 20, 2025 •

edited

Loading

codecov bot commented Dec 20, 2025 •

edited

Loading

cursor bot commented Dec 24, 2025 •

edited

Loading

drbenvincent commented Jan 17, 2026 •

edited

Loading