Avoid mutating sklearn estimators in DiD experiments#693
Open
drbenvincent wants to merge 4 commits intomainfrom
Open
Avoid mutating sklearn estimators in DiD experiments#693drbenvincent wants to merge 4 commits intomainfrom
drbenvincent wants to merge 4 commits intomainfrom
Conversation
Documentation build overview
Show files changed (4 files in total): 📝 4 modified | ➕ 0 added | ➖ 0 deleted
|
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #693 +/- ##
==========================================
+ Coverage 94.35% 94.39% +0.03%
==========================================
Files 44 44
Lines 7517 7544 +27
Branches 456 456
==========================================
+ Hits 7093 7121 +28
Misses 262 262
+ Partials 162 161 -1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
…tion tests Move the _ensure_sklearn_fit_intercept_false() call from inside the `elif isinstance(self.model, RegressorMixin)` block to before the model-type dispatch in both DiD and StaggeredDiD. This makes the isinstance early-return guard (line 70 in base.py) naturally reachable by existing PyMC integration tests, fixing the codecov/patch failure. Also add test_did_sklearn_fit_intercept_false integration test covering the "no warning needed" path when fit_intercept is already False. Co-authored-by: Cursor <cursoragent@cursor.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
fit_intercept=Falseis requiredUserWarningwhen the clone occurs so users are aware and can fix their codefit_intercepthandling inBaseExperiment._ensure_sklearn_fit_intercept_false()codecov/patchfailure by ensuring full branch coverage of the new methodDetails
DiD experiments require
fit_intercept=Falsebecause the design matrix already contains an explicit intercept column (~ 1 + ...). Previously, the experiment constructor mutated the user's model in-place (model.fit_intercept = False), which was a hidden side effect. Now the method clones the estimator viasklearn.base.clone(), sets the parameter on the copy, and issues a warning explaining why.Coverage fix
The
_ensure_sklearn_fit_intercept_false()method inbase.pyhas a defensive isinstance guard that returns early for non-sklearn models (line 70). Originally, the method was called from insideelif isinstance(self.model, RegressorMixin):blocks in bothdiff_in_diff.pyandstaggered_did.py, which made the guard unreachable — the caller had already confirmed the model was sklearn, so the isinstance check could never beTrue. This caused thecodecov/patchcheck to fail.The fix moves the
_ensure_sklearn_fit_intercept_false()call to before the model-type dispatch (theif PyMCModel / elif RegressorMixinblock) in both experiment classes. Since the method already has a built-in guard for non-sklearn models, calling it unconditionally is safe and semantically cleaner. Now any PyMC integration test naturally exercises the early-return path, providing full branch coverage without artificial unit tests.A new integration test
test_did_sklearn_fit_intercept_falsewas also added to cover the "no warning needed" path (when the model already hasfit_intercept=False).Testing
Issues