adding weights to regressions with up to 2 FE dimensions#12
adding weights to regressions with up to 2 FE dimensions#12jamesbrandecon wants to merge 5 commits intomainfrom
Conversation
|
@grantmcdermott if you can, it'd help to confirm that weighted regs match feols on your end. Next task is to figure out if I can refactor the dbplyr stuff to avoid writing temp tables, which I think will be hard but probably necessary. This is also the second step of #11, at least for TWFE. More work for HDFE. |
|
Awesome. I think the next step is to set up a test suite and CI so that we can iterate more systemically and securely. I've got a couple of things on today, but will try to do that when I get sec. Then we can merge into this branch and add weighted tests too before diving into the code. |
|
I agree! Already it was unwieldy to make sure I haven't broken anything. Happy to defer to you on test structure/content. |
|
Tests and CI added. If you merge from main locally and then push, they should auto propagate. (You can also add additional tests as part of this PR. They need to be saved as |
|
Thanks! Any attachment to the current test structure? Locally, I'm simulating one small-ish dataset and then running regs with many different options and comparing to feols as source of truth. So |
|
No attachment, but it does need to follow the |
|
Just a quick note: the CI will fail because you've got new undeclared dependencies |
|
To clarify: Just add (It doesn't need to be in |
|
I'll try harder on testing later today -- sorry for the CI breaking emails. Need to learn local testing in R! |
|
No worries! I actually don't get the breaking emails (since it's your PR). The workflow is quite nice once you get used to it. I might add a Makefile when I get a chance, but two tips for local testing in the meantime:
|
|
Thanks! Now I get the flow. I hadn't realized we weren't importing dplyr yet -- is that something you want to avoid? I was relying heavily on dplyr/dbplyr and pipes to get alternating projections to work. |
I'd prefer not to add it as a dependency if we can avoid it. At the same time, I obviously don't want to make our lives harder than it needs be. Regardless, maybe it's a good idea to touch base over a call at this point? The weights functionality sounds very useful regardless of where we land on HDFE. I don't want to lose that through too much integration with alternating projects functionality yet.... Pure SQL alternating projectionsSpeaking of which, here is my largely vibe-coded "pure SQL" implementation of alternating projections. Right now, this is just a standalone function that does the AP (using temp tables) and then calls
|
|
Ok yeah let's talk live! We can email to schedule. Fwiw, three quick points:
|
|
@grantmcdermott for now, can you unblock me on adding dplyr as a dependency and let me build this stuff out (weights and AP)? Always able to go back and take other approaches if I make it modular |
126e86d to
d1a1092
Compare
d1a1092 to
f18f12b
Compare
|
@grantmcdermott I am still making sure the speed is ok, but would be interested in any comments you have here. I needed weights for a couple of the things I want to add next, so I went back and had codex undo all the dplyr stuff. Hopefully that lets us merge this finally |
|
Thanks @jamesbrandecon. Realistically, I won't be able to review under the weekend (at least). QQ, though: what do you mean "alternating projections for demean strategy" in commits above? Is this full blown, FWL style iterative AP? Or something else? |
|
@grantmcdermott If I am thinking correctly (I independently came to this conclusion in the previous version above, which I mostly discarded here), the demeaning with num_FEs>2 is best done via alternating projections. I guess I've never looked deeply into AP algos so maybe I am missing something but I just loop over each FE dimension and take out the mean, repeating until convergence. |
First step of #11
I am running local tests of every combination of no FEs, 1 FE, weights, and no weights, and the output matches feols exactly.