[kotlin2cpg] Lambda capture fix (#8544)#5739
Conversation
...cli/frontends/kotlin2cpg/src/main/scala/io/joern/kotlin2cpg/ast/AstForFunctionsCreator.scala
Show resolved
Hide resolved
...cli/frontends/kotlin2cpg/src/main/scala/io/joern/kotlin2cpg/ast/AstForFunctionsCreator.scala
Outdated
Show resolved
Hide resolved
48622f7 to
28fe7b9
Compare
|
I have changed the logic, as the previous approach did not properly handle nested lambdas. Now the logic traverses the CompilerAPI AST directly to find captures instead of trying to first create the CPG then trying to deduce closure bindings from that. |
28fe7b9 to
baa0e6a
Compare
|
Could you please add a test for nested lambdas confirming that capture locals and the corresponding closure bindings are created for each intermediate step (if this does not already exist outside this PR)? For example in java import java.util.function.Supplier;
class Test {
Supplier<Integer> foo(int capturedParam) {
return () -> {
Supplier<Integer> sp = () -> capturedParam;
return sp.get();
};
}
}the outer lambda would need a local and closure binding for the I recently had to fix this for javasrc2cpg: #5795 |
|
I have added assertions to an existing test. Let me know if you think that would be sufficient? |
908ce17 to
839bcc1
Compare
|
|
||
| cb._refOut.size shouldBe 1 | ||
| "should only capture the used local" in { | ||
| cpg.methodRef.outE.collectAll[Capture].size shouldBe 1 |
There was a problem hiding this comment.
Wherever possible avoid going to the edge level in tests and use steps instead. Here _closureBindingViaCaptureOut.
| cpg.closureBinding._localViaRefOut.name.toSet shouldBe Set("baz") | ||
| cpg.closureBinding._methodParameterInViaRefOut.name.toSet shouldBe empty |
There was a problem hiding this comment.
Instead of starting with new cpg.... queries it would be better to make a variable for the method reference and then expand from there. Like it is now you do not test that the closure bindings are associated with the right methodRef.
| case node: NewMethodParameterIn => NodeContext(node, node.name, node.typeFullName) | ||
| case node: NewLocal => NodeContext(node, node.name, node.typeFullName) | ||
| } | ||
| .filter(capturedNodeContext => hasDestructuringParam || referencedNames.contains(capturedNodeContext.name)) |
There was a problem hiding this comment.
Why are all capturedNodeContext taken if there is even a single destructuring parameter?
Are there tests for this?
There was a problem hiding this comment.
Thinking a bit more about it, this is a bit of a hack. The idea is that a lambda with at least one destructuring parameter bypasses usage filtering and captures all visible outer locals/params as a compatibility workaround for destructuring. The OSS dataflow engine seems very particular about these sorts of things. I'll try to think of a better method to solve this issue.
Ideally true usage-based capturing would be possible, but unfortunately this heuristic based method is the best that I found possible with my current understanding of the kotlin compiler API and internals. I'll do some more reading into the internals and reading more into how other frontends handle this for a potential cleaner fix.
There was a problem hiding this comment.
This test covers this case indirectly, but I'll add a test to cover this behavouir directly.
| val generatedName = s"${Constants.DestructedParamNamePrefix}${destructedParamKeyPool.next}" | ||
| paramAsts.append(astForParameter(param, valueParamStartIndex + idx, Some(generatedName))) | ||
| destructedParamAsts.appendAll(astsForDestructuring(param, Some(generatedName))) | ||
| } else { | ||
| paramAsts.append(astForParameter(param, valueParamStartIndex + idx)) |
There was a problem hiding this comment.
Seems to be an unrelated fix for the naming for destructuring parameters. Cant find a corresponding change in the tests. What am I missing?
There was a problem hiding this comment.
The generated destructuring parameter name is passed into astsForDestructuring(...) so the synthetic it initializer (tmp = it) can be wired to the real lambda parameter node instead of a standalone local, which restores correct closure/dataflow edges (at least for the OSS dataflow engine).
That was needed because, after tightening capture to used variables, the old destructuring path lost the parameter-to-destructured-value link and caused the lambda dataflow test to go empty, so this change preserves the expected flow.
I'll add a test to test the parameter name generation. The Kotlin2cpg frontend does have a lot of missing tests for edge cases like this. I'll make a note for future work to also check that tests are sufficient to test edge cases like this.
The test that caught this issue was this:
|
I have pushed another commit with some small changes, but still need to think through #5739 (comment) and #5739 (comment) |
Resolves: https://github.com/ShiftLeftSecurity/codescience/issues/8544
This commit solves an issue where lambdas incorrectly capture variables that aren't used within the lambda. As an example, for the given code, the method parameter x and Bar.this are captured into the lambda. Only baz should be captured into the lambda.