[kotlin2cpg] Add support for KtCallableReferenceExpression#5775
[kotlin2cpg] Add support for KtCallableReferenceExpression#5775SuperUserDone wants to merge 11 commits intomasterfrom
Conversation
...rontends/kotlin2cpg/src/test/scala/io/joern/kotlin2cpg/querying/CallableReferenceTests.scala
Outdated
Show resolved
Hide resolved
...rontends/kotlin2cpg/src/test/scala/io/joern/kotlin2cpg/querying/CallableReferenceTests.scala
Outdated
Show resolved
Hide resolved
...rontends/kotlin2cpg/src/test/scala/io/joern/kotlin2cpg/querying/CallableReferenceTests.scala
Outdated
Show resolved
Hide resolved
...rontends/kotlin2cpg/src/test/scala/io/joern/kotlin2cpg/querying/CallableReferenceTests.scala
Outdated
Show resolved
Hide resolved
8d8282f to
47108a6
Compare
47108a6 to
ee3e39e
Compare
|
The latest set of changes are major overhaul of the methodology of this PR. This now closely follows what the kotlin compiler does. For unbound references to a global function For bound references to a companion object This does not work around kotlin compiler bug https://youtrack.jetbrains.com/issue/KT-54316, so the behavior is identical to the Kotlin compiler in the documented case. This only affects the companion case for assigned references. This affects cases like this. This code won't compile because the Utils::validate ref has a broken signature of (Utils, Int) -> Boolean. class Utils {
companion object {
fun validate(x: Int): Boolean = x > 0
}
}
val ref: (Int) -> Boolean = Utils::validateThis case is fine tough and types are resolved correctly class Utils {
companion object {
fun validate(x: Int): Boolean = x > 0
}
}
fun bar(ref: (Int) -> Boolean) {}
bar(Utils::validate) |
johannescoetzee
left a comment
There was a problem hiding this comment.
I've only checked the tests so far, but have left a few comments about the higher-level design and some nitpicks. I'll probably only get to reviewing the implementation tomorrow.
| val forwardedArgs = processCall.argument.isIdentifier.l | ||
| forwardedArgs.size shouldBe 2 | ||
| forwardedArgs.map(_.typeFullName).toSet shouldBe Set("int", "java.lang.String") |
There was a problem hiding this comment.
With the way this is written, it's possible to miss non-identifier arguments or to have the correct arguments in the wrong order without the test catching it. It's a nitpick, but in my opinion it is better to test explicitly, so something like
processCall.argument.map(_.typeFullName).toList shouldBe List(<something>, "int", "java.lang.String")
or
inside(processCall.argument.l) { case List(<something>, intArg: Identifier, stringArg: Identifier) =>
intArg.typeFullName shouldBe "int"
...
}
It's more cumbersome to write and can be harder to read, but is also occasionally useful for catching weird bugs
There was a problem hiding this comment.
Please check that REF edges from the process arguments to the invoke parameters are being created correctly as well (in this method, but also wherever an Identifier node is added to the CPG). A lot of dataflow issues in the past were caused by missing REF edges (or REF edges referring to the wrong local/param) and testing it in the frontend AST tests is much faster than figuring it out through dataflow debugging
There was a problem hiding this comment.
Have adjusted some of the tests to use this pattern where I thought it made sense. In most of the tests also added checks for the ref edges to the identifier nodes. There was a bug that I caught through that and resolved.
...rontends/kotlin2cpg/src/test/scala/io/joern/kotlin2cpg/querying/CallableReferenceTests.scala
Show resolved
Hide resolved
...rontends/kotlin2cpg/src/test/scala/io/joern/kotlin2cpg/querying/CallableReferenceTests.scala
Outdated
Show resolved
Hide resolved
...rontends/kotlin2cpg/src/test/scala/io/joern/kotlin2cpg/querying/CallableReferenceTests.scala
Outdated
Show resolved
Hide resolved
| } | ||
|
|
||
| "create a constructor call for the synthetic type with receiver as parameter" in { | ||
| val ctorCalls = cpg.call.nameExact("<init>").methodFullName(".*Function2Impl.*<init>.*").l |
There was a problem hiding this comment.
| val ctorCalls = cpg.call.nameExact("<init>").methodFullName(".*Function2Impl.*<init>.*").l | |
| val ctorCalls = cpg.call.methodFullName(".*Function2Impl.*<init>.*").l |
should return the same, unless there's a type decl with <init> in the name. You could also use methodFullNameExact with the exact method full name to avoid any conflicts.
There was a problem hiding this comment.
This is perhaps a stylistic thing, but I also prefer to write these in a way that gives more information about where the call occurs, for example cpg.method("test").call.... How far to take this depends on the complexity of the code in question though and here just finding the <init> directly is probably fine
...rontends/kotlin2cpg/src/test/scala/io/joern/kotlin2cpg/querying/CallableReferenceTests.scala
Outdated
Show resolved
Hide resolved
...rontends/kotlin2cpg/src/test/scala/io/joern/kotlin2cpg/querying/CallableReferenceTests.scala
Show resolved
Hide resolved
...rontends/kotlin2cpg/src/test/scala/io/joern/kotlin2cpg/querying/CallableReferenceTests.scala
Outdated
Show resolved
Hide resolved
...rontends/kotlin2cpg/src/test/scala/io/joern/kotlin2cpg/querying/CallableReferenceTests.scala
Outdated
Show resolved
Hide resolved
...rontends/kotlin2cpg/src/test/scala/io/joern/kotlin2cpg/querying/CallableReferenceTests.scala
Show resolved
Hide resolved
…lib CallableReference implementation
ml86
left a comment
There was a problem hiding this comment.
So far I only reviewed the non test code. Once the comments are incorporated I will take a look at the tests.
| None | ||
| } | ||
| } | ||
| .flatMap(paramDesc => Some(paramDesc.getType)) |
There was a problem hiding this comment.
Extra flatMap seems superfluous. Just do the .getType in the above flatMap.
| protected def createUnboundSamTypeDecl(samInfo: UnboundSamInfo): Ast = { | ||
| val samTypeDecl = typeDeclNode( | ||
| samInfo.expr, | ||
| samInfo.samImplClass.split('.').last, |
There was a problem hiding this comment.
Is this quaranteed to always have a '.'?
There was a problem hiding this comment.
No, its not, but does String.split(..) not contain the original string as the only element of the array if so? Am I missing something?
There was a problem hiding this comment.
You are right. For whatever reason I was thinking that split returns an empty array if the character is not contained in the string.
|
|
||
| val samMethodFullName = s"${samInfo.methodRefName}:${samInfo.signature}" | ||
|
|
||
| val methodRefBinding = |
| protected def createBoundSamTypeDecl(samInfo: BoundSamInfo): Ast = { | ||
| val samTypeDecl = typeDeclNode( | ||
| samInfo.expr, | ||
| samInfo.samImplClass.split('.').last, |
| ): Unit = { | ||
| val binding = bindingNode(methodName, signature, methodFullName) | ||
| bindingInfoQueue.prepend( | ||
| BindingInfo(binding, Seq((typeDecl, binding, EdgeTypes.BINDS), (binding, methodNode, EdgeTypes.REF))) |
There was a problem hiding this comment.
The edge meta data for the REF edge is superfluous as long as it points to the same method as indicated by methodFullName. The backend will create the edge.
There was a problem hiding this comment.
Good to know. Was one of the pain points as it was not possible to get a ref to the method node here, hence the duplication here: #5775 (comment)
| val identNode = | ||
| identifierNode(samInfo.expr, paramName, paramName, registerType(paramType)).argumentIndex(idx + 1) | ||
| paramAst.root | ||
| .collect { case p: NewMethodParameterIn => |
There was a problem hiding this comment.
Avoid single letter variable names.
| val identNode = | ||
| identifierNode(samInfo.expr, paramName, paramName, registerType(paramType)).argumentIndex(idx + 1) | ||
| paramAst.root | ||
| .collect { case p: NewMethodParameterIn => |
There was a problem hiding this comment.
Best to have valueParameterAsts be not yet Ast wrapped NewMethodParameterIn nodes. That way you can avoid the collect.
| samMethodName: String, | ||
| samMethodSig: String, | ||
| samGenericMethodSig: Option[String], | ||
| receiverParam: (Ast, String), |
There was a problem hiding this comment.
The second tuple value seems never to be used.
| parameterInNode(samInfo.expr, "this", "this", 0, false, EvaluationStrategies.BY_SHARING, samInfo.samImplClass) | ||
| .dynamicTypeHintFullName(IndexedSeq(samInfo.samImplClass)) | ||
| val receiverParamName = | ||
| samInfo.receiverParam._1.root.collect { case expr: ExpressionNew => expr.code }.getOrElse("receiver") |
There was a problem hiding this comment.
If expr is e.g. a call expression this will result in strange looking parameter names. I suggest you just always name the parameter receiver.
| } | ||
| } | ||
|
|
||
| private def extractReturnTypeFromSignature(signature: String): String = { |
There was a problem hiding this comment.
The method signature you pass into this function was at some point composed from a return type and the parameter types and now you parse it in order to extract the return type again. Please avoid that and propagated the not yet combined return and parameter types.
| samInfo.expr, | ||
| receiverParamName, | ||
| receiverParamName, | ||
| "receiver", |
There was a problem hiding this comment.
Use constant for "receiver" string.
| .getContributedDescriptors(DescriptorKindFilter.FUNCTIONS, _ => true) | ||
| .asScala | ||
| .collectFirst { | ||
| case functionDescriptor: FunctionDescriptor if functionDescriptor.getModality.toString == "ABSTRACT" => |
There was a problem hiding this comment.
I suspect that we can avoid using .toString and instead check the modality directly?
There was a problem hiding this comment.
That is correct. Will change that. I cannot remember why I wrote it like this.
| val syntheticTypeDecl = cpg.typeDecl.fullName(".*Function2Impl.*").head | ||
| val invokeMethod = syntheticTypeDecl.method.name("invoke").head |
There was a problem hiding this comment.
All code outside of an actual concrete test that references cpg must be evaluated lazily because it will otherwise trigger the CPG generation even if none of theses tests are selected. The cpg value itself is an exception and does not need to be lazily evaluated because the CPG is only created open first us of cpg. For details look into the KotlinCode2CpgFixture and its super classes.
| receiverAccess.ast.isIdentifier | ||
| .name("this") | ||
| .head | ||
| .typeFullName shouldBe "kotlin.jvm.internal.CallableReference" |
There was a problem hiding this comment.
Static type of this should be the name of the generated wrapper class.
In order to test the field identifier and the this identifier better use receiverAccess.argument(2).asInstanceOf[FieldIdentifier]... and receiverAccess.argument(1).asInstanceOf[Identifier].... That way you encode the expected positions and check that these are really direct AST children of the call instead of being somewhere in the AST below it.
| ctorCalls.size shouldBe 1 | ||
|
|
||
| val ctorCall = ctorCalls.head | ||
| ctorCall.methodFullName shouldBe "com.test.Handler.process$kotlin.jvm.functions.Function2Impl.invoke:boolean(int,java.lang.String).<init>:void(com.test.Handler)" |
There was a problem hiding this comment.
Fullname seems longer than necessary. Seems like .invoke:boolean(int,java.langString) can be omitted without the danger of a naming collisions.
There was a problem hiding this comment.
If com.test.Handler.process is overloaded, it might cause a collision, so at the very least the .invoke can be omitted, but I think the signature should stay.
| "have constructor body that assigns receiver field" in { | ||
| val constructor = syntheticTypeDecl.method.name("<init>").head | ||
|
|
||
| constructor.ast.isCall.name("<init>").methodFullName(".*CallableReference.*").l shouldBe List.empty |
There was a problem hiding this comment.
What is this supposed to test?
There was a problem hiding this comment.
Sorry, was a leftover from #5775 (comment). Will remove.
| val syntheticTypeDecl = cpg.typeDecl.fullName(".*Function1Impl.*").head | ||
| val invokeMethod = syntheticTypeDecl.method.name("invoke").head |
There was a problem hiding this comment.
Again make these lazily evaluated.
| val syntheticTypeDecl = cpg.typeDecl.fullName(".*Function1Impl.*").head | ||
| val invokeMethod = syntheticTypeDecl.method.name("invoke").head | ||
|
|
||
| "use STATIC_DISPATCH without receiver access" in { |
There was a problem hiding this comment.
Validation of static call target is missing. Check call.methodFullName.
There was a problem hiding this comment.
I would also like to see validation that the companion object is given as argument 0 to the validate call.
There was a problem hiding this comment.
Will also explicitly set the argumentIndex in the implementation code to ensure this.
| .l | ||
| .filter(_.bindingTypeDecl.fullName.contains("globalFunction")) | ||
| .sortBy(_.methodFullName) | ||
| bindings.size shouldBe 2 |
There was a problem hiding this comment.
Better to select the type decl you are interested in and than use .methodBinding.
| class CallableReferenceTests extends KotlinCode2CpgFixture(withOssDataflow = false) { | ||
|
|
||
| "CPG for code with simple callback usage" should { | ||
| "SAM implementation for bound instance method reference" should { |
There was a problem hiding this comment.
A binding table test for the generated type decl seems to be missing.
There was a problem hiding this comment.
Yes, only for the unbound case currently exist. Will add for the bound case.
No description provided.