Conversation
|
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). View this failed invocation of the CLA check for more information. For the most up to date status, view the checks section at the bottom of the pull request. |
|
@FlorianKothmeier thank you for your contribution! Would you please confirm that you intend to sign the CLA? Unfortunately we are unable to accept contributions without the CLA. Assuming that you sign the CLA, I'll be able to give your PR a thorough review early next week - I look forward to taking a closer look! |
|
@mike-hunhoff I thought I already signed the cla. At least the cla check in the CI passes now. Can you point me in the right direction what I'm missing? |
Yes you are correct, sorry for missing that. CLA check is passing now so you're good to go, thank you! |
Resolves #5
This should work for most exposed java functionality. Getting code completion for python is more complicated due to dynamic typing.
I first tried to port the existing Jython code from the ghidra repository. But this ultimately failed because the ghidra python2 code just didn't work even after porting it to valid python3. Furthermore, the java introspection side relied on jython features, which were also hard to port over.
So I gave up and wrote a new implementation. IMHO this code is more readable and robust, because it uses the python standard library to achieve most of it and no python magic and doesn't use
eval()everywhere. It mostly traverses the AST generated by python. The only case this code usesevalfor is getting properties from local scope.Still, a lot of the boilerplate on the Java side was copied over (and adapted) from the ghidra implementation, such as code completion colors.
How does this work? (Not a complete overview; only the important points)
The code is split in two parts. A Python side
complete.py:ast.parseevalinspect.getmembersfor a list of all members and find the one we wantPythonCodeCompletionFactory.getReturnType, which returns a faked java object of the return type. More on that later oninspect.getmembersand turn them into code completions. If we have a member typed out (e.g.currentProgram().getM), we filter this list by a prefix.The Java side mostly resolves around implementing the "faked" objects returned by
PythonCodeCompletionFactory.getReturnType.We use a private inner class
InspectableJavaObject<T>, which holds a reference to thejava.lang.Classof the object we want to fake. We can then get a property list viagetProperties(), these return either moreInspectableJavaObjects for fields or another typeInspectableJavaMethodfor functions defined in that class. TheInspectableJavaMethodneeds a special case in thegetReturnTypeimplementation that resolves the Java Class of the object type we wanted to fake instead.Additionally due to a jep limitation, an interpreter may only be used from the same thread it was created on. Therefore I had to move the interpreter to its own Java Thread. Any action to run on the interpreter can now be done by submitting Futures and waiting for their completion.
I also got rid of the
execcall injepwrappers.pyas this made it hard to get the type information into the signature of the wrapper functions.Limitations
complete.pyuses a lot of match statements for traversing the AST. Unfortunately match statements are a relatively new addition to python only being added in 3.10. However, typing this out withif ... elsechains is a nightmare. If this is a problem, we could fall back to no autocompletion in case the module cannot be loaded.Furthermore, there is a special case for the
strtype. Jep converts between some Java and Python types automatically. This is mostly observed onjava.lang.String<->strSo when the java side reports that a function returns
java.lang.String, the python side has to fix the type tostror else the completions are wrong. While this isn't pretty I'm not sure if there is a better solution.The same problem applies to other types as well. However I fear the other cases are not as straight forward as this one. Additionally the string problem is the most common one, so this should work fine for most cases.
The Python
locals()function cannot be evaluated directly from the java side, as it must be run from inside python. Unfortunately this is needed to get the local variables. The workaround is currently to eval and assign it to a variable that can be retrieved later, which is ugly because it pollutes the script environment (and needs a special case not to break the autocompletion)As we are trying to parse the command character by character the completion runs in
O(n^2)time, which could be a problem for longer inputs. For testing I tested completion on the string"foo("*500 + "currentProgram().getM"and completion results still completed instantly so this hopefully shouldn't be a problem. This string is already too long for the interpreter console anyways.Completion on the python side are limited mostly by signatures on python functions. Therefore it is quite important that the builtin wrapper function return values are correctly typed or else code completion will not work on them.
Additionally, this code does not give any hints on the argument types or the number of arguments. This may be nontrivial though, especially considering the wrapper functions have no sensible function signature arguments, but accept any args.
Furthermore, this only covers the basic completion cases. More challenging cases such as array subscripts and import autocompletions are not covered at all.