Skip to content

Drop attempts to shadow lexical definitions in LLM generated code #437

@kiranandcode

Description

@kiranandcode

As we allow the LLM access to the lexical context of the enclosing function (as implemented by @datvo06 's #434 ), one common problem I've noticed is that LLMs often ignore instructions to NOT redefine the types they are passed in.

For example:

@Template.define
def make_double_counter(char: str) -> Callable[[str], int]:
    """Create a function that counts occurrences of '{char}' and doubles the result.
    Use the double_count helper function.

    Raise {NotHandled} in the case of a ill formed-input.
    """
    raise NotImplementedError

class ProgramSynthesisInspectCode(ProgramSynthesis):
    def _parse_and_eval[T](self, t: type[T], content: str) -> T:
        print(content)
        return super()._parse_and_eval(t, content)

with handler(LiteLLMProvider()), handler(ProgramSynthesisInspectCode()):
    f = make_double_counter('p')
    print(f('pineapple'))
    print(inspect.getsource(f))

produces output:

<code>
from collections.abc import Callable

# Helper function to double the count
def double_count(count: int) -> int:
    return count * 2

# Custom exception class
class NotHandled(Exception):
    """Raised by an operation when the operation should remain unhandled."""
    pass

# Main function definition
def count_and_double(input_str: str) -> int:
    if not isinstance(input_str, str):
        raise NotHandled("Input is not a valid string.")
    
    count_p = input_str.count('p')
    return double_count(count_p)
</code>

Namely, the LLM has redefined the NotHandled exception, and worse this is in a transient module, so there is no way to capture this specific NotHandledException specifically and the generated code will behave in a confusing way to the user.

One thing we could do is, if we're parsing the generated code, we can drop any attempts to redefine the supplied classes. That or we can raise an exception if any of the symbols produced by a code block overlap with the lexical context.

Opening an issue as there might be some discussion about this, adding more introspection on the code makes things more fragile.

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions