[Lldb-commits] [PATCH] D68671: Add the ability to pass extra args to a Python breakpoint command function

Jim Ingham via Phabricator via lldb-commits lldb-commits at lists.llvm.org
Wed Oct 16 10:46:52 PDT 2019


jingham marked an inline comment as done.
jingham added inline comments.


================
Comment at: lldb/include/lldb/Interpreter/ScriptInterpreter.h:469
+  
+  virtual int GetNumArgumentsForCallable(const char *callable_name) { 
+    return -1;
----------------
labath wrote:
> jingham wrote:
> > labath wrote:
> > > jingham wrote:
> > > > jingham wrote:
> > > > > labath wrote:
> > > > > > labath wrote:
> > > > > > > lawrence_danna wrote:
> > > > > > > > jingham wrote:
> > > > > > > > > lawrence_danna wrote:
> > > > > > > > > > labath wrote:
> > > > > > > > > > > jingham wrote:
> > > > > > > > > > > > lawrence_danna wrote:
> > > > > > > > > > > > > labath wrote:
> > > > > > > > > > > > > > In light of varargs functions (`*args, **kwargs`), which are fairly popular in python, the concept of "number of arguments of a callable" does not seem that well defined. The current implementation seems to return the number of fixed arguments, which might be fine, but I think this behavior should be documented. Also, it would be good to modernize this function signature -- have it take a StringRef, and return a `Expected<unsigned (?)>` -- ongoing work by @lawrence_danna will make it possible to return errors from the python interpreter, and this will make it possible to display those, instead of just guessing that this is because the callable was not found (it could in fact be because the named thing is not a callable, of because resolving the name produced an exception, ...).
> > > > > > > > > > > > > I just took a look at PythonCallable::GetNumArguments() and it's horribly broken.  
> > > > > > > > > > > > > 
> > > > > > > > > > > > > It doesn't even work for the simplest test case I could think of.
> > > > > > > > > > > > > 
> > > > > > > > > > > > > ```  
> > > > > > > > > > > > > auto builtins = PythonModule::Import("builtins");
> > > > > > > > > > > > > ASSERT_THAT_EXPECTED(builtins, llvm::Succeeded());
> > > > > > > > > > > > > auto hex = As<PythonCallable>(builtins.get().GetAttribute("hex"));
> > > > > > > > > > > > > ASSERT_THAT_EXPECTED(hex, llvm::Succeeded());  
> > > > > > > > > > > > > EXPECT_EQ(hex.get().GetNumArguments().count, 1u);
> > > > > > > > > > > > > ```
> > > > > > > > > > > > > 
> > > > > > > > > > > > > we should really re-write it to use inspect.signature.
> > > > > > > > > > > > Interesting.  We use it for free functions (what you pass to the -F option of `breakpoint command add`) and for the __init__ and __call__ method in the little classes you can make up for scripted thread plans and for the class version of Python implemented command-line commands.  We have tests for telling 3 vrs. 4 vrs. not enough or too many, and they all pass.  So it seems to work in the cases we currently need it to work for...
> > > > > > > > > > > > 
> > > > > > > > > > > > "inspect.signature" is python 3 only, and the python 2 equivalent is deprecated.  So it will take a little fancy footwork to use it in the transition period.
> > > > > > > > > > > lol :)
> > > > > > > > > > > 
> > > > > > > > > > > I would actually say that we should try not to use this function(ality) wherever possible. Making decisions based on the number of arguments the thing you're about to call takes sounds weird. I don't want to get too involved in this, but I was designing this, I'd just say that if one tries to pass arguments to the callback then the callback MUST take three arguments (or we'll abort processing the breakpoint command). If he wants his function to be called both with arguments and without, he can just add a default value to the third argument. (And if his function takes two arguments, but he still tries to pass something... well, it's his own fault).
> > > > > > > > > > > 
> > > > > > > > > > > Anyway, feel free to ignore this comment, but I felt like I had to say something. :)
> > > > > > > > > > I completely agree with Pavel.    Inspecting a function signature before calling it is a big code smell in python.    If there's a way to avoid doing that introspection, that would be better.
> > > > > > > > > Unfortunately, we originally designed this interface to take three arguments, the frame pointer, the breakpoint location and the Python session dict.  Then it became clear that it would be better to add this extra args argument (and in the case of Python based commands the ExecutionContext pointer).  At that point we had three choices, abandon the improvement; switch to unconditionally passing the extra arguments, and break everybody's extant uses; or switch off of the number of arguments to decide whether the user had provided the old or new forms.
> > > > > > > > > 
> > > > > > > > > My feeling about lldb Python scripts/commands etc. that people use in the debugger is that a lot of users don't know how they work at all, they just got them from somebody else; and many more figured out how to write them for some specific purpose, and then pretty much forgot how they worked.  So suddenly breaking all these bits of functionality will result in folks just deciding that this affordance is not reliable and not worth their time, which would be a shame.
> > > > > > > > > 
> > > > > > > > > So instead we accommodate both forms, which requires that we know which one the user provided.  If you see a better way to do this, (and are willing to implement it because so far as I can see this method is going to work just fine) dig in, I'm not wedded to the particular approach.  But I am not excited about penalizing our users because we didn't get the API design right the first time through.
> > > > > > > > makes sense.   
> > > > > > > > 
> > > > > > > > The only other way I can think of to solve it would be to have some indication in the `break com add` command of what signature it expects from the function.   But that's really ugly too because now you're asking users to understand yet another option.
> > > > > > > > 
> > > > > > > > I put up https://reviews.llvm.org/D68995 this morning which adds `inspect.signature` support for pythons that have it.
> > > > > > > > 
> > > > > > > > Currently we have really gnarly ArgInfo tests like this:
> > > > > > > > ```
> > > > > > > >     if (argc.count == 5 || argc.is_bound_method || argc.has_varargs)
> > > > > > > >         pfunc(debugger_arg, PythonString(args), exe_ctx_arg, cmd_retobj_arg, dict);
> > > > > > > >     else
> > > > > > > >         pfunc(debugger_arg, PythonString(args), cmd_retobj_arg, dict);
> > > > > > > > ```
> > > > > > > > 
> > > > > > > > 😖
> > > > > > > > 
> > > > > > > > I think if we replace `count`  with `max_positional_args` we should be able to replace that kindof test with just 
> > > > > > > > ```
> > > > > > > > if (argc.max_positional_args < 5)
> > > > > > > >    old_version
> > > > > > > > else
> > > > > > > >    new_version
> > > > > > > > ```
> > > > > > > > 
> > > > > > > > 
> > > > > > > So, how about this: Put extra_args as the last argument, instead of inserting it in the middle (so the new signature becomes `frame, bp_loc, dict, extra_args` instead of `frame, bp_loc, extra_args, dict`. Then instead of
> > > > > > > ```
> > > > > > >   if (arg_info.count == 3)
> > > > > > >       result = pfunc(frame_arg, bp_loc_arg, dict);
> > > > > > >     else if (arg_info.count == 4) {
> > > > > > >         lldb::SBStructuredData *args_value = new lldb::SBStructuredData(args_impl);
> > > > > > >         PythonObject args_arg(PyRefType::Owned, SBTypeToSWIGWrapper(args_value));
> > > > > > >         result = pfunc(frame_arg, bp_loc_arg, args_arg, dict);
> > > > > > > ```
> > > > > > > we do:
> > > > > > > ```
> > > > > > > if (args_impl.was_specified())
> > > > > > >   pfunc(frame_arg, bp_loc_arg, dict, args_arg)
> > > > > > > else
> > > > > > >   pfunc(frame_arg, bp_loc_arg, dict);
> > > > > > > ```
> > > > > > > All existing scripts will not specify the extra arguments, so they will work as usual. New scripts which do pass extra arguments will have to use the new signature. New scripts can also put `args = None` in their python signature, so that they are callable both with and without arguments, should they so desire. (If we don't want to support the `=None` use case then we can even keep the arguments in the same order as in this patch.)
> > > > > > > 
> > > > > > > Is there some reason why that would not work?
> > > > > > > The only other way I can think of to solve it would be to have some indication in the break com add command of what signature it expects from the function.
> > > > > > 
> > > > > > That's kind of what I'm getting at. I am hoping that the presence of the `--key`, `--value` options can be used as an indicator of that signature. (Though maybe I am misunderstanding something and I should shut up.)
> > > > > The is_bound_method test is cheesy.  We didn't offer a "class with __call__" for Python based commands until after we added the ExecutionContext argument, so this check knows that if you are providing a class method, then you are probably also providing the correct number of arguments.  max_positional_args seems a more explicit approach.
> > > > > 
> > > > > I think the varargs check is to allow you to write a command that takes the old three arguments and the ExecutionContext as a vararg, so the same Python function could work with an older lldb that didn't send the exe_ctx but take advantage of the better interface if it was present.  After all, the fallback of using the currently selected "target/process/thread/frame" inside the function will mostly work.
> > > > > 
> > > > > For the affordances taking the "extra_args", like breakpoint commands and scripted breakpoints and the like, it's hard to see how you could have a reasonable fallback to "you didn't tell me what function to look for..."  So for these I want to count the fixed arguments, I don't think it is necessary to allow them to be passed as varargs.
> > > > > 
> > > > > Thanks for fixing up the PythonObjects code, BTW!
> > > > I did consider switching off of whether the user provided key & value arguments.  But that would mean that you could not write Python breakpoint command functions that have default behaviors, but will do special things if the user provided `--key` and `--value` arguments.  That seemed like a reasonable thing to support, so I switched to doing everything based on the actual signature we were provided.
> > > Why not? If the last argument has a default value (`def my_fancy_callback(frame, bp_loc, dict, extra_args = None):`), then it should be callable both with three and four arguments, should it not? I think the `=None` part actually nicely documents that the callback is usable both with extra arguments, but also has a reasonable default behavior.
> > It's already weird that we have to tell people to  pass the session dict.  Now we're saying you have to pass a defaulted arg after the dict; that seems even weirder.  I don't actually think lots of people are going to want to write commands that take advantage of being passed extra arguments but want them to work with older lldb's.  The case I want to support is that you are always going to get this extra args SBStructuredData, but it might be empty depending on whether the user added --key and --value, and your command can do what it wants based on the contents of the extra_args.  That seems straightforwardly supported by requiring the arguments.
> > 
> > I'm not sure we are solving any actual problem here?
> What I find weird is that the meaning of a positional argument changes depending on what comes after it. So if I define a function like `def foo(a,b,c)`, then `c` will be a session dictionary. But if I define it as `def foo(a,b,c,d):` then the dict becomes `d`. I don't find the `=None` confusing, particularly as that means we could just have one function signature we advertise, and if the users write it that way then it will "happen" to work on older lldb's too. But then again, I've never written any breakpoint callbacks, so if you think that these concerns are unfounded, then feel free to go ahead.
Yeah, I don't think that is what you are told to do.  For breakpoint commands, you are told to write:

def foo(frame, bp_no, dict):

in the old form, or if you want to receive extra args, you would write:

def foo(frame, bp_no, extra_args, dict)

and in either case the last - "dict" - argument is an internal detail.  That seems straightforward.

Also, it is the way command implementation functions (with the added exe_ctx argument), and scripted thread plans (with a similar extra_args argument) work.  It is also how the scripted breakpoint resolvers work, except there was never a version w/o the extra_args, because that's when I realized that was a much better way to go...  So I do think I'm going to leave it this way.



Repository:
  rLLDB LLDB

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D68671/new/

https://reviews.llvm.org/D68671





More information about the lldb-commits mailing list