[Lldb-commits] [lldb] Add the ability to define a Python based command that uses CommandObjectParsed (PR #70734)

Mon Oct 30 15:02:06 PDT 2023

llvmbot wrote:




@llvm/pr-subscribers-lldb

Author: None (jimingham)

<details>
<summary>Changes</summary>

This allows you to specify options and arguments and their definitions and then have lldb handle the completions, help, etc. in the same way that lldb does for its parsed commands internally.

This feature has some design considerations as well as the code, so I've also set up an RFC, but I did this one first and will put the RFC address in here once I've pushed it...

Note, the lldb "ParsedCommand interface" doesn't actually do all the work that it should.  For instance, saying the type of an option that has a completer doesn't automatically hook up the completer, and ditto for argument values.  We also do almost no work to verify that the arguments match their definition, or do auto-completion for them.  This patch allows you to make a command that's bug-for-bug compatible with built-in ones, but I didn't want to stall it on getting the auto-command checking to work all the way correctly.

As an overall design note, my primary goal here was to make an interface that worked well in the script language.  For that I needed, for instance, to have a property-based way to get all the option values that were specified.  It was much more convenient to do that by making a fairly bare-bones C interface to define the options and arguments of a command, and set their values, and then wrap that in a Python class (installed along with the other bits of the lldb python module) which you can then derive from to make your new command.  This approach will also make it easier to experiment.

See the file test_commands.py in the test case for examples of how this works.

---

Patch is 76.74 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/70734.diff


15 Files Affected:

- (modified) lldb/bindings/python/CMakeLists.txt (+2-1) 
- (modified) lldb/bindings/python/python-wrapper.swig (+31) 
- (added) lldb/examples/python/parsed_cmd.py (+315) 
- (modified) lldb/include/lldb/Interpreter/CommandObject.h (+4-1) 
- (modified) lldb/include/lldb/Interpreter/ScriptInterpreter.h (+29) 
- (modified) lldb/source/Commands/CommandObjectCommands.cpp (+693-4) 
- (modified) lldb/source/Commands/Options.td (+14-8) 
- (modified) lldb/source/Interpreter/CommandObject.cpp (+16) 
- (modified) lldb/source/Plugins/ScriptInterpreter/Python/PythonDataObjects.h (+2) 
- (modified) lldb/source/Plugins/ScriptInterpreter/Python/SWIGPythonBridge.h (+7) 
- (modified) lldb/source/Plugins/ScriptInterpreter/Python/ScriptInterpreterPython.cpp (+251-1) 
- (modified) lldb/source/Plugins/ScriptInterpreter/Python/ScriptInterpreterPythonImpl.h (+21) 
- (added) lldb/test/API/commands/command/script/add/TestAddParsedCommand.py (+146) 
- (added) lldb/test/API/commands/command/script/add/test_commands.py (+175) 
- (modified) lldb/unittests/ScriptInterpreter/Python/PythonTestSuite.cpp (+8) 


``````````diff

diff --git a/lldb/bindings/python/CMakeLists.txt b/lldb/bindings/python/CMakeLists.txt
index c941f764dfc92ac..657fdd2c9590066 100644
--- a/lldb/bindings/python/CMakeLists.txt
+++ b/lldb/bindings/python/CMakeLists.txt
@@ -96,7 +96,8 @@ function(finish_swig_python swig_target lldb_python_bindings_dir lldb_python_tar
     ${lldb_python_target_dir}
     "utils"
     FILES "${LLDB_SOURCE_DIR}/examples/python/in_call_stack.py"
-          "${LLDB_SOURCE_DIR}/examples/python/symbolication.py")
+          "${LLDB_SOURCE_DIR}/examples/python/symbolication.py"
+          "${LLDB_SOURCE_DIR}/examples/python/parsed_cmd.py")
 
   create_python_package(
     ${swig_target}
diff --git a/lldb/bindings/python/python-wrapper.swig b/lldb/bindings/python/python-wrapper.swig
index 17bc7b1f2198709..47887491d0c552a 100644
--- a/lldb/bindings/python/python-wrapper.swig
+++ b/lldb/bindings/python/python-wrapper.swig
@@ -831,6 +831,37 @@ bool lldb_private::python::SWIGBridge::LLDBSwigPythonCallCommandObject(
   return true;
 }
 
+bool lldb_private::python::SWIGBridge::LLDBSwigPythonCallParsedCommandObject(
+    PyObject *implementor, lldb::DebuggerSP debugger, lldb_private::StructuredDataImpl &args_impl,
+    lldb_private::CommandReturnObject &cmd_retobj,
+    lldb::ExecutionContextRefSP exe_ctx_ref_sp) {
+
+  PyErr_Cleaner py_err_cleaner(true);
+
+  PythonObject self(PyRefType::Borrowed, implementor);
+  auto pfunc = self.ResolveName<PythonCallable>("__call__");
+
+  if (!pfunc.IsAllocated())
+    return false;
+
+  auto cmd_retobj_arg = SWIGBridge::ToSWIGWrapper(cmd_retobj);
+
+  // FIXME:
+  // I wanted to do something like:
+  // size_t num_elem = args.size();
+  // PythonList my_list(num_elem);
+  // for (const char *elem : args)
+  //   my_list.append(PythonString(elem);
+  //
+  // and then pass my_list to the pfunc, but that crashes somewhere
+  // deep in Python for reasons that aren't clear to me.
+  
+  pfunc(SWIGBridge::ToSWIGWrapper(std::move(debugger)), SWIGBridge::ToSWIGWrapper(args_impl),
+        SWIGBridge::ToSWIGWrapper(exe_ctx_ref_sp), cmd_retobj_arg.obj());
+
+  return true;
+}
+
 PythonObject lldb_private::python::SWIGBridge::LLDBSWIGPythonCreateOSPlugin(
     const char *python_class_name, const char *session_dictionary_name,
     const lldb::ProcessSP &process_sp) {
diff --git a/lldb/examples/python/parsed_cmd.py b/lldb/examples/python/parsed_cmd.py
new file mode 100644
index 000000000000000..7ee9e077d49ab52
--- /dev/null
+++ b/lldb/examples/python/parsed_cmd.py
@@ -0,0 +1,315 @@
+"""
+This module implements a couple of utility classes to make writing
+lldb parsed commands more Pythonic.
+The way to use it is to make a class for you command that inherits from ParsedCommandBase.
+That will make an LLDBOVParser which you will use for your
+option definition, and to fetch option values for the current invocation
+of your command.  Access to the OV parser is through:
+
+ParsedCommandBase.get_parser()
+
+Next, implement setup_command_definition in your new command class, and call:
+
+  self.get_parser().add_option
+
+to add all your options.  The order doesn't matter for options, lldb will sort them
+alphabetically for you when it prints help.
+
+Similarly you can define the arguments with:
+
+  self.get_parser.add_argument
+
+at present, lldb doesn't do as much work as it should verifying arguments, it pretty
+much only checks that commands that take no arguments don't get passed arguments.
+
+Then implement the execute function for your command as:
+
+    def __call__(self, debugger, args_array, exe_ctx, result):
+
+The arguments will be in a python array as strings.  
+
+You can access the option values using varname you passed in when defining the option.  
+If you need to know whether a given option was set by the user or not, you can retrieve 
+the option definition array with:
+
+  self.get_options_definition()
+
+look up your element by varname and check the "_value_set" element.
+
+There are example commands in the lldb testsuite at:
+
+llvm-project/lldb/test/API/commands/command/script/add/test_commands.py
+
+FIXME: I should make a convenient wrapper for that. 
+"""
+import inspect
+import lldb
+import sys
+
+class LLDBOVParser:
+    def __init__(self):
+        self.options_array = []
+        self.args_array = []
+
+    # Some methods to translate common value types.  Should return a
+    # tuple of the value and an error value (True => error) if the
+    # type can't be converted.
+    # FIXME: Need a way to push the conversion error string back to lldb.
+    @staticmethod
+    def to_bool(in_value):
+        error = True
+        value = False
+        low_in = in_value.lower()
+        if low_in == "yes" or low_in == "true" or low_in == "1":
+            value = True
+            error = False
+            
+        if not value and low_in == "no" or low_in == "false" or low_in == "0":
+            value = False
+            error = False
+
+        return (value, error)
+
+    @staticmethod
+    def to_int(in_value):
+        #FIXME: Not doing errors yet...
+        return (int(in_value), False)
+
+    def to_unsigned(in_value):
+        # FIXME: find an unsigned converter...
+        # And handle errors.
+        return (int(in_value), False)
+
+    translators = {
+        lldb.eArgTypeBoolean : to_bool,
+        lldb.eArgTypeBreakpointID : to_unsigned,
+        lldb.eArgTypeByteSize : to_unsigned,
+        lldb.eArgTypeCount : to_unsigned,
+        lldb.eArgTypeFrameIndex : to_unsigned,
+        lldb.eArgTypeIndex : to_unsigned,
+        lldb.eArgTypeLineNum : to_unsigned,
+        lldb.eArgTypeNumLines : to_unsigned,
+        lldb.eArgTypeNumberPerLine : to_unsigned,
+        lldb.eArgTypeOffset : to_int,
+        lldb.eArgTypeThreadIndex : to_unsigned,
+        lldb.eArgTypeUnsignedInteger : to_unsigned,
+        lldb.eArgTypeWatchpointID : to_unsigned,
+        lldb.eArgTypeColumnNum : to_unsigned,
+        lldb.eArgTypeRecognizerID : to_unsigned,
+        lldb.eArgTypeTargetID : to_unsigned,
+        lldb.eArgTypeStopHookID : to_unsigned
+    }
+
+    @classmethod
+    def translate_value(cls, value_type, value):
+        error = False
+        try:
+            return cls.translators[value_type](value)
+        except KeyError:
+            # If we don't have a translator, return the string value.
+            return (value, False)
+
+    # FIXME: would this be better done on the C++ side?
+    # The common completers are missing some useful ones.
+    # For instance there really should be a common Type completer
+    # And an "lldb command name" completer.
+    completion_table = {
+        lldb.eArgTypeAddressOrExpression : lldb.eVariablePathCompletion,
+        lldb.eArgTypeArchitecture : lldb.eArchitectureCompletion,
+        lldb.eArgTypeBreakpointID : lldb.eBreakpointCompletion,
+        lldb.eArgTypeBreakpointIDRange : lldb.eBreakpointCompletion,
+        lldb.eArgTypeBreakpointName : lldb.eBreakpointNameCompletion,
+        lldb.eArgTypeClassName : lldb.eSymbolCompletion,
+        lldb.eArgTypeDirectoryName : lldb.eDiskDirectoryCompletion,
+        lldb.eArgTypeExpression : lldb.eVariablePathCompletion,
+        lldb.eArgTypeExpressionPath : lldb.eVariablePathCompletion,
+        lldb.eArgTypeFilename : lldb.eDiskFileCompletion,
+        lldb.eArgTypeFrameIndex : lldb.eFrameIndexCompletion,
+        lldb.eArgTypeFunctionName : lldb.eSymbolCompletion,
+        lldb.eArgTypeFunctionOrSymbol : lldb.eSymbolCompletion,
+        lldb.eArgTypeLanguage : lldb.eTypeLanguageCompletion,
+        lldb.eArgTypePath : lldb.eDiskFileCompletion,
+        lldb.eArgTypePid : lldb.eProcessIDCompletion,
+        lldb.eArgTypeProcessName : lldb.eProcessNameCompletion,
+        lldb.eArgTypeRegisterName : lldb.eRegisterCompletion,
+        lldb.eArgTypeRunArgs : lldb.eDiskFileCompletion,
+        lldb.eArgTypeShlibName : lldb.eModuleCompletion,
+        lldb.eArgTypeSourceFile : lldb.eSourceFileCompletion,
+        lldb.eArgTypeSymbol : lldb.eSymbolCompletion,
+        lldb.eArgTypeThreadIndex : lldb.eThreadIndexCompletion,
+        lldb.eArgTypeVarName : lldb.eVariablePathCompletion,
+        lldb.eArgTypePlatform : lldb.ePlatformPluginCompletion,
+        lldb.eArgTypeWatchpointID : lldb.eWatchpointIDCompletion,
+        lldb.eArgTypeWatchpointIDRange : lldb.eWatchpointIDCompletion,
+        lldb.eArgTypeModuleUUID : lldb.eModuleUUIDCompletion,
+        lldb.eArgTypeStopHookID : lldb.eStopHookIDCompletion
+    }
+
+    @classmethod
+    def determine_completion(cls, arg_type):
+        try:
+            return cls.completion_table[arg_type]
+        except KeyError:
+            return lldb.eNoCompletion
+
+    def get_option_element(self, long_name):
+        # Fixme: Is it worth making a long_option dict holding the rest of
+        # the options dict so this lookup is faster?
+        for item in self.options_array:
+            if item["long_option"] == long_name:
+                return item
+
+        return None
+            
+    def option_parsing_started(self):
+        # This makes the ivars for all the varnames in the array and gives them
+        # their default values.
+        for elem in self.options_array:
+            elem['_value_set'] = False       
+            try:
+                object.__setattr__(self, elem["varname"], elem["default"])
+            except AttributeError:
+            # It isn't an error not to have a target, you'll just have to set and
+            # get this option value on your own.
+                continue
+
+    def set_enum_value(self, enum_values, input):
+        candidates = []
+        for candidate in enum_values:
+            # The enum_values are a duple of value & help string.
+            value = candidate[0]
+            if value.startswith(input):
+                candidates.append(value)
+
+        if len(candidates) == 1:
+            return (candidates[0], False)
+        else:
+            return (input, True)
+        
+    def set_option_value(self, exe_ctx, opt_name, opt_value):
+        elem = self.get_option_element(opt_name)
+        if not elem:
+            return False
+        
+        if "enum_values" in elem:
+            (value, error) = self.set_enum_value(elem["enum_values"], opt_value)
+        else:
+            (value, error)  = __class__.translate_value(elem["value_type"], opt_value)
+
+        if not error:
+            object.__setattr__(self, elem["varname"], value)
+            elem["_value_set"] = True
+            return True
+        return False
+
+    def was_set(self, opt_name):
+        elem = self.get_option_element(opt_name)
+        if not elem:
+            return False
+        try:
+            return elem["_value_set"]
+        except AttributeError:
+            return False
+
+    def is_enum_opt(self, opt_name):
+        elem = self.get_option_element(opt_name)
+        if not elem:
+            return False
+        return "enum_values" in elem
+
+    def add_option(self, short_option, long_option, usage, default,
+                   varname = None, required=False, groups = None,
+                   value_type=lldb.eArgTypeNone, completion_type=None,
+                   enum_values=None):
+        """
+        short_option: one character, must be unique, not required
+        long_option: no spaces, must be unique, required
+        usage: a usage string for this option, will print in the command help
+        default: the initial value for this option (if it has a value)
+        varname: the name of the property that gives you access to the value for
+                 this value.  Defaults to the long option if not provided.
+        required: if true, this option must be provided or the command will error out
+        groups: Which "option groups" does this option belong to
+        value_type: one of the lldb.eArgType enum values.  Some of the common arg
+                    types also have default completers, which will be applied automatically.
+        completion_type: currently these are values form the lldb.CompletionType enum, I
+                         haven't done custom completions yet.
+        enum_values: An array of duples: ["element_name", "element_help"].  If provided,
+                     only one of the enum elements is allowed.  The value will be the 
+                     element_name for the chosen enum element as a string. 
+        """
+        if not varname:
+            varname = long_option
+
+        if not completion_type:
+            completion_type = self.determine_completion(value_type)
+            
+        dict = {"short_option" : short_option,
+                "long_option" : long_option,
+                "required" : required,
+                "usage" : usage,
+                "value_type" : value_type,
+                "completion_type" : completion_type,
+                "varname" : varname,
+                "default" : default}
+
+        if enum_values:
+            dict["enum_values"] = enum_values
+        if groups:
+            dict["groups"] = groups
+
+        self.options_array.append(dict)
+
+    def make_argument_element(self, arg_type, repeat = "optional", groups = None):
+        element = {"arg_type" : arg_type, "repeat" : repeat}
+        if groups:
+            element["groups"] = groups
+        return element
+
+    def add_argument_set(self, arguments):
+        self.args_array.append(arguments)
+
+class ParsedCommandBase:
+    def __init__(self, debugger, unused):
+        self.debugger = debugger
+        self.ov_parser = LLDBOVParser()
+        self.setup_command_definition()
+        
+    def get_parser(self):
+        return self.ov_parser
+
+    def get_options_definition(self):
+        return self.get_parser().options_array
+
+    def get_flags(self):
+        return 0
+
+    def get_args_definition(self):
+        return self.get_parser().args_array
+
+    def option_parsing_started(self):
+        self.get_parser().option_parsing_started()
+
+    def set_option_value(self, exe_ctx, opt_name, opt_value):
+        return self.get_parser().set_option_value(exe_ctx, opt_name, opt_value)
+
+    # These are the two "pure virtual" methods:
+    def __call__(self, debugger, args_array, exe_ctx, result):
+        raise NotImplementedError()
+
+    def setup_command_definition(self):
+        raise NotImplementedError()
+
+    @staticmethod
+    def do_register_cmd(cls, debugger, module_name):
+        # Add any commands contained in this module to LLDB
+        command = "command script add -o -p -c %s.%s %s" % (
+            module_name,
+            cls.__name__,
+            cls.program,
+        )
+        debugger.HandleCommand(command)
+        print(
+            'The "{0}" command has been installed, type "help {0}"'
+            'for detailed help.'.format(cls.program)
+        )
diff --git a/lldb/include/lldb/Interpreter/CommandObject.h b/lldb/include/lldb/Interpreter/CommandObject.h
index 004f5d42f1e44ee..c2f17f374fe5828 100644
--- a/lldb/include/lldb/Interpreter/CommandObject.h
+++ b/lldb/include/lldb/Interpreter/CommandObject.h
@@ -224,7 +224,10 @@ class CommandObject : public std::enable_shared_from_this<CommandObject> {
   void GetFormattedCommandArguments(Stream &str,
                                     uint32_t opt_set_mask = LLDB_OPT_SET_ALL);
 
-  bool IsPairType(ArgumentRepetitionType arg_repeat_type);
+  static bool IsPairType(ArgumentRepetitionType arg_repeat_type);
+
+  static std::optional<ArgumentRepetitionType> 
+    ArgRepetitionFromString(llvm::StringRef string);
 
   bool ParseOptions(Args &args, CommandReturnObject &result);
 
diff --git a/lldb/include/lldb/Interpreter/ScriptInterpreter.h b/lldb/include/lldb/Interpreter/ScriptInterpreter.h
index 0146eeb86262003..4dce5f40328d8dd 100644
--- a/lldb/include/lldb/Interpreter/ScriptInterpreter.h
+++ b/lldb/include/lldb/Interpreter/ScriptInterpreter.h
@@ -476,6 +476,14 @@ class ScriptInterpreter : public PluginInterface {
     return false;
   }
 
+  virtual bool RunScriptBasedParsedCommand(
+      StructuredData::GenericSP impl_obj_sp, Args& args,
+      ScriptedCommandSynchronicity synchronicity,
+      lldb_private::CommandReturnObject &cmd_retobj, Status &error,
+      const lldb_private::ExecutionContext &exe_ctx) {
+    return false;
+  }
+
   virtual bool RunScriptFormatKeyword(const char *impl_function,
                                       Process *process, std::string &output,
                                       Status &error) {
@@ -520,6 +528,27 @@ class ScriptInterpreter : public PluginInterface {
     dest.clear();
     return false;
   }
+  
+  virtual StructuredData::ObjectSP
+  GetOptionsForCommandObject(StructuredData::GenericSP cmd_obj_sp) {
+    return {};
+  }
+
+  virtual StructuredData::ObjectSP
+  GetArgumentsForCommandObject(StructuredData::GenericSP cmd_obj_sp) {
+    return {};
+  }
+  
+  virtual bool SetOptionValueForCommandObject(
+      StructuredData::GenericSP cmd_obj_sp, ExecutionContext *exe_ctx, 
+      llvm::StringRef long_option, llvm::StringRef value) {
+    return false;
+  }
+
+  virtual void OptionParsingStartedForCommandObject(
+      StructuredData::GenericSP cmd_obj_sp) {
+    return;
+  }
 
   virtual uint32_t
   GetFlagsForCommandObject(StructuredData::GenericSP cmd_obj_sp) {
diff --git a/lldb/source/Commands/CommandObjectCommands.cpp b/lldb/source/Commands/CommandObjectCommands.cpp
index 656ace223b5f157..1e7227767eb982a 100644
--- a/lldb/source/Commands/CommandObjectCommands.cpp
+++ b/lldb/source/Commands/CommandObjectCommands.cpp
@@ -1158,6 +1158,9 @@ class CommandObjectPythonFunction : public CommandObjectRaw {
   CompletionType m_completion_type = eNoCompletion;
 };
 
+/// This class implements a "raw" scripted command.  lldb does no parsing of the
+/// command line, instead passing the line unaltered (except for backtick
+/// substitution).
 class CommandObjectScriptingObject : public CommandObjectRaw {
 public:
   CommandObjectScriptingObject(CommandInterpreter &interpreter,
@@ -1255,6 +1258,676 @@ class CommandObjectScriptingObject : public CommandObjectRaw {
   CompletionType m_completion_type = eNoCompletion;
 };
 
+
+/// This command implements a lldb parsed scripted command.  The command
+/// provides a definition of the options and arguments, and a option value
+/// setting callback, and then the command's execution function gets passed
+/// just the parsed arguments.
+/// Note, implementing a command in Python using these base interfaces is a bit
+/// of a pain, but it is much easier to export this low level interface, and
+/// then make it nicer on the Python side, than to try to do that in a
+/// script language neutral way.
+/// So I've also added a base class in Python that provides a table-driven
+/// way of defining the options and arguments, which automatically fills the
+/// option values, making them available as properties in Python.
+/// 
+class CommandObjectScriptingObjectParsed : public CommandObjectParsed {
+private: 
+  class CommandOptions : public Options {
+  public:
+    CommandOptions(CommandInterpreter &interpreter, 
+        StructuredData::GenericSP cmd_obj_sp) : m_interpreter(interpreter), 
+            m_cmd_obj_sp(cmd_obj_sp) {}
+
+    ~CommandOptions() override = default;
+
+    Status SetOptionValue(uint32_t option_idx, llvm::StringRef option_arg,
+                          ExecutionContext *execution_context) override {
+      Status error;
+      ScriptInterpreter *scripter = 
+        m_interpreter.GetDebugger().GetScriptInterpreter();
+      if (!scripter) {
+        error.SetErrorString("No script interpreter for SetOptionValue.");
+        return error;
+      }
+      if (!m_cmd_obj_sp) {
+        error.SetErrorString("SetOptionValue called with empty cmd_obj.");
+        return error;
+      }
+      if (!m_options_definition_up) {
+        error.SetErrorString("SetOptionValue called before options definitions "
+                             "were created.");
+        return error;
+      }
+      // Pass the long option, since you aren't actually required to have a
+      // short_option, and for those options the index or short option character
+      // aren't meaningful on the python side.
+      const char * long_option = 
+        m_options_definition_up.get()[option_idx].long_option;
+      bool success = scripter->SetOptionValueForCommandObject(m_cmd_obj_sp, 
+        execution_context, long_option, option_arg);
+      if (!success)...
[truncated]

``````````

</details>


https://github.com/llvm/llvm-project/pull/70734