[llvm-dev] Binary utilities: switch command line parsing from llvm::cl to OptTable (byproduct: drop -long-option?)

Fri Jul 2 16:13:08 PDT 2021

Looks like our email crossed :)

On Fri, Jul 2, 2021 at 4:07 PM Fāng-ruì Sòng <maskray at google.com> wrote:

> On Fri, Jul 2, 2021 at 3:26 PM Fāng-ruì Sòng <maskray at google.com> wrote:
>
>> On Fri, Jul 2, 2021 at 3:21 PM Mehdi AMINI <joker.eph at gmail.com> wrote:
>>
>>>
>>>
>>> On Fri, Jul 2, 2021 at 2:09 PM Fāng-ruì Sòng <maskray at google.com> wrote:
>>>
>>>> On Fri, Jul 2, 2021 at 2:03 PM Mehdi AMINI <joker.eph at gmail.com> wrote:
>>>>
>>>>>
>>>>>
>>>>> On Fri, Jul 2, 2021 at 1:27 PM Fāng-ruì Sòng <maskray at google.com>
>>>>> wrote:
>>>>>
>>>>>> On Fri, Jul 2, 2021 at 1:10 PM Mehdi AMINI <joker.eph at gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Fri, Jul 2, 2021 at 12:49 PM Alexandre Ganea <
>>>>>>> alexandre.ganea at ubisoft.com> wrote:
>>>>>>>
>>>>>>>> The API benefits sound nice, though presumably some could be
>>>>>>>> retrofitted to cl::opt if that was the only goal. Side benefits in addition
>>>>>>>> to removing global ctors are nice to have.
>>>>>>>>
>>>>>>>>
>>>>>>>> The drawback is some initial boilerplate (e.g. llvm-tblgen
>>>>>>>> -gen-opt-parser-defs in CMakeLists.txt, class NmOptTable in code).
>>>>>>>>
>>>>>>>> The handling of comma separated options -arch=x86_64,arm64 doesn't
>>>>>>>> have direct OptTable support. llvm::SplitString is needed (just search for
>>>>>>>> SplitString in https://reviews.llvm.org/D105330)
>>>>>>>>
>>>>>>>> But this doesn't tend to increase complexity because the
>>>>>>>> cl::list<std::string> will need per-value verification anyway.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> One potential one (though I don't recall it being discussed
>>>>>>>> recently) would be that maybe this addresses the issue of global ctors in
>>>>>>>> cl::opt? Does OptTable avoid/not use global constructors? That would be
>>>>>>>> nice - it's an ongoing issue that LLVM library users pay for command line
>>>>>>>> argument support they have no need for in the form of global ctor execution
>>>>>>>> time.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> OptTable is used as a local variable. So yes, it avoids global
>>>>>>>> constructors,
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Nice :)
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Note that MLIR is using cl::opt without global ctor (we build with
>>>>>>>> `-Werror=global-constructors`).
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> The pattern we use to write a tool with cl::opt and avoid global
>>>>>>>> ctor (and can be used to avoid collision) looks like:
>>>>>>>> https://github.com/llvm/llvm-project/blob/main/mlir/lib/IR/MLIRContext.cpp#L57-L83
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> The tool that wants to expose the MLIRContext options to the
>>>>>>>> command line calls registerMLIRContextCLOptions() before parsing the
>>>>>>>> command line.
>>>>>>>>
>>>>>>>> Wouldn't this translate directly to LLVM tools as well with some
>>>>>>>> minor refactoring?
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> The same applies to all of the infrastructure in MLIR, passes are
>>>>>>>> registered explicitly, etc. This decouples the "is this code linked in"
>>>>>>>> from "options are loaded" annoying part of the global constructors.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>>
>>>>>>>> Mehdi
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> *[Alexandre Ganea] *I think one other issue with cl::opt is that
>>>>>>>> it aggregates the “command-line argument definition” and the “runtime
>>>>>>>> parameter” *de facto* in a single object (unless cl::location is
>>>>>>>> manually specified to every cl::opt). What MLIR does solves the issue
>>>>>>>> mentioned by David, the fact that every tool pulls/initializes every
>>>>>>>> cl::opt out there. However OptTable solves both problems, and makes the
>>>>>>>> entry point thread-safe.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>> I agree that removing the global state would be great!
>>>>>>> Right now what I see proposed with OptTable (like
>>>>>>> https://reviews.llvm.org/D104889) seems to just address the
>>>>>>> tools-specific options, and the value isn't clear to me for these cases,
>>>>>>> since these options aren't exposed through library entry points.
>>>>>>> I don't quite get right now how OptTable would compose at the LLVM
>>>>>>> scale? Are there examples of libraries exposing pluggable hooks for a tool
>>>>>>> to aggregate multiple libraries' options and expose them on the command
>>>>>>> line?
>>>>>>>
>>>>>>>
>>>>>> The first message listed:
>>>>>>
>>>>>> * -t=d is removed (equal sign after a short option). Use -t d instead.
>>>>>> * --demangle=0 (=0 to disable a boolean option) is removed. Omit the
>>>>>> option or use --no-demangle instead.
>>>>>> * To support boolean options (e.g. --demangle --no-demangle), we
>>>>>> don't need to compare their positions (if (NoDemangle.getPosition() >
>>>>>> Demangle.getPosition()) , see llvm-nm.cpp)
>>>>>> * grouped short options can be specified with one line
>>>>>> `setGroupedShortOptions`, instead of adding cl::Grouping to every short
>>>>>> options.
>>>>>> * We don't need to add cl::cat to every option and call
>>>>>> `HideUnrelatedOptions` to hide unrelated options from --help. The issue
>>>>>> would happen with cl::opt tools if linker garbage collection is disabled or
>>>>>> libLLVM-13git.so is used. (See https://reviews.llvm.org/D104363)
>>>>>>
>>>>>> To me *these points are all usability issues of cl::opt*. I care
>>>>>> about not exposing unnecessary interfaces so cl::opt accepting the weird
>>>>>> -t=d looks a downside to me.
>>>>>>
>>>>>> --demangle=0 is weird and some llvm/llvm/test tests do use cl::opt
>>>>>> options this way, so we cannot just remove this usage. As a workaround, we
>>>>>> could add a cl::foobar_toggle to a cl::opt to disallow =0.
>>>>>> We would end with more customization for one option, cl::cat (for
>>>>>> hiding unrelated options), cl::foobar_toggle (for disallowing =0), and
>>>>>> potentially others for other ad-hoc tasks.
>>>>>>
>>>>>> I can highlight another thing about the global state of cl::opt => *library
>>>>>> cl::opt and binary utility cl::opt share the same namespace*.
>>>>>> So cl::opt options (usually for debugging or testing) in library code
>>>>>> can end up in a tool's list of command line options.
>>>>>> This is usually undesired (e.g. llvm-objdump --x86-asm-syntax in
>>>>>> https://reviews.llvm.org/D100433).
>>>>>> People may not notice this if they always
>>>>>> use -DLLVM_LINK_LLVM_DYLIB=off and don't use linker garbage collection.
>>>>>>
>>>>>
>>>>> You're not answering my question here, are you? Are you answering to
>>>>> what I mentioned 3 emails before in an answer to David when I wrote
>>>>> "Indeed: it isn't clear to me that these are outright "benefits"."?
>>>>>
>>>>
>>>> OptTable doesn't have the listed usability issues. Not having the
>>>> issues is a large benefit to me.
>>>>
>>>
>>> Maybe, these are subjective though.
>>> (And the confusion above came from the fact that  you just answered the
>>> wrong email)
>>>
>>>
>>>>
>>>>
>>>>> Because I still don't see clearly how to build something like `opt`
>>>>> with all the pass and the options with OptTable, how does it all compose?
>>>>>
>>>>
>>>> The proposed changes are specific to binary utilities:
>>>> llvm-{ar,cxxfilt,nm,objcopy,objdump,readobj,size,strings,symbolizer}.
>>>> 'opt', 'llc', etc are not in the scope.
>>>> (I guess I should have named the utilities more specifically to not
>>>> cause confusion.)
>>>>
>>>> For
>>>> llvm-{ar,cxxfilt,nm,objcopy,objdump,readobj,size,strings,symbolizer}, we
>>>> don't want a random cl::opt in lib/Transforms, lib/MC, orlib/LTO to be
>>>> accidentally specifiable on the command line.
>>>> (In the rare debugging case where such functionality is needed, it
>>>> needs a -mllvm prefix like what ld.lld does.)
>>>>
>>>
>>> Have you looked at what I mentioned with MLIR on how to use cl::opt
>>> without global constructor? This has exactly the behavior you're looking
>>> for.
>>>
>>>
>> I have read
>> https://github.com/llvm/llvm-project/blob/main/mlir/lib/IR/MLIRContext.cpp#L57-L83
>>
>> Unless it does something more specially, I don't think it can avoid the
>> issue I mentioned in a previous message:
>>
>> > I can highlight another thing about the global state of cl::opt => *library
>> cl::opt and binary utility cl::opt share the same namespace*.
>>
>> This can be worked around with linker garbage collection by discarding
>> unreferenced cl::opt.
>>
>
> I re-read these messages. I think you probably meant something more
> generic - how to design decentralized command line option registry where
> every library can register some options.
>

Yes indeed!

I'm not against using OptTable for the sole purpose of binary tools, but
that does not seem to provide us with a path forward. So I am afraid that
this is a local optima that is short sighted.

> The binary utilities command line parsing issue I intended to address is
> fairly different:
> We want a single registry of all options and don't want to inherit options
> from llvm/lib/A just because the tool happens to depend on llvm/lib/A
> directly or indirectly.
>

I agree, what I pointed at in MLIR was an attempt to achieve this, the tool
has to explicitly call a function at runtime to make the options available.

> E.g. llvm-nm needs to depend on bitcode reader because it needs to handle
> LLVM bitcode files, however, I don't want a random cl::opt in bitcode
> reader to appear in llvm-nm's command line option list.
>
> So I just built mlir-opt and inspected its --help output. It has exactly
> the problem I called out in my first message:
>

> * We don't need to add cl::cat to every option and call
>
> `HideUnrelatedOptions` to hide unrelated options from --help. The issue
> would happen with cl::opt tools if linker garbage collection is disabled or
> libLLVM-13git.so is used. (See https://reviews.llvm.org/D104363)
>
> There are many unrelated options like --amdgpu-bypass-slow-div which is from llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp
>
>
I think you're missing the point here: mlir-opt does not have this problem *for
mlir components.* It pays the price of the LLVM components using global
constructors, but I pointed to the solution we use for mlir components that
aren't using global constructors and avoid this problem.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210702/9ca63876/attachment.html>