[llvm-dev] Binary utilities: switch command line parsing from llvm::cl to OptTable (byproduct: drop -long-option?)

Fāng-ruì Sòng via llvm-dev llvm-dev at lists.llvm.org
Fri Jul 2 16:07:12 PDT 2021


On Fri, Jul 2, 2021 at 3:26 PM Fāng-ruì Sòng <maskray at google.com> wrote:

> On Fri, Jul 2, 2021 at 3:21 PM Mehdi AMINI <joker.eph at gmail.com> wrote:
>
>>
>>
>> On Fri, Jul 2, 2021 at 2:09 PM Fāng-ruì Sòng <maskray at google.com> wrote:
>>
>>> On Fri, Jul 2, 2021 at 2:03 PM Mehdi AMINI <joker.eph at gmail.com> wrote:
>>>
>>>>
>>>>
>>>> On Fri, Jul 2, 2021 at 1:27 PM Fāng-ruì Sòng <maskray at google.com>
>>>> wrote:
>>>>
>>>>> On Fri, Jul 2, 2021 at 1:10 PM Mehdi AMINI <joker.eph at gmail.com>
>>>>> wrote:
>>>>>
>>>>>>
>>>>>>
>>>>>> On Fri, Jul 2, 2021 at 12:49 PM Alexandre Ganea <
>>>>>> alexandre.ganea at ubisoft.com> wrote:
>>>>>>
>>>>>>> The API benefits sound nice, though presumably some could be
>>>>>>> retrofitted to cl::opt if that was the only goal. Side benefits in addition
>>>>>>> to removing global ctors are nice to have.
>>>>>>>
>>>>>>>
>>>>>>> The drawback is some initial boilerplate (e.g. llvm-tblgen
>>>>>>> -gen-opt-parser-defs in CMakeLists.txt, class NmOptTable in code).
>>>>>>>
>>>>>>> The handling of comma separated options -arch=x86_64,arm64 doesn't
>>>>>>> have direct OptTable support. llvm::SplitString is needed (just search for
>>>>>>> SplitString in https://reviews.llvm.org/D105330)
>>>>>>>
>>>>>>> But this doesn't tend to increase complexity because the
>>>>>>> cl::list<std::string> will need per-value verification anyway.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> One potential one (though I don't recall it being discussed
>>>>>>> recently) would be that maybe this addresses the issue of global ctors in
>>>>>>> cl::opt? Does OptTable avoid/not use global constructors? That would be
>>>>>>> nice - it's an ongoing issue that LLVM library users pay for command line
>>>>>>> argument support they have no need for in the form of global ctor execution
>>>>>>> time.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> OptTable is used as a local variable. So yes, it avoids global
>>>>>>> constructors,
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Nice :)
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Note that MLIR is using cl::opt without global ctor (we build with
>>>>>>> `-Werror=global-constructors`).
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> The pattern we use to write a tool with cl::opt and avoid global
>>>>>>> ctor (and can be used to avoid collision) looks like:
>>>>>>> https://github.com/llvm/llvm-project/blob/main/mlir/lib/IR/MLIRContext.cpp#L57-L83
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> The tool that wants to expose the MLIRContext options to the command
>>>>>>> line calls registerMLIRContextCLOptions() before parsing the command line.
>>>>>>>
>>>>>>> Wouldn't this translate directly to LLVM tools as well with some
>>>>>>> minor refactoring?
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> The same applies to all of the infrastructure in MLIR, passes are
>>>>>>> registered explicitly, etc. This decouples the "is this code linked in"
>>>>>>> from "options are loaded" annoying part of the global constructors.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>>
>>>>>>> Mehdi
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> *[Alexandre Ganea] *I think one other issue with cl::opt is that it
>>>>>>> aggregates the “command-line argument definition” and the “runtime
>>>>>>> parameter” *de facto* in a single object (unless cl::location is
>>>>>>> manually specified to every cl::opt). What MLIR does solves the issue
>>>>>>> mentioned by David, the fact that every tool pulls/initializes every
>>>>>>> cl::opt out there. However OptTable solves both problems, and makes the
>>>>>>> entry point thread-safe.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> I agree that removing the global state would be great!
>>>>>> Right now what I see proposed with OptTable (like
>>>>>> https://reviews.llvm.org/D104889) seems to just address the
>>>>>> tools-specific options, and the value isn't clear to me for these cases,
>>>>>> since these options aren't exposed through library entry points.
>>>>>> I don't quite get right now how OptTable would compose at the LLVM
>>>>>> scale? Are there examples of libraries exposing pluggable hooks for a tool
>>>>>> to aggregate multiple libraries' options and expose them on the command
>>>>>> line?
>>>>>>
>>>>>>
>>>>> The first message listed:
>>>>>
>>>>> * -t=d is removed (equal sign after a short option). Use -t d instead.
>>>>> * --demangle=0 (=0 to disable a boolean option) is removed. Omit the
>>>>> option or use --no-demangle instead.
>>>>> * To support boolean options (e.g. --demangle --no-demangle), we don't
>>>>> need to compare their positions (if (NoDemangle.getPosition() >
>>>>> Demangle.getPosition()) , see llvm-nm.cpp)
>>>>> * grouped short options can be specified with one line
>>>>> `setGroupedShortOptions`, instead of adding cl::Grouping to every short
>>>>> options.
>>>>> * We don't need to add cl::cat to every option and call
>>>>> `HideUnrelatedOptions` to hide unrelated options from --help. The issue
>>>>> would happen with cl::opt tools if linker garbage collection is disabled or
>>>>> libLLVM-13git.so is used. (See https://reviews.llvm.org/D104363)
>>>>>
>>>>> To me *these points are all usability issues of cl::opt*. I care
>>>>> about not exposing unnecessary interfaces so cl::opt accepting the weird
>>>>> -t=d looks a downside to me.
>>>>>
>>>>> --demangle=0 is weird and some llvm/llvm/test tests do use cl::opt
>>>>> options this way, so we cannot just remove this usage. As a workaround, we
>>>>> could add a cl::foobar_toggle to a cl::opt to disallow =0.
>>>>> We would end with more customization for one option, cl::cat (for
>>>>> hiding unrelated options), cl::foobar_toggle (for disallowing =0), and
>>>>> potentially others for other ad-hoc tasks.
>>>>>
>>>>> I can highlight another thing about the global state of cl::opt => *library
>>>>> cl::opt and binary utility cl::opt share the same namespace*.
>>>>> So cl::opt options (usually for debugging or testing) in library code
>>>>> can end up in a tool's list of command line options.
>>>>> This is usually undesired (e.g. llvm-objdump --x86-asm-syntax in
>>>>> https://reviews.llvm.org/D100433).
>>>>> People may not notice this if they always
>>>>> use -DLLVM_LINK_LLVM_DYLIB=off and don't use linker garbage collection.
>>>>>
>>>>
>>>> You're not answering my question here, are you? Are you answering to
>>>> what I mentioned 3 emails before in an answer to David when I wrote
>>>> "Indeed: it isn't clear to me that these are outright "benefits"."?
>>>>
>>>
>>> OptTable doesn't have the listed usability issues. Not having the issues
>>> is a large benefit to me.
>>>
>>
>> Maybe, these are subjective though.
>> (And the confusion above came from the fact that  you just answered the
>> wrong email)
>>
>>
>>>
>>>
>>>> Because I still don't see clearly how to build something like `opt`
>>>> with all the pass and the options with OptTable, how does it all compose?
>>>>
>>>
>>> The proposed changes are specific to binary utilities:
>>> llvm-{ar,cxxfilt,nm,objcopy,objdump,readobj,size,strings,symbolizer}.
>>> 'opt', 'llc', etc are not in the scope.
>>> (I guess I should have named the utilities more specifically to not
>>> cause confusion.)
>>>
>>> For
>>> llvm-{ar,cxxfilt,nm,objcopy,objdump,readobj,size,strings,symbolizer}, we
>>> don't want a random cl::opt in lib/Transforms, lib/MC, orlib/LTO to be
>>> accidentally specifiable on the command line.
>>> (In the rare debugging case where such functionality is needed, it needs
>>> a -mllvm prefix like what ld.lld does.)
>>>
>>
>> Have you looked at what I mentioned with MLIR on how to use cl::opt
>> without global constructor? This has exactly the behavior you're looking
>> for.
>>
>>
> I have read
> https://github.com/llvm/llvm-project/blob/main/mlir/lib/IR/MLIRContext.cpp#L57-L83
>
> Unless it does something more specially, I don't think it can avoid the
> issue I mentioned in a previous message:
>
> > I can highlight another thing about the global state of cl::opt => *library
> cl::opt and binary utility cl::opt share the same namespace*.
>
> This can be worked around with linker garbage collection by discarding
> unreferenced cl::opt.
>

I re-read these messages. I think you probably meant something more generic
- how to design decentralized command line option registry where every
library can register some options.
The binary utilities command line parsing issue I intended to address is
fairly different:
We want a single registry of all options and don't want to inherit options
from llvm/lib/A just because the tool happens to depend on llvm/lib/A
directly or indirectly.
E.g. llvm-nm needs to depend on bitcode reader because it needs to handle
LLVM bitcode files, however, I don't want a random cl::opt in bitcode
reader to appear in llvm-nm's command line option list.

So I just built mlir-opt and inspected its --help output. It has exactly
the problem I called out in my first message:

* We don't need to add cl::cat to every option and call

`HideUnrelatedOptions` to hide unrelated options from --help. The issue
would happen with cl::opt tools if linker garbage collection is disabled or
libLLVM-13git.so is used. (See https://reviews.llvm.org/D104363)

There are many unrelated options like --amdgpu-bypass-slow-div which
is from llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210702/9aa8866a/attachment.html>


More information about the llvm-dev mailing list