[llvm-dev] Binary utilities: switch command line parsing from llvm::cl to OptTable (byproduct: drop -long-option?)

Mehdi AMINI via llvm-dev llvm-dev at lists.llvm.org
Fri Jul 2 16:07:58 PDT 2021


On Fri, Jul 2, 2021 at 3:26 PM Fāng-ruì Sòng <maskray at google.com> wrote:

> On Fri, Jul 2, 2021 at 3:21 PM Mehdi AMINI <joker.eph at gmail.com> wrote:
>
>>
>>
>> On Fri, Jul 2, 2021 at 2:09 PM Fāng-ruì Sòng <maskray at google.com> wrote:
>>
>>> On Fri, Jul 2, 2021 at 2:03 PM Mehdi AMINI <joker.eph at gmail.com> wrote:
>>>
>>>>
>>>>
>>>> On Fri, Jul 2, 2021 at 1:27 PM Fāng-ruì Sòng <maskray at google.com>
>>>> wrote:
>>>>
>>>>> On Fri, Jul 2, 2021 at 1:10 PM Mehdi AMINI <joker.eph at gmail.com>
>>>>> wrote:
>>>>>
>>>>>>
>>>>>>
>>>>>> On Fri, Jul 2, 2021 at 12:49 PM Alexandre Ganea <
>>>>>> alexandre.ganea at ubisoft.com> wrote:
>>>>>>
>>>>>>> The API benefits sound nice, though presumably some could be
>>>>>>> retrofitted to cl::opt if that was the only goal. Side benefits in addition
>>>>>>> to removing global ctors are nice to have.
>>>>>>>
>>>>>>>
>>>>>>> The drawback is some initial boilerplate (e.g. llvm-tblgen
>>>>>>> -gen-opt-parser-defs in CMakeLists.txt, class NmOptTable in code).
>>>>>>>
>>>>>>> The handling of comma separated options -arch=x86_64,arm64 doesn't
>>>>>>> have direct OptTable support. llvm::SplitString is needed (just search for
>>>>>>> SplitString in https://reviews.llvm.org/D105330)
>>>>>>>
>>>>>>> But this doesn't tend to increase complexity because the
>>>>>>> cl::list<std::string> will need per-value verification anyway.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> One potential one (though I don't recall it being discussed
>>>>>>> recently) would be that maybe this addresses the issue of global ctors in
>>>>>>> cl::opt? Does OptTable avoid/not use global constructors? That would be
>>>>>>> nice - it's an ongoing issue that LLVM library users pay for command line
>>>>>>> argument support they have no need for in the form of global ctor execution
>>>>>>> time.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> OptTable is used as a local variable. So yes, it avoids global
>>>>>>> constructors,
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Nice :)
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Note that MLIR is using cl::opt without global ctor (we build with
>>>>>>> `-Werror=global-constructors`).
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> The pattern we use to write a tool with cl::opt and avoid global
>>>>>>> ctor (and can be used to avoid collision) looks like:
>>>>>>> https://github.com/llvm/llvm-project/blob/main/mlir/lib/IR/MLIRContext.cpp#L57-L83
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> The tool that wants to expose the MLIRContext options to the command
>>>>>>> line calls registerMLIRContextCLOptions() before parsing the command line.
>>>>>>>
>>>>>>> Wouldn't this translate directly to LLVM tools as well with some
>>>>>>> minor refactoring?
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> The same applies to all of the infrastructure in MLIR, passes are
>>>>>>> registered explicitly, etc. This decouples the "is this code linked in"
>>>>>>> from "options are loaded" annoying part of the global constructors.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>>
>>>>>>> Mehdi
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> *[Alexandre Ganea] *I think one other issue with cl::opt is that it
>>>>>>> aggregates the “command-line argument definition” and the “runtime
>>>>>>> parameter” *de facto* in a single object (unless cl::location is
>>>>>>> manually specified to every cl::opt). What MLIR does solves the issue
>>>>>>> mentioned by David, the fact that every tool pulls/initializes every
>>>>>>> cl::opt out there. However OptTable solves both problems, and makes the
>>>>>>> entry point thread-safe.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> I agree that removing the global state would be great!
>>>>>> Right now what I see proposed with OptTable (like
>>>>>> https://reviews.llvm.org/D104889) seems to just address the
>>>>>> tools-specific options, and the value isn't clear to me for these cases,
>>>>>> since these options aren't exposed through library entry points.
>>>>>> I don't quite get right now how OptTable would compose at the LLVM
>>>>>> scale? Are there examples of libraries exposing pluggable hooks for a tool
>>>>>> to aggregate multiple libraries' options and expose them on the command
>>>>>> line?
>>>>>>
>>>>>>
>>>>> The first message listed:
>>>>>
>>>>> * -t=d is removed (equal sign after a short option). Use -t d instead.
>>>>> * --demangle=0 (=0 to disable a boolean option) is removed. Omit the
>>>>> option or use --no-demangle instead.
>>>>> * To support boolean options (e.g. --demangle --no-demangle), we don't
>>>>> need to compare their positions (if (NoDemangle.getPosition() >
>>>>> Demangle.getPosition()) , see llvm-nm.cpp)
>>>>> * grouped short options can be specified with one line
>>>>> `setGroupedShortOptions`, instead of adding cl::Grouping to every short
>>>>> options.
>>>>> * We don't need to add cl::cat to every option and call
>>>>> `HideUnrelatedOptions` to hide unrelated options from --help. The issue
>>>>> would happen with cl::opt tools if linker garbage collection is disabled or
>>>>> libLLVM-13git.so is used. (See https://reviews.llvm.org/D104363)
>>>>>
>>>>> To me *these points are all usability issues of cl::opt*. I care
>>>>> about not exposing unnecessary interfaces so cl::opt accepting the weird
>>>>> -t=d looks a downside to me.
>>>>>
>>>>> --demangle=0 is weird and some llvm/llvm/test tests do use cl::opt
>>>>> options this way, so we cannot just remove this usage. As a workaround, we
>>>>> could add a cl::foobar_toggle to a cl::opt to disallow =0.
>>>>> We would end with more customization for one option, cl::cat (for
>>>>> hiding unrelated options), cl::foobar_toggle (for disallowing =0), and
>>>>> potentially others for other ad-hoc tasks.
>>>>>
>>>>> I can highlight another thing about the global state of cl::opt => *library
>>>>> cl::opt and binary utility cl::opt share the same namespace*.
>>>>> So cl::opt options (usually for debugging or testing) in library code
>>>>> can end up in a tool's list of command line options.
>>>>> This is usually undesired (e.g. llvm-objdump --x86-asm-syntax in
>>>>> https://reviews.llvm.org/D100433).
>>>>> People may not notice this if they always
>>>>> use -DLLVM_LINK_LLVM_DYLIB=off and don't use linker garbage collection.
>>>>>
>>>>
>>>> You're not answering my question here, are you? Are you answering to
>>>> what I mentioned 3 emails before in an answer to David when I wrote
>>>> "Indeed: it isn't clear to me that these are outright "benefits"."?
>>>>
>>>
>>> OptTable doesn't have the listed usability issues. Not having the issues
>>> is a large benefit to me.
>>>
>>
>> Maybe, these are subjective though.
>> (And the confusion above came from the fact that  you just answered the
>> wrong email)
>>
>>
>>>
>>>
>>>> Because I still don't see clearly how to build something like `opt`
>>>> with all the pass and the options with OptTable, how does it all compose?
>>>>
>>>
>>> The proposed changes are specific to binary utilities:
>>> llvm-{ar,cxxfilt,nm,objcopy,objdump,readobj,size,strings,symbolizer}.
>>> 'opt', 'llc', etc are not in the scope.
>>> (I guess I should have named the utilities more specifically to not
>>> cause confusion.)
>>>
>>> For
>>> llvm-{ar,cxxfilt,nm,objcopy,objdump,readobj,size,strings,symbolizer}, we
>>> don't want a random cl::opt in lib/Transforms, lib/MC, orlib/LTO to be
>>> accidentally specifiable on the command line.
>>> (In the rare debugging case where such functionality is needed, it needs
>>> a -mllvm prefix like what ld.lld does.)
>>>
>>
>> Have you looked at what I mentioned with MLIR on how to use cl::opt
>> without global constructor? This has exactly the behavior you're looking
>> for.
>>
>>
> I have read
> https://github.com/llvm/llvm-project/blob/main/mlir/lib/IR/MLIRContext.cpp#L57-L83
>
> Unless it does something more specially, I don't think it can avoid the
> issue I mentioned in a previous message:
>
> > I can highlight another thing about the global state of cl::opt => *library
> cl::opt and binary utility cl::opt share the same namespace*.
>

Sure, but that isn't the problem you were raising that I was answering
to, let's not move the goal post here.

How do you solve this "namespacing" issue though? This was the sense of my
question earlier in this thread: is OptTable providing me a solution to
build library options that can be re-exported through tools command line
interface? (bonus point if it manages some sort of namespacing).
The intent of cl::opt in libraries is that they can be exposed on the
command line. It seems to me that this namespacing issue is quite intrinsic
to the flat command line interface itself.
The way we work around this to avoid collision in mlir is through
convention, making sure passes are prefixed according to their library (or
dialects in MLIR).

Another thing we have in MLIR is cl::opt wrapped into external storage and
custom parser, they aren't exposed in the global namespace.
For example, a pass class will define a member:

  ::mlir::Pass::Option<uint64_t> fastMemoryCapacity{*this,
"fast-mem-capacity", ::llvm::cl::desc("Set fast memory space capacity in
KiB (default: unlimited)"),
::llvm::cl::init(std::numeric_limits<uint64_t>::max())};

Where mlir::Pass::Option is a inheriting from cl::opt here:
https://github.com/llvm/llvm-project/blob/main/mlir/include/mlir/Pass/PassOptions.h#L111

In `mlir-opt --help` it shows up as:

  Compiler passes to run
    --pass-pipeline                                     -   A textual
description of a pass pipeline to run
    Passes:
      --affine-data-copy-generate                       -   Generate
explicit copying for affine memory operations
        --fast-mem-capacity=<ulong>                     - Set fast memory
space capacity in KiB (default: unlimited)

Note how the  "fast-mem-capacity" is nested under the
"affine-data-copy-generate" (which is the pass name).
On the command line it won't be present at the top-level and you end up
invoking it this way:

// RUN: mlir-opt %s -split-input-file
-affine-data-copy-generate="generate-dma=false fast-mem-space=0
skip-non-unit-stride-loops" | FileCheck %s

It also has the advantage that you can invoke the same pass twice in the
pipeline with a different value for the same cl::opt since the storage is
private to a pass instance.




>
> This can be worked around with linker garbage collection by discarding
> unreferenced cl::opt.
>

I don't understand how this works actually: cl::opt that rely on global
constructors aren't referenced ever, as long as the file is linked in they
will be involved. This is an incredibly clunky situation to play with
linker semantics here, we end relying on the way files are organized in
static archives (and it breaks when you build libLLVM.so as you mentioned
before).

-- 
Mejdi
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210702/989b5148/attachment.html>


More information about the llvm-dev mailing list