[llvm-dev] Binary utilities: switch command line parsing from llvm::cl to OptTable (byproduct: drop -long-option?)

Fri Jul 2 17:21:52 PDT 2021

On Fri, Jul 2, 2021 at 4:49 PM Fāng-ruì Sòng <maskray at google.com> wrote:

> [This topic of this subthread has shifted from binary utilities to generic
> library options]
>
> > Sure, but that isn't the problem you were raising that I was answering
> > to, let's not move the goal post here.
> >
> > How do you solve this "namespacing" issue though? This was the sense of
> my
> > question earlier in this thread: is OptTable providing me a solution to
> > build library options that can be re-exported through tools command line
> > interface? (bonus point if it manages some sort of namespacing).
>
> No, OptTable options are not composable.
>
> > The intent of cl::opt in libraries is that they can be exposed on the
> > command line. It seems to me that this namespacing issue is quite
> intrinsic
> > to the flat command line interface itself.
>
> Yes.
>
> > The way we work around this to avoid collision in mlir is through
> > convention, making sure passes are prefixed according to their library
> (or
> > dialects in MLIR).
>
> Yes. llvm-readobj has a similar pattern: calling a function which
> defines cl::opt local variables.
>
>
> OptTable is for tools like clang, flang, lld where
>
> * they want high configurability, e.g. -long-option for some options while
> --long-option for others.
> * they don't want library cl::opt options.[1]
> * they want to avoid cl::opt's loosing behavior (-t=d and -demangle=0 are
> two examples I raised; hey, -enable-new-pm=0:) it is convenient internally
> but externally we'd better use -no- instead)
>
> [1]: As you may have implied, this
> is alternatively solvable by reorganizing llvm/lib/* options. This is
> a huge task and the solution isn't particular clear yet.
> --riscv-no-aliases is an example that cl::opt in library code can affect
> multiple tools: llc/llvm-mc/llvm-objdump (llvm-mc/llvm-objdump usage was
> unintentional). Someone signing up for the work needs to be careful on
> letting utilities call the relevant register functions.
>
> My proposal is like: binary utilities should move in the direction of
> clang/lld as well.
>

> > Another thing we have in MLIR is cl::opt wrapped into external storage
> and
> > custom parser, they aren't exposed in the global namespace.
> > For example, a pass class will define a member:
> >
> >   ::mlir::Pass::Option<uint64_t> fastMemoryCapacity{*this,
> > "fast-mem-capacity", ::llvm::cl::desc("Set fast memory space capacity in
> > KiB (default: unlimited)"),
> > ::llvm::cl::init(std::numeric_limits<uint64_t>::max())};
> >
> > Where mlir::Pass::Option is a inheriting from cl::opt here:
> >
> https://github.com/llvm/llvm-project/blob/main/mlir/include/mlir/Pass/PassOptions.h#L111
> >
> > In `mlir-opt --help` it shows up as:
> >
> >   Compiler passes to run
> >     --pass-pipeline                                     -   A textual
> > description of a pass pipeline to run
> >     Passes:
> >       --affine-data-copy-generate                       -   Generate
> > explicit copying for affine memory operations
> >         --fast-mem-capacity=<ulong>                     - Set fast memory
> > space capacity in KiB (default: unlimited)
> >
> > Note how the  "fast-mem-capacity" is nested under the
> > "affine-data-copy-generate" (which is the pass name).
> > On the command line it won't be present at the top-level and you end up
> > invoking it this way:
> >
> > // RUN: mlir-opt %s -split-input-file
> > -affine-data-copy-generate="generate-dma=false fast-mem-space=0
> > skip-non-unit-stride-loops" | FileCheck %s
> >
> > It also has the advantage that you can invoke the same pass twice in the
> > pipeline with a different value for the same cl::opt since the storage is
> > private to a pass instance.
> >
> >
> >
> >
> > >
> > > This can be worked around with linker garbage collection by discarding
> > > unreferenced cl::opt.
> > >
> >
> > I don't understand how this works actually: cl::opt that rely on global
> > constructors aren't referenced ever, as long as the file is linked in
> they
> > will be involved. This is an incredibly clunky situation to play with
> > linker semantics here, we end relying on the way files are organized in
> > static archives (and it breaks when you build libLLVM.so as you mentioned
> > before).
>
> Yep, the reliance on linker garbage collection makes hiding unrelated
> options.
>
> On 2021-07-02, Mehdi AMINI wrote:
> >Looks like our email crossed :)
> >
> >On Fri, Jul 2, 2021 at 4:07 PM Fāng-ruì Sòng <maskray at google.com> wrote:
> >
> >> On Fri, Jul 2, 2021 at 3:26 PM Fāng-ruì Sòng <maskray at google.com>
> wrote:
> >>
> >>> On Fri, Jul 2, 2021 at 3:21 PM Mehdi AMINI <joker.eph at gmail.com>
> wrote:
> >>>
> >>>>
> >>>>
> >>>> On Fri, Jul 2, 2021 at 2:09 PM Fāng-ruì Sòng <maskray at google.com>
> wrote:
> >>>>
> >>>>> On Fri, Jul 2, 2021 at 2:03 PM Mehdi AMINI <joker.eph at gmail.com>
> wrote:
> >>>>>
> >>>>>>
> >>>>>>
> >>>>>> On Fri, Jul 2, 2021 at 1:27 PM Fāng-ruì Sòng <maskray at google.com>
> >>>>>> wrote:
> >>>>>>
> >>>>>>> On Fri, Jul 2, 2021 at 1:10 PM Mehdi AMINI <joker.eph at gmail.com>
> >>>>>>> wrote:
> >>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On Fri, Jul 2, 2021 at 12:49 PM Alexandre Ganea <
> >>>>>>>> alexandre.ganea at ubisoft.com> wrote:
> >>>>>>>>
> >>>>>>>>> The API benefits sound nice, though presumably some could be
> >>>>>>>>> retrofitted to cl::opt if that was the only goal. Side benefits
> in addition
> >>>>>>>>> to removing global ctors are nice to have.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> The drawback is some initial boilerplate (e.g. llvm-tblgen
> >>>>>>>>> -gen-opt-parser-defs in CMakeLists.txt, class NmOptTable in
> code).
> >>>>>>>>>
> >>>>>>>>> The handling of comma separated options -arch=x86_64,arm64
> doesn't
> >>>>>>>>> have direct OptTable support. llvm::SplitString is needed (just
> search for
> >>>>>>>>> SplitString in https://reviews.llvm.org/D105330)
> >>>>>>>>>
> >>>>>>>>> But this doesn't tend to increase complexity because the
> >>>>>>>>> cl::list<std::string> will need per-value verification anyway.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> One potential one (though I don't recall it being discussed
> >>>>>>>>> recently) would be that maybe this addresses the issue of global
> ctors in
> >>>>>>>>> cl::opt? Does OptTable avoid/not use global constructors? That
> would be
> >>>>>>>>> nice - it's an ongoing issue that LLVM library users pay for
> command line
> >>>>>>>>> argument support they have no need for in the form of global
> ctor execution
> >>>>>>>>> time.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> OptTable is used as a local variable. So yes, it avoids global
> >>>>>>>>> constructors,
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> Nice :)
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> Note that MLIR is using cl::opt without global ctor (we build
> with
> >>>>>>>>> `-Werror=global-constructors`).
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> The pattern we use to write a tool with cl::opt and avoid global
> >>>>>>>>> ctor (and can be used to avoid collision) looks like:
> >>>>>>>>>
> https://github.com/llvm/llvm-project/blob/main/mlir/lib/IR/MLIRContext.cpp#L57-L83
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> The tool that wants to expose the MLIRContext options to the
> >>>>>>>>> command line calls registerMLIRContextCLOptions() before parsing
> the
> >>>>>>>>> command line.
> >>>>>>>>>
> >>>>>>>>> Wouldn't this translate directly to LLVM tools as well with some
> >>>>>>>>> minor refactoring?
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> The same applies to all of the infrastructure in MLIR, passes are
> >>>>>>>>> registered explicitly, etc. This decouples the "is this code
> linked in"
> >>>>>>>>> from "options are loaded" annoying part of the global
> constructors.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> --
> >>>>>>>>>
> >>>>>>>>> Mehdi
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> *[Alexandre Ganea] *I think one other issue with cl::opt is that
> >>>>>>>>> it aggregates the “command-line argument definition” and the
> “runtime
> >>>>>>>>> parameter” *de facto* in a single object (unless cl::location is
> >>>>>>>>> manually specified to every cl::opt). What MLIR does solves the
> issue
> >>>>>>>>> mentioned by David, the fact that every tool pulls/initializes
> every
> >>>>>>>>> cl::opt out there. However OptTable solves both problems, and
> makes the
> >>>>>>>>> entry point thread-safe.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>> I agree that removing the global state would be great!
> >>>>>>>> Right now what I see proposed with OptTable (like
> >>>>>>>> https://reviews.llvm.org/D104889) seems to just address the
> >>>>>>>> tools-specific options, and the value isn't clear to me for these
> cases,
> >>>>>>>> since these options aren't exposed through library entry points.
> >>>>>>>> I don't quite get right now how OptTable would compose at the LLVM
> >>>>>>>> scale? Are there examples of libraries exposing pluggable hooks
> for a tool
> >>>>>>>> to aggregate multiple libraries' options and expose them on the
> command
> >>>>>>>> line?
> >>>>>>>>
> >>>>>>>>
> >>>>>>> The first message listed:
> >>>>>>>
> >>>>>>> * -t=d is removed (equal sign after a short option). Use -t d
> instead.
> >>>>>>> * --demangle=0 (=0 to disable a boolean option) is removed. Omit
> the
> >>>>>>> option or use --no-demangle instead.
> >>>>>>> * To support boolean options (e.g. --demangle --no-demangle), we
> >>>>>>> don't need to compare their positions (if
> (NoDemangle.getPosition() >
> >>>>>>> Demangle.getPosition()) , see llvm-nm.cpp)
> >>>>>>> * grouped short options can be specified with one line
> >>>>>>> `setGroupedShortOptions`, instead of adding cl::Grouping to every
> short
> >>>>>>> options.
> >>>>>>> * We don't need to add cl::cat to every option and call
> >>>>>>> `HideUnrelatedOptions` to hide unrelated options from --help. The
> issue
> >>>>>>> would happen with cl::opt tools if linker garbage collection is
> disabled or
> >>>>>>> libLLVM-13git.so is used. (See https://reviews.llvm.org/D104363)
> >>>>>>>
> >>>>>>> To me *these points are all usability issues of cl::opt*. I care
> >>>>>>> about not exposing unnecessary interfaces so cl::opt accepting the
> weird
> >>>>>>> -t=d looks a downside to me.
> >>>>>>>
> >>>>>>> --demangle=0 is weird and some llvm/llvm/test tests do use cl::opt
> >>>>>>> options this way, so we cannot just remove this usage. As a
> workaround, we
> >>>>>>> could add a cl::foobar_toggle to a cl::opt to disallow =0.
> >>>>>>> We would end with more customization for one option, cl::cat (for
> >>>>>>> hiding unrelated options), cl::foobar_toggle (for disallowing =0),
> and
> >>>>>>> potentially others for other ad-hoc tasks.
> >>>>>>>
> >>>>>>> I can highlight another thing about the global state of cl::opt =>
> *library
> >>>>>>> cl::opt and binary utility cl::opt share the same namespace*.
> >>>>>>> So cl::opt options (usually for debugging or testing) in library
> code
> >>>>>>> can end up in a tool's list of command line options.
> >>>>>>> This is usually undesired (e.g. llvm-objdump --x86-asm-syntax in
> >>>>>>> https://reviews.llvm.org/D100433).
> >>>>>>> People may not notice this if they always
> >>>>>>> use -DLLVM_LINK_LLVM_DYLIB=off and don't use linker garbage
> collection.
> >>>>>>>
> >>>>>>
> >>>>>> You're not answering my question here, are you? Are you answering to
> >>>>>> what I mentioned 3 emails before in an answer to David when I wrote
> >>>>>> "Indeed: it isn't clear to me that these are outright "benefits"."?
> >>>>>>
> >>>>>
> >>>>> OptTable doesn't have the listed usability issues. Not having the
> >>>>> issues is a large benefit to me.
> >>>>>
> >>>>
> >>>> Maybe, these are subjective though.
> >>>> (And the confusion above came from the fact that  you just answered
> the
> >>>> wrong email)
> >>>>
> >>>>
> >>>>>
> >>>>>
> >>>>>> Because I still don't see clearly how to build something like `opt`
> >>>>>> with all the pass and the options with OptTable, how does it all
> compose?
> >>>>>>
> >>>>>
> >>>>> The proposed changes are specific to binary utilities:
> >>>>> llvm-{ar,cxxfilt,nm,objcopy,objdump,readobj,size,strings,symbolizer}.
> >>>>> 'opt', 'llc', etc are not in the scope.
> >>>>> (I guess I should have named the utilities more specifically to not
> >>>>> cause confusion.)
> >>>>>
> >>>>> For
> >>>>>
> llvm-{ar,cxxfilt,nm,objcopy,objdump,readobj,size,strings,symbolizer}, we
> >>>>> don't want a random cl::opt in lib/Transforms, lib/MC, orlib/LTO to
> be
> >>>>> accidentally specifiable on the command line.
> >>>>> (In the rare debugging case where such functionality is needed, it
> >>>>> needs a -mllvm prefix like what ld.lld does.)
> >>>>>
> >>>>
> >>>> Have you looked at what I mentioned with MLIR on how to use cl::opt
> >>>> without global constructor? This has exactly the behavior you're
> looking
> >>>> for.
> >>>>
> >>>>
> >>> I have read
> >>>
> https://github.com/llvm/llvm-project/blob/main/mlir/lib/IR/MLIRContext.cpp#L57-L83
> >>>
> >>> Unless it does something more specially, I don't think it can avoid the
> >>> issue I mentioned in a previous message:
> >>>
> >>> > I can highlight another thing about the global state of cl::opt =>
> *library
> >>> cl::opt and binary utility cl::opt share the same namespace*.
> >>>
> >>> This can be worked around with linker garbage collection by discarding
> >>> unreferenced cl::opt.
> >>>
> >>
> >> I re-read these messages. I think you probably meant something more
> >> generic - how to design decentralized command line option registry where
> >> every library can register some options.
> >>
> >
> >Yes indeed!
> >
> >I'm not against using OptTable for the sole purpose of binary tools, but
> >that does not seem to provide us with a path forward. So I am afraid that
> >this is a local optima that is short sighted.
>
> Mentioned previously, I don't try to alter current cl::opt usage in
> library code and non-binary-utility tools (e.g. opt,llc).
>
> Though I am not sure I agree with clang/lld/llvm-symbolizer/llvm-objdump
> are using a local optima. I think cl::opt just don't fit for their use
> cases.
> OptTable is just the appropriate solution for them.
>

I think part of the confusion on my side in this thread is that when I read
"binary utilities" I thought first and foremost about `opt` and `lld`,
while you're using "binary utilities" to refer to what I would call
"end-user tools".
I agree with you that tools like clang and lld are in a different category
than `opt`.

cl::opt as it may not be suitable as-is, but OptTable being not composable
and not offering any facility to someone building a tool to re-expose
library options is also quite limited. It seems to me that we need such a
solution, and it also seems that if we had such solutions it would be
suitable for all the tools we intend to build using LLVM-based libraries.
Without any plan to build this though, I'm not trying to block progress on
your cleanup/improvement of these end-user tools :)

Cheers,

-- 
Mehdi

>
> Bogus: since options are data with the OptTable approach, we can re-use
> something like clang-tblgen -gen-opt-docs to help ensure the documentation
> doesn't diverge from the reality.
>
> >> The binary utilities command line parsing issue I intended to address is
> >> fairly different:
> >> We want a single registry of all options and don't want to inherit
> options
> >> from llvm/lib/A just because the tool happens to depend on llvm/lib/A
> >> directly or indirectly.
> >>
> >
> >I agree, what I pointed at in MLIR was an attempt to achieve this, the
> tool
> >has to explicitly call a function at runtime to make the options
> available.
> >
> >
> >> E.g. llvm-nm needs to depend on bitcode reader because it needs to
> handle
> >> LLVM bitcode files, however, I don't want a random cl::opt in bitcode
> >> reader to appear in llvm-nm's command line option list.
> >>
> >> So I just built mlir-opt and inspected its --help output. It has exactly
> >> the problem I called out in my first message:
> >>
> >
> >> * We don't need to add cl::cat to every option and call
> >>
> >> `HideUnrelatedOptions` to hide unrelated options from --help. The issue
> >> would happen with cl::opt tools if linker garbage collection is
> disabled or
> >> libLLVM-13git.so is used. (See https://reviews.llvm.org/D104363)
> >>
> >> There are many unrelated options like --amdgpu-bypass-slow-div which is
> from llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp
> >>
> >>
> >I think you're missing the point here: mlir-opt does not have this
> problem *for
> >mlir components.* It pays the price of the LLVM components using global
> >constructors, but I pointed to the solution we use for mlir components
> that
> >aren't using global constructors and avoid this problem.
>
> See [1] above
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210702/3bec9d27/attachment.html>