[PATCH] D83530: [llvm-symbolizer] Switch command line parsing from llvm::cl to OptTable

Tue Jul 28 02:01:56 PDT 2020

jhenderson added a comment.

Sorry, I've been off for the past two working days, and had a bucket load of other reviews to look at too.

In D83530#2159094 <https://reviews.llvm.org/D83530#2159094>, @MaskRay wrote:

> In D83530#2146686 <https://reviews.llvm.org/D83530#2146686>, @jhenderson wrote:
>
>> Honestly, I'm not convinced by the change, looking at this patch, at least not without much more stuff being added to common library code. In particular, I don't like how significant code needs adding in this patch to handle a number of things that were previously handled by the CommandLine code, for example help handling, type validation, and unknown argument detection.
>
> I think the library complexity is OK - 50 lines in OptTable.cpp which can be shared with other utilities.

Right, I wasn't referring to the library complexity, I was referring to the fact that this change significantly increases the amount of code in llvm-symbolizer.cpp to parse the command-line, for only minimal benefit.

>> In high-level terms, I don't think I really care either way which command line processing option is used, if it is straightforward to use, so I'm not opposed to a switchover, but at the moment it seems to be significantly complicating things, not simplifying, for relatively little gain.
>
> `cl::ParseCommandLineOptions` has a baked-in interface, with a bunch of customizations the user may or may not want to accept. For example, `-bool=0` & `-bool=false` may feel exotic for users coming from GNU getopt_long. `--flag` & `--no-flag` requires explicit position comparison.
>
> OptTable appears to be a lighter interface with less customization. It does not call `Tbl.findNearest`, so the tool needs to do it by itself. Other than the few boilerplate lines I don't find the complexity is significant. We could move spelling checking to OptTable if more utilities want to use OptTable.

You seem to be throwing out all the good things to simplify a couple of things though. Here is a (possibly non-exhaustive) list of things that were previously handled by the library, and now llvm-symbolizer has to do itself:

1. Response file usage.
2. Spell-checking of option names.
3. Argument type checking and conversion (e.g. to integer, enum etc).
4. Integer argument signedness checking.
5. Unknown argument handling.
6. Options specified via environment variables.
7. Help text printing.

Most, if not all of these are things that would be useful in other tools too, and should be therefore in the library, ideally on by default. For example, there are multiple places that want to use response files (LLD, llvm-objcopy etc) which have very similar, or even identical code to achieve that. I accept that there are some things in that list, where the default behaviour may need to differ (e.g. -h should not print help in llvm-readelf), but I reckon most of these are fairly rare.

It seems like there are two main things that you want to change. 1) To not support =0/=false style arguments, and 2) to not have to explicitly compare positions of --[no-]option examples. Would a different approach resolve this much more simply: add a new type `cl::flag` or similar, which implicitly adds a "no" prefix version, and turns off the =0 etc parsing, and handles the last-one-wins approach itself? If you don't think this is useful, then I'd expect to see the OptTable code expanded to address my above points.

>> (P.S. Don't forget to update documentation).
>
> After the switchover, the interface very closely matches the previous behavior. I find that there is not a need for a change to `docs/CommandGuide/llvm-symbolizer.rst`
>
> `-untag-addresses={0,false}` is not documented (this looks to me a debug option). We can do it in a separate change.

I actually asked to get that documented some time after the original review landed (see D65769 <https://reviews.llvm.org/D65769>). I wasn't referring to this though, I was referring to the new --no-inlines option, at the time (that's been added now). Also, there are several inline examples in the doc which use `-i=0`, which would still need updating.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D83530/new/

https://reviews.llvm.org/D83530