[PATCH] D94472: [WIP][clang][cli] Command line round-trip for HeaderSearch options

Jan Svoboda via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Mon Jan 25 06:47:22 PST 2021


jansvoboda11 added a comment.

In D94472#2508018 <https://reviews.llvm.org/D94472#2508018>, @dexonsmith wrote:

> I have three concerns with the approach:
>
> 1. Churn. This adds a lot of code that will eventually be removed / refactored away. One example is that shifting the `NDEBUG` logic to the driver requires threading a Boolean (or tristate/enum) for round-trip mode through a few layers; once we're finished, we'll end up stripping that argument back out. A second example is that this seems to require splitting up `OPTIONS_WITH_MARSHALLING` for each option. Once this work is finished, someone will need to clean them up / unify them, or maybe they'll unnecessarily stay split / duplicated.

I think the driver command line options could be useful even after we're finished, so I'd consider not removing them. For example when working on new Clang feature that requires new command line option, one could implement the parsing first, temporarily disable round-tripping, implement the feature and worry about generation and round-tripping later. I can imagine a situation when downstream projects would prefer to prevent round-tripping from the driver rather than from within `CompilerInvocation`.

Yes, this requires splitting `OPTIONS_WITH_MARSHALLING`. I don't think it's a big deal, because we already do that for `DiagnosticOpts`, `LangOpts` and `CodegenOpts` anyway (D84673 <https://reviews.llvm.org/D84673>, D94682 <https://reviews.llvm.org/D94682>). I think it makes more sense to have one parsing function per `*Opts`, rather than keeping it split up into `Parse*Opts` and `ParseSimpleOpts`.

> 2. Boilerplate. Somewhat related to churn; there's a fair bit of additional boilerplate required in this approach. This makes it harder to read / understand / modify the code.

Boilerplate was added mainly to `ArgList`, but we can delete that without even showing up in `git blame`. Another instance is the lambda setup in `ParseHeaderSearchArgs`, but I think that's pretty isolated as well. I agree the code may be confusing for people not familiar with this effort, but it will be me deleting the code few weeks from now, so I'm not sure how bad this is. Do you worry this will cause confusion during merge conflicts down stream?

> 3. Correctness. I'm not sure ResetHS is quite right. It'll probably work for a normal command-line `-cc1` invocation, but perhaps not for all tooling applications, since it's changing the pointer identity of `HeaderSearchOptions`.

Nice catch! Some clients make changes to `AnalyzerOpts` before calling `CreateFromArgs`, and we would discard those changes. I've updated this patch with a solution: we save the original part of `CompilerInvocation`, run the first parse on a clean instance, generate arguments and use the original part for the second parse.

> On the other hand, one thing I really like about your approach, which I don't see a way of getting with a top-level check, is the ability to turn on "strict" mode for subsets of the command-line, which helps to hold the line in the face of new options being added (I think this is the feature you're (rightly) calling out). I'm not sure how important it is in practice, as long as we still think getting them all is going to happen in a reasonable time frame. There aren't that many new options being added: filtering out `[cli]` (your patches), `[flang]`, `[[D]driver]`, reverts, and relandings, there are 39 commits to Options.td in the last three months (3 / week). Some of those are deleting options or changing values, and not important to catch. Even for new options, I imagine most people copy/paste nearby options. I think it's critical to hold the line once we have them all, but until then I'm not sure.

That's indeed not too many changes in the area we're working in. Though the biggest reason I'm inclined to stick with this solution is that we can verify the manual generation as soon as we finish one `Generate*Args`. When committing each `Generate*Args` separately, this ensures the code really works.

If we go only with the "relaxed" checks, we'd be committing code that might not work as expected and we'd find out only when enabling strict mode for the whole `-cc1` interface. Finding a bug in that situation might be harder than with the incremental approach.

> `strict` mode additionally uses the `GeneratedArgs1` to fill CompilerInvocation, indirectly checking both directions by requiring tests to pass both with and without this round-trip. However, during development people generally only run the tests one way and the failure mode won't be ideal.

So people build without assertions during development? In that case, I agree that erroring out on `GeneratedArgs1 != GeneratedArgs2` (in all kinds of builds) would improve the experience. I don't think there's anything preventing us to incorporate this into the current patch.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D94472/new/

https://reviews.llvm.org/D94472



More information about the cfe-commits mailing list