[PATCH] D51321: [Tooling] Improve handling of CL-style options

Sam McCall via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Tue Aug 28 11:21:13 PDT 2018


sammccall added a comment.

Thanks for fixing this!
Mostly just picky style comments.
In particular, I know that some of the other maintainers here are just as ignorant of the cl driver as I am, and I want to make sure that it's still possible to follow the logic and debug unrelated problems without needing to know too much about it.

If some of the style bits is too annoying or unclear, happy to do some of it myself as a followup, let me know!



================
Comment at: lib/Tooling/InterpolatingCompilationDatabase.cpp:156-169
+    // Transform the command line to an llvm ArgList.
+    // Make sure that OldArgs lives for at least as long as this variable as the
+    // arg list contains pointers to the OldArgs strings.
+    llvm::opt::InputArgList ArgList;
+    {
+      // Unfortunately InputArgList requires an array of c-strings whereas we
+      // have an array of std::string; we'll need an intermediate vector.
----------------
hamzasood wrote:
> rnk wrote:
> > I think the old for loop was nicer. I don't think this code needs to change, you should be able to use ParseArgs with the extra flag args.
> The problem with the old loop is that we lose aliasing information (e.g. `/W3` becomes `-Wall`). While `ParseOneArg` also performs alias conversion, it gives us indices for each individual argument. The last line of the new loop copies arguments by using that index information to read from `OldArgs`, which gives us the original spelling.
Makes sense, please add a comment summarizing. ("We don't use ParseArgs, as we must pass through the exact spelling of uninteresting args. Re-rendering is lossy for clang-cl options, e.g. /W3 -> -Wall")


================
Comment at: lib/Tooling/InterpolatingCompilationDatabase.cpp:124
+// Determine whether the given file path is the clang-cl executable.
+static bool checkIsCLMode(const std::vector<std::string> &CmdLine,
+                          const llvm::opt::OptTable &OptTable) {
----------------
nit: a two-value enum would be clearer and allow for terser variable names (`Mode`)


================
Comment at: lib/Tooling/InterpolatingCompilationDatabase.cpp:156
+
+    // Transform the command line to an llvm ArgList.
+    // Make sure that OldArgs lives for at least as long as this variable as the
----------------
would you mind reverting this change and just wrapping the old Argv in an InputArgList?
I'm not sure the lifetime comments and std::transform aid readability here.


================
Comment at: lib/Tooling/InterpolatingCompilationDatabase.cpp:175
+    Cmd.CommandLine.emplace_back(OldArgs.front());
+    for (unsigned Start = 1, End = Start; End < OldArgs.size(); Start = End) {
+      std::unique_ptr<llvm::opt::Arg> Arg;
----------------
these names are somewhat confusing - "End" neither marks the end of the loop nor the end of the current item (as it's initialized to Start).
What about:
```
for (unsigned Pos = 1; Pos < OldArgs.size();) {
  ...
  unsigned OldPos = Pos;
  Arg = ParseOneArg(..., Pos);
  ...
  NewArgs.insert(NewArgs.end(), &OldArgs[OldPos], &OldArgs[Pos]);
}
```


================
Comment at: lib/Tooling/InterpolatingCompilationDatabase.cpp:178
+      {
+        unsigned Included, Excluded;
+        if (IsCLMode) {
----------------
This seems a bit verbose.

First, do things actually break if we just use the default 0/0 masks? We're not trying to interpret all the flags, just a few and pass the rest through.

Failing that, it seems clearer to just use a ternary to initialize Included/Excluded, or inline them completely.
(Please also drop the extra scope here).


================
Comment at: lib/Tooling/InterpolatingCompilationDatabase.cpp:186
+        }
+        Arg = std::unique_ptr<llvm::opt::Arg>(
+                  OptTable->ParseOneArg(ArgList, End, Included, Excluded));
----------------
Arg.reset(OptTable->ParseOneArg...)


================
Comment at: lib/Tooling/InterpolatingCompilationDatabase.cpp:194
       // Strip input and output files.
-      if (option.matches(clang::driver::options::OPT_INPUT) ||
-          option.matches(clang::driver::options::OPT_o)) {
+      if (isUntypedInputOrOutput(*Arg))
         continue;
----------------
can we inline this and just `||` all the options together here?


================
Comment at: lib/Tooling/InterpolatingCompilationDatabase.cpp:197
+
+      // Strip type specifiers, but record the overridden language.
+      if (const auto GivenType = isTypeSpecArg(*Arg)) {
----------------
please keep the mentions of -x etc even if not comprehensive - it's hard to precisely+tersely describe these flags in prose.


================
Comment at: lib/Tooling/InterpolatingCompilationDatabase.cpp:203
+
+      // Strip std, but record the value.
+      if (const auto GivenStd = isStdArg(*Arg)) {
----------------
ditto --std


================
Comment at: lib/Tooling/InterpolatingCompilationDatabase.cpp:205
+      if (const auto GivenStd = isStdArg(*Arg)) {
+        if (*GivenStd != LangStandard::lang_unspecified)
+          Std = *GivenStd;
----------------
prefer *either* optional<> or allowing the function to return lang_unspecified, but not both. (There's no way a user can *explicitly* pass a flag saying the language is unspecified, right?)


================
Comment at: lib/Tooling/InterpolatingCompilationDatabase.cpp:225
     auto TargetType = guessType(Filename, &TypeCertain);
-    // If the filename doesn't determine the language (.h), transfer with -x.
+    // If the filename doesn't determine the language (.h), use an appropriate
+    // argument to transfer it.
----------------
as above, please revert comment


================
Comment at: lib/Tooling/InterpolatingCompilationDatabase.cpp:231
                        : Type;
-      Result.CommandLine.push_back("-x");
-      Result.CommandLine.push_back(types::getTypeName(TargetType));
+      if (IsCLMode) {
+        const StringRef Flag = toCLFlag(TargetType);
----------------
can we unify as `addTypeFlag(TargetType, Mode, Result.CommandLine)`?


================
Comment at: lib/Tooling/InterpolatingCompilationDatabase.cpp:258
+
+    if (IsCLMode)
+      return Opt.matches(driver::options::OPT__SLASH_Fa) ||
----------------
Why do we need to take IsClMode into account while reading flags?
How would `OPT__SLASH_Fa` get parsed if we weren't in CL mode? Would we care?

(and below)


================
Comment at: lib/Tooling/InterpolatingCompilationDatabase.cpp:267
+
+  Optional<types::ID> isTypeSpecArg(const llvm::opt::Arg &Arg) const {
+    const llvm::opt::Option &Opt = Arg.getOption();
----------------
parseTypeArg?


================
Comment at: lib/Tooling/InterpolatingCompilationDatabase.cpp:283
+
+  Optional<LangStandard::Kind> isStdArg(const llvm::opt::Arg &Arg) const {
+    const llvm::opt::Option &Opt = Arg.getOption();
----------------
parseStandardArg?


================
Comment at: unittests/Tooling/CompilationDatabaseTest.cpp:652
+
+class InterpolateTest
+  : public ::testing::TestWithParam<std::tuple<DriverExe, DriverMode>> {
----------------
I'm not sure the parameterized test is the right model here.
The `INSTANTIATE_TEST_P` and friends makes the test fixture pretty complicated, and failure messages hard to relate to input conditions.
Then the actual tests have a bunch of assertions conditional on which mode we're in.
Finally, we don't actually test any mixed-mode database.

Could we write this a little more directly?:
 - pass the driver mode flag to `add()` in the normal way, in the Flags param
 - require callers to "add" to pass "clang" or "clang-cl". (It's OK that this makes existing cases a bit more verbose)
 - explicitly test the clang-cl and --driver-mode cases we care about, rather than running every assertion in every configuration

e.g.
```
TEST_F(InterpolateTest, ClangCL) {
  add("foo.cc", "clang");
  add("bar.cc", "clang-cl");
  add("baz.cc", "clang --driver-mode=clang-cl");

  EXPECT_EQ(getCommand("a/bar.h"), "clang-cl -D a/baz.cc /TP");
}
```


https://reviews.llvm.org/D51321





More information about the cfe-commits mailing list