[PATCH] [extra] pptrace - preprocessor tracing and testing tool

Sean Silva silvas at purdue.edu
Fri Oct 25 12:53:22 PDT 2013



================
Comment at: pp-trace/PPCallbacksTracker.h:82-83
@@ +81,4 @@
+
+  LLVM_OVERRIDE void FileChanged(clang::SourceLocation Loc,
+                                 clang::PPCallbacks::FileChangeReason Reason,
+                                 clang::SrcMgr::CharacteristicKind FileType,
----------------
This is not correct use of this macro, which is semantically equivalent to the C++11 `override` specifier. See <http://en.cppreference.com/w/cpp/language/override>.

================
Comment at: pp-trace/PPCallbacksTracker.cpp:256-262
@@ +255,9 @@
+
+  std::string IdsString;
+  for (llvm::ArrayRef<int>::iterator I = Ids.begin(), E = Ids.end(); I != E;
+       ++I) {
+    char Buffer[64];
+    sprintf(Buffer, "%d, ", *I);
+    IdsString.append(Buffer);
+  }
+  appendArgument("Ids", IdsString);
----------------
Use `raw_string_ostream` or `raw_svector_ostream` here and anywhere else you have used sprintf.

================
Comment at: pp-trace/PPTrace.cpp:32-33
@@ +31,4 @@
+//
+//    -output (file|-)            Output trace to the given file (or stdout
+//                                for "-") in a YAML format, i.e.:
+//
----------------
This is not correct use of "i.e." (which basically means "in other words"). You probably meant "e.g." which basically means "for example".

================
Comment at: pp-trace/PPTrace.cpp:157-163
@@ +156,9 @@
+       I != E; ++I) {
+    CallbackCall &Callback(*I);
+    OS << "- Callback: " << Callback.Name << "\n";
+
+    for (std::vector<Argument>::iterator AI = Callback.Arguments.begin(),
+                                         AE = Callback.Arguments.end();
+         AI != AE; ++AI) {
+      Argument &Arg(*AI);
+      OS << "  " << Arg.Name << ": " << Arg.Value << "\n";
----------------
I feel like I'm probably nitpicking here, but I've never seen a reference in the LLVM codebase initialized with parentheses. Please use e.g. `Argument &Arg = *AI;` here and elsewhere.

(hopefully someday clang-tidy will catch and fix these things).

================
Comment at: pp-trace/PPCallbacksTracker.cpp:564-567
@@ +563,6 @@
+  SS << "MacroArgs(";
+  for (int Index = 0; Index < Count; Index++) {
+    const clang::Token *ArgTok = Value->getUnexpArgument(Index);
+    SS << PP.getSpelling(*ArgTok) << ((Index == Count - 1) ? "" : ", ");
+  }
+  SS << ")";
----------------
Tiny bit of LLVM style guidance:

* unless there's a good reason, a for loop integer induction variable should be called `i` and the upper bound should be called `e` (which is inconsistent with the rest of the coding style, which is weird, but that's the style).

* In C++, when you are iterating through a sequence sequentially, the convention is to use `!=` to compare (this generalizes to e.g. linked lists, which can be traversed sequentially, but whose iterators can't have a meaningful `<`) and `++var` (which avoids potentially making a copy of the iterator); obviously it doesn't matter in this case, but for consistency and clarity that is the convention. These conventions originate in the STL. If you aren't familiar with the various conceptual iterator types, you can read a summary here: <http://www.sgi.com/tech/stl/Iterators.html>.

So I would recommend rewriting this loop header as:

    for (int i = 0, e =  Value->getNumArguments(); i != e; ++i) {

* When outputting constructs with "separator" semantics (e.g. a comma-separated list) rather than "terminator semantics" (e.g. a semicolon follows each statement, including the last), the pattern I've seen most commonly in LLVM is

    for (int i = ..., e = ...; i != e; ++i) {
      if (i)
        OS << ", ";
      [...]
    }

and I would recommend following this since it makes it very clear "up front" (at the beginning of the loop) that this is outputting a comma-separated list.


So overall I would recommend writing this loop as:

    for (int i = 0, e =  Value->getNumArguments(); i != e; ++i) {
      if (i)
        SS << ", ";
      SS << PP.getSpelling(*Value->getUnexpArgument(i));
    }

================
Comment at: pp-trace/PPCallbacksTracker.cpp:562-563
@@ +561,4 @@
+  std::string Str;
+  llvm::raw_string_ostream SS(Str);
+  SS << "MacroArgs(";
+  for (int Index = 0; Index < Count; Index++) {
----------------
This seems like it should produce a YAML list in the end, instead of a YAML string formatted like a function call (which would have to be parsed by a consumer, instead of being handled in the YAML parser itself). The YAML flow style can be used, e.g. `[ foo, bar, baz ]`. For example, this is easily consumed by a client using a YAML parser without an additional parsing step:

```
- Callback: Name
  Args: [ foo, bar, baz ]
  Argument2: Value2
```

The fact that it is a list of macro arguments should be clear from the context.

================
Comment at: pp-trace/PPCallbacksTracker.cpp:521-529
@@ +520,11 @@
+  std::string Str;
+  llvm::raw_string_ostream SS(Str);
+  for (clang::ModuleIdPath::iterator I = Value.begin(), E = Value.end(); I != E;
+       ++I) {
+    clang::IdentifierInfo *Info = I->first;
+    clang::SourceLocation Loc = I->second;
+    SS << "(" << Info->getName() << ", " << getSourceLocationString(PP, Loc)
+       << "), ";
+  }
+  appendArgument(Name, SS.str());
+}
----------------
I recommend formatting this as a list of YAML flow records (i.e., basically JSON) for easy downstream consumption. So it would be something like:

```
[{name: foo, loc: sourcelocationstring}, ...]
```

================
Comment at: pp-trace/PPCallbacksTracker.cpp:515-517
@@ +514,5 @@
+
+// Append a SourceLocation argument to the top trace item.
+void PPCallbacksTracker::appendArgument(const char *Name,
+                                        clang::ModuleIdPath Value) {
+  if (DisableTrace)
----------------
This comment is out of date. Please check the comments on the other methods as well.

================
Comment at: pp-trace/PPCallbacksTracker.cpp:415-432
@@ +414,20 @@
+
+// Append an "Outer(Inner)" string argument to the top trace item.
+void PPCallbacksTracker::appendArgument(const char *Name, const char *Outer,
+                                        const char *Inner) {
+  std::string Str;
+  llvm::raw_string_ostream SS(Str);
+  SS << Outer << "(" << Inner << ")";
+  appendArgument(Name, SS.str());
+}
+
+// Append an "Outer(Inner1, Inner2)" string argument to the top trace item.
+void PPCallbacksTracker::appendArgument(const char *Name, const char *Outer,
+                                        const char *Inner1,
+                                        const char *Inner2) {
+  std::string Str;
+  llvm::raw_string_ostream SS(Str);
+  SS << Outer << "(" << Inner1 << ", " << Inner2 << ")";
+  appendArgument(Name, SS.str());
+}
+
----------------
This format is annoying to parse downstream. Most of the uses of this seem to be just to tag the type of data. For example, SourceRange's should be just a YAML list of two entries: `[ loc1, loc2 ]`, and enum constants could be just their name (or maybe better `EnumName::EN_EnumerantName`).

The SourceLocation format that you are using to print them seems fine, since that can be deconstructed with a single call to `.split(':')` (in Python) or similar.

================
Comment at: pp-trace/PPCallbacksTracker.cpp:25-32
@@ +24,10 @@
+
+// Get a "file:line:column" source location string.
+static std::string getSourceLocationString(clang::Preprocessor &PP,
+                                           clang::SourceLocation Loc) {
+  if (Loc.isInvalid())
+    return std::string("(none)");
+  else
+    return Loc.printToString(PP.getSourceManager());
+}
+
----------------
I think that `Loc.printToString` may sometimes not do quite what you want. See <http://clang.llvm.org/doxygen/SourceLocation_8cpp_source.html#l00038>. You might want to replicate that logic in a customized way here to make sure that the results are consistently as intended.

================
Comment at: pp-trace/PPCallbacksTracker.cpp:411-412
@@ +410,4 @@
+    return;
+  Argument Arg(Name, Value);
+  CallbackCalls.back().Arguments.push_back(Arg);
+}
----------------
You can simplify this to just `.push_back(Argument(Name, Value))`

================
Comment at: test/pp-trace/pp-trace-macro.cpp:1
@@ +1,2 @@
+// RUN: pp-trace -ignore FileChanged %s -undef -target x86_64 -Xclang -std=c++11 -Xclang -triple=x86_64 | FileCheck --strict-whitespace %s
+
----------------
Why do you need `-Xclang` for `-std=c++11`?

Also, why does `-Xclang -triple=x86_64` need to be there if you are already passing `-target x86_64`?

================
Comment at: pp-trace/PPTrace.cpp:19-24
@@ +18,8 @@
+//
+// The pp-trace tool supports the following general command line format:
+//
+//    pp-trace [pp-trace options] (source file(s)) [compiler options]
+//
+// Basically you put the pp-trace options first, then the source file or files,
+// and then any options you want to pass to the compiler.
+//
----------------
I assume that a `--` can be used to separate the `[compiler options]` from the rest. Is that correct? You probably should mention the behavior in that case (especially if what I described is *not* the case, to prevent confusion).

================
Comment at: pp-trace/PPTrace.cpp:216
@@ +215,3 @@
+    llvm::tool_output_file Out(OutputFileName.c_str(), Error);
+    if (Error.empty()) {
+      HadErrors = outputPPTrace(CallbackCalls, Out.os());
----------------
Use an early exit to simplify this:

```
if (!Error.empty()) {
  // print message
  return 1;
}
```


http://llvm-reviews.chandlerc.com/D2020



More information about the cfe-commits mailing list