[PATCH] D59887: [Syntax] Introduce TokenBuffer, start clangToolingSyntax library

Dmitri Gribenko via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Thu Mar 28 11:15:05 PDT 2019


gribozavr added inline comments.


================
Comment at: clang/include/clang/Tooling/Syntax/TokenBuffer.h:69
+  /// The tokens after preprocessor replacements.
+  llvm::ArrayRef<syntax::Token> tokens(const TokenBuffer &B) const;
+  /// Tokens that appear in the text of the file, i.e. a name of an object-like
----------------
`expandedTokens`?


================
Comment at: clang/include/clang/Tooling/Syntax/TokenBuffer.h:72
+  /// macro or a name, arguments and parentheses of a function-like macro.
+  llvm::ArrayRef<syntax::Token> macroTokens(const TokenBuffer &B) const;
+  /// The range covering macroTokens().
----------------
`invocationTokens` or `macroInvocationTokens`


================
Comment at: clang/include/clang/Tooling/Syntax/TokenBuffer.h:74
+  /// The range covering macroTokens().
+  std::pair<unsigned, unsigned> macroRange(const TokenBuffer &B,
+                                           const SourceManager &SM) const;
----------------
`invocationSourceRange` or `macroInvocationSourceRange` depending on what you choose for the function above.



================
Comment at: clang/include/clang/Tooling/Syntax/TokenBuffer.h:83
+  unsigned BeginMacroToken = 0;
+  unsigned EndMacroToken = 0;
+};
----------------
Please add brief doc comments to these variables. Having read the public API of this class, I still don't have an idea what these variables are.




================
Comment at: clang/include/clang/Tooling/Syntax/TokenBuffer.h:120
+  /// the original source file. The tranformation may not be possible if the
+  /// tokens cross macro invocations in the middle, e.g.
+  ///    #define FOO 1*2
----------------
"The mapping fails if cross boundaries of macro expansions, that is, don't correspond to a complete top-level macro invocation"

Consider adding examples.

```
Given this source file:

#define FIRST f1 f2 f3
#define SECOND s1 s2 3

a FIRST b SECOND c  // expansion: a f1 f2 f3 b s1 s2 s3 c

toOffsetRange will map tokens like this:

input range => output range
a => a
s1 s2 s3 => SECOND
a f1 f2 f3 => a FIRST
a f1 => can't map
s1 s2 => s1 s2 within the macro definition
```

Could you add these to tests as well?


================
Comment at: clang/include/clang/Tooling/Syntax/TokenBuffer.h:153
+  /// FIXME: we do not yet store tokens of directives, like #include, #define,
+  ///        #pragma, etc.
+  llvm::ArrayRef<syntax::Token> macroTokens() const { return MacroTokens; }
----------------
... For the input above, macroTokens() should return {desired output}


================
Comment at: clang/include/clang/Tooling/Syntax/TokenBuffer.h:191
+  /// i.e. after running Execute().
+  TokenBuffer consume() &&;
+
----------------
"Finalizes token collection"

Of course a function called "consume" consumes the result :)


================
Comment at: clang/include/clang/Tooling/Syntax/TokenBuffer.h:191
+  /// i.e. after running Execute().
+  TokenBuffer consume() &&;
+
----------------
gribozavr wrote:
> "Finalizes token collection"
> 
> Of course a function called "consume" consumes the result :)
LLVM_NODISCARD?


================
Comment at: clang/lib/Tooling/Syntax/TokenBuffer.cpp:295
+MacroExpansion::tokens(const TokenBuffer &B) const {
+  return B.tokens().slice(BeginExpansionToken,
+                          EndExpansionToken - BeginExpansionToken);
----------------
"ExpansionTokenBegin"?  "ExpansionTokenStartIndex"?


================
Comment at: clang/lib/Tooling/Syntax/TokenBuffer.cpp:367
+  if (FirstCall != Expansions.end() &&
+      (FirstCall->BeginExpansionToken < BeginIndex ||
+       EndIndex < FirstCall->EndExpansionToken)) {
----------------
`FirstCall->BeginExpansionToken != BeginIndex` ?


================
Comment at: clang/lib/Tooling/Syntax/TokenBuffer.cpp:368
+      (FirstCall->BeginExpansionToken < BeginIndex ||
+       EndIndex < FirstCall->EndExpansionToken)) {
+    return llvm::None;
----------------
I don't understand why `EndIndex < FirstCall->EndExpansionToken` is needed -- isn't it redundant with the `LastCall` checks below?


================
Comment at: clang/lib/Tooling/Syntax/TokenBuffer.cpp:373
+  if (LastCall != Expansions.end() &&
+      (LastCall->BeginExpansionToken < BeginIndex ||
+       EndIndex < LastCall->EndExpansionToken)) {
----------------
`LastCall->BeginExpansionToken != BeginIndex`?


================
Comment at: clang/lib/Tooling/Syntax/TokenBuffer.cpp:374
+      (LastCall->BeginExpansionToken < BeginIndex ||
+       EndIndex < LastCall->EndExpansionToken)) {
+    return llvm::None;
----------------
Also redundant with `FirstCall` checks?


================
Comment at: clang/unittests/Tooling/Syntax/TokenBufferTest.cpp:355
+  checkTokens("");
+  checkMacroInvocations({{"EMPTY", "", Code.range("m")},
+                   {"EMPTY_FUNC(1+2+3)", "", Code.range("f")}});
----------------
`expectTokens`, `expectMacroInvocations`?


================
Comment at: clang/unittests/Tooling/Syntax/TokenBufferTest.cpp:376
+  // The parser eventually breaks the first '>>' into two tokens ('>' and '>'),
+  // but we chooses to record them as a single token (for now).
+  llvm::Annotations Code(R"cpp(
----------------
"we choose"


================
Comment at: clang/unittests/Tooling/Syntax/TokenBufferTest.cpp:395
+      int method() {
+        // Parser will visit method bodies and initializers multiple time, but
+        // TokenBuffer should only record the first walk over the tokens;
----------------
"times"


================
Comment at: clang/unittests/Tooling/Syntax/TokenBufferTest.cpp:429
+  Code = llvm::Annotations(R"cpp(
+    $all[[int $a[[a]] = $numbers[[100 + 200]];]]
+  )cpp");
----------------
Could we instead use some nonsensical code like "a1 a2 a3 FIRST a3 a4 a5 SECOND a6 a7 a8", and instead of `OfKind` we can make a helper that finds the identifier with the given name?


================
Comment at: clang/unittests/Tooling/Syntax/TokenBufferTest.cpp:467
+                                 std::next(OfKind(tok::plus)), *SourceMgr),
+            llvm::None);
+}
----------------
Please add tests with multiple macro calls.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D59887/new/

https://reviews.llvm.org/D59887





More information about the cfe-commits mailing list