[PATCH] D134942: [Lex] Simplify and cleanup the updateConsecutiveMacroArgTokens implementation.

Sam McCall via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Tue Oct 4 12:13:08 PDT 2022


sammccall added a comment.

Thanks, I think this is worthwhile for the simpler code, better (non-lying) comments, avoiding arg-evaluation-order stuff.

-0.05% time and -0.5% SLocEntries is interesting too! Clearly not earth-shattering but the the offset table idea gives us a lead to follow.
I do have one idea to skip a getFileID() that might not be cached...



================
Comment at: clang/lib/Lex/TokenLexer.cpp:1019
+    // sourcelocation-against-bounds comparison.
+    FileID BeginFID = SM.getFileID(BeginLoc);
+    SourceLocation Limit =
----------------
this getFileID() call is unneccesary when `All.empty() || All.front().getLocation().isFileID()`.

Worth checking whether bailing out in that case is profitable?

You could do it right at the top:
```
if (All.size() == 1 || All[0].getLocation().isFileID() != All[1].getLocation().isFileID())
  return All.take_front(1);
```


================
Comment at: clang/lib/Lex/TokenLexer.cpp:1030
+      Partition.back().getEndLoc().getRawEncoding() -
+      Partition.begin()->getLocation().getRawEncoding();
   // Create a macro expansion SLocEntry that will "contain" all of the tokens.
----------------
nit: s/begin/front/ if you're using back()


================
Comment at: clang/lib/Lex/TokenLexer.cpp:1038
+  for (Token& T : Partition) {
+    SourceLocation::IntTy RelativeOffset = T.getLocation().getRawEncoding() -
+       BeginLoc.getRawEncoding();
----------------
hmm, actually maybe just before this line would be the best place to assert that T and BeginLoc are in the same FileID, as it justifies the subtraction


================
Comment at: clang/lib/Lex/TokenLexer.cpp:1001
+  if (BeginLoc.isFileID()) {
+    // We can avoid calling getFileID at all for *file* consecutive tokens --
+    // for this case it is impossible to switch between files in marco arguments
----------------
I think this comment can be shorter while still getting its point across.

```
// Consecutive tokens not written in macros must be from the same file.
// (Neither #include nor eof can occur inside a macro argument.)
```


================
Comment at: clang/lib/Lex/TokenLexer.cpp:1008
+    });
+    assert(!Partition.empty() &&
+           llvm::all_of(Partition.drop_front(),
----------------
nickdesaulniers wrote:
> Could you just check that all of the tokens in the partition have the same fileID as the first token?
> 
> ```
> FileID FirstFID = SM.getFileID(Partition[0]->getLocation());
> llvm::all_of(Partition, [&SM, &FirstID](const Token &T) { return SM.getFileID(T.getLocation() == FID; });
> ```
> or move the assertion into the take_while above so we iterate less?
this assertion seems to belong outside the if() - it applies to both the file/macro case?
I'd suggest asserting nonempty first and then the rest as another assertion.
also missing an assertion that if there are any more tokens, the next token has a different FileID

that said with these assertions we should probably check we're not regressing debug performance too much!


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D134942/new/

https://reviews.llvm.org/D134942



More information about the cfe-commits mailing list