[all-commits] [llvm/llvm-project] aa9790: [clang][Syntax] Optimize expandedTokens for token ...

Utkarsh Saxena via All-commits all-commits at lists.llvm.org
Thu Mar 25 10:54:39 PDT 2021


  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: aa979084dffba86a3e170826b4e89d90820bb78b
      https://github.com/llvm/llvm-project/commit/aa979084dffba86a3e170826b4e89d90820bb78b
  Author: Utkarsh Saxena <usx at google.com>
  Date:   2021-03-25 (Thu, 25 Mar 2021)

  Changed paths:
    M clang-tools-extra/clangd/ParsedAST.cpp
    M clang/include/clang/Tooling/Syntax/Tokens.h
    M clang/lib/Tooling/Syntax/Tokens.cpp
    M clang/unittests/Tooling/Syntax/TokensTest.cpp

  Log Message:
  -----------
  [clang][Syntax] Optimize expandedTokens for token ranges.

`expandedTokens(SourceRange)` used to do a binary search to get the
expanded tokens belonging to a source range. Each binary search uses
`isBeforeInTranslationUnit` to order two source locations. This is
inherently very slow.
By profiling clangd we found out that users like clangd::SelectionTree
spend 95% of time in `isBeforeInTranslationUnit`. Also it is worth
noting that users of `expandedTokens(SourceRange)` majorly use ranges
provided by AST to query this funciton. The ranges provided by AST are
token ranges (starting at the beginning of a token and ending at the
beginning of another token).

Therefore we can avoid the binary search in majority of the cases by
maintaining an index of ExpandedToken by their SourceLocations. We still
do binary search for ranges which are not token ranges but such
instances are quite low.

Performance:
`~/build/bin/clangd --check=clang/lib/Serialization/ASTReader.cpp`
Before: Took 2:10s to complete.
Now: Took 1:13s to complete.

Differential Revision: https://reviews.llvm.org/D99086




More information about the All-commits mailing list