[all-commits] [llvm/llvm-project] 0d300d: [Clang] Fix compile time regression caused by D126...

Tue Jun 21 14:15:59 PDT 2022

  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: 0d300da799b06931eb6b974198d683548a8c8392
      https://github.com/llvm/llvm-project/commit/0d300da799b06931eb6b974198d683548a8c8392
  Author: Martin Boehme <mboehme at google.com>
  Date:   2022-06-21 (Tue, 21 Jun 2022)

  Changed paths:
    M clang/include/clang/Sema/ParsedAttr.h

  Log Message:
  -----------
  [Clang] Fix compile time regression caused by D126061.

As noted by @nikic, D126061 causes a compile time regression of about
0.5% on -O0 builds:

http://llvm-compile-time-tracker.com/compare.php?from=7acc88be0312c721bc082ed9934e381d297f4707&to=8c7b64b5ae2a09027c38db969a04fc9ddd0cd6bb&stat=instructions

This happens because, in a number of places, D126061 creates an
additional local variable of type `ParsedAttributes`. In the large
majority of cases, no attributes are added to this `ParsedAttributes`,
but it turns out that creating an empty `ParsedAttributes`, then
destroying it is a relatively expensive operation.

The reason for this is because `AttributePool` uses a `TinyPtrVector` as
its underlying vector class, and the technique that `TinyPtrVector`
employs to achieve its extreme memory frugality makes the `begin()` and
`end()` member functions relatively slow. The `ParsedAttributes`
destructor iterates over the attributes in its `AttributePool`, and this
is a relatively expensive operation because `TinyPtrVector`'s `begin()` and
`end()` are relatively slow.

The fix for this is to replace `TinyPtrVector` in `ParsedAttributes` and
`AttributePool` with `SmallVector`. `ParsedAttributes` and
`AttributePool` objects are only ever allocated on the stack (they're
not part of the AST), and only a small number of these objects are live
at any given time, so they don't need the extreme memory frugality of
`TinyPtrVector`.

I've confirmed with valgrind that this patch does not increase heap
memory usage, and it actually makes compiles slightly faster than they
were before D126061.

Here are instruction count measurements (obtained with callgrind)
running `clang -c MultiSource/Applications/JM/lencod/parsetcommon.c`
(a file from llvm-test-suite that exhibited a particularly large
compile-time regression):

7acc88be0312c721bc082ed9934e381d297f4707
(baseline one commit before D126061 landed)
102,280,068 instructions

8c7b64b5ae2a09027c38db969a04fc9ddd0cd6bb
(the patch that landed D126061)
103,289,454 instructions
(+0.99% relative to baseline)

This patch applied onto
8c7b64b5ae2a09027c38db969a04fc9ddd0cd6bb
101,117,584 instructions
(-1.14% relative to baseline)

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D128097