[all-commits] [llvm/llvm-project] 55eb71: [NFC] OpenMPOpt: add a statistic for num of parall...

Roman Lebedev via All-commits all-commits at lists.llvm.org
Fri Jun 12 13:13:51 PDT 2020


  Branch: refs/heads/master
  Home:   https://github.com/llvm/llvm-project
  Commit: 55eb714a0e8dd4e83f987979de823ce8e8bbd2f0
      https://github.com/llvm/llvm-project/commit/55eb714a0e8dd4e83f987979de823ce8e8bbd2f0
  Author: Roman Lebedev <lebedev.ri at gmail.com>
  Date:   2020-06-12 (Fri, 12 Jun 2020)

  Changed paths:
    M llvm/lib/Transforms/IPO/OpenMPOpt.cpp

  Log Message:
  -----------
  [NFC] OpenMPOpt: add a statistic for num of parallel regions deleted


  Commit: 7aeb41b3c8446e5f5df67a20cba3101d899da27e
      https://github.com/llvm/llvm-project/commit/7aeb41b3c8446e5f5df67a20cba3101d899da27e
  Author: Roman Lebedev <lebedev.ri at gmail.com>
  Date:   2020-06-12 (Fri, 12 Jun 2020)

  Changed paths:
    M llvm/lib/Transforms/Vectorize/VectorCombine.cpp

  Log Message:
  -----------
  [NFCI] VectorCombine: add statistic for bitcast(shuf()) -> shuf(bitcast()) xform


  Commit: 17f765415245dc59bdf17d8e2b6911cba3aeb504
      https://github.com/llvm/llvm-project/commit/17f765415245dc59bdf17d8e2b6911cba3aeb504
  Author: Roman Lebedev <lebedev.ri at gmail.com>
  Date:   2020-06-12 (Fri, 12 Jun 2020)

  Changed paths:
    M llvm/lib/CodeGen/MachineCopyPropagation.cpp

  Log Message:
  -----------
  [NFCI][MachineCopyPropagation] invalidateRegister(): use SmallSet<8> instead of DenseSet.

This decreases the time consumed by the pass [during RawSpeed unity build]
by 25% (0.0586 s -> 0.04388 s).

While that isn't really impressive overall, that wasn't the goal here.
The memory results here are noticeable.
The baseline results are:
```
total runtime: 55.65s.
calls to allocation functions: 19754254 (354960/s)
temporary memory allocations: 4951609 (88974/s)
peak heap memory consumption: 239.13MB
peak RSS (including heaptrack overhead): 463.79MB
total memory leaked: 198.01MB
```
While with this patch the results are:
```
total runtime: 55.37s.
calls to allocation functions: 19068237 (344403/s)   # -3.47 %
temporary memory allocations: 4261772 (76974/s)      # -13.93 % (!!!)
peak heap memory consumption: 239.13MB
peak RSS (including heaptrack overhead): 463.73MB
total memory leaked: 198.01MB
```

So we get rid of *a lot* of temporary allocations.

Using `SmallSet<8>` makes sense to me because at least here
for x86 BdVer2, the size of that set is *never* more than 3,
over all of llvm test-suite + RawSpeed.

The story might be different on other targets,
not sure if it will ever justify whole DenseSet,
but if it does SmallDenseSet might be a compromise.


Compare: https://github.com/llvm/llvm-project/compare/ca77aa03fef7...17f765415245


More information about the All-commits mailing list