[all-commits] [llvm/llvm-project] 55eb71: [NFC] OpenMPOpt: add a statistic for num of parall...
Roman Lebedev via All-commits
all-commits at lists.llvm.org
Fri Jun 12 13:13:51 PDT 2020
Branch: refs/heads/master
Home: https://github.com/llvm/llvm-project
Commit: 55eb714a0e8dd4e83f987979de823ce8e8bbd2f0
https://github.com/llvm/llvm-project/commit/55eb714a0e8dd4e83f987979de823ce8e8bbd2f0
Author: Roman Lebedev <lebedev.ri at gmail.com>
Date: 2020-06-12 (Fri, 12 Jun 2020)
Changed paths:
M llvm/lib/Transforms/IPO/OpenMPOpt.cpp
Log Message:
-----------
[NFC] OpenMPOpt: add a statistic for num of parallel regions deleted
Commit: 7aeb41b3c8446e5f5df67a20cba3101d899da27e
https://github.com/llvm/llvm-project/commit/7aeb41b3c8446e5f5df67a20cba3101d899da27e
Author: Roman Lebedev <lebedev.ri at gmail.com>
Date: 2020-06-12 (Fri, 12 Jun 2020)
Changed paths:
M llvm/lib/Transforms/Vectorize/VectorCombine.cpp
Log Message:
-----------
[NFCI] VectorCombine: add statistic for bitcast(shuf()) -> shuf(bitcast()) xform
Commit: 17f765415245dc59bdf17d8e2b6911cba3aeb504
https://github.com/llvm/llvm-project/commit/17f765415245dc59bdf17d8e2b6911cba3aeb504
Author: Roman Lebedev <lebedev.ri at gmail.com>
Date: 2020-06-12 (Fri, 12 Jun 2020)
Changed paths:
M llvm/lib/CodeGen/MachineCopyPropagation.cpp
Log Message:
-----------
[NFCI][MachineCopyPropagation] invalidateRegister(): use SmallSet<8> instead of DenseSet.
This decreases the time consumed by the pass [during RawSpeed unity build]
by 25% (0.0586 s -> 0.04388 s).
While that isn't really impressive overall, that wasn't the goal here.
The memory results here are noticeable.
The baseline results are:
```
total runtime: 55.65s.
calls to allocation functions: 19754254 (354960/s)
temporary memory allocations: 4951609 (88974/s)
peak heap memory consumption: 239.13MB
peak RSS (including heaptrack overhead): 463.79MB
total memory leaked: 198.01MB
```
While with this patch the results are:
```
total runtime: 55.37s.
calls to allocation functions: 19068237 (344403/s) # -3.47 %
temporary memory allocations: 4261772 (76974/s) # -13.93 % (!!!)
peak heap memory consumption: 239.13MB
peak RSS (including heaptrack overhead): 463.73MB
total memory leaked: 198.01MB
```
So we get rid of *a lot* of temporary allocations.
Using `SmallSet<8>` makes sense to me because at least here
for x86 BdVer2, the size of that set is *never* more than 3,
over all of llvm test-suite + RawSpeed.
The story might be different on other targets,
not sure if it will ever justify whole DenseSet,
but if it does SmallDenseSet might be a compromise.
Compare: https://github.com/llvm/llvm-project/compare/ca77aa03fef7...17f765415245
More information about the All-commits
mailing list