[PATCH] D95631: [TailDuplicator] Add TargetInstrInfo hook to modify the TailDuplicateSize default threshold

Tue Feb 2 08:01:43 PST 2021

NickGuy updated this revision to Diff 320791.
NickGuy added a comment.

> I don't think I would mind merging this with D95632 <https://reviews.llvm.org/D95632> to keep things in one place; don't think separating brings much value in this case.

Done

In D95632#2533284 <https://reviews.llvm.org/D95632#2533284>, @dmgreen wrote:

> This sounds OK to me, but 6 is quite a big jump up from 2. Is is possible to make the normal threshold 2 or 3, and the aggressive threshold 6?
> Oh. It looks like there is already an aggressive limit. TailDuplicator is either used by itself or in MachineBlockPlacement. MachineBlockPlacement already sets the limit to 4 when optimizing aggressively. Would it work for your usecase to alter the value there based on the target instead?

I don't see a problem with only performing this at -O3, and moving the override from TailDuplicator to MachineBlockPlacement doesn't seem to cause any problems (at least, as far as I could tell).

> you need to show that this is good/bad/neutral for some more code other than our one motivation example.

Here are the results from the benchmarking performed with this patch; It causes an improvement in a number of cases, however due to the size of said tests, the results are rather susceptible to system noise.

LLVM Test suite, AArch64 exec_time results (cherry-picked for the most interesting/significant tests), lower is better

| Benchmark                                                                  | Duplication threshold = 2 | Duplication threshold = 6 | Difference % |
| -------------------------------------------------------------------------- | ------------------------- | ------------------------- | ------------ |
| test-suite :: SingleSource/Benchmarks/Misc/evalloop.test                   | 1.212                     | 0.6376                    | -47.39%      |
| test-suite :: MultiSource/Benchmarks/TSVC/Searching-dbl/Searching-dbl.test | 4.38                      | 6.408                     | 46.30%       |
| test-suite :: MultiSource/Benchmarks/TSVC/Searching-flt/Searching-flt.test | 6.408                     | 4.356                     | -32.02%      |
| test-suite :: SingleSource/Benchmarks/Shootout-C++/Shootout-C++-ary2.test  | 0.02                      | 0.0136                    | -32.00%      |
| test-suite :: SingleSource/Benchmarks/McGill/misr.test                     | 0.252                     | 0.2968                    | 17.78%       |
| test-suite :: MultiSource/Applications/ALAC/decode/alacconvert-decode.test | 0.016                     | 0.0184                    | 15.00%       |
| test-suite :: MultiSource/Benchmarks/Ptrdist/ks/ks.test                    | 1.7544                    | 1.5                       | -14.50%      |
| test-suite :: SingleSource/Benchmarks/Misc/himenobmtxpa.test               | 1.536                     | 1.3496                    | -12.14%      |
| test-suite :: MultiSource/Applications/kimwitu++/kc.test                   | 0.048                     | 0.0432                    | -10.00%      |
| test-suite :: SingleSource/Benchmarks/Stanford/Towers.test                 | 0.0184                    | 0.0168                    | -8.70%       |
| test-suite :: MultiSource/Benchmarks/Olden/mst/mst.test                    | 0.1168                    | 0.1072                    | -8.22%       |
| test-suite :: MultiSource/Benchmarks/TSVC/Packing-flt/Packing-flt.test     | 5.7104                    | 6.1456                    | 7.62%        |
| test-suite :: MultiSource/Applications/hexxagon/hexxagon.test              | 12.4872                   | 11.6384                   | -6.80%       |
| test-suite :: MultiSource/Benchmarks/mediabench/gsm/toast/toast.test       | 0.0256                    | 0.0272                    | 6.25%        |
| test-suite :: MultiSource/Applications/lambda-0.1.3/lambda.test            | 5.7496                    | 6.1                       | 6.09%        |
| test-suite :: MultiSource/Benchmarks/Ptrdist/anagram/anagram.test          | 1.2448                    | 1.32                      | 6.04%        |
| test-suite :: MultiSource/Benchmarks/DOE-ProxyApps-C++/miniFE/miniFE.test  | 6.048                     | 5.7024                    | -5.71%       |
| test-suite :: MultiSource/Benchmarks/Olden/bisort/bisort.test              | 0.6808                    | 0.648                     | -4.82%       |
| test-suite :: MultiSource/Applications/ClamAV/clamscan.test                | 0.1984                    | 0.1896                    | -4.44%       |
| test-suite :: MultiSource/Benchmarks/McCat/03-testtrie/testtrie.test       | 0.0192                    | 0.02                      | 4.17%        |
|

Spec results are a bit more stable, and show a similar pattern (a small delta in both directions, but overall an improvement).
Beyond the benchmarks shown in this table, no other meaningful differences were identified.

Spec2017 results, higher is better

| Benchmark       | Difference % |
| --------------- | ------------ |
| 505.mcf_r       | 0.675%       |
| 523.xalancbmk_r | -0.24%       |
| 557.xz_r        | 0.417%       |

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D95631/new/

https://reviews.llvm.org/D95631

Files:
  llvm/include/llvm/CodeGen/TargetInstrInfo.h
  llvm/lib/CodeGen/MachineBlockPlacement.cpp
  llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
  llvm/lib/Target/AArch64/AArch64InstrInfo.h
  llvm/test/CodeGen/AArch64/aarch64-tail-dup-size.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D95631.320791.patch
Type: text/x-patch
Size: 6180 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20210202/29d177c5/attachment.bin>