[PATCH] D86530: [InstCombine] PHI-of-insertvalues -> insertvalue-of-PHI's, aka invokes are bad

Roman Lebedev via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Aug 25 04:51:13 PDT 2020


lebedev.ri created this revision.
lebedev.ri added a reviewer: spatel.
lebedev.ri added a project: LLVM.
Herald added a subscriber: hiraditya.
lebedev.ri requested review of this revision.

While since D86306 <https://reviews.llvm.org/D86306> we do it's sibling fold for `insertvalue`,
we should also do this for `extractvalue`'s.

And unlike that one, the results here are quite honestly shocking,
as it can be observed here on vanilla llvm test-suite + RawSpeed results:

  | statistic name                                     | baseline  | proposed  |       Δ |       % |    |%| |
  |----------------------------------------------------|-----------|-----------|--------:|--------:|-------:|
  | asm-printer.EmittedInsts                           | 7945095   | 7942507   |   -2588 |  -0.03% |  0.03% |
  | assembler.ObjectBytes                              | 273209920 | 273069800 | -140120 |  -0.05% |  0.05% |
  | early-cse.NumCSE                                   | 2183363   | 2183398   |      35 |   0.00% |  0.00% |
  | early-cse.NumSimplify                              | 541847    | 550017    |    8170 |   1.51% |  1.51% |
  | instcombine.NumAggregateReconstructionsSimplified  | 2139      | 108       |   -2031 | -94.95% | 94.95% |
  | instcombine.NumCombined                            | 3601364   | 3635448   |   34084 |   0.95% |  0.95% |
  | instcombine.NumConstProp                           | 27153     | 27157     |       4 |   0.01% |  0.01% |
  | instcombine.NumDeadInst                            | 1694521   | 1765022   |   70501 |   4.16% |  4.16% |
  | instcombine.NumPHIsOfExtractValues                 | 0         | 37546     |   37546 |   0.00% |  0.00% |
  | instcombine.NumSunkInst                            | 63158     | 63686     |     528 |   0.84% |  0.84% |
  | instcount.NumBrInst                                | 874304    | 871857    |   -2447 |  -0.28% |  0.28% |
  | instcount.NumCallInst                              | 1757657   | 1758402   |     745 |   0.04% |  0.04% |
  | instcount.NumExtractValueInst                      | 45623     | 11483     |  -34140 | -74.83% | 74.83% |
  | instcount.NumInsertValueInst                       | 4983      | 580       |   -4403 | -88.36% | 88.36% |
  | instcount.NumInvokeInst                            | 61018     | 59478     |   -1540 |  -2.52% |  2.52% |
  | instcount.NumLandingPadInst                        | 35334     | 34215     |   -1119 |  -3.17% |  3.17% |
  | instcount.NumPHIInst                               | 344428    | 331116    |  -13312 |  -3.86% |  3.86% |
  | instcount.NumRetInst                               | 100773    | 100772    |      -1 |   0.00% |  0.00% |
  | instcount.TotalBlocks                              | 1081154   | 1077166   |   -3988 |  -0.37% |  0.37% |
  | instcount.TotalFuncs                               | 101443    | 101442    |      -1 |   0.00% |  0.00% |
  | instcount.TotalInsts                               | 8890201   | 8833747   |  -56454 |  -0.64% |  0.64% |
  | instsimplify.NumSimplified                         | 75822     | 75707     |    -115 |  -0.15% |  0.15% |
  | simplifycfg.NumHoistCommonCode                     | 24203     | 24197     |      -6 |  -0.02% |  0.02% |
  | simplifycfg.NumHoistCommonInstrs                   | 48201     | 48195     |      -6 |  -0.01% |  0.01% |
  | simplifycfg.NumInvokes                             | 2785      | 4298      |    1513 |  54.33% | 54.33% |
  | simplifycfg.NumSimpl                               | 997332    | 1018189   |   20857 |   2.09% |  2.09% |
  | simplifycfg.NumSinkCommonCode                      | 7088      | 6464      |    -624 |  -8.80% |  8.80% |
  | simplifycfg.NumSinkCommonInstrs                    | 15117     | 14021     |   -1096 |  -7.25% |  7.25% |

... which tells us that this new fold fires whopping 38k times,
increasing the amount of SimplifyCFG's `invoke`->`call` transforms by +54% (+1513) (again, D85787 <https://reviews.llvm.org/D85787> did that last time),
decreasing total instruction count by -0.64% (-56454),
and sharply decreasing count of `insertvalue`'s (-88.36%, i.e. 9 times less)
and `extractvalue`'s (-74.83%, i.e. four times less),
so this may have nice binary size decrease on it's own.

The other thing that tells is, is that while this is a massive win for `invoke`->`call` transform
`InstCombinerImpl::foldAggregateConstructionIntoAggregateReuse()` fold, which is supposed to be dealing with such aggregate reconstructions,
fires a lot less now. There are two reasons why:

1. After this fold, as it can be seen in tests, we may (will) end up with trivially redundant PHI nodes. We don't CSE them in InstCombine presently, which means that EarlyCSE needs to run and then InstCombine rerun.
2. But then, EarlyCSE not only manages to fold such redundant PHI's, it also sees that the extract-insert chain recreates the original aggregate, and replaces it with the original aggregate.

The take-aways are

1. We maybe should do most trivial, same-BB PHI CSE in InstCombine
2. I need to check if what other patterns remain, and how they can be resolved. (i.e. i wonder if `foldAggregateConstructionIntoAggregateReuse()` might go away)


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D86530

Files:
  llvm/lib/Transforms/InstCombine/InstCombineInternal.h
  llvm/lib/Transforms/InstCombine/InstCombinePHI.cpp
  llvm/test/Transforms/InstCombine/phi-aware-aggregate-reconstruction.ll
  llvm/test/Transforms/InstCombine/phi-of-extractvalues.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D86530.287625.patch
Type: text/x-patch
Size: 13297 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20200825/a37f991b/attachment.bin>


More information about the llvm-commits mailing list