[PATCH] D78932: [DSE,MSSA] Relax post-dom restriction for objs visible after return.

Florian Hahn via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Apr 27 08:01:48 PDT 2020


fhahn created this revision.
fhahn added reviewers: dmgreen, bryant, asbirlea, Tyker, efriedma, george.burgess.iv.
Herald added subscribers: hiraditya, Prazek.
Herald added a project: LLVM.

This patch relaxes the post-dominance requirement for accesses to
objects visible after the function returns.

Instead of requiring the killing def to post-dominate the access to
eliminate, the set of 'killing blocks' (= blocks that completely
overwrite the original access) is collected.

If all paths from the access to eliminate and an exit block go through a
killing block, the access can be removed.

To check this property, we first get the common post-dominator block for
the killing blocks. If this block does not post-dominate the access
block, there may be a path from DomAccess to an exit block not involving
any killing block.

Otherwise we have to check if there is a path from the DomAccess to the
common post-dominator, that does not contain a killing block. If there
is no such path, we can remove DomAccess. For this check, we start at
the common post-dominator and then traverse the CFG backwards. Paths are
terminated when we hit a killing block or a block that is not executed
between DomAccess and a killing block according to the post-order
numbering (if the post order number of a block is greater than the one
of DomAccess, the block cannot be in in a path starting at DomAccess).

This gives the following improvements on the total number of stores
after DSE for MultiSource, SPEC2K, SPEC2006:

Tests: 237
Same hash: 206 (filtered out)
Remaining: 31
Metric: dse.NumRemainingStores

Program                                        base      new100    diff
 test-suite...CFP2000/188.ammp/188.ammp.test   3624.00   3544.00   -2.2%
 test-suite...ch/g721/g721encode/encode.test   128.00    126.00    -1.6%
 test-suite.../Benchmarks/Olden/mst/mst.test    73.00     72.00    -1.4%
 test-suite...CFP2006/433.milc/433.milc.test   3202.00   3163.00   -1.2%
 test-suite...000/186.crafty/186.crafty.test   5062.00   5010.00   -1.0%
 test-suite...-typeset/consumer-typeset.test   40460.00  40248.00  -0.5%
 test-suite...Source/Benchmarks/sim/sim.test   642.00    639.00    -0.5%
 test-suite...nchmarks/McCat/09-vor/vor.test   642.00    644.00     0.3%
 test-suite...lications/sqlite3/sqlite3.test   35664.00  35563.00  -0.3%
 test-suite...T2000/300.twolf/300.twolf.test   7202.00   7184.00   -0.2%
 test-suite...lications/ClamAV/clamscan.test   19475.00  19444.00  -0.2%
 test-suite...INT2000/164.gzip/164.gzip.test   2199.00   2196.00   -0.1%
 test-suite...peg2/mpeg2dec/mpeg2decode.test   2380.00   2378.00   -0.1%
 test-suite.../Benchmarks/Bullet/bullet.test   39335.00  39309.00  -0.1%
 test-suite...:: External/Povray/povray.test   36951.00  36927.00  -0.1%
 test-suite...marks/7zip/7zip-benchmark.test   67396.00  67356.00  -0.1%
 test-suite...6/464.h264ref/464.h264ref.test   31497.00  31481.00  -0.1%
 test-suite...006/453.povray/453.povray.test   51441.00  51416.00  -0.0%
 test-suite...T2006/401.bzip2/401.bzip2.test   4450.00   4448.00   -0.0%
 test-suite...Applications/kimwitu++/kc.test   23481.00  23471.00  -0.0%
 test-suite...chmarks/MallocBench/gs/gs.test   6286.00   6284.00   -0.0%
 test-suite.../CINT2000/254.gap/254.gap.test   13719.00  13715.00  -0.0%
 test-suite.../Applications/SPASS/SPASS.test   30345.00  30338.00  -0.0%
 test-suite...006/450.soplex/450.soplex.test   15018.00  15016.00  -0.0%
 test-suite...ications/JM/lencod/lencod.test   27780.00  27777.00  -0.0%
 test-suite.../CINT2006/403.gcc/403.gcc.test   105285.00 105276.00 -0.0%

There might be potential to pre-compute some of the information of which
blocks are on the path to an exit for each block, but the overall
benefit might be comparatively small.

On the set of benchmarks, 15738 times out of 20322 we reach the
CFG check, the CFG check is successful. The total number of iterations
in the CFG check is 187810, so on average we need less than 10 steps in
the check loop. Bumping the threshold in the loop from 50 to 150 gives a
few small improvements, but I don't think they warrant such a big bump
at the moment. This is all pending further tuning in the future.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D78932

Files:
  llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp
  llvm/test/Transforms/DeadStoreElimination/MSSA/combined-partial-overwrites.ll
  llvm/test/Transforms/DeadStoreElimination/MSSA/multiblock-memintrinsics.ll
  llvm/test/Transforms/DeadStoreElimination/MSSA/multiblock-multipath.ll
  llvm/test/Transforms/DeadStoreElimination/MSSA/multiblock-simple.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D78932.260325.patch
Type: text/x-patch
Size: 12549 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20200427/4c2aeeb2/attachment.bin>


More information about the llvm-commits mailing list