[PATCH] D114401: [Passes] Run LowerConstantIntrinsics after SCCP/before DSE.

Florian Hahn via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Nov 22 14:35:59 PST 2021


fhahn created this revision.
fhahn added reviewers: joerg, whitequark, rjmccall, void, chandlerc, asbirlea.
Herald added subscribers: ormris, wenlei, steven_wu, hiraditya.
fhahn requested review of this revision.
Herald added a project: LLVM.

This patch adjusts the LowerConstantIntrinsics pass placement in the
pipeline to run earlier than currently.

My main motivation for running it earlier is to improve codegen for code
using __builtin___mem*_chk / __builtin_object_size combinations.

At the moment, we miss out on various memory related optimizations when
such builtins are involved, because they won't get transformed to
llvm.memset/llvm.memcpy calls early enough. This limits optimizations.

By running it before DSE/MemCpyOpt, we will be able to perform DSE on
code such as below.

  void memset_chk(int** ptr, int *foo) {
      __builtin___memset_chk (ptr, 0, 128, __builtin_object_size (ptr, 0));
      ptr[1] = foo;
      ptr[2] = ((int *)0);
  }
  
  void use(char*);
  
  void memcpy_chk(int* ptr, int *bar) {
      char buf[128];
      use(&buf[0]);
  
      bar[0] = 10;
      __builtin___memcpy_chk (bar, &buf[0], 128, __builtin_object_size (bar, 0));
  }

It would be great if people familiar with the pass could chime in if
there are any concerns adjusting the placement. I *think* the current
conditions should not impact the LowerConstantIntrinsics pass results
too much, as  most reasoning/simplification passes have run already.

Note that the placement for -O1 is a bit odd, because SCCP gets run
*after* MemCpyOpt, not before as for other optimization levels. For O1 <https://reviews.llvm.org/owners/package/1/>
it is placed after SCCP, because SCCP can help simplifying IR feeding
LowerConstantIntrinsics.

I collected stats for building SPEC2006,SPEC2017 and MultiSource with
-O3 and a stdlib that uses the `_chk` builtins. The results are below
and show a few notable improvements with respect to memory
optimizations.

  Metric: memcpyopt.NumCpyToSet
  Program                                        base   patch  diff
   test-suite...6.blender_r/526.blender_r.test    41.00  42.00  2.4%
   Geomean difference                                           2.4%
  
  Metric: memcpyopt.NumMemCpyInstr
  Program                                        base   patch  diff
   test-suite.../CINT2006/403.gcc/403.gcc.test    48.00  51.00  6.2%
   Geomean difference                                           6.2%
  
  Metric: memcpyopt.NumMemSetInfer
  Program                                        base   patch  diff
   test-suite...lications/sqlite3/sqlite3.test    39.00 136.00 248.7%
   test-suite...plications/d/make_dparser.test     1.00   2.00 100.0%
   test-suite...ications/JM/ldecod/ldecod.test    17.00  22.00 29.4%
   test-suite...6/482.sphinx3/482.sphinx3.test     9.00  10.00 11.1%
   test-suite...chmarks/MallocBench/gs/gs.test    10.00  11.00 10.0%
   test-suite...nsumer-lame/consumer-lame.test    13.00  14.00  7.7%
   test-suite...0.perlbench/400.perlbench.test    39.00  41.00  5.1%
   test-suite...ications/JM/lencod/lencod.test    56.00  58.00  3.6%
   test-suite...017rate/557.xz_r/557.xz_r.test    37.00  38.00  2.7%
   test-suite...lications/ClamAV/clamscan.test    44.00  45.00  2.3%
   test-suite...7rate/502.gcc_r/502.gcc_r.test   605.00 617.00  2.0%
   test-suite...ate/525.x264_r/525.x264_r.test    71.00  72.00  1.4%
   test-suite...6.blender_r/526.blender_r.test   751.00 758.00  0.9%
   test-suite.../CINT2006/403.gcc/403.gcc.test   148.00 149.00  0.7%
   test-suite...marks/7zip/7zip-benchmark.test   416.00 417.00  0.2%
   Geomean difference                                          19.4%
  
  Metric: dse.NumFastOther
  Program                                        base   patch  diff
   test-suite...lications/viterbi/viterbi.test     0.00  12.00  inf%
   test-suite...lications/sqlite3/sqlite3.test    12.00  43.00 258.3%
   test-suite...T2006/445.gobmk/445.gobmk.test     4.00   7.00 75.0%
   test-suite...0.perlbench/400.perlbench.test    49.00  75.00 53.1%
   test-suite...abench/jpeg/jpeg-6a/cjpeg.test     4.00   5.00 25.0%
   test-suite...chmarks/MallocBench/gs/gs.test     9.00  11.00 22.2%
   test-suite...nsumer-lame/consumer-lame.test     7.00   8.00 14.3%
   test-suite...plications/d/make_dparser.test    12.00  13.00  8.3%
   test-suite...nsumer-jpeg/consumer-jpeg.test    13.00  14.00  7.7%
   test-suite...7rate/502.gcc_r/502.gcc_r.test   258.00 271.00  5.0%
   test-suite...rlbench_r/500.perlbench_r.test    42.00  44.00  4.8%
   test-suite...pplications/oggenc/oggenc.test    22.00  23.00  4.5%
   test-suite...6.blender_r/526.blender_r.test   503.00 519.00  3.2%
   test-suite.../CINT2006/403.gcc/403.gcc.test    78.00  80.00  2.6%
  
  Metric: dse.NumFastStores
  Program                                        base    patch   diff
   test-suite...chmarks/MallocBench/gs/gs.test    55.00   59.00   7.3%
   test-suite...pplications/oggenc/oggenc.test   129.00  132.00   2.3%
   test-suite...7rate/502.gcc_r/502.gcc_r.test   1491.00 1524.00  2.2%
   test-suite...rlbench_r/500.perlbench_r.test   235.00  238.00   1.3%
   test-suite...lications/ClamAV/clamscan.test   214.00  216.00   0.9%
   test-suite...0.perlbench/400.perlbench.test   177.00  178.00   0.6%
   test-suite...6.blender_r/526.blender_r.test   3683.00 3703.00  0.5%
   test-suite...ate/525.x264_r/525.x264_r.test   410.00  412.00   0.5%
  
  Metric: dse.NumRedundantStores
  Program                                        base   patch  diff
   test-suite...lications/viterbi/viterbi.test     1.00  15.00 1400.0%
   test-suite...0.perlbench/400.perlbench.test     3.00  17.00 466.7%
   test-suite...lications/sqlite3/sqlite3.test     7.00  27.00 285.7%
   test-suite...pplications/oggenc/oggenc.test     8.00  18.00 125.0%
   test-suite...017rate/557.xz_r/557.xz_r.test     1.00   2.00 100.0%
   test-suite.../Benchmarks/Ptrdist/bc/bc.test     1.00   2.00 100.0%
   test-suite...7rate/502.gcc_r/502.gcc_r.test    14.00  22.00 57.1%
   test-suite...nsumer-jpeg/consumer-jpeg.test     2.00   3.00 50.0%
   test-suite...plications/d/make_dparser.test    22.00  32.00 45.5%
   test-suite...rlbench_r/500.perlbench_r.test     4.00   5.00 25.0%
   test-suite.../CINT2006/403.gcc/403.gcc.test    11.00  13.00 18.2%
   test-suite...6.blender_r/526.blender_r.test    79.00  90.00 13.9%
   test-suite...nsumer-lame/consumer-lame.test     8.00   9.00 12.5%


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D114401

Files:
  llvm/lib/Passes/PassBuilderPipelines.cpp
  llvm/test/Other/new-pm-defaults.ll
  llvm/test/Other/new-pm-thinlto-defaults.ll
  llvm/test/Other/new-pm-thinlto-postlink-pgo-defaults.ll
  llvm/test/Other/new-pm-thinlto-postlink-samplepgo-defaults.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D114401.389047.patch
Type: text/x-patch
Size: 8389 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20211122/c3dfee5f/attachment.bin>


More information about the llvm-commits mailing list