[PATCH] D105700: [LoopSimplify] Convert loop with multiple latches to nested loop using dominator tree

Tue Jul 13 03:50:42 PDT 2021

jaykang10 added a comment.

In D105700#2872907 <https://reviews.llvm.org/D105700#2872907>, @efriedma wrote:

> The transforms here is clearly correct... the part I'm not sure about is whether it's profitable in general.  I'm particularly worried about cases where we make PHI nodes in the outer loop more difficult to analyze.  Have you don't any experiments to try to determine the performance impact?

I have checked the performance number from below benchmarks.

  llvm-test-suite for x86

  Tests: 2939
  Short Running: 2374 (filtered out)
  Remaining: 565
  Metric: exec_time

  Program                                        results.org results.multi.latches diff 
   test-suite...emCmp<8, GreaterThanZero, Mid>   601.59      1056.69               75.7%
   test-suite...emCmp<15, LessThanZero, First>   299.16      517.11                72.9%
   test-suite...Cmp<8, GreaterThanZero, First>   577.34      902.53                56.3%
   test-suite...Cmp<31, GreaterThanZero, None>   410.89      600.97                46.3%
   test-suite...MemCmp<15, LessThanZero, None>   453.93      658.92                45.2%
   test-suite...aw.test:BM_MAT_X_MAT_RAW/44217   391924.61   536335.61             36.8%
   test-suite...sCRaw.test:BM_HYDRO_2D_RAW/171     8.22       10.95                33.2%
   test-suite...Cmp<31, GreaterThanZero, Last>   420.00      553.94                31.9%
   test-suite...t:BM_MemCmp<31, EqZero, First>   155.73      204.14                31.1%
   test-suite...Source/Benchmarks/sim/sim.test     2.12        2.78                30.7%
   test-suite...lications/sqlite3/sqlite3.test     1.70        2.12                25.0%
   test-suite....test:BENCHMARK_HARRIS/512/512   2272.75     2806.47               23.5%
   test-suite...tions/lambda-0.1.3/lambda.test     2.09        2.57                22.6%
   test-suite...s-C/Pathfinder/PathFinder.test     1.66        2.02                21.9%
   test-suite...C/Packing-flt/Packing-flt.test     1.64        2.00                21.7%
   Geomean difference                                                               nan%
           results.org  results.multi.latches        diff
  count  564.000000     562.000000             561.000000
  mean   2062.547024    2174.176751            0.007214  
  std    23151.613672   25917.675478           0.091953  
  min    0.614800       0.616300              -0.490838  
  25%    2.715425       2.744382              -0.007251  
  50%    98.130479      98.001753             -0.000128  
  75%    517.301815     525.848058             0.010050  
  max    391924.609000  536335.615000          0.756508

  SPEC2006 for AArch64

  Benchmark		diff(%)
  400.perlbench		0.361501846
  401.bzip2		0.279018415
  403.gcc		-0.565577003
  429.mcf		-2.208262958
  445.gobmk		-0.31082178
  456.hmmer		-0.467206165
  458.sjeng		-0.235398032
  462.libquantum		-3.935842025
  464.h264ref		0.057919804
  471.omnetpp		-1.779197968
  473.astar		-0.518484615
  483.xalancbmk		-0.36306218

  SPEC2017 for AArch64

  Benchmark		diff(%)
  500.perlbench_r		1.547117225
  502.gcc_r		-0.76960912
  505.mcf_r		0
  520.omnetpp_r		-1.730885443
  523.xalancbmk_r		-0.697377016
  525.x264_r		0.052501261
  531.deepsjeng_r		-0.150069504
  541.leela_r		-0.042836837
  548.exchange2_r		0.011397094
  557.xz_r		-0.416533465

As you mentioned, the existing analyses could be failed with the cascaded phi nodes from outer loop. I was able to see a test in which `CanProveNotTakenFirstIteration` is failed with LICM pass.
In order to canonicalize the loop with multiple latches, normally, the LoopSimplify pass creates a new latch and connects the old latches to the new one. If the pass detects a certain condition of phi node in the loop with multiple latches, it converts the loop into a nested loop. Therefore, it could be a question about which one is better between a loop, which has multiple induction variables and conditional branch, and a nested loop.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D105700/new/

https://reviews.llvm.org/D105700