[PATCH] D68414: [SROA] Enhance AggLoadStoreRewriter to rewrite integer load/store if it covers multi fields in original aggregate

Roman Lebedev via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Sep 21 02:52:20 PDT 2020


lebedev.ri added a comment.

compile-time results: https://llvm-compile-time-tracker.com/compare.php?from=e616a4259889b55ed1bf5bf095f0e59658c6e311&to=0a2a92f815130c9ed2f0fed11850079bbd55038e&stat=instructions

As of vanilla llvm test-suite + RawSpeed, `presplitOverlappedSlices()` fires `629240` times, and succeeds `3` (three) times.
I would say that it should either succeed more, or cost less :)

  |             statistic name             |  baseline |  proposed |    Δ   |    %   | \|%\| |
  |:--------------------------------------:|:---------:|:---------:|:------:|:------:|:-----:|
  | sroa.NumPresplitAttempted              |         0 |    629240 | 629240 |  0.00% | 0.00% |
  | sroa.NumPresplitSuccess                |         0 |         3 |      3 |  0.00% | 0.00% |
  | correlated-value-propagation.NumShlNUW |      4210 |      4209 |     -1 | -0.02% | 0.02% |
  | correlated-value-propagation.NumShlNW  |      6274 |      6273 |     -1 | -0.02% | 0.02% |
  | mem2reg.NumLocalPromoted               |      6245 |      6246 |      1 |  0.02% | 0.02% |
  | correlated-value-propagation.NumNUW    |     15086 |     15085 |     -1 | -0.01% | 0.01% |
  | sroa.MaxPartitionsPerAlloca            |     11933 |     11934 |      1 |  0.01% | 0.01% |
  | stack-coloring.StackSlotMerged         |     12160 |     12159 |     -1 | -0.01% | 0.01% |
  | SLP.NumVectorInstructions              |     34625 |     34626 |      1 |  0.00% | 0.00% |
  | asm-printer.EmittedInsts               |   7936899 |   7936895 |     -4 |  0.00% | 0.00% |
  | assembler.EmittedDataFragments         |   2500398 |   2500399 |      1 |  0.00% | 0.00% |
  | assembler.EmittedFillFragments         |    423471 |    423472 |      1 |  0.00% | 0.00% |
  | assembler.EmittedFragments             |   5157859 |   5157861 |      2 |  0.00% | 0.00% |
  | assembler.FragmentLayouts              |  12143832 |  12143834 |      2 |  0.00% | 0.00% |
  | assembler.ObjectBytes                  | 254675544 | 254675584 |     40 |  0.00% | 0.00% |
  | assembler.evaluateFixup                |   7937674 |   7937675 |      1 |  0.00% | 0.00% |
  | assume-queries.NumAssumeQueries        |   8436268 |   8436587 |    319 |  0.00% | 0.00% |
  | basicaa.SearchTimes                    |  66366214 |  66366216 |      2 |  0.00% | 0.00% |
  | bdce.NumRemoved                        |     43590 |     43589 |     -1 |  0.00% | 0.00% |
  | codegenprepare.NumCastUses             |    375363 |    375361 |     -2 |  0.00% | 0.00% |
  | codegenprepare.NumGEPsElim             |    106610 |    106609 |     -1 |  0.00% | 0.00% |
  | correlated-value-propagation.NumNW     |     25516 |     25515 |     -1 |  0.00% | 0.00% |
  | dagcombine.NodesCombined               |   3881288 |   3881285 |     -3 |  0.00% | 0.00% |
  | dse.NumDomMemDefChecks                 |   3131956 |   3131955 |     -1 |  0.00% | 0.00% |
  | dse.NumGetDomMemoryDefPassed           |   1084707 |   1084706 |     -1 |  0.00% | 0.00% |
  | dse.NumRemainingStores                 |    846110 |    846108 |     -2 |  0.00% | 0.00% |
  | early-cse.NumCSE                       |   2188895 |   2188891 |     -4 |  0.00% | 0.00% |
  | early-cse.NumSimplify                  |    542909 |    542924 |     15 |  0.00% | 0.00% |
  | gvn.NumGVNInstr                        |    325697 |    325693 |     -4 |  0.00% | 0.00% |
  | gvn.NumGVNLoad                         |     76337 |     76336 |     -1 |  0.00% | 0.00% |
  | gvn.NumGVNSimpl                        |     96093 |     96090 |     -3 |  0.00% | 0.00% |
  | instcombine.NumCombined                |   3674269 |   3674267 |     -2 |  0.00% | 0.00% |
  | instcombine.NumSunkInst                |     63820 |     63817 |     -3 |  0.00% | 0.00% |
  | instcombine.NumWorklistIterations      |   2024510 |   2024511 |      1 |  0.00% | 0.00% |
  | instcount.NumAllocaInst                |     45896 |     45895 |     -1 |  0.00% | 0.00% |
  | instcount.NumBitCastInst               |    607901 |    607898 |     -3 |  0.00% | 0.00% |
  | instcount.NumCallInst                  |   1760607 |   1760604 |     -3 |  0.00% | 0.00% |
  | instcount.NumGetElementPtrInst         |   1177736 |   1177732 |     -4 |  0.00% | 0.00% |
  | instcount.NumLoadInst                  |   1006106 |   1006105 |     -1 |  0.00% | 0.00% |
  | instcount.NumStoreInst                 |    706082 |    706079 |     -3 |  0.00% | 0.00% |
  | instcount.TotalInsts                   |   8826738 |   8826723 |    -15 |  0.00% | 0.00% |
  | instcount.TotalIntegerInsts            |   2263812 |   2263811 |     -1 |  0.00% | 0.00% |
  | instcount.TotalIntegerScalarInsts      |   2154781 |   2154780 |     -1 |  0.00% | 0.00% |
  | instcount.TotalScalarInsts             |   8163514 |   8163499 |    -15 |  0.00% | 0.00% |
  | isel.NumDAGIselRetries                 |  56963972 |  56963901 |    -71 |  0.00% | 0.00% |
  | mcexpr.MCExprEvaluate                  |  39299357 |  39299360 |      3 |  0.00% | 0.00% |
  | mem2reg.NumSingleStore                 |    556895 |    556897 |      2 |  0.00% | 0.00% |
  | memory-builtins.ObjectVisitorArgument  |   1652515 |   1652519 |      4 |  0.00% | 0.00% |
  | memory-builtins.ObjectVisitorLoad      |    560185 |    560201 |     16 |  0.00% | 0.00% |
  | post-RA-sched.NumFixedAnti             |     52447 |     52446 |     -1 |  0.00% | 0.00% |
  | post-RA-sched.NumStalls                |   3686107 |   3686108 |      1 |  0.00% | 0.00% |
  | regalloc.NumAssigned                   |   4117619 |   4117618 |     -1 |  0.00% | 0.00% |
  | simplifycfg.NumSimpl                   |    985893 |    985892 |     -1 |  0.00% | 0.00% |
  | sroa.NumAllocaPartitionUses            |   3095830 |   3095827 |     -3 |  0.00% | 0.00% |
  | sroa.NumAllocaPartitions               |    698974 |    698973 |     -1 |  0.00% | 0.00% |
  | sroa.NumAllocasAnalyzed                |    796794 |    796791 |     -3 |  0.00% | 0.00% |
  | sroa.NumDeleted                        |   3687318 |   3687311 |     -7 |  0.00% | 0.00% |
  | sroa.NumPromoted                       |    689146 |    689149 |      3 |  0.00% | 0.00% |
  | stack-coloring.NumMarkerSeen           |    105177 |    105174 |     -3 |  0.00% | 0.00% |
  | stack-coloring.StackSpaceSaved         |    383217 |    383205 |    -12 |  0.00% | 0.00% |



================
Comment at: llvm/lib/Transforms/Scalar/SROA.cpp:4535
   Changed |= presplitLoadsAndStores(AI, AS);
+  int PresplitTimes = 0;
+  bool LocalChanged = true;
----------------
As of vanilla llvm test-suite, `PresplitTimes` is at most ever `2`,
so i think `MAX_PRESPLIT_ITERATIONS` could be much lower than `256`.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D68414/new/

https://reviews.llvm.org/D68414



More information about the llvm-commits mailing list