[PATCH] D68414: [SROA] Enhance AggLoadStoreRewriter to rewrite integer load/store if it covers multi fields in original aggregate
Roman Lebedev via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Sep 21 02:52:20 PDT 2020
lebedev.ri added a comment.
compile-time results: https://llvm-compile-time-tracker.com/compare.php?from=e616a4259889b55ed1bf5bf095f0e59658c6e311&to=0a2a92f815130c9ed2f0fed11850079bbd55038e&stat=instructions
As of vanilla llvm test-suite + RawSpeed, `presplitOverlappedSlices()` fires `629240` times, and succeeds `3` (three) times.
I would say that it should either succeed more, or cost less :)
| statistic name | baseline | proposed | Δ | % | \|%\| |
|:--------------------------------------:|:---------:|:---------:|:------:|:------:|:-----:|
| sroa.NumPresplitAttempted | 0 | 629240 | 629240 | 0.00% | 0.00% |
| sroa.NumPresplitSuccess | 0 | 3 | 3 | 0.00% | 0.00% |
| correlated-value-propagation.NumShlNUW | 4210 | 4209 | -1 | -0.02% | 0.02% |
| correlated-value-propagation.NumShlNW | 6274 | 6273 | -1 | -0.02% | 0.02% |
| mem2reg.NumLocalPromoted | 6245 | 6246 | 1 | 0.02% | 0.02% |
| correlated-value-propagation.NumNUW | 15086 | 15085 | -1 | -0.01% | 0.01% |
| sroa.MaxPartitionsPerAlloca | 11933 | 11934 | 1 | 0.01% | 0.01% |
| stack-coloring.StackSlotMerged | 12160 | 12159 | -1 | -0.01% | 0.01% |
| SLP.NumVectorInstructions | 34625 | 34626 | 1 | 0.00% | 0.00% |
| asm-printer.EmittedInsts | 7936899 | 7936895 | -4 | 0.00% | 0.00% |
| assembler.EmittedDataFragments | 2500398 | 2500399 | 1 | 0.00% | 0.00% |
| assembler.EmittedFillFragments | 423471 | 423472 | 1 | 0.00% | 0.00% |
| assembler.EmittedFragments | 5157859 | 5157861 | 2 | 0.00% | 0.00% |
| assembler.FragmentLayouts | 12143832 | 12143834 | 2 | 0.00% | 0.00% |
| assembler.ObjectBytes | 254675544 | 254675584 | 40 | 0.00% | 0.00% |
| assembler.evaluateFixup | 7937674 | 7937675 | 1 | 0.00% | 0.00% |
| assume-queries.NumAssumeQueries | 8436268 | 8436587 | 319 | 0.00% | 0.00% |
| basicaa.SearchTimes | 66366214 | 66366216 | 2 | 0.00% | 0.00% |
| bdce.NumRemoved | 43590 | 43589 | -1 | 0.00% | 0.00% |
| codegenprepare.NumCastUses | 375363 | 375361 | -2 | 0.00% | 0.00% |
| codegenprepare.NumGEPsElim | 106610 | 106609 | -1 | 0.00% | 0.00% |
| correlated-value-propagation.NumNW | 25516 | 25515 | -1 | 0.00% | 0.00% |
| dagcombine.NodesCombined | 3881288 | 3881285 | -3 | 0.00% | 0.00% |
| dse.NumDomMemDefChecks | 3131956 | 3131955 | -1 | 0.00% | 0.00% |
| dse.NumGetDomMemoryDefPassed | 1084707 | 1084706 | -1 | 0.00% | 0.00% |
| dse.NumRemainingStores | 846110 | 846108 | -2 | 0.00% | 0.00% |
| early-cse.NumCSE | 2188895 | 2188891 | -4 | 0.00% | 0.00% |
| early-cse.NumSimplify | 542909 | 542924 | 15 | 0.00% | 0.00% |
| gvn.NumGVNInstr | 325697 | 325693 | -4 | 0.00% | 0.00% |
| gvn.NumGVNLoad | 76337 | 76336 | -1 | 0.00% | 0.00% |
| gvn.NumGVNSimpl | 96093 | 96090 | -3 | 0.00% | 0.00% |
| instcombine.NumCombined | 3674269 | 3674267 | -2 | 0.00% | 0.00% |
| instcombine.NumSunkInst | 63820 | 63817 | -3 | 0.00% | 0.00% |
| instcombine.NumWorklistIterations | 2024510 | 2024511 | 1 | 0.00% | 0.00% |
| instcount.NumAllocaInst | 45896 | 45895 | -1 | 0.00% | 0.00% |
| instcount.NumBitCastInst | 607901 | 607898 | -3 | 0.00% | 0.00% |
| instcount.NumCallInst | 1760607 | 1760604 | -3 | 0.00% | 0.00% |
| instcount.NumGetElementPtrInst | 1177736 | 1177732 | -4 | 0.00% | 0.00% |
| instcount.NumLoadInst | 1006106 | 1006105 | -1 | 0.00% | 0.00% |
| instcount.NumStoreInst | 706082 | 706079 | -3 | 0.00% | 0.00% |
| instcount.TotalInsts | 8826738 | 8826723 | -15 | 0.00% | 0.00% |
| instcount.TotalIntegerInsts | 2263812 | 2263811 | -1 | 0.00% | 0.00% |
| instcount.TotalIntegerScalarInsts | 2154781 | 2154780 | -1 | 0.00% | 0.00% |
| instcount.TotalScalarInsts | 8163514 | 8163499 | -15 | 0.00% | 0.00% |
| isel.NumDAGIselRetries | 56963972 | 56963901 | -71 | 0.00% | 0.00% |
| mcexpr.MCExprEvaluate | 39299357 | 39299360 | 3 | 0.00% | 0.00% |
| mem2reg.NumSingleStore | 556895 | 556897 | 2 | 0.00% | 0.00% |
| memory-builtins.ObjectVisitorArgument | 1652515 | 1652519 | 4 | 0.00% | 0.00% |
| memory-builtins.ObjectVisitorLoad | 560185 | 560201 | 16 | 0.00% | 0.00% |
| post-RA-sched.NumFixedAnti | 52447 | 52446 | -1 | 0.00% | 0.00% |
| post-RA-sched.NumStalls | 3686107 | 3686108 | 1 | 0.00% | 0.00% |
| regalloc.NumAssigned | 4117619 | 4117618 | -1 | 0.00% | 0.00% |
| simplifycfg.NumSimpl | 985893 | 985892 | -1 | 0.00% | 0.00% |
| sroa.NumAllocaPartitionUses | 3095830 | 3095827 | -3 | 0.00% | 0.00% |
| sroa.NumAllocaPartitions | 698974 | 698973 | -1 | 0.00% | 0.00% |
| sroa.NumAllocasAnalyzed | 796794 | 796791 | -3 | 0.00% | 0.00% |
| sroa.NumDeleted | 3687318 | 3687311 | -7 | 0.00% | 0.00% |
| sroa.NumPromoted | 689146 | 689149 | 3 | 0.00% | 0.00% |
| stack-coloring.NumMarkerSeen | 105177 | 105174 | -3 | 0.00% | 0.00% |
| stack-coloring.StackSpaceSaved | 383217 | 383205 | -12 | 0.00% | 0.00% |
================
Comment at: llvm/lib/Transforms/Scalar/SROA.cpp:4535
Changed |= presplitLoadsAndStores(AI, AS);
+ int PresplitTimes = 0;
+ bool LocalChanged = true;
----------------
As of vanilla llvm test-suite, `PresplitTimes` is at most ever `2`,
so i think `MAX_PRESPLIT_ITERATIONS` could be much lower than `256`.
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D68414/new/
https://reviews.llvm.org/D68414
More information about the llvm-commits
mailing list