[llvm] bd7949b - reland e5581df60a35 [SimplifyCFG] accumulate bonus insts cost
Liu, Yaxun (Sam) via llvm-commits
llvm-commits at lists.llvm.org
Tue Oct 25 08:13:36 PDT 2022
[AMD Official Use Only - General]
The patch is intended to fix the worst case scenario where a chain of branch folding happens without considering the accumulated costs. Since now the option -bonus-inst-threshold (default 1) is used to compare with the accumulated cost, some previously folded branches may no longer folded. This causes mixed results in benchmarks. The default value of `-bonus-inst-threshold` may need to be tuned to get most out of this change, e.g, changing it to 2 or 3.
Is there a way for me to run your benchmark to tune this option? Thanks.
Sam
-----Original Message-----
From: Roman Lebedev <lebedev.ri at gmail.com>
Sent: Monday, October 24, 2022 6:17 PM
To: Liu, Yaxun (Sam) <Yaxun.Liu at amd.com>
Cc: llvm-commits at lists.llvm.org; Yaxun Liu <llvmlistbot at llvm.org>; Philip Reames <listmail at philipreames.com>; Nikita Popov <nikita.ppv at gmail.com>
Subject: Re: [llvm] bd7949b - reland e5581df60a35 [SimplifyCFG] accumulate bonus insts cost
Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding.
Some more perf numbers for me (attached); on that bench, there's 11 noticeable (>+1%) *statistically significant* regressions but only 4 similar improvements.
On Mon, Oct 24, 2022 at 11:22 PM Liu, Yaxun (Sam) <Yaxun.Liu at amd.com> wrote:
>
> [AMD Official Use Only - General]
>
> Also, if the middle-end fold the following example:
>
> /// E.g. folding
> /// if (cond1) return false;
> /// if (cond2) return false;
> /// return true;
> /// into
> /// if (cond1 | cond2) return false;
>
> How could the backend unfold this?
>
> Sam
>
> -----Original Message-----
> From: Liu, Yaxun (Sam)
> Sent: Monday, October 24, 2022 4:08 PM
> To: Roman Lebedev <lebedev.ri at gmail.com>; llvm-commits at lists.llvm.org
> Cc: Yaxun Liu <llvmlistbot at llvm.org>; Philip Reames
> <listmail at philipreames.com>; Nikita Popov <nikita.ppv at gmail.com>
> Subject: RE: [llvm] bd7949b - reland e5581df60a35 [SimplifyCFG]
> accumulate bonus insts cost
>
> The original review is at
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Frevi
> ews.llvm.org%2FD132408&data=05%7C01%7CYaxun.Liu%40amd.com%7C0f05a7
> 6ad0024daff53c08dab60d9899%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%
> 7C638022466780964329%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQI
> joiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=%2
> FBUfQSqpggWFObsBKjWvEky5q%2B4F0G7POvS851HURkw%3D&reserved=0
>
> The cost per BB is tracked by ValueMap. If a BB is deleted, it is no longer tracked. Tracking of merged BB only happens of the old BB is 'replaced all uses by' the new BB, which AFAIK not happening for BB's.
>
> Sam
>
> -----Original Message-----
> From: Roman Lebedev <lebedev.ri at gmail.com>
> Sent: Monday, October 24, 2022 3:58 PM
> To: llvm-commits at lists.llvm.org
> Cc: Liu, Yaxun (Sam) <Yaxun.Liu at amd.com>; Yaxun Liu
> <llvmlistbot at llvm.org>; Philip Reames <listmail at philipreames.com>;
> Nikita Popov <nikita.ppv at gmail.com>
> Subject: Re: [llvm] bd7949b - reland e5581df60a35 [SimplifyCFG]
> accumulate bonus insts cost
>
> Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding.
>
>
> I was in a rather long AWOL when this was being reviewed, so post-commit it is.
>
> I'm really not comfortable with maintaining such a global state, especially because i know for a fact that the existing state, LoopHeaders, is already pretty broken - what should happen when a block is deleted?
> How do we know we are tracking the right block when they are merged?
> Etc.
>
> Also, i was under a rather strong impression that we decided that aggressively flattening blocks like that is the right decision for the middle-end, and back-end will need to selectively (pun intended) undo this with actual performance characteristics in mind.
>
> If you don't agree, i would recommend proceeding through an RFC...
>
> (Also, this is missing a link to the review.)
>
>
> Roman.
>
> On Mon, Oct 24, 2022 at 10:44 PM Yaxun Liu via llvm-commits <llvm-commits at lists.llvm.org> wrote:
> >
> >
> > Author: Yaxun (Sam) Liu
> > Date: 2022-10-24T15:43:53-04:00
> > New Revision: bd7949bcd86633bd4203b2ba6f891aea00fce4d1
> >
> > URL:
> > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgi
> > th
> > ub.com%2Fllvm%2Fllvm-project%2Fcommit%2Fbd7949bcd86633bd4203b2ba6f89
> > 1a
> > ea00fce4d1&data=05%7C01%7Cyaxun.liu%40amd.com%7Ce97f2df2126f476e
> > 3a
> > f108dab5fa1cb3%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C63802238
> > 31
> > 06379522%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzI
> > iL
> > CJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=Xzj%2FEck1Oi
> > 7d
> > MS14twesHRQGR%2F3aAzBppPPEEGdrEpY%3D&reserved=0
> > DIFF:
> > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgi
> > th
> > ub.com%2Fllvm%2Fllvm-project%2Fcommit%2Fbd7949bcd86633bd4203b2ba6f89
> > 1a
> > ea00fce4d1.diff&data=05%7C01%7Cyaxun.liu%40amd.com%7Ce97f2df2126
> > f4
> > 76e3af108dab5fa1cb3%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C638
> > 02
> > 2383106394513%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2
> > lu
> > MzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=fw4X2f5
> > 1R
> > SO2GhkKCe4yL%2B0nCZzpVxyB89l2O5TN86c%3D&reserved=0
> >
> > LOG: reland e5581df60a35 [SimplifyCFG] accumulate bonus insts cost
> >
> > Fixed compile time increase due to always constructing LocalCostTracker.
> > Now only construct LocalCostTracker when needed.
> >
> > Added:
> >
> >
> > Modified:
> > llvm/include/llvm/Transforms/Utils/Local.h
> > llvm/lib/Transforms/Scalar/SimplifyCFGPass.cpp
> > llvm/lib/Transforms/Utils/LoopSimplify.cpp
> > llvm/lib/Transforms/Utils/SimplifyCFG.cpp
> > llvm/test/Transforms/LoopUnroll/peel-loop-inner.ll
> > llvm/test/Transforms/PhaseOrdering/X86/vector-reductions-logical.ll
> > llvm/test/Transforms/SimplifyCFG/branch-fold-multiple.ll
> > llvm/test/Transforms/SimplifyCFG/branch-fold-threshold.ll
> >
> > llvm/test/Transforms/SimplifyCFG/fold-branch-to-common-dest-two-pred
> > s-
> > cost.ll
> >
> > Removed:
> >
> >
> >
> > ####################################################################
> > ## ########## diff --git
> > a/llvm/include/llvm/Transforms/Utils/Local.h
> > b/llvm/include/llvm/Transforms/Utils/Local.h
> > index 4db697c1ffcec..1397420b2950f 100644
> > --- a/llvm/include/llvm/Transforms/Utils/Local.h
> > +++ b/llvm/include/llvm/Transforms/Utils/Local.h
> > @@ -16,6 +16,7 @@
> >
> > #include "llvm/ADT/ArrayRef.h"
> > #include "llvm/IR/Dominators.h"
> > +#include "llvm/IR/ValueMap.h"
> > #include "llvm/Support/CommandLine.h"
> > #include "llvm/Transforms/Utils/SimplifyCFGOptions.h"
> > #include <cstdint>
> > @@ -164,6 +165,26 @@ bool
> > TryToSimplifyUncondBranchFromEmptyBlock(BasicBlock *BB, /// values, but instcombine orders them so it usually won't matter.
> > bool EliminateDuplicatePHINodes(BasicBlock *BB);
> >
> > +/// Class to track cost of simplify CFG transformations.
> > +class SimplifyCFGCostTracker {
> > + /// Number of bonus instructions due to folding branches into predecessors.
> > + /// E.g. folding
> > + /// if (cond1) return false;
> > + /// if (cond2) return false;
> > + /// return true;
> > + /// into
> > + /// if (cond1 | cond2) return false;
> > + /// return true;
> > + /// In this case cond2 is always executed whereas originally it
> > +may be
> > + /// evicted due to early exit of cond1. 'cond2' is called bonus
> > +instructions
> > + /// and such bonus instructions could accumulate for unrolled
> > +loops, therefore
> > + /// use a value map to accumulate their costs across transformations.
> > + ValueMap<BasicBlock *, unsigned> NumBonusInsts;
> > +
> > +public:
> > + void updateNumBonusInsts(BasicBlock *Parent, unsigned InstCount);
> > + unsigned getNumBonusInsts(BasicBlock *Parent); };
> > /// This function is used to do simplification of a CFG. For
> > example, it /// adjusts branches to branches to eliminate the extra
> > hop, it eliminates /// unreachable basic blocks, and does other peephole optimization of the CFG.
> > @@ -174,7 +195,8 @@ extern cl::opt<bool> RequireAndPreserveDomTree;
> > bool simplifyCFG(BasicBlock *BB, const TargetTransformInfo &TTI,
> > DomTreeUpdater *DTU = nullptr,
> > const SimplifyCFGOptions &Options = {},
> > - ArrayRef<WeakVH> LoopHeaders = {});
> > + ArrayRef<WeakVH> LoopHeaders = {},
> > + SimplifyCFGCostTracker *CostTracker = nullptr);
> >
> > /// This function is used to flatten a CFG. For example, it uses
> > parallel-and /// and parallel-or mode to collapse if-conditions and
> > merge if-regions with @@ -184,7 +206,8 @@ bool FlattenCFG(BasicBlock
> > *BB, AAResults *AA = nullptr); /// If this basic block is ONLY a
> > setcc and a branch, and if a predecessor /// branches to us and one
> > of our successors, fold the setcc into the /// predecessor and use logical operations to pick the right destination.
> > -bool FoldBranchToCommonDest(BranchInst *BI, llvm::DomTreeUpdater
> > *DTU = nullptr,
> > +bool FoldBranchToCommonDest(BranchInst *BI, SimplifyCFGCostTracker &CostTracker,
> > + DomTreeUpdater *DTU = nullptr,
> > MemorySSAUpdater *MSSAU = nullptr,
> > const TargetTransformInfo *TTI = nullptr,
> > unsigned BonusInstThreshold = 1);
> >
> > diff --git a/llvm/lib/Transforms/Scalar/SimplifyCFGPass.cpp
> > b/llvm/lib/Transforms/Scalar/SimplifyCFGPass.cpp
> > index fb2d812a186df..e2646eda06c54 100644
> > --- a/llvm/lib/Transforms/Scalar/SimplifyCFGPass.cpp
> > +++ b/llvm/lib/Transforms/Scalar/SimplifyCFGPass.cpp
> > @@ -221,7 +221,8 @@ static bool
> > tailMergeBlocksWithSimilarFunctionTerminators(Function &F, /// iterating until no more changes are made.
> > static bool iterativelySimplifyCFG(Function &F, const TargetTransformInfo &TTI,
> > DomTreeUpdater *DTU,
> > - const SimplifyCFGOptions &Options) {
> > + const SimplifyCFGOptions &Options,
> > + SimplifyCFGCostTracker
> > + &CostTracker) {
> > bool Changed = false;
> > bool LocalChange = true;
> >
> > @@ -252,7 +253,7 @@ static bool iterativelySimplifyCFG(Function &F, const TargetTransformInfo &TTI,
> > while (BBIt != F.end() && DTU->isBBPendingDeletion(&*BBIt))
> > ++BBIt;
> > }
> > - if (simplifyCFG(&BB, TTI, DTU, Options, LoopHeaders)) {
> > + if (simplifyCFG(&BB, TTI, DTU, Options, LoopHeaders,
> > + &CostTracker)) {
> > LocalChange = true;
> > ++NumSimpl;
> > }
> > @@ -266,11 +267,13 @@ static bool simplifyFunctionCFGImpl(Function &F, const TargetTransformInfo &TTI,
> > DominatorTree *DT,
> > const SimplifyCFGOptions &Options) {
> > DomTreeUpdater DTU(DT, DomTreeUpdater::UpdateStrategy::Eager);
> > + SimplifyCFGCostTracker CostTracker;
> >
> > bool EverChanged = removeUnreachableBlocks(F, DT ? &DTU : nullptr);
> > EverChanged |=
> > tailMergeBlocksWithSimilarFunctionTerminators(F, DT ? &DTU :
> > nullptr);
> > - EverChanged |= iterativelySimplifyCFG(F, TTI, DT ? &DTU :
> > nullptr, Options);
> > + EverChanged |=
> > + iterativelySimplifyCFG(F, TTI, DT ? &DTU : nullptr, Options,
> > + CostTracker);
> >
> > // If neither pass changed anything, we're done.
> > if (!EverChanged) return false;
> > @@ -284,7 +287,8 @@ static bool simplifyFunctionCFGImpl(Function &F, const TargetTransformInfo &TTI,
> > return true;
> >
> > do {
> > - EverChanged = iterativelySimplifyCFG(F, TTI, DT ? &DTU : nullptr, Options);
> > + EverChanged = iterativelySimplifyCFG(F, TTI, DT ? &DTU : nullptr, Options,
> > + CostTracker);
> > EverChanged |= removeUnreachableBlocks(F, DT ? &DTU : nullptr);
> > } while (EverChanged);
> >
> >
> > diff --git a/llvm/lib/Transforms/Utils/LoopSimplify.cpp
> > b/llvm/lib/Transforms/Utils/LoopSimplify.cpp
> > index 8943b4bb651f6..13f1080ffc044 100644
> > --- a/llvm/lib/Transforms/Utils/LoopSimplify.cpp
> > +++ b/llvm/lib/Transforms/Utils/LoopSimplify.cpp
> > @@ -480,6 +480,7 @@ static bool simplifyOneLoop(Loop *L, SmallVectorImpl<Loop *> &Worklist,
> > DominatorTree *DT, LoopInfo *LI,
> > ScalarEvolution *SE, AssumptionCache *AC,
> > MemorySSAUpdater *MSSAU, bool
> > PreserveLCSSA) {
> > + SimplifyCFGCostTracker CostTracker;
> > bool Changed = false;
> > if (MSSAU && VerifyMemorySSA)
> > MSSAU->getMemorySSA()->verifyMemorySSA();
> > @@ -661,7 +662,7 @@ static bool simplifyOneLoop(Loop *L, SmallVectorImpl<Loop *> &Worklist,
> > // The block has now been cleared of all instructions except for
> > // a comparison and a conditional branch. SimplifyCFG may be able
> > // to fold it now.
> > - if (!FoldBranchToCommonDest(BI, /*DTU=*/nullptr, MSSAU))
> > + if (!FoldBranchToCommonDest(BI, CostTracker, /*DTU=*/nullptr,
> > + MSSAU))
> > continue;
> >
> > // Success. The block is now dead, so remove it from the
> > loop,
> >
> > diff --git a/llvm/lib/Transforms/Utils/SimplifyCFG.cpp
> > b/llvm/lib/Transforms/Utils/SimplifyCFG.cpp
> > index fcdd85838340d..7008f9b152f7f 100644
> > --- a/llvm/lib/Transforms/Utils/SimplifyCFG.cpp
> > +++ b/llvm/lib/Transforms/Utils/SimplifyCFG.cpp
> > @@ -207,6 +207,21 @@ STATISTIC(NumInvokes,
> > STATISTIC(NumInvokesMerged, "Number of invokes that were merged
> > together"); STATISTIC(NumInvokeSetsFormed, "Number of invoke sets
> > that were formed");
> >
> > +namespace llvm {
> > +
> > +void SimplifyCFGCostTracker::updateNumBonusInsts(BasicBlock *BB,
> > + unsigned
> > +InstCount) {
> > + auto Loc = NumBonusInsts.find(BB);
> > + if (Loc == NumBonusInsts.end())
> > + Loc = NumBonusInsts.insert({BB, 0}).first;
> > + Loc->second = Loc->second + InstCount; } unsigned
> > +SimplifyCFGCostTracker::getNumBonusInsts(BasicBlock *BB) {
> > + return NumBonusInsts.lookup(BB);
> > +}
> > +
> > +} // namespace llvm
> > +
> > namespace {
> >
> > // The first field contains the value that the switch produces when
> > a certain @@ -243,6 +258,10 @@ class SimplifyCFGOpt {
> > ArrayRef<WeakVH> LoopHeaders;
> > const SimplifyCFGOptions &Options;
> > bool Resimplify;
> > + // Accumulates number of bonus instructions due to merging basic
> > + blocks // of common destination.
> > + SimplifyCFGCostTracker *CostTracker;
> > + std::unique_ptr<SimplifyCFGCostTracker> LocalCostTracker;
> >
> > Value *isValueEqualityComparison(Instruction *TI);
> > BasicBlock *GetValueEqualityComparisonCases( @@ -286,8 +305,15 @@
> > class SimplifyCFGOpt {
> > public:
> > SimplifyCFGOpt(const TargetTransformInfo &TTI, DomTreeUpdater *DTU,
> > const DataLayout &DL, ArrayRef<WeakVH> LoopHeaders,
> > - const SimplifyCFGOptions &Opts)
> > + const SimplifyCFGOptions &Opts,
> > + SimplifyCFGCostTracker *CostTracker_)
> > : TTI(TTI), DTU(DTU), DL(DL), LoopHeaders(LoopHeaders),
> > Options(Opts) {
> > + // Cannot do this with member initializer list since LocalCostTracker is not
> > + // initialized there yet.
> > + CostTracker = CostTracker_
> > + ? CostTracker_
> > + : (LocalCostTracker.reset(new SimplifyCFGCostTracker()),
> > + LocalCostTracker.get());
> > assert((!DTU || !DTU->hasPostDomTree()) &&
> > "SimplifyCFG is not yet capable of maintaining validity of a "
> > "PostDomTree, so don't ask for it."); @@ -3624,8 +3650,9
> > @@ static bool isVectorOp(Instruction &I) { /// If this basic block
> > is simple enough, and if a predecessor branches to us /// and one
> > of our successors, fold the block into the predecessor and use ///
> > logical operations to pick the right destination.
> > -bool llvm::FoldBranchToCommonDest(BranchInst *BI, DomTreeUpdater *DTU,
> > - MemorySSAUpdater *MSSAU,
> > +bool llvm::FoldBranchToCommonDest(BranchInst *BI,
> > + SimplifyCFGCostTracker &CostTracker,
> > + DomTreeUpdater *DTU,
> > +MemorySSAUpdater *MSSAU,
> > const TargetTransformInfo *TTI,
> > unsigned BonusInstThreshold) {
> > // If this block ends with an unconditional branch, @@ -3697,7
> > +3724,6 @@ bool llvm::FoldBranchToCommonDest(BranchInst *BI,
> > +DomTreeUpdater *DTU,
> > // as "bonus instructions", and only allow this transformation when the
> > // number of the bonus instructions we'll need to create when cloning into
> > // each predecessor does not exceed a certain threshold.
> > - unsigned NumBonusInsts = 0;
> > bool SawVectorOp = false;
> > const unsigned PredCount = Preds.size();
> > for (Instruction &I : *BB) {
> > @@ -3716,12 +3742,13 @@ bool llvm::FoldBranchToCommonDest(BranchInst *BI, DomTreeUpdater *DTU,
> > // predecessor. Ignore free instructions.
> > if (!TTI || TTI->getInstructionCost(&I, CostKind) !=
> > TargetTransformInfo::TCC_Free) {
> > - NumBonusInsts += PredCount;
> > -
> > - // Early exits once we reach the limit.
> > - if (NumBonusInsts >
> > - BonusInstThreshold * BranchFoldToCommonDestVectorMultiplier)
> > - return false;
> > + for (auto PredBB : Preds) {
> > + CostTracker.updateNumBonusInsts(PredBB, PredCount);
> > + // Early exits once we reach the limit.
> > + if (CostTracker.getNumBonusInsts(PredBB) >
> > + BonusInstThreshold * BranchFoldToCommonDestVectorMultiplier)
> > + return false;
> > + }
> > }
> >
> > auto IsBCSSAUse = [BB, &I](Use &U) { @@ -3735,10 +3762,12 @@
> > bool llvm::FoldBranchToCommonDest(BranchInst *BI, DomTreeUpdater *DTU,
> > if (!all_of(I.uses(), IsBCSSAUse))
> > return false;
> > }
> > - if (NumBonusInsts >
> > - BonusInstThreshold *
> > - (SawVectorOp ? BranchFoldToCommonDestVectorMultiplier : 1))
> > - return false;
> > + for (auto PredBB : Preds) {
> > + if (CostTracker.getNumBonusInsts(PredBB) >
> > + BonusInstThreshold *
> > + (SawVectorOp ? BranchFoldToCommonDestVectorMultiplier : 1))
> > + return false;
> > + }
> >
> > // Ok, we have the budget. Perform the transformation.
> > for (BasicBlock *PredBlock : Preds) { @@ -6889,7 +6918,7 @@ bool
> > SimplifyCFGOpt::simplifyUncondBranch(BranchInst *BI,
> > // branches to us and our successor, fold the comparison into the
> > // predecessor and use logical operations to update the incoming value
> > // for PHI nodes in common successor.
> > - if (FoldBranchToCommonDest(BI, DTU, /*MSSAU=*/nullptr, &TTI,
> > + if (FoldBranchToCommonDest(BI, *CostTracker, DTU,
> > + /*MSSAU=*/nullptr, &TTI,
> > Options.BonusInstThreshold))
> > return requestResimplify();
> > return false;
> > @@ -6958,7 +6987,7 @@ bool SimplifyCFGOpt::simplifyCondBranch(BranchInst *BI, IRBuilder<> &Builder) {
> > // If this basic block is ONLY a compare and a branch, and if a predecessor
> > // branches to us and one of our successors, fold the comparison into the
> > // predecessor and use logical operations to pick the right destination.
> > - if (FoldBranchToCommonDest(BI, DTU, /*MSSAU=*/nullptr, &TTI,
> > + if (FoldBranchToCommonDest(BI, *CostTracker, DTU,
> > + /*MSSAU=*/nullptr, &TTI,
> > Options.BonusInstThreshold))
> > return requestResimplify();
> >
> > @@ -7257,8 +7286,9 @@ bool SimplifyCFGOpt::run(BasicBlock *BB) {
> >
> > bool llvm::simplifyCFG(BasicBlock *BB, const TargetTransformInfo &TTI,
> > DomTreeUpdater *DTU, const SimplifyCFGOptions &Options,
> > - ArrayRef<WeakVH> LoopHeaders) {
> > + ArrayRef<WeakVH> LoopHeaders,
> > + SimplifyCFGCostTracker *CostTracker) {
> > return SimplifyCFGOpt(TTI, DTU, BB->getModule()->getDataLayout(), LoopHeaders,
> > - Options)
> > + Options, CostTracker)
> > .run(BB);
> > }
> >
> > diff --git a/llvm/test/Transforms/LoopUnroll/peel-loop-inner.ll
> > b/llvm/test/Transforms/LoopUnroll/peel-loop-inner.ll
> > index fa39b77aae36a..40493b48dfe05 100644
> > --- a/llvm/test/Transforms/LoopUnroll/peel-loop-inner.ll
> > +++ b/llvm/test/Transforms/LoopUnroll/peel-loop-inner.ll
> > @@ -1,5 +1,5 @@
> > ; NOTE: Assertions have been autogenerated by
> > utils/update_test_checks.py -; RUN: opt < %s -S
> > -passes='require<opt-remark-emit>,loop-unroll<peeling;no-runtime>,si
> > mp lifycfg,instcombine' -unroll-force-peel-count=3 -verify-dom-info
> > | FileCheck %s
> > +; RUN: opt < %s -S
> > +-passes='require<opt-remark-emit>,loop-unroll<peeling;no-runtime>,s
> > +im plifycfg<bonus-inst-threshold=3>,instcombine'
> > +-unroll-force-peel-count=3 -verify-dom-info | FileCheck %s
> >
> > define void @basic(i32 %K, i32 %N) { ; CHECK-LABEL: @basic(
> >
> > diff --git
> > a/llvm/test/Transforms/PhaseOrdering/X86/vector-reductions-logical.l
> > l
> > b/llvm/test/Transforms/PhaseOrdering/X86/vector-reductions-logical.l
> > l index 05ef21d5a1123..c126dbcd6ca96 100644
> > ---
> > a/llvm/test/Transforms/PhaseOrdering/X86/vector-reductions-logical.l
> > l
> > +++ b/llvm/test/Transforms/PhaseOrdering/X86/vector-reductions-logic
> > +++ al
> > +++ .ll
> > @@ -1,5 +1,5 @@
> > ; NOTE: Assertions have been autogenerated by
> > utils/update_test_checks.py -; RUN: opt -O2 -S < %s | FileCheck %s
> > +; RUN: opt -bonus-inst-threshold=4 -O2 -S < %s | FileCheck %s
> >
> > target datalayout = "e-m:o-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
> > target triple = "x86_64--"
> >
> > diff --git
> > a/llvm/test/Transforms/SimplifyCFG/branch-fold-multiple.ll
> > b/llvm/test/Transforms/SimplifyCFG/branch-fold-multiple.ll
> > index fd400632a5916..f332cf82b5573 100644
> > --- a/llvm/test/Transforms/SimplifyCFG/branch-fold-multiple.ll
> > +++ b/llvm/test/Transforms/SimplifyCFG/branch-fold-multiple.ll
> > @@ -3,9 +3,12 @@
> >
> > %struct.S = type { [4 x i32] }
> >
> > -; Check the second, third, and fourth basic blocks are folded into
> > -; the first basic block since each has one bonus intruction, which
> > -; does not exceed the default bouns instruction threshold of 1.
> > +; Check the second basic block is folded into the first basic block
> > +; since it has one bonus intruction. The third basic block is not ;
> > +folded into the first basic block since the accumulated bonus ;
> > +instructions will exceed the default threshold of 1. The fourth
> > +basic ; block is foled into the third basic block since the
> > +accumulated ; bonus instruction cost is 1.
> >
> > define i1 @test1(i32 %0, i32 %1, i32 %2, i32 %3) { ; CHECK-LABEL:
> > @test1( @@ -15,14 +18,18 @@ define i1 @test1(i32 %0, i32 %1, i32 %2,
> > i32 %3) {
> > ; CHECK-NEXT: [[MUL1:%.*]] = mul i32 [[TMP1:%.*]], [[TMP1]]
> > ; CHECK-NEXT: [[CMP2_1:%.*]] = icmp sgt i32 [[MUL1]], 0
> > ; CHECK-NEXT: [[OR_COND:%.*]] = select i1 [[CMP2]], i1 true, i1 [[CMP2_1]]
> > +; CHECK-NEXT: br i1 [[OR_COND]], label [[CLEANUP:%.*]], label [[FOR_COND_1:%.*]]
> > +; CHECK: for.cond.1:
> > ; CHECK-NEXT: [[MUL2:%.*]] = mul i32 [[TMP2:%.*]], [[TMP2]]
> > ; CHECK-NEXT: [[CMP2_2:%.*]] = icmp sgt i32 [[MUL2]], 0
> > -; CHECK-NEXT: [[OR_COND1:%.*]] = select i1 [[OR_COND]], i1 true, i1 [[CMP2_2]]
> > ; CHECK-NEXT: [[MUL3:%.*]] = mul i32 [[TMP3:%.*]], [[TMP3]]
> > ; CHECK-NEXT: [[CMP2_3:%.*]] = icmp sgt i32 [[MUL3]], 0
> > -; CHECK-NEXT: [[OR_COND2:%.*]] = select i1 [[OR_COND1]], i1 true, i1 [[CMP2_3]]
> > -; CHECK-NEXT: [[SPEC_SELECT:%.*]] = select i1 [[OR_COND2]], i1 false, i1 true
> > -; CHECK-NEXT: ret i1 [[SPEC_SELECT]]
> > +; CHECK-NEXT: [[OR_COND1:%.*]] = select i1 [[CMP2_2]], i1 true, i1 [[CMP2_3]]
> > +; CHECK-NEXT: [[SPEC_SELECT:%.*]] = select i1 [[OR_COND1]], i1 false, i1 true
> > +; CHECK-NEXT: br label [[CLEANUP]]
> > +; CHECK: cleanup:
> > +; CHECK-NEXT: [[CMP:%.*]] = phi i1 [ false, [[ENTRY:%.*]] ], [ [[SPEC_SELECT]], [[FOR_COND_1]] ]
> > +; CHECK-NEXT: ret i1 [[CMP]]
> > ;
> > entry:
> > %mul0 = mul i32 %0, %0
> >
> > diff --git
> > a/llvm/test/Transforms/SimplifyCFG/branch-fold-threshold.ll
> > b/llvm/test/Transforms/SimplifyCFG/branch-fold-threshold.ll
> > index c1fd267aef93f..0482fa57227f6 100644
> > --- a/llvm/test/Transforms/SimplifyCFG/branch-fold-threshold.ll
> > +++ b/llvm/test/Transforms/SimplifyCFG/branch-fold-threshold.ll
> > @@ -1,9 +1,9 @@
> > ; RUN: opt %s -simplifycfg
> > -simplifycfg-require-and-preserve-domtree=1 -S | FileCheck %s
> > --check-prefix=NORMAL -; RUN: opt %s -simplifycfg
> > -simplifycfg-require-and-preserve-domtree=1 -S
> > -bonus-inst-threshold=2
> > | FileCheck %s --check-prefix=AGGRESSIVE -; RUN: opt %s -simplifycfg
> > -simplifycfg-require-and-preserve-domtree=1 -S
> > -bonus-inst-threshold=4
> > | FileCheck %s --check-prefix=WAYAGGRESSIVE
> > +; RUN: opt %s -simplifycfg
> > +-simplifycfg-require-and-preserve-domtree=1 -S
> > +-bonus-inst-threshold=3 | FileCheck %s --check-prefix=AGGRESSIVE ;
> > +RUN: opt %s -simplifycfg
> > +-simplifycfg-require-and-preserve-domtree=1
> > +-S -bonus-inst-threshold=6 | FileCheck %s
> > +--check-prefix=WAYAGGRESSIVE
> > ; RUN: opt %s -passes=simplifycfg -S | FileCheck %s
> > --check-prefix=NORMAL -; RUN: opt %s
> > -passes='simplifycfg<bonus-inst-threshold=2>' -S | FileCheck %s
> > --check-prefix=AGGRESSIVE -; RUN: opt %s
> > -passes='simplifycfg<bonus-inst-threshold=4>' -S | FileCheck %s
> > --check-prefix=WAYAGGRESSIVE
> > +; RUN: opt %s -passes='simplifycfg<bonus-inst-threshold=3>' -S |
> > +FileCheck %s --check-prefix=AGGRESSIVE ; RUN: opt %s
> > +-passes='simplifycfg<bonus-inst-threshold=6>' -S | FileCheck %s
> > +--check-prefix=WAYAGGRESSIVE
> >
> > define i32 @foo(i32 %a, i32 %b, i32 %c, i32 %d, i32* %input) { ;
> > NORMAL-LABEL: @foo(
> >
> > diff --git
> > a/llvm/test/Transforms/SimplifyCFG/fold-branch-to-common-dest-two-pr
> > ed
> > s-cost.ll
> > b/llvm/test/Transforms/SimplifyCFG/fold-branch-to-common-dest-two-pr
> > ed s-cost.ll index 71b8ef5c7612c..6b8ebd9054dc6 100644
> > ---
> > a/llvm/test/Transforms/SimplifyCFG/fold-branch-to-common-dest-two-pr
> > ed
> > s-cost.ll
> > +++ b/llvm/test/Transforms/SimplifyCFG/fold-branch-to-common-dest-tw
> > +++ o-
> > +++ preds-cost.ll
> > @@ -1,6 +1,6 @@
> > ; NOTE: Assertions have been autogenerated by
> > utils/update_test_checks.py ; RUN: opt < %s -S -simplifycfg
> > -simplifycfg-require-and-preserve-domtree=1 -bonus-inst-threshold=1
> > | FileCheck --check-prefixes=ALL,THR1 %s -; RUN: opt < %s -S
> > -simplifycfg -simplifycfg-require-and-preserve-domtree=1
> > -bonus-inst-threshold=2 | FileCheck --check-prefixes=ALL,THR2 %s
> > +; RUN: opt < %s -S -simplifycfg
> > +-simplifycfg-require-and-preserve-domtree=1 -bonus-inst-threshold=3
> > +| FileCheck --check-prefixes=ALL,THR2 %s
> >
> > declare void @sideeffect0()
> > declare void @sideeffect1()
> > @@ -10,7 +10,7 @@ declare i1 @gen1()
> >
> > ; Here we'd want to duplicate %v3_adj into two predecessors, ; but
> > -bonus-inst-threshold=1 says that we can only clone it into one.
> > -; With -bonus-inst-threshold=2 we can clone it into both though.
> > +; With -bonus-inst-threshold=3 we can clone it into both though.
> > define void @two_preds_with_extra_op(i8 %v0, i8 %v1, i8 %v2, i8
> > %v3) { ; THR1-LABEL: @two_preds_with_extra_op( ; THR1-NEXT: entry:
> >
> >
> >
> > _______________________________________________
> > llvm-commits mailing list
> > llvm-commits at lists.llvm.org
> > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fli
> > st
> > s.llvm.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fllvm-commits&data=05
> > %7
> > C01%7Cyaxun.liu%40amd.com%7Ce97f2df2126f476e3af108dab5fa1cb3%7C3dd89
> > 61
> > fe4884e608e11a82d994e183d%7C0%7C0%7C638022383106404511%7CUnknown%7CT
> > WF
> > pbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI
> > 6M
> > n0%3D%7C3000%7C%7C%7C&sdata=Qs1bProsrzvPwVl1r7KQBNbZ2bMWIg%2BYX6
> > S5
> > %2FT1u%2FV0%3D&reserved=0
More information about the llvm-commits
mailing list