[llvm] r296149 - [CGP] Split some critical edges coming out of indirect branches
Daniel Jasper via llvm-commits
llvm-commits at lists.llvm.org
Sun Feb 26 03:21:11 PST 2017
This lead to segfaults when compiling for PPC. I have reverted this in
r296295 and will forward reproduction instructions to you.
On Fri, Feb 24, 2017 at 7:41 PM, Michael Kuperstein via llvm-commits <
llvm-commits at lists.llvm.org> wrote:
> Author: mkuper
> Date: Fri Feb 24 12:41:32 2017
> New Revision: 296149
>
> URL: http://llvm.org/viewvc/llvm-project?rev=296149&view=rev
> Log:
> [CGP] Split some critical edges coming out of indirect branches
>
> Splitting critical edges when one of the source edges is an indirectbr
> is hard in general (because it requires changing the memory the indirectbr
> reads). But if a block only has a single indirectbr predecessor (which is
> the common case), we can simulate splitting that edge by splitting
> the destination block, and retargeting the *direct* branches.
>
> This is motivated by the use of computed gotos in python 2.7:
> PyEval_EvalFrame()
> ends up using an indirect branch with ~100 successors, and passing a
> constant to
> each of those. Since MachineSink can't break indirect critical edges on
> demand
> (and doing this in MIR doesn't look feasible), this causes us to emit
> about ~100
> defs of registers containing constants, which we in the predecessor block,
> where
> only one of those constants is used in each successor. So, at each
> computed goto,
> we needlessly spill about a 100 constants to stack. The end result is that
> a
> clang-compiled python interpreter can be about ~2.5x slower on a simple
> python
> reduction loop than a gcc-compiled interpreter.
>
> Differential Revision: https://reviews.llvm.org/D29916
>
> Added:
> llvm/trunk/test/Transforms/CodeGenPrepare/X86/computedgoto.ll
> - copied unchanged from r296063, llvm/trunk/test/Transforms/
> CodeGenPrepare/computedgoto.ll
> Modified:
> llvm/trunk/lib/CodeGen/CodeGenPrepare.cpp
> llvm/trunk/test/CodeGen/ARM/indirectbr.ll
> llvm/trunk/test/CodeGen/MSP430/indirectbr2.ll
> llvm/trunk/test/CodeGen/PowerPC/indirectbr.ll
>
> Modified: llvm/trunk/lib/CodeGen/CodeGenPrepare.cpp
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/
> CodeGen/CodeGenPrepare.cpp?rev=296149&r1=296148&r2=296149&view=diff
> ============================================================
> ==================
> --- llvm/trunk/lib/CodeGen/CodeGenPrepare.cpp (original)
> +++ llvm/trunk/lib/CodeGen/CodeGenPrepare.cpp Fri Feb 24 12:41:32 2017
> @@ -15,10 +15,12 @@
>
> #include "llvm/CodeGen/Passes.h"
> #include "llvm/ADT/DenseMap.h"
> +#include "llvm/ADT/SetVector.h"
> #include "llvm/ADT/SmallSet.h"
> #include "llvm/ADT/Statistic.h"
> #include "llvm/Analysis/BlockFrequencyInfo.h"
> #include "llvm/Analysis/BranchProbabilityInfo.h"
> +#include "llvm/Analysis/CFG.h"
> #include "llvm/Analysis/InstructionSimplify.h"
> #include "llvm/Analysis/LoopInfo.h"
> #include "llvm/Analysis/ProfileSummaryInfo.h"
> @@ -53,8 +55,10 @@
> #include "llvm/Transforms/Utils/BasicBlockUtils.h"
> #include "llvm/Transforms/Utils/BuildLibCalls.h"
> #include "llvm/Transforms/Utils/BypassSlowDivision.h"
> +#include "llvm/Transforms/Utils/Cloning.h"
> #include "llvm/Transforms/Utils/Local.h"
> #include "llvm/Transforms/Utils/SimplifyLibCalls.h"
> +#include "llvm/Transforms/Utils/ValueMapper.h"
> using namespace llvm;
> using namespace llvm::PatternMatch;
>
> @@ -222,6 +226,7 @@ class TypePromotionTransaction;
> unsigned CreatedInstCost);
> bool splitBranchCondition(Function &F);
> bool simplifyOffsetableRelocate(Instruction &I);
> + bool splitIndirectCriticalEdges(Function &F);
> };
> }
>
> @@ -296,6 +301,10 @@ bool CodeGenPrepare::runOnFunction(Funct
> if (!DisableBranchOpts)
> EverMadeChange |= splitBranchCondition(F);
>
> + // Split some critical edges where one of the sources is an indirect
> branch,
> + // to help generate sane code for PHIs involving such edges.
> + EverMadeChange |= splitIndirectCriticalEdges(F);
> +
> bool MadeChange = true;
> while (MadeChange) {
> MadeChange = false;
> @@ -429,6 +438,152 @@ BasicBlock *CodeGenPrepare::findDestBloc
> return DestBB;
> }
>
> +// Return the unique indirectbr predecessor of a block. This may return
> null
> +// even if such a predecessor exists, if it's not useful for splitting.
> +// If a predecessor is found, OtherPreds will contain all other
> (non-indirectbr)
> +// predecessors of BB.
> +static BasicBlock *
> +findIBRPredecessor(BasicBlock *BB, SmallVectorImpl<BasicBlock *>
> &OtherPreds) {
> + // If the block doesn't have any PHIs, we don't care about it, since
> there's
> + // no point in splitting it.
> + PHINode *PN = dyn_cast<PHINode>(BB->begin());
> + if (!PN)
> + return nullptr;
> +
> + // Verify we have exactly one IBR predecessor.
> + // Conservatively bail out if one of the other predecessors is not a
> "regular"
> + // terminator (that is, not a switch or a br).
> + BasicBlock *IBB = nullptr;
> + for (unsigned Pred = 0, E = PN->getNumIncomingValues(); Pred != E;
> ++Pred) {
> + BasicBlock *PredBB = PN->getIncomingBlock(Pred);
> + TerminatorInst *PredTerm = PredBB->getTerminator();
> + switch (PredTerm->getOpcode()) {
> + case Instruction::IndirectBr:
> + if (IBB)
> + return nullptr;
> + IBB = PredBB;
> + break;
> + case Instruction::Br:
> + case Instruction::Switch:
> + OtherPreds.push_back(PredBB);
> + continue;
> + default:
> + return nullptr;
> + }
> + }
> +
> + return IBB;
> +}
> +
> +// Split critical edges where the source of the edge is an indirectbr
> +// instruction. This isn't always possible, but we can handle some easy
> cases.
> +// This is useful because MI is unable to split such critical edges,
> +// which means it will not be able to sink instructions along those edges.
> +// This is especially painful for indirect branches with many successors,
> where
> +// we end up having to prepare all outgoing values in the origin block.
> +//
> +// Our normal algorithm for splitting critical edges requires us to update
> +// the outgoing edges of the edge origin block, but for an indirectbr this
> +// is hard, since it would require finding and updating the block
> addresses
> +// the indirect branch uses. But if a block only has a single indirectbr
> +// predecessor, with the others being regular branches, we can do it in a
> +// different way.
> +// Say we have A -> D, B -> D, I -> D where only I -> D is an indirectbr.
> +// We can split D into D0 and D1, where D0 contains only the PHIs from D,
> +// and D1 is the D block body. We can then duplicate D0 as D0A and D0B,
> and
> +// create the following structure:
> +// A -> D0A, B -> D0A, I -> D0B, D0A -> D1, D0B -> D1
> +bool CodeGenPrepare::splitIndirectCriticalEdges(Function &F) {
> + // Check whether the function has any indirectbrs, and collect which
> blocks
> + // they may jump to. Since most functions don't have indirect branches,
> + // this lowers the common case's overhead to O(Blocks) instead of
> O(Edges).
> + SmallSetVector<BasicBlock *, 16> Targets;
> + for (auto &BB : F) {
> + auto *IBI = dyn_cast<IndirectBrInst>(BB.getTerminator());
> + if (!IBI)
> + continue;
> +
> + for (unsigned Succ = 0, E = IBI->getNumSuccessors(); Succ != E;
> ++Succ)
> + Targets.insert(IBI->getSuccessor(Succ));
> + }
> +
> + if (Targets.empty())
> + return false;
> +
> + bool Changed = false;
> + for (BasicBlock *Target : Targets) {
> + SmallVector<BasicBlock *, 16> OtherPreds;
> + BasicBlock *IBRPred = findIBRPredecessor(Target, OtherPreds);
> + if (!IBRPred)
> + continue;
> +
> + // Don't even think about ehpads/landingpads.
> + Instruction *FirstNonPHI = Target->getFirstNonPHI();
> + if (FirstNonPHI->isEHPad() || Target->isLandingPad())
> + continue;
> +
> + BasicBlock *BodyBlock = Target->splitBasicBlock(FirstNonPHI,
> ".split");
> + // It's possible Target was its own successor through an indirectbr.
> + // In this case, the indirectbr now comes from BodyBlock.
> + if (IBRPred == Target)
> + IBRPred = BodyBlock;
> +
> + // At this point Target only has PHIs, and BodyBlock has the rest of
> the
> + // block's body. Create a copy of Target that will be used by the
> "direct"
> + // preds.
> + ValueToValueMapTy VMap;
> + BasicBlock *DirectSucc = CloneBasicBlock(Target, VMap, ".clone", &F);
> +
> + for (BasicBlock *Pred : OtherPreds)
> + Pred->getTerminator()->replaceUsesOfWith(Target, DirectSucc);
> +
> + // Ok, now fix up the PHIs. We know the two blocks only have PHIs,
> and that
> + // they are clones, so the number of PHIs are the same.
> + // (a) Remove the edge coming from IBRPred from the "Direct" PHI
> + // (b) Leave that as the only edge in the "Indirect" PHI.
> + // (c) Merge the two in the body block.
> + BasicBlock::iterator Indirect = Target->begin(),
> + End = Target->getFirstNonPHI()->getIterator();
> + BasicBlock::iterator Direct = DirectSucc->begin();
> + BasicBlock::iterator MergeInsert = BodyBlock->getFirstInsertionPt();
> +
> + assert(&*End == Target->getTerminator() &&
> + "Block was expected to only contain PHIs");
> +
> + while (Indirect != End) {
> + PHINode *DirPHI = cast<PHINode>(Direct);
> + PHINode *IndPHI = cast<PHINode>(Indirect);
> +
> + // Now, clean up - the direct block shouldn't get the indirect
> value,
> + // and vice versa.
> + DirPHI->removeIncomingValue(IBRPred);
> + Direct++;
> +
> + // Advance the pointer here, to avoid invalidation issues when the
> old
> + // PHI is erased.
> + Indirect++;
> +
> + PHINode *NewIndPHI = PHINode::Create(IndPHI->getType(), 1, "ind",
> IndPHI);
> + NewIndPHI->addIncoming(IndPHI->getIncomingValueForBlock(IBRPred),
> + IBRPred);
> +
> + // Create a PHI in the body block, to merge the direct and indirect
> + // predecessors.
> + PHINode *MergePHI =
> + PHINode::Create(IndPHI->getType(), 2, "merge", &*MergeInsert);
> + MergePHI->addIncoming(NewIndPHI, Target);
> + MergePHI->addIncoming(DirPHI, DirectSucc);
> +
> + IndPHI->replaceAllUsesWith(MergePHI);
> + IndPHI->eraseFromParent();
> + }
> +
> + Changed = true;
> + }
> +
> + return Changed;
> +}
> +
> /// Eliminate blocks that contain only PHI nodes, debug info directives,
> and an
> /// unconditional branch. Passes before isel (e.g. LSR/loopsimplify)
> often split
> /// edges in ways that are non-optimal for isel. Start by eliminating
> these
>
> Modified: llvm/trunk/test/CodeGen/ARM/indirectbr.ll
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/
> CodeGen/ARM/indirectbr.ll?rev=296149&r1=296148&r2=296149&view=diff
> ============================================================
> ==================
> --- llvm/trunk/test/CodeGen/ARM/indirectbr.ll (original)
> +++ llvm/trunk/test/CodeGen/ARM/indirectbr.ll Fri Feb 24 12:41:32 2017
> @@ -47,6 +47,7 @@ L3:
> br label %L2
>
> L2: ; preds = %L3, %bb2
> +; THUMB-LABEL: %L1.clone
> ; THUMB: muls
> %res.2 = phi i32 [ %res.1, %L3 ], [ 1, %bb2 ] ; <i32> [#uses=1]
> %phitmp = mul i32 %res.2, 6 ; <i32> [#uses=1]
>
> Modified: llvm/trunk/test/CodeGen/MSP430/indirectbr2.ll
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/
> CodeGen/MSP430/indirectbr2.ll?rev=296149&r1=296148&r2=296149&view=diff
> ============================================================
> ==================
> --- llvm/trunk/test/CodeGen/MSP430/indirectbr2.ll (original)
> +++ llvm/trunk/test/CodeGen/MSP430/indirectbr2.ll Fri Feb 24 12:41:32 2017
> @@ -5,7 +5,7 @@ define internal i16 @foo(i16 %i) nounwin
> entry:
> %tmp1 = getelementptr inbounds [5 x i8*], [5 x i8*]* @C.0.2070, i16 0,
> i16 %i ; <i8**> [#uses=1]
> %gotovar.4.0 = load i8*, i8** %tmp1, align 4 ; <i8*> [#uses=1]
> -; CHECK: br .LC.0.2070(r12)
> +; CHECK: br .LC.0.2070(r15)
> indirectbr i8* %gotovar.4.0, [label %L5, label %L4, label %L3, label
> %L2, label %L1]
>
> L5: ; preds = %bb2
>
> Modified: llvm/trunk/test/CodeGen/PowerPC/indirectbr.ll
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/
> CodeGen/PowerPC/indirectbr.ll?rev=296149&r1=296148&r2=296149&view=diff
> ============================================================
> ==================
> --- llvm/trunk/test/CodeGen/PowerPC/indirectbr.ll (original)
> +++ llvm/trunk/test/CodeGen/PowerPC/indirectbr.ll Fri Feb 24 12:41:32 2017
> @@ -17,23 +17,35 @@ entry:
> bb2: ; preds = %entry, %bb3
> %gotovar.4.0 = phi i8* [ %gotovar.4.0.pre, %bb3 ], [ %0, %entry ] ;
> <i8*> [#uses=1]
> ; PIC: mtctr
> -; PIC-NEXT: li
> -; PIC-NEXT: li
> -; PIC-NEXT: li
> -; PIC-NEXT: li
> ; PIC-NEXT: bctr
> +; PIC: li
> +; PIC: b LBB
> +; PIC: li
> +; PIC: b LBB
> +; PIC: li
> +; PIC: b LBB
> +; PIC: li
> +; PIC: b LBB
> ; STATIC: mtctr
> -; STATIC-NEXT: li
> -; STATIC-NEXT: li
> -; STATIC-NEXT: li
> -; STATIC-NEXT: li
> ; STATIC-NEXT: bctr
> +; STATIC: li
> +; STATIC: b LBB
> +; STATIC: li
> +; STATIC: b LBB
> +; STATIC: li
> +; STATIC: b LBB
> +; STATIC: li
> +; STATIC: b LBB
> ; PPC64: mtctr
> -; PPC64-NEXT: li
> -; PPC64-NEXT: li
> -; PPC64-NEXT: li
> -; PPC64-NEXT: li
> ; PPC64-NEXT: bctr
> +; PPC64: li
> +; PPC64: b LBB
> +; PPC64: li
> +; PPC64: b LBB
> +; PPC64: li
> +; PPC64: b LBB
> +; PPC64: li
> +; PPC64: b LBB
> indirectbr i8* %gotovar.4.0, [label %L5, label %L4, label %L3, label
> %L2, label %L1]
>
> bb3: ; preds = %entry
>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20170226/e0fccc59/attachment.html>
More information about the llvm-commits
mailing list