[llvm] r179957 - SimplifyCFG: If convert single conditional stores

Shuxin Yang shuxin.llvm at gmail.com
Sat Apr 20 17:24:53 PDT 2013


I'm not going to discuss how to optimize/hack the famous loops in hmmer, 
I just focus on
this transformation.

IMHO, this xform is better motivated by the subsequent loads, which is 
the case in hmmer.

   ---------------
   if (cond)
      store val to x ; // x is just a memory unit, dose not necessary 
has name.

     ... load from x
   ---------------

=>
    if (cond)
      t = val
    store x = t

    ... t

   ------------------

   by stripping off the condition of the store stmt, we can git rid of 
the load expr, which
seems to be big save regardless the which combination of following 
conditions take place
    a)
       a1 the branch is biased (likely taken)
       a2) the branch is biased (likely fall-thru)
       a3) the branch is balanced

    b) the load-store unit of the underlying architecture has load-store 
forwarding capability.

   if b) dose not hold, it always win as we reduce a expensive load
   for combination a1 + b, the load request dose not go to cache, 
however, we still suffer
    from latency. (FIXME: load-store unit will blindly go through the 
pipeline).
   ....






On 4/20/13 2:42 PM, Arnold Schwaighofer wrote:
> Author: arnolds
> Date: Sat Apr 20 16:42:09 2013
> New Revision: 179957
>
> URL: http://llvm.org/viewvc/llvm-project?rev=179957&view=rev
> Log:
> SimplifyCFG: If convert single conditional stores
>
> This transformation will transform a conditional store with a preceeding
> uncondtional store to the same location:
>
>   a[i] =
>   may-alias with a[i] load
>   if (cond)
>     a[i] = Y
>
> into an unconditional store.
>
>   a[i] = X
>   may-alias with a[i] load
>   tmp = cond ? Y : X;
>   a[i] = tmp
>
> We assume that on average the cost of a mispredicted branch is going to be
> higher than the cost of a second store to the same location, and that the
> secondary benefits of creating a bigger basic block for other optimizations to
> work on outway the potential case were the branch would be correctly predicted
> and the cost of the executing the second store would be noticably reflected in
> performance.
>
> hmmer's execution time improves by 30% on an imac12,2 on ref data sets. With
> this change we are on par with gcc's performance (gcc also performs this
> transformation). There was a 1.2 % performance improvement on a ARM swift chip.
> Other tests in the test-suite+external seem to be mostly uninfluenced in my
> experiments:
> This optimization was triggered on 41 tests such that the executable was
> different before/after the patch. Only 1 out of the 40 tests (dealII) was
> reproducable below 100% (by about .4%). Given that hmmer benefits so much I
> believe this to be a fair trade off.
>
> I am going to watch performance numbers across the builtbots and will revert
> this if anything unexpected comes up.
>
> Added:
>      llvm/trunk/test/Transforms/SimplifyCFG/speculate-store.ll
> Modified:
>      llvm/trunk/lib/Transforms/Utils/SimplifyCFG.cpp
>
> Modified: llvm/trunk/lib/Transforms/Utils/SimplifyCFG.cpp
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Utils/SimplifyCFG.cpp?rev=179957&r1=179956&r2=179957&view=diff
> ==============================================================================
> --- llvm/trunk/lib/Transforms/Utils/SimplifyCFG.cpp (original)
> +++ llvm/trunk/lib/Transforms/Utils/SimplifyCFG.cpp Sat Apr 20 16:42:09 2013
> @@ -59,6 +59,10 @@ static cl::opt<bool>
>   SinkCommon("simplifycfg-sink-common", cl::Hidden, cl::init(true),
>          cl::desc("Sink common instructions down to the end block"));
>   
> +static cl::opt<bool>
> +HoistCondStores("simplifycfg-hoist-cond-stores", cl::Hidden, cl::init(true),
> +       cl::desc("Hoist conditional stores if an unconditional store preceeds"));
> +
>   STATISTIC(NumBitMaps, "Number of switch instructions turned into bitmaps");
>   STATISTIC(NumLookupTables, "Number of switch instructions turned into lookup tables");
>   STATISTIC(NumSinkCommons, "Number of common instructions sunk down to the end block");
> @@ -1332,6 +1336,64 @@ static bool SinkThenElseCodeToEnd(Branch
>     return Changed;
>   }
>   
> +/// \brief Determine if we can hoist sink a sole store instruction out of a
> +/// conditional block.
> +///
> +/// We are looking for code like the following:
> +///   BrBB:
> +///     store i32 %add, i32* %arrayidx2
> +///     ... // No other stores or function calls (we could be calling a memory
> +///     ... // function).
> +///     %cmp = icmp ult %x, %y
> +///     br i1 %cmp, label %EndBB, label %ThenBB
> +///   ThenBB:
> +///     store i32 %add5, i32* %arrayidx2
> +///     br label EndBB
> +///   EndBB:
> +///     ...
> +///   We are going to transform this into:
> +///   BrBB:
> +///     store i32 %add, i32* %arrayidx2
> +///     ... //
> +///     %cmp = icmp ult %x, %y
> +///     %add.add5 = select i1 %cmp, i32 %add, %add5
> +///     store i32 %add.add5, i32* %arrayidx2
> +///     ...
> +///
> +/// \return The pointer to the value of the previous store if the store can be
> +///         hoisted into the predecessor block. 0 otherwise.
> +Value *isSafeToSpeculateStore(Instruction *I, BasicBlock *BrBB,
> +                              BasicBlock *StoreBB, BasicBlock *EndBB) {
> +  StoreInst *StoreToHoist = dyn_cast<StoreInst>(I);
> +  if (!StoreToHoist)
> +    return 0;
> +
> +  Value *StorePtr = StoreToHoist->getPointerOperand();
> +  StoreInst *PreviousStore = 0;
> +
> +  // Look for a store to the same pointer in BrBB.
> +  unsigned MaxNumInstToLookAt = 10;
> +  for (BasicBlock::reverse_iterator RI = BrBB->rbegin(),
> +       RE = BrBB->rend(); RI != RE && (--MaxNumInstToLookAt); ++RI) {
> +    Instruction *CurI = &*RI;
> +
> +    // Could be calling an instruction that effects memory like free().
> +    if (CurI->mayHaveSideEffects() && !isa<StoreInst>(CurI))
> +      return 0;
> +
> +    // Found the previous store make sure it stores to the same location.
> +    if ((PreviousStore = dyn_cast<StoreInst>(CurI))) {
> +      if (PreviousStore->getPointerOperand() == StorePtr)
> +        // Found the previous store, return its value operand.
> +        return PreviousStore->getValueOperand();
> +      else
> +        return 0; // Unknown store.
> +    }
> +  }
> +
> +  return 0;
> +}
> +
>   /// \brief Speculate a conditional basic block flattening the CFG.
>   ///
>   /// Note that this is a very risky transform currently. Speculating
> @@ -1395,6 +1457,8 @@ static bool SpeculativelyExecuteBB(Branc
>     SmallDenseMap<Instruction *, unsigned, 4> SinkCandidateUseCounts;
>   
>     unsigned SpeculationCost = 0;
> +  Value *SpeculatedStoreValue = 0;
> +  StoreInst *SpeculatedStore = 0;
>     for (BasicBlock::iterator BBI = ThenBB->begin(),
>                               BBE = llvm::prior(ThenBB->end());
>          BBI != BBE; ++BBI) {
> @@ -1410,13 +1474,21 @@ static bool SpeculativelyExecuteBB(Branc
>         return false;
>   
>       // Don't hoist the instruction if it's unsafe or expensive.
> -    if (!isSafeToSpeculativelyExecute(I))
> +    if (!isSafeToSpeculativelyExecute(I) &&
> +        !(HoistCondStores &&
> +          (SpeculatedStoreValue = isSafeToSpeculateStore(I, BB, ThenBB,
> +                                                         EndBB))))
>         return false;
> -    if (ComputeSpeculationCost(I) > PHINodeFoldingThreshold)
> +    if (!SpeculatedStoreValue &&
> +        ComputeSpeculationCost(I) > PHINodeFoldingThreshold)
>         return false;
>   
> +    // Store the store speculation candidate.
> +    if (SpeculatedStoreValue)
> +      SpeculatedStore = cast<StoreInst>(I);
> +
>       // Do not hoist the instruction if any of its operands are defined but not
> -    // used in this BB. The transformation will prevent the operand from
> +    // used in BB. The transformation will prevent the operand from
>       // being sunk into the use block.
>       for (User::op_iterator i = I->op_begin(), e = I->op_end();
>            i != e; ++i) {
> @@ -1473,12 +1545,24 @@ static bool SpeculativelyExecuteBB(Branc
>   
>     // If there are no PHIs to process, bail early. This helps ensure idempotence
>     // as well.
> -  if (!HaveRewritablePHIs)
> +  if (!HaveRewritablePHIs && !(HoistCondStores && SpeculatedStoreValue))
>       return false;
>   
>     // If we get here, we can hoist the instruction and if-convert.
>     DEBUG(dbgs() << "SPECULATIVELY EXECUTING BB" << *ThenBB << "\n";);
>   
> +  // Insert a select of the value of the speculated store.
> +  if (SpeculatedStoreValue) {
> +    IRBuilder<true, NoFolder> Builder(BI);
> +    Value *TrueV = SpeculatedStore->getValueOperand();
> +    Value *FalseV = SpeculatedStoreValue;
> +    if (Invert)
> +      std::swap(TrueV, FalseV);
> +    Value *S = Builder.CreateSelect(BrCond, TrueV, FalseV, TrueV->getName() +
> +                                    "." + FalseV->getName());
> +    SpeculatedStore->setOperand(0, S);
> +  }
> +
>     // Hoist the instructions.
>     BB->getInstList().splice(BI, ThenBB->getInstList(), ThenBB->begin(),
>                              llvm::prior(ThenBB->end()));
>
> Added: llvm/trunk/test/Transforms/SimplifyCFG/speculate-store.ll
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/SimplifyCFG/speculate-store.ll?rev=179957&view=auto
> ==============================================================================
> --- llvm/trunk/test/Transforms/SimplifyCFG/speculate-store.ll (added)
> +++ llvm/trunk/test/Transforms/SimplifyCFG/speculate-store.ll Sat Apr 20 16:42:09 2013
> @@ -0,0 +1,83 @@
> +; RUN: opt -simplifycfg -S < %s | FileCheck %s
> +
> +define void @ifconvertstore(i32 %m, i32* %A, i32* %B, i32 %C, i32 %D) {
> +entry:
> +  %arrayidx = getelementptr inbounds i32* %B, i64 0
> +  %0 = load i32* %arrayidx, align 4
> +  %add = add nsw i32 %0, %C
> +  %arrayidx2 = getelementptr inbounds i32* %A, i64 0
> +
> +; First store to the location.
> +  store i32 %add, i32* %arrayidx2, align 4
> +  %arrayidx4 = getelementptr inbounds i32* %B, i64 1
> +  %1 = load i32* %arrayidx4, align 4
> +  %add5 = add nsw i32 %1, %D
> +  %cmp6 = icmp sgt i32 %add5, %C
> +  br i1 %cmp6, label %if.then, label %ret.end
> +
> +; Make sure we speculate stores like the following one. It is cheap compared to
> +; a mispredicated branch.
> +; CHECK: @ifconvertstore
> +; CHECK: %add5.add = select i1 %cmp6, i32 %add5, i32 %add
> +; CHECK: store i32 %add5.add, i32* %arrayidx2, align 4
> +if.then:
> +  store i32 %add5, i32* %arrayidx2, align 4
> +  br label %ret.end
> +
> +ret.end:
> +  ret void
> +}
> +
> +define void @noifconvertstore1(i32 %m, i32* %A, i32* %B, i32 %C, i32 %D) {
> +entry:
> +  %arrayidx = getelementptr inbounds i32* %B, i64 0
> +  %0 = load i32* %arrayidx, align 4
> +  %add = add nsw i32 %0, %C
> +  %arrayidx2 = getelementptr inbounds i32* %A, i64 0
> +
> +; Store to a different location.
> +  store i32 %add, i32* %arrayidx, align 4
> +  %arrayidx4 = getelementptr inbounds i32* %B, i64 1
> +  %1 = load i32* %arrayidx4, align 4
> +  %add5 = add nsw i32 %1, %D
> +  %cmp6 = icmp sgt i32 %add5, %C
> +  br i1 %cmp6, label %if.then, label %ret.end
> +
> +; CHECK: @noifconvertstore1
> +; CHECK-NOT: select
> +if.then:
> +  store i32 %add5, i32* %arrayidx2, align 4
> +  br label %ret.end
> +
> +ret.end:
> +  ret void
> +}
> +
> +declare void @unknown_fun()
> +
> +define void @noifconvertstore2(i32 %m, i32* %A, i32* %B, i32 %C, i32 %D) {
> +entry:
> +  %arrayidx = getelementptr inbounds i32* %B, i64 0
> +  %0 = load i32* %arrayidx, align 4
> +  %add = add nsw i32 %0, %C
> +  %arrayidx2 = getelementptr inbounds i32* %A, i64 0
> +
> +; First store to the location.
> +  store i32 %add, i32* %arrayidx2, align 4
> +  call void @unknown_fun()
> +  %arrayidx4 = getelementptr inbounds i32* %B, i64 1
> +  %1 = load i32* %arrayidx4, align 4
> +  %add5 = add nsw i32 %1, %D
> +  %cmp6 = icmp sgt i32 %add5, %C
> +  br i1 %cmp6, label %if.then, label %ret.end
> +
> +; CHECK: @noifconvertstore2
> +; CHECK-NOT: select
> +if.then:
> +  store i32 %add5, i32* %arrayidx2, align 4
> +  br label %ret.end
> +
> +ret.end:
> +  ret void
> +}
> +
>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits




More information about the llvm-commits mailing list