[llvm] r241139 - [NaryReassociate] enhances nsw by leveraging @llvm.assume

Wed Jul 8 10:29:58 PDT 2015

Hi Philip,

The reason that I added this rule in NaryReassociate instead of InstCombine
is that InstCombine happens too early for this rule to be useful for our
cases. In our cases, this

a[i]
a[i + j]

pattern occurs after certain iterations of NaryReassociate, so I have to
put it over there.

I like the idea of putting the cover function in ValueTracking, and I
noticed there are some similar hosts for unsigned operations already such
as computeOverflowForUnsignedAdd (
http://llvm.org/docs/doxygen/html/ValueTracking_8h_source.html#l00286). I'd
probably need to create a new interface computeOverflowForSignedAdd, and
add the rule

i >= 0 && i + j >= 0 => i +nsw j

Then, InstCombine can strengthen nsw using this interface. I haven't seen a
particular use case that benefits from doing this in InstCombine, but I
believe it's good in general.

Sounds good to you?

Jingyue

On Wed, Jul 8, 2015 at 9:56 AM, Philip Reames <listmail at philipreames.com>
wrote:

> Sorry for the delayed comment, just got back to this in my email.
>
> Looking at the patch, I was wondering why the rule to add nsw wasn't part
> of instcombine?  Is there a reasoning why we can assume nsw for purposes of
> reassociation, but not add the generic attribute?  Doing this in
> instcombine would seem to be generally useful.
>
> Comment inline as well.
>
>
> On 06/30/2015 08:38 PM, Jingyue Wu wrote:
>
>> Author: jingyue
>> Date: Tue Jun 30 22:38:49 2015
>> New Revision: 241139
>>
>> URL: http://llvm.org/viewvc/llvm-project?rev=241139&view=rev
>> Log:
>> [NaryReassociate] enhances nsw by leveraging @llvm.assume
>>
>> Summary:
>> nsw are flaky and can often be removed by optimizations. This patch
>> enhances
>> nsw by leveraging @llvm.assume in the IR. Specifically, NaryReassociate
>> now
>> understands that
>>
>>      assume(a + b >= 0) && assume(a >= 0) ==> a +nsw b
>>
>> As a result, it can split more sext(a + b) into sext(a) + sext(b) for CSE.
>>
>> Test Plan: nary-gep.ll
>>
>> Reviewers: broune, meheff
>>
>> Subscribers: jholewinski, llvm-commits
>>
>> Differential Revision: http://reviews.llvm.org/D10822
>>
>> Modified:
>>      llvm/trunk/lib/Transforms/Scalar/NaryReassociate.cpp
>>      llvm/trunk/test/Transforms/NaryReassociate/NVPTX/nary-gep.ll
>>
>> Modified: llvm/trunk/lib/Transforms/Scalar/NaryReassociate.cpp
>> URL:
>> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/NaryReassociate.cpp?rev=241139&r1=241138&r2=241139&view=diff
>>
>> ==============================================================================
>> --- llvm/trunk/lib/Transforms/Scalar/NaryReassociate.cpp (original)
>> +++ llvm/trunk/lib/Transforms/Scalar/NaryReassociate.cpp Tue Jun 30
>> 22:38:49 2015
>> @@ -74,21 +74,18 @@
>>   // 1) We only considers n-ary adds for now. This should be extended and
>>   // generalized.
>>   //
>> -// 2) Besides arithmetic operations, similar reassociation can be
>> applied to
>> -// GEPs. For example, if
>> -//   X = &arr[a]
>> -// dominates
>> -//   Y = &arr[a + b]
>> -// we may rewrite Y into X + b.
>> -//
>>
>> //===----------------------------------------------------------------------===//
>>   +#include "llvm/Analysis/AssumptionCache.h"
>>   #include "llvm/Analysis/ScalarEvolution.h"
>>   #include "llvm/Analysis/TargetLibraryInfo.h"
>>   #include "llvm/Analysis/TargetTransformInfo.h"
>> +#include "llvm/Analysis/ValueTracking.h"
>>   #include "llvm/IR/Dominators.h"
>>   #include "llvm/IR/Module.h"
>>   #include "llvm/IR/PatternMatch.h"
>> +#include "llvm/Support/Debug.h"
>> +#include "llvm/Support/raw_ostream.h"
>>   #include "llvm/Transforms/Scalar.h"
>>   #include "llvm/Transforms/Utils/Local.h"
>>   using namespace llvm;
>> @@ -115,6 +112,7 @@ public:
>>       AU.addPreserved<DominatorTreeWrapperPass>();
>>       AU.addPreserved<ScalarEvolution>();
>>       AU.addPreserved<TargetLibraryInfoWrapperPass>();
>> +    AU.addRequired<AssumptionCacheTracker>();
>>       AU.addRequired<DominatorTreeWrapperPass>();
>>       AU.addRequired<ScalarEvolution>();
>>       AU.addRequired<TargetLibraryInfoWrapperPass>();
>> @@ -163,12 +161,18 @@ private:
>>     // GEP's pointer size, i.e., whether Index needs to be sign-extended
>> in order
>>     // to be an index of GEP.
>>     bool requiresSignExtension(Value *Index, GetElementPtrInst *GEP);
>> +  // Returns whether V is known to be non-negative at context \c Ctxt.
>> +  bool isKnownNonNegative(Value *V, Instruction *Ctxt);
>> +  // Returns whether AO may sign overflow at context \c Ctxt. It
>> computes a
>> +  // conservative result -- it answers true when not sure.
>> +  bool maySignOverflow(AddOperator *AO, Instruction *Ctxt);
>>   +  AssumptionCache *AC;
>> +  const DataLayout *DL;
>>     DominatorTree *DT;
>>     ScalarEvolution *SE;
>>     TargetLibraryInfo *TLI;
>>     TargetTransformInfo *TTI;
>> -  const DataLayout *DL;
>>     // A lookup table quickly telling which instructions compute the
>> given SCEV.
>>     // Note that there can be multiple instructions at different locations
>>     // computing to the same SCEV, so we map a SCEV to an instruction
>> list.  For
>> @@ -185,6 +189,7 @@ private:
>>   char NaryReassociate::ID = 0;
>>   INITIALIZE_PASS_BEGIN(NaryReassociate, "nary-reassociate", "Nary
>> reassociation",
>>                         false, false)
>> +INITIALIZE_PASS_DEPENDENCY(AssumptionCacheTracker)
>>   INITIALIZE_PASS_DEPENDENCY(DominatorTreeWrapperPass)
>>   INITIALIZE_PASS_DEPENDENCY(ScalarEvolution)
>>   INITIALIZE_PASS_DEPENDENCY(TargetLibraryInfoWrapperPass)
>> @@ -200,6 +205,7 @@ bool NaryReassociate::runOnFunction(Func
>>     if (skipOptnoneFunction(F))
>>       return false;
>>   +  AC = &getAnalysis<AssumptionCacheTracker>().getAssumptionCache(F);
>>     DT = &getAnalysis<DominatorTreeWrapperPass>().getDomTree();
>>     SE = &getAnalysis<ScalarEvolution>();
>>     TLI = &getAnalysis<TargetLibraryInfoWrapperPass>().getTLI();
>> @@ -346,18 +352,44 @@ bool NaryReassociate::requiresSignExtens
>>     return cast<IntegerType>(Index->getType())->getBitWidth() <
>> PointerSizeInBits;
>>   }
>>   +bool NaryReassociate::isKnownNonNegative(Value *V, Instruction *Ctxt) {
>> +  bool NonNegative, Negative;
>> +  // TODO: ComputeSignBits is expensive. Consider caching the results.
>> +  ComputeSignBit(V, NonNegative, Negative, *DL, 0, AC, Ctxt, DT);
>> +  return NonNegative;
>> +}
>>
> This really seems like a cover function which should live in ValueTracking.
>
>  +
>> +bool NaryReassociate::maySignOverflow(AddOperator *AO, Instruction
>> *Ctxt) {
>> +  if (AO->hasNoSignedWrap())
>> +    return false;
>> +
>> +  Value *LHS = AO->getOperand(0), *RHS = AO->getOperand(1);
>> +  // If LHS or RHS has the same sign as the sum, AO doesn't sign
>> overflow.
>> +  // TODO: handle the negative case as well.
>> +  if (isKnownNonNegative(AO, Ctxt) &&
>> +      (isKnownNonNegative(LHS, Ctxt) || isKnownNonNegative(RHS, Ctxt)))
>> +    return false;
>> +
>> +  return true;
>> +}
>> +
>>   GetElementPtrInst *
>>   NaryReassociate::tryReassociateGEPAtIndex(GetElementPtrInst *GEP,
>> unsigned I,
>>                                             Type *IndexedType) {
>>     Value *IndexToSplit = GEP->getOperand(I + 1);
>> -  if (SExtInst *SExt = dyn_cast<SExtInst>(IndexToSplit))
>> +  if (SExtInst *SExt = dyn_cast<SExtInst>(IndexToSplit)) {
>>       IndexToSplit = SExt->getOperand(0);
>> +  } else if (ZExtInst *ZExt = dyn_cast<ZExtInst>(IndexToSplit)) {
>> +    // zext can be treated as sext if the source is non-negative.
>> +    if (isKnownNonNegative(ZExt->getOperand(0), GEP))
>> +      IndexToSplit = ZExt->getOperand(0);
>> +  }
>>       if (AddOperator *AO = dyn_cast<AddOperator>(IndexToSplit)) {
>>       // If the I-th index needs sext and the underlying add is not
>> equipped with
>>       // nsw, we cannot split the add because
>>       //   sext(LHS + RHS) != sext(LHS) + sext(RHS).
>> -    if (requiresSignExtension(IndexToSplit, GEP) &&
>> !AO->hasNoSignedWrap())
>> +    if (requiresSignExtension(IndexToSplit, GEP) && maySignOverflow(AO,
>> GEP))
>>         return nullptr;
>>       Value *LHS = AO->getOperand(0), *RHS = AO->getOperand(1);
>>       // IndexToSplit = LHS + RHS.
>> @@ -373,10 +405,9 @@ NaryReassociate::tryReassociateGEPAtInde
>>     return nullptr;
>>   }
>>   -GetElementPtrInst *
>> -NaryReassociate::tryReassociateGEPAtIndex(GetElementPtrInst *GEP,
>> unsigned I,
>> -                                          Value *LHS, Value *RHS,
>> -                                          Type *IndexedType) {
>> +GetElementPtrInst *NaryReassociate::tryReassociateGEPAtIndex(
>> +    GetElementPtrInst *GEP, unsigned I, Value *LHS, Value *RHS,
>> +    Type *IndexedType) {
>>     // Look for GEP's closest dominator that has the same SCEV as GEP
>> except that
>>     // the I-th index is replaced with LHS.
>>     SmallVector<const SCEV *, 4> IndexExprs;
>> @@ -384,6 +415,16 @@ NaryReassociate::tryReassociateGEPAtInde
>>       IndexExprs.push_back(SE->getSCEV(*Index));
>>     // Replace the I-th index with LHS.
>>     IndexExprs[I] = SE->getSCEV(LHS);
>> +  if (isKnownNonNegative(LHS, GEP) &&
>> +      DL->getTypeSizeInBits(LHS->getType()) <
>> +          DL->getTypeSizeInBits(GEP->getOperand(I)->getType())) {
>> +    // Zero-extend LHS if it is non-negative. InstCombine canonicalizes
>> sext to
>> +    // zext if the source operand is proved non-negative. We should do
>> that
>> +    // consistently so that CandidateExpr more likely appears before. See
>> +    // @reassociate_gep_assume for an example of this canonicalization.
>> +    IndexExprs[I] =
>> +        SE->getZeroExtendExpr(IndexExprs[I],
>> GEP->getOperand(I)->getType());
>> +  }
>>     const SCEV *CandidateExpr = SE->getGEPExpr(
>>         GEP->getSourceElementType(),
>> SE->getSCEV(GEP->getPointerOperand()),
>>         IndexExprs, GEP->isInBounds());
>>
>> Modified: llvm/trunk/test/Transforms/NaryReassociate/NVPTX/nary-gep.ll
>> URL:
>> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/NaryReassociate/NVPTX/nary-gep.ll?rev=241139&r1=241138&r2=241139&view=diff
>>
>> ==============================================================================
>> --- llvm/trunk/test/Transforms/NaryReassociate/NVPTX/nary-gep.ll
>> (original)
>> +++ llvm/trunk/test/Transforms/NaryReassociate/NVPTX/nary-gep.ll Tue Jun
>> 30 22:38:49 2015
>> @@ -61,6 +61,40 @@ define void @reassociate_gep_nsw(float*
>>     ret void
>>   }
>>   +; assume(j >= 0);
>> +; foo(&a[zext(j)]);
>> +; assume(i + j >= 0);
>> +; foo(&a[zext(i + j)]);
>> +;   =>
>> +; t1 = &a[zext(j)];
>> +; foo(t1);
>> +; t2 = t1 + sext(i);
>> +; foo(t2);
>> +define void @reassociate_gep_assume(float* %a, i32 %i, i32 %j) {
>> +; CHECK-LABEL: @reassociate_gep_assume(
>> +  ; assume(j >= 0)
>> +  %cmp = icmp sgt i32 %j, -1
>> +  call void @llvm.assume(i1 %cmp)
>> +  %1 = add i32 %i, %j
>> +  %cmp2 = icmp sgt i32 %1, -1
>> +  call void @llvm.assume(i1 %cmp2)
>> +
>> +  %idxprom.j = zext i32 %j to i64
>> +  %2 = getelementptr float, float* %a, i64 %idxprom.j
>> +; CHECK: [[t1:[^ ]+]] = getelementptr float, float* %a, i64 %idxprom.j
>> +  call void @foo(float* %2)
>> +; CHECK: call void @foo(float* [[t1]])
>> +
>> +  %idxprom.1 = zext i32 %1 to i64
>> +  %3 = getelementptr float, float* %a, i64 %idxprom.1
>> +; CHECK: [[sexti:[^ ]+]] = sext i32 %i to i64
>> +; CHECK: [[t2:[^ ]+]] = getelementptr float, float* [[t1]], i64 [[sexti]]
>> +  call void @foo(float* %3)
>> +; CHECK: call void @foo(float* [[t2]])
>> +
>> +  ret void
>> +}
>> +
>>   ; Do not split the second GEP because sext(i + j) != sext(i) + sext(j).
>>   define void @reassociate_gep_no_nsw(float* %a, i32 %i, i32 %j) {
>>   ; CHECK-LABEL: @reassociate_gep_no_nsw(
>> @@ -88,3 +122,5 @@ define void @reassociate_gep_128(float*
>>   ; CHECK: call void @foo(float* [[t2]])
>>     ret void
>>   }
>> +
>> +declare void @llvm.assume(i1)
>>
>>
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20150708/93580df6/attachment.html>