<div dir="ltr"><div>Hi Philip, </div><div><br></div><div>The reason that I added this rule in NaryReassociate instead of InstCombine is that InstCombine happens too early for this rule to be useful for our cases. In our cases, this</div><div><br></div><div>a[i]</div><div>a[i + j]</div><div><br></div><div>pattern occurs after certain iterations of NaryReassociate, so I have to put it over there. </div><div><br></div><div>I like the idea of putting the cover function in ValueTracking, and I noticed there are some similar hosts for unsigned operations already such as computeOverflowForUnsignedAdd (<a href="https://urldefense.proofpoint.com/v2/url?u=http-3A__llvm.org_docs_doxygen_html_ValueTracking-5F8h-5Fsource.html-23l00286&d=AwMFaQ&c=8hUWFZcy2Z-Za5rBPlktOQ&r=mQ4LZ2PUj9hpadE3cDHZnIdEwhEBrbAstXeMaFoB9tg&m=9mIqcl8503MaMPMT6efAnKaivVtxVeda3aj6UVbu6lk&s=1WxoynTdbLiIHJ3PirINCBpRyQsm5YPeGZZtC6d4zuw&e=">http://llvm.org/docs/doxygen/html/ValueTracking_8h_source.html#l00286</a>). I'd probably need to create a new interface computeOverflowForSignedAdd, and add the rule</div><div><br></div><div>i >= 0 && i + j >= 0 => i +nsw j</div><div><br></div><div>Then, InstCombine can strengthen nsw using this interface. I haven't seen a particular use case that benefits from doing this in InstCombine, but I believe it's good in general. </div><div><br></div><div>Sounds good to you? </div><div><br></div><div>Jingyue</div></div><div class="gmail_extra"><br><div class="gmail_quote">On Wed, Jul 8, 2015 at 9:56 AM, Philip Reames <span dir="ltr"><<a href="mailto:listmail@philipreames.com" target="_blank">listmail@philipreames.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Sorry for the delayed comment, just got back to this in my email.<br>
<br>
Looking at the patch, I was wondering why the rule to add nsw wasn't part of instcombine?  Is there a reasoning why we can assume nsw for purposes of reassociation, but not add the generic attribute?  Doing this in instcombine would seem to be generally useful.<br>
<br>
Comment inline as well.<div><div class="h5"><br>
<br>
On 06/30/2015 08:38 PM, Jingyue Wu wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Author: jingyue<br>
Date: Tue Jun 30 22:38:49 2015<br>
New Revision: 241139<br>
<br>
URL: <a href="https://urldefense.proofpoint.com/v2/url?u=http-3A__llvm.org_viewvc_llvm-2Dproject-3Frev-3D241139-26view-3Drev&d=AwMFaQ&c=8hUWFZcy2Z-Za5rBPlktOQ&r=mQ4LZ2PUj9hpadE3cDHZnIdEwhEBrbAstXeMaFoB9tg&m=9mIqcl8503MaMPMT6efAnKaivVtxVeda3aj6UVbu6lk&s=1ulaBbIKNCjekduT7GrqYBxSS35IObkJwWFEVTk5Kmg&e=" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project?rev=241139&view=rev</a><br>
Log:<br>
[NaryReassociate] enhances nsw by leveraging @llvm.assume<br>
<br>
Summary:<br>
nsw are flaky and can often be removed by optimizations. This patch enhances<br>
nsw by leveraging @llvm.assume in the IR. Specifically, NaryReassociate now<br>
understands that<br>
<br>
     assume(a + b >= 0) && assume(a >= 0) ==> a +nsw b<br>
<br>
As a result, it can split more sext(a + b) into sext(a) + sext(b) for CSE.<br>
<br>
Test Plan: nary-gep.ll<br>
<br>
Reviewers: broune, meheff<br>
<br>
Subscribers: jholewinski, llvm-commits<br>
<br>
Differential Revision: <a href="https://urldefense.proofpoint.com/v2/url?u=http-3A__reviews.llvm.org_D10822&d=AwMFaQ&c=8hUWFZcy2Z-Za5rBPlktOQ&r=mQ4LZ2PUj9hpadE3cDHZnIdEwhEBrbAstXeMaFoB9tg&m=9mIqcl8503MaMPMT6efAnKaivVtxVeda3aj6UVbu6lk&s=pYWHBrYOb5ERnll9JvJOccXCxXvcabp_gzeAslYSjzw&e=" rel="noreferrer" target="_blank">http://reviews.llvm.org/D10822</a><br>
<br>
Modified:<br>
     llvm/trunk/lib/Transforms/Scalar/NaryReassociate.cpp<br>
     llvm/trunk/test/Transforms/NaryReassociate/NVPTX/nary-gep.ll<br>
<br>
Modified: llvm/trunk/lib/Transforms/Scalar/NaryReassociate.cpp<br>
URL: <a href="https://urldefense.proofpoint.com/v2/url?u=http-3A__llvm.org_viewvc_llvm-2Dproject_llvm_trunk_lib_Transforms_Scalar_NaryReassociate.cpp-3Frev-3D241139-26r1-3D241138-26r2-3D241139-26view-3Ddiff&d=AwMFaQ&c=8hUWFZcy2Z-Za5rBPlktOQ&r=mQ4LZ2PUj9hpadE3cDHZnIdEwhEBrbAstXeMaFoB9tg&m=9mIqcl8503MaMPMT6efAnKaivVtxVeda3aj6UVbu6lk&s=jkeIE5vwuhHSY6LhT-QeuYvUKPc59BrtTkKPeJDcfVU&e=" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/NaryReassociate.cpp?rev=241139&r1=241138&r2=241139&view=diff</a><br>
==============================================================================<br>
--- llvm/trunk/lib/Transforms/Scalar/NaryReassociate.cpp (original)<br>
+++ llvm/trunk/lib/Transforms/Scalar/NaryReassociate.cpp Tue Jun 30 22:38:49 2015<br>
@@ -74,21 +74,18 @@<br>
  // 1) We only considers n-ary adds for now. This should be extended and<br>
  // generalized.<br>
  //<br>
-// 2) Besides arithmetic operations, similar reassociation can be applied to<br>
-// GEPs. For example, if<br>
-//   X = &arr[a]<br>
-// dominates<br>
-//   Y = &arr[a + b]<br>
-// we may rewrite Y into X + b.<br>
-//<br>
  //===----------------------------------------------------------------------===//<br>
  +#include "llvm/Analysis/AssumptionCache.h"<br>
  #include "llvm/Analysis/ScalarEvolution.h"<br>
  #include "llvm/Analysis/TargetLibraryInfo.h"<br>
  #include "llvm/Analysis/TargetTransformInfo.h"<br>
+#include "llvm/Analysis/ValueTracking.h"<br>
  #include "llvm/IR/Dominators.h"<br>
  #include "llvm/IR/Module.h"<br>
  #include "llvm/IR/PatternMatch.h"<br>
+#include "llvm/Support/Debug.h"<br>
+#include "llvm/Support/raw_ostream.h"<br>
  #include "llvm/Transforms/Scalar.h"<br>
  #include "llvm/Transforms/Utils/Local.h"<br>
  using namespace llvm;<br>
@@ -115,6 +112,7 @@ public:<br>
      AU.addPreserved<DominatorTreeWrapperPass>();<br>
      AU.addPreserved<ScalarEvolution>();<br>
      AU.addPreserved<TargetLibraryInfoWrapperPass>();<br>
+    AU.addRequired<AssumptionCacheTracker>();<br>
      AU.addRequired<DominatorTreeWrapperPass>();<br>
      AU.addRequired<ScalarEvolution>();<br>
      AU.addRequired<TargetLibraryInfoWrapperPass>();<br>
@@ -163,12 +161,18 @@ private:<br>
    // GEP's pointer size, i.e., whether Index needs to be sign-extended in order<br>
    // to be an index of GEP.<br>
    bool requiresSignExtension(Value *Index, GetElementPtrInst *GEP);<br>
+  // Returns whether V is known to be non-negative at context \c Ctxt.<br>
+  bool isKnownNonNegative(Value *V, Instruction *Ctxt);<br>
+  // Returns whether AO may sign overflow at context \c Ctxt. It computes a<br>
+  // conservative result -- it answers true when not sure.<br>
+  bool maySignOverflow(AddOperator *AO, Instruction *Ctxt);<br>
  +  AssumptionCache *AC;<br>
+  const DataLayout *DL;<br>
    DominatorTree *DT;<br>
    ScalarEvolution *SE;<br>
    TargetLibraryInfo *TLI;<br>
    TargetTransformInfo *TTI;<br>
-  const DataLayout *DL;<br>
    // A lookup table quickly telling which instructions compute the given SCEV.<br>
    // Note that there can be multiple instructions at different locations<br>
    // computing to the same SCEV, so we map a SCEV to an instruction list.  For<br>
@@ -185,6 +189,7 @@ private:<br>
  char NaryReassociate::ID = 0;<br>
  INITIALIZE_PASS_BEGIN(NaryReassociate, "nary-reassociate", "Nary reassociation",<br>
                        false, false)<br>
+INITIALIZE_PASS_DEPENDENCY(AssumptionCacheTracker)<br>
  INITIALIZE_PASS_DEPENDENCY(DominatorTreeWrapperPass)<br>
  INITIALIZE_PASS_DEPENDENCY(ScalarEvolution)<br>
  INITIALIZE_PASS_DEPENDENCY(TargetLibraryInfoWrapperPass)<br>
@@ -200,6 +205,7 @@ bool NaryReassociate::runOnFunction(Func<br>
    if (skipOptnoneFunction(F))<br>
      return false;<br>
  +  AC = &getAnalysis<AssumptionCacheTracker>().getAssumptionCache(F);<br>
    DT = &getAnalysis<DominatorTreeWrapperPass>().getDomTree();<br>
    SE = &getAnalysis<ScalarEvolution>();<br>
    TLI = &getAnalysis<TargetLibraryInfoWrapperPass>().getTLI();<br>
@@ -346,18 +352,44 @@ bool NaryReassociate::requiresSignExtens<br>
    return cast<IntegerType>(Index->getType())->getBitWidth() < PointerSizeInBits;<br>
  }<br>
  +bool NaryReassociate::isKnownNonNegative(Value *V, Instruction *Ctxt) {<br>
+  bool NonNegative, Negative;<br>
+  // TODO: ComputeSignBits is expensive. Consider caching the results.<br>
+  ComputeSignBit(V, NonNegative, Negative, *DL, 0, AC, Ctxt, DT);<br>
+  return NonNegative;<br>
+}<br>
</blockquote></div></div>
This really seems like a cover function which should live in ValueTracking.<div class="HOEnZb"><div class="h5"><br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
+<br>
+bool NaryReassociate::maySignOverflow(AddOperator *AO, Instruction *Ctxt) {<br>
+  if (AO->hasNoSignedWrap())<br>
+    return false;<br>
+<br>
+  Value *LHS = AO->getOperand(0), *RHS = AO->getOperand(1);<br>
+  // If LHS or RHS has the same sign as the sum, AO doesn't sign overflow.<br>
+  // TODO: handle the negative case as well.<br>
+  if (isKnownNonNegative(AO, Ctxt) &&<br>
+      (isKnownNonNegative(LHS, Ctxt) || isKnownNonNegative(RHS, Ctxt)))<br>
+    return false;<br>
+<br>
+  return true;<br>
+}<br>
+<br>
  GetElementPtrInst *<br>
  NaryReassociate::tryReassociateGEPAtIndex(GetElementPtrInst *GEP, unsigned I,<br>
                                            Type *IndexedType) {<br>
    Value *IndexToSplit = GEP->getOperand(I + 1);<br>
-  if (SExtInst *SExt = dyn_cast<SExtInst>(IndexToSplit))<br>
+  if (SExtInst *SExt = dyn_cast<SExtInst>(IndexToSplit)) {<br>
      IndexToSplit = SExt->getOperand(0);<br>
+  } else if (ZExtInst *ZExt = dyn_cast<ZExtInst>(IndexToSplit)) {<br>
+    // zext can be treated as sext if the source is non-negative.<br>
+    if (isKnownNonNegative(ZExt->getOperand(0), GEP))<br>
+      IndexToSplit = ZExt->getOperand(0);<br>
+  }<br>
      if (AddOperator *AO = dyn_cast<AddOperator>(IndexToSplit)) {<br>
      // If the I-th index needs sext and the underlying add is not equipped with<br>
      // nsw, we cannot split the add because<br>
      //   sext(LHS + RHS) != sext(LHS) + sext(RHS).<br>
-    if (requiresSignExtension(IndexToSplit, GEP) && !AO->hasNoSignedWrap())<br>
+    if (requiresSignExtension(IndexToSplit, GEP) && maySignOverflow(AO, GEP))<br>
        return nullptr;<br>
      Value *LHS = AO->getOperand(0), *RHS = AO->getOperand(1);<br>
      // IndexToSplit = LHS + RHS.<br>
@@ -373,10 +405,9 @@ NaryReassociate::tryReassociateGEPAtInde<br>
    return nullptr;<br>
  }<br>
  -GetElementPtrInst *<br>
-NaryReassociate::tryReassociateGEPAtIndex(GetElementPtrInst *GEP, unsigned I,<br>
-                                          Value *LHS, Value *RHS,<br>
-                                          Type *IndexedType) {<br>
+GetElementPtrInst *NaryReassociate::tryReassociateGEPAtIndex(<br>
+    GetElementPtrInst *GEP, unsigned I, Value *LHS, Value *RHS,<br>
+    Type *IndexedType) {<br>
    // Look for GEP's closest dominator that has the same SCEV as GEP except that<br>
    // the I-th index is replaced with LHS.<br>
    SmallVector<const SCEV *, 4> IndexExprs;<br>
@@ -384,6 +415,16 @@ NaryReassociate::tryReassociateGEPAtInde<br>
      IndexExprs.push_back(SE->getSCEV(*Index));<br>
    // Replace the I-th index with LHS.<br>
    IndexExprs[I] = SE->getSCEV(LHS);<br>
+  if (isKnownNonNegative(LHS, GEP) &&<br>
+      DL->getTypeSizeInBits(LHS->getType()) <<br>
+          DL->getTypeSizeInBits(GEP->getOperand(I)->getType())) {<br>
+    // Zero-extend LHS if it is non-negative. InstCombine canonicalizes sext to<br>
+    // zext if the source operand is proved non-negative. We should do that<br>
+    // consistently so that CandidateExpr more likely appears before. See<br>
+    // @reassociate_gep_assume for an example of this canonicalization.<br>
+    IndexExprs[I] =<br>
+        SE->getZeroExtendExpr(IndexExprs[I], GEP->getOperand(I)->getType());<br>
+  }<br>
    const SCEV *CandidateExpr = SE->getGEPExpr(<br>
        GEP->getSourceElementType(), SE->getSCEV(GEP->getPointerOperand()),<br>
        IndexExprs, GEP->isInBounds());<br>
<br>
Modified: llvm/trunk/test/Transforms/NaryReassociate/NVPTX/nary-gep.ll<br>
URL: <a href="https://urldefense.proofpoint.com/v2/url?u=http-3A__llvm.org_viewvc_llvm-2Dproject_llvm_trunk_test_Transforms_NaryReassociate_NVPTX_nary-2Dgep.ll-3Frev-3D241139-26r1-3D241138-26r2-3D241139-26view-3Ddiff&d=AwMFaQ&c=8hUWFZcy2Z-Za5rBPlktOQ&r=mQ4LZ2PUj9hpadE3cDHZnIdEwhEBrbAstXeMaFoB9tg&m=9mIqcl8503MaMPMT6efAnKaivVtxVeda3aj6UVbu6lk&s=47u2MhOnfj0IBhOug5GWaZtOYpV0ilaRV4Xqsq2xmrg&e=" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/NaryReassociate/NVPTX/nary-gep.ll?rev=241139&r1=241138&r2=241139&view=diff</a><br>
==============================================================================<br>
--- llvm/trunk/test/Transforms/NaryReassociate/NVPTX/nary-gep.ll (original)<br>
+++ llvm/trunk/test/Transforms/NaryReassociate/NVPTX/nary-gep.ll Tue Jun 30 22:38:49 2015<br>
@@ -61,6 +61,40 @@ define void @reassociate_gep_nsw(float*<br>
    ret void<br>
  }<br>
  +; assume(j >= 0);<br>
+; foo(&a[zext(j)]);<br>
+; assume(i + j >= 0);<br>
+; foo(&a[zext(i + j)]);<br>
+;   =><br>
+; t1 = &a[zext(j)];<br>
+; foo(t1);<br>
+; t2 = t1 + sext(i);<br>
+; foo(t2);<br>
+define void @reassociate_gep_assume(float* %a, i32 %i, i32 %j) {<br>
+; CHECK-LABEL: @reassociate_gep_assume(<br>
+  ; assume(j >= 0)<br>
+  %cmp = icmp sgt i32 %j, -1<br>
+  call void @llvm.assume(i1 %cmp)<br>
+  %1 = add i32 %i, %j<br>
+  %cmp2 = icmp sgt i32 %1, -1<br>
+  call void @llvm.assume(i1 %cmp2)<br>
+<br>
+  %idxprom.j = zext i32 %j to i64<br>
+  %2 = getelementptr float, float* %a, i64 %idxprom.j<br>
+; CHECK: [[t1:[^ ]+]] = getelementptr float, float* %a, i64 %idxprom.j<br>
+  call void @foo(float* %2)<br>
+; CHECK: call void @foo(float* [[t1]])<br>
+<br>
+  %idxprom.1 = zext i32 %1 to i64<br>
+  %3 = getelementptr float, float* %a, i64 %idxprom.1<br>
+; CHECK: [[sexti:[^ ]+]] = sext i32 %i to i64<br>
+; CHECK: [[t2:[^ ]+]] = getelementptr float, float* [[t1]], i64 [[sexti]]<br>
+  call void @foo(float* %3)<br>
+; CHECK: call void @foo(float* [[t2]])<br>
+<br>
+  ret void<br>
+}<br>
+<br>
  ; Do not split the second GEP because sext(i + j) != sext(i) + sext(j).<br>
  define void @reassociate_gep_no_nsw(float* %a, i32 %i, i32 %j) {<br>
  ; CHECK-LABEL: @reassociate_gep_no_nsw(<br>
@@ -88,3 +122,5 @@ define void @reassociate_gep_128(float*<br>
  ; CHECK: call void @foo(float* [[t2]])<br>
    ret void<br>
  }<br>
+<br>
+declare void @llvm.assume(i1)<br>
<br>
<br>
_______________________________________________<br>
llvm-commits mailing list<br>
<a href="mailto:llvm-commits@cs.uiuc.edu" target="_blank">llvm-commits@cs.uiuc.edu</a><br>
<a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits" rel="noreferrer" target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits</a><br>
</blockquote>
<br>
</div></div></blockquote></div><br></div>