[llvm-dev] [IndVarSimplify] Narrow IV's are not eliminated resulting in inefficient code
Oleg Ranevskyy via llvm-dev
llvm-dev at lists.llvm.org
Thu Apr 28 13:51:01 PDT 2016
Hi Sanjoy,
Attached is the patch that fixes the LLVM test regressions caused by this
fix. It just adds <nsw> entries the fix has introduced.
Would you have a couple of minutes to check it, please?
I would also like to share two differences in the opt logs I noticed for
these tests.
*1.* The fixed LLVM might split sext of a sum into a sum of two sext's
while doing SCEV analysis. This does not affect the final IR however - it
is the same for all the patched tests with and w/o your fix. E.g., this can
be seen on Analysis/ScalarEvolution/flags-from-poison.ll. LLVM w/o the fix
prints this:
--------
Printing analysis 'Scalar Evolution Analysis' for function 'test-sub-nsw':
Classifying expressions for: @test-sub-nsw
...
%index64 = sext i32 %index32 to i64
--> {(sext i32 ((-1 * %halfsub)<nsw> + %start) to i64),+,1}<nsw><%loop>
U: [-2147483648,6442450943) S: [-2147483648,6442450943) Exits: ((zext i32
(-1 + (-1 * %start) + %numIterations) to i64) + (sext i32 ((-1 *
%halfsub)<nsw> + %start) to i64))
--------
The fixed one prints a sum of sext's:
--------
%index64 = sext i32 %index32 to i64
--> {((sext i32 (-1 * %halfsub)<nsw> to i64) + (sext i32 %start to
i64))<nsw>,+,1}<nsw><%loop> U: [-3221225471,7516192767) S:
[-3221225471,7516192767) Exits: ((zext i32 (-1 + (-1 * %start) +
%numIterations) to i64) + (sext i32 (-1 * %halfsub)<nsw> to i64) + (sext
i32 %start to i64))
--------
*2.* The fixed LLVM seems to be more aggressive in extracting constants
from sext. E.g., the same test flags-from-poison.ll, the test-sub-with-add
function. W/o the fix:
--------
%ptr = getelementptr inbounds float, float* %input, i32 %index32
--> {((4 * (sext i32 (1 + (-1 * %offset)) to i64)) +
%input),+,4}<nsw><%loop> U: full-set S: full-set Exits: ((4 * (zext i32 (-1
+ %numIterations) to i64)) + (4 * (sext i32 (1 + (-1 * %offset)) to i64)) +
%input)
--------
With the fix:
--------
%ptr = getelementptr inbounds float, float* %input, i32 %index32
--> {(4 + (4 * (sext i32 (-1 * %offset) to i64)) +
%input),+,4}<nsw><%loop> U: full-set S: full-set Exits: (4 + (4 * (zext i32
(-1 + %numIterations) to i64)) + (4 * (sext i32 (-1 * %offset) to i64)) +
%input)
--------
Hope this helps.
On Sat, Apr 23, 2016 at 4:48 PM, Oleg Ranevskyy <llvm.mail.list at gmail.com>
wrote:
> Hi Sanjoy,
>
> Thank you for looking into this!
> Yes, your patch does fix my larger test case too. My algorithm gets double
> performance improvement with the patch, as the loop now has a smaller
> instruction set and succeeds to unroll w/o any extra #pragma's.
>
> I also ran the LLVM tests against the patch. There are 6 new failures:
> Analysis/LoopAccessAnalysis/number-of-memchecks.ll
> Analysis/LoopAccessAnalysis/reverse-memcheck-bounds.ll
> Analysis/ScalarEvolution/flags-from-poison.ll
> Analysis/ScalarEvolution/nsw-offset-assume.ll
> Analysis/ScalarEvolution/nsw-offset.ll
> Analysis/ScalarEvolution/nsw.ll
>
> I haven't inspected these failures in detail yet, but it's likely the
> tests merely need to be adjusted to handle the new no-wrap flags the patch
> introduced. I will double-check this soon.
>
> Kind regards,
> Oleg
>
> On Fri, Apr 22, 2016 at 11:02 AM, Sanjoy Das <
> sanjoy at playingwithpointers.com> wrote:
>
>> Hi Oleg,
>>
>> I think the problem here is that SCEV forgets to propagate no-wrap
>> flags when folding "{S,+,X}+T ==> {S+T,+,X}".
>>
>> I haven't carefully thought about the implications and whether the
>> change is even correct, but the appended patch fixes the test case
>> you've attached. I'll give it some more thought and if it holds up
>> I'll check it in in the next few days. Meanwhile if you have a larger
>> test case that you extracted indvar_test.cpp from, I'd be interested
>> in hearing if this change works there as well.
>>
>> diff --git a/lib/Analysis/ScalarEvolution.cpp
>> b/lib/Analysis/ScalarEvolution.cpp
>> index 39ced1e..2e87902 100644
>> --- a/lib/Analysis/ScalarEvolution.cpp
>> +++ b/lib/Analysis/ScalarEvolution.cpp
>> @@ -2274,19 +2274,19 @@ const SCEV
>> *ScalarEvolution::getAddExpr(SmallVectorImpl<const SCEV *> &Ops,
>> }
>>
>> // If we found some loop invariants, fold them into the recurrence.
>> if (!LIOps.empty()) {
>> // NLI + LI + {Start,+,Step} --> NLI + {LI+Start,+,Step}
>> LIOps.push_back(AddRec->getStart());
>>
>> SmallVector<const SCEV *, 4> AddRecOps(AddRec->op_begin(),
>> AddRec->op_end());
>> - AddRecOps[0] = getAddExpr(LIOps);
>> + AddRecOps[0] = getAddExpr(LIOps, Flags);
>>
>> // Build the new addrec. Propagate the NUW and NSW flags if both
>> the
>> // outer add and the inner addrec are guaranteed to have no
>> overflow.
>> // Always propagate NW.
>> Flags = AddRec->getNoWrapFlags(setFlags(Flags, SCEV::FlagNW));
>> const SCEV *NewRec = getAddRecExpr(AddRecOps, AddRecLoop, Flags);
>>
>> // If all of the other operands were loop invariant, we are done.
>> if (Ops.size() == 1) return NewRec;
>>
>> Thanks!
>> -- Sanjoy
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160428/f41485df/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: scev-tests.patch
Type: application/octet-stream
Size: 8858 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160428/f41485df/attachment.obj>
More information about the llvm-dev
mailing list