[PATCH] [LSR] Generate and use zero extends
Andrew Trick
atrick at apple.com
Wed Apr 29 21:22:20 PDT 2015
> On Apr 27, 2015, at 2:05 PM, Sanjoy Das <sanjoy at playingwithpointers.com> wrote:
>
>> Does your test fail on trunk? This is what the test outputs for me:
>>
>> ok_161: ; preds = %ok_158
>>
>> %lsr.iv.next = add nuw nsw i64 %lsr.iv, 1
>> %4 = add i64 %0, %lsr.iv.next
>> %tmp1 = trunc i64 %4 to i32
>> %tmp188 = icmp slt i32 %tmp1, %tmp160
>> br i1 %tmp188, label %ok_146, label %block_81
>
> That's behavior I'm trying to avoid -- I don't want two additions on
> every iteration of the loop.
>
> To elaborate a little bit, here is the -debug-only=loop-reduce on the
> test case before the change:
You’re test makes sense. I was reading it wrong.
>
>> Would you be able to test both post-inc and pre-inc variants? We've a
>> lot of bugs because of TransformForPostIncUse.
>
> I think that is a good idea. I'm not familiar with LSR so I suspect
> it will take me some time to get there, but just that is actually a
> good reason for writing these tests. :)
>
> It also looks like (I'm not a 100% sure) that there's an inconsistency
> in how LSR treats overflow in post-inc expressions vs. how SCEV treats
> them. For instance, in ScalarEvolution.cpp line ~1500 we say that
> {S,+,X} is <nuw> if "S + MaxBECount * X" does not unsigned
> overflow. But that does not say anything about the overflow of the
> post-inc SCEV expression (e.g. if BECount == 0 then all add recs are
> <nuw>).
>
> The problem is that TransformForPostIncUse(zext({S,+,X})) =
> zext({S+X,+,X}) -- ScalarEvolutionNormalization.cpp, line ~100.
>
> Now if {S,+,1} is nuw then {zext S,+,1} = zext({S,+,1}), but
> {(zext S) + 1,+,1} is != zext({S+1,+,1}) by SCEV's definition of nuw.
> But TransformForPostIncUse({zext S,+,1}) =
> TransformForPostIncUse(zext({S,+,1})) = zext({S+1,+,1}) when it should
> really be {(zext S) + 1,+,1}. I don't have a concrete example where
> this is a problem, but this definitely looks fishy to me.
TransformForPostInc is the most confusing and bug prone parts of LSR. I haven't looked at an example, just stared at the code for a while. I would expect it to work like this:
Original SCEV for use:
zext{I,+,1}<nuw>
Reduced SCEV:
{zext(I),+,1}<nuw>
Normalized: (nuw should be dropped)
{zext(I) - 1,+,1}
I think this line should drop nuw:
+ // Conservatively use AnyWrap until/unless we need FlagNW.
+ const SCEV *Result = SE.getAddRecExpr(Operands, L, SCEV::FlagAnyWrap);
And if that line doesn't drop it, then this line should:
+ Result = SE.getMinusSCEV(Result, TransformedStep);
Denormalized:
{zext(I),+,1}
That's conservative, because we don't really need to drop NUW.
I'm not sure where this equality that you're showing came from:
TransformForPostIncUse({zext S,+,1}) =
TransformForPostIncUse(zext({S,+,1}))
Andy
> On Mon, Apr 27, 2015 at 11:14 AM, Andrew Trick <atrick at apple.com> wrote:
>> This looks reasonable. Thanks!
>>
>> Does your test fail on trunk? This is what the test outputs for me:
>>
>> ok_161: ; preds = %ok_158
>>
>> %lsr.iv.next = add nuw nsw i64 %lsr.iv, 1
>> %4 = add i64 %0, %lsr.iv.next
>> %tmp1 = trunc i64 %4 to i32
>> %tmp188 = icmp slt i32 %tmp1, %tmp160
>> br i1 %tmp188, label %ok_146, label %block_81
>>
>> Would you be able to test both post-inc and pre-inc variants? We've a lot of bugs because of TransformForPostIncUse.
>>
>>
>> ================
>> Comment at: lib/Transforms/Scalar/LoopStrengthReduce.cpp:3685
>> @@ -3669,1 +3684,3 @@
>>
>> +/// GenerateZExts - Generate reuse formulae from different IV types.
>> +void LSRInstance::GenerateZExts(LSRUse &LU, unsigned LUIdx, Formula Base) {
>> ----------------
>> Please comment this at least as well as your commit comment if not better.
>>
>> http://reviews.llvm.org/D9181
>>
>> EMAIL PREFERENCES
>> http://reviews.llvm.org/settings/panel/emailpreferences/
>>
>>
More information about the llvm-commits
mailing list