[PATCH] [IndVarSimplify] Widen signed loop compare instructions to enable additional optimizations.

Tue Sep 16 05:11:38 PDT 2014

>>! In D5333#16, @atrick wrote:
> I think the reason removing the trunc works well in practice is that LSR can easily tell that an immediate offset can be folded into the compare. I'm sure it would be possible to make LSR handle the trunc, but fixing Indvars to generate a cleaner loop test is a fine approach.

Yes, that's what I'm seeing as well.

LSR Use: Kind=Basic, Offsets={0}, widest fixup type: i64
    reg({0,+,1}<nuw><nsw><%for.body>)
    reg({-1,+,1}<nw><%for.body>) + imm(1)

The latter case is only considered once the trunc is removed.

The specific instcombine I thought might be effected was committed by David here:
http://llvm.org/viewvc/llvm-project?view=revision&revision=179316

>From my original code I was hoping to fold the extra subtract into the compare:

ldr     w11, [x10, x0, lsl #2]
cbz     w11, .LBB0_5
add     x0, x0, #1
sub     w11, w0, #1  <--- fold into compare
cmp      w11, w9
b.lt    .LBB0_2

However, r179316 couldn't handle this case because of the trunc.  After a few pointers from David M. I realized that modifying that code to handle the trunc wasn't the right approach and that's how we arrived at this patch.  Based on that limited experience, I could imagine other compare instcombines not working due to intervening truncs.  However, I haven't actually seen this in practice.

Thanks for the review, Andy.  I'm going to do a little more analysis before committing.  My performance runs were on devices that were rather unstable at the time, so I'd like to rerun everything so I can sleep easier at night. :)

 Chad

http://reviews.llvm.org/D5333