[llvm-dev] "trunc"s generated by LSR cause problem for SCEV

Ehsan Amiri via llvm-dev llvm-dev at lists.llvm.org
Fri Jul 22 13:35:33 PDT 2016


Adding a couple of points just to make sure I have been clear:

1- Without any trunc the code after LSR will directly compare %2 and
%indvars.iv.next.7 in the loop control logic.
2- The argument for why trunc is not needed basically says that if we
compare %2 and %indvars.iv.next.7, the loop will finish while the upper 32
bits of %indvars.iv.next.7 are still all zero. So the behavior remains the
same as the current behavior.

I am going to look into LSR a little bit to see if I can teach it not to
generate those truncs. If those truncs are needed for some reason, please
let me know.


On Fri, Jul 22, 2016 at 11:57 AM, Ehsan Amiri <ehsanamiri at gmail.com> wrote:

> Hi
>
> I am working on a bug that is caused by Scalar Evolution not being able to
> compute the iteration count of an unrolled loop (PR 28363). While I believe
> there is enough information for SCEV to do its job, I think the code that
> is generated by earlier transformations can be simpler. There is one bug in
> IndVarSimplify for which Sanjoy Das suggested a fix. With that fix if I
> disable loop strength reduction the problem is fixed. Below I have copied
> the code before and after loop strength reduction.
>
> For this code pattern, it is possible to prove that truncs generated by
> LSR can be avoided (see bottom of the email). Andy Trick says that LSR
> generally thinks that trunc is free, but there might be ways to work around
> it or improve LSR target hooks.
>
> 1- Does anyone has any suggestion on how to fix this in LSR?
> 2- Any reason that we should not fix LSR, and instead focus on Scalar
> Evolution so it can handle more complicated code patterns properly?
>
>
> *Before LSR:*
>
> *for.body.preheader*:
> %xtraiter = and i32 %m, 7
>
> *for.body.preheader.new:*
> %unroll_iter = sub i32 %m, %xtraiter
>
> *for.body:*
> %niter = phi i32 [ %unroll_iter, %for.body.preheader.new ], [
> %niter.nsub.7, %for.body ]
> %indvars.iv = phi i64 [ 0, %for.body.preheader.new ], [
> %indvars.iv.next.7, %for.body ]
> %indvars.iv.next.7 = add nsw i64 %indvars.iv, 8
> %niter.nsub.7 = add nsw i32 %niter, -8
> %niter.ncmp.7 = icmp eq i32 %niter.nsub.7, 0
>
>
> *After LSR:*
>
> *for.body.preheader:*
> %xtraiter = and i32 %m, 7
>
> *for.body.preheader.new: *
> %unroll_iter = sub i32 %m, %xtraiter
> %2 = zext i32 %unroll_iter to i64
>
> *for.body:*
> %indvars.iv = phi i64 [ 0, %for.body.preheader.new ], [
> %indvars.iv.next.7, %for.body ]
> %indvars.iv.next.7 = add nsw i64 %indvars.iv, 8
> %tmp = trunc i64 %indvars.iv.next.7 to i32
> %tmp80 = trunc i64 %2 to i32
> %niter.ncmp.7 = icmp eq i32 %tmp80, %tmp
>
> *Why trunc is not needed:* %indvars.iv starts from 0 and increments by 8.
> %2 is divsible by 8.  If indvars.iv.next.7 ever reaches a value, which has
> a non-zero bit in its upper 32 bits, it will repeat that pattern until it
> overflows. But the definition of %indvars.iv.next.7 is marked nsw.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160722/d043dbe2/attachment.html>


More information about the llvm-dev mailing list