[LLVMdev] [LoopVectorizer] Missed vectorization opportunities caused by sext/zext operations

Tue May 5 10:53:49 PDT 2015

On 29 April 2015 at 16:21, Silviu Baranga <Silviu.Baranga at arm.com> wrote:
> 1)      It would be possible for SCEV to prove that it is safe to fold the
> sext/zext nodes into an AddRec expression, but this doesn’t happen because either nsw/nuw flags have been
> lost or the code

Hi Silviu,

It's possible that this is something I spotted two years ago while
working on the stride vectorizer draft. I believe the flag loss is
still in there.

> 2)      It is actually possible for some operations to overflow, so folding
> sext/zext nodes into AddRec expressions would be incorrect.

Exactly. We need to keep the flags as much as possible.

> In there anyone working on improving the 1) aspect of SCEV? If so, maybe
> some coordination of effort might be a good idea.

I haven't heard of anyone so far.

> -          Promote values that go into the trip count calculation and memory
> access indices to the smallest type
> which would remove sext/zext/trunc operations from the loop body. This
> should remove the sext/zext
> issue, as SCEV wouldn’t have to deal with these operations.

You mean demote the range specifier (a in your example above), to a
more constrained type than the induction variable, right?

If possible, this would help static analysis of loop bounds and not
require run time checks.

> -          Add nsw/nuw flags where necessary

Or at least make sure they're not removed. The cases I've seen had
then in O0 but not in O3, but I haven't gone through to see (or I
can't remember) which pass removed them.

> -          Add runtime checks (outside the loop) to detect overflows in the
> original loop

I think we do some of that. Maybe the two steps above will make the
little we have today work on most cases already.

cheers,
--renato