[LLVMdev] [LoopVectorizer] Missed vectorization opportunities caused by sext/zext operations

Silviu Baranga Silviu.Baranga at arm.com
Wed Apr 29 08:21:46 PDT 2015


Hi,

This is somewhat similar to the previous thread regarding missed vectorization
opportunities (http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-April/084765.html),
but maybe different enough to require a new thread.

I'm seeing some missed vectorization opportunities in the loop vectorizer because SCEV
is not able to fold sext/zext expressions into recurrence expressions (AddRecExpr).

This can manifest in multiple ways:

-          We cannot get the back-edges taken count since SCEV  because we may have something like (sext (1,+1))

which we can't evaluate as it can overflow

-          We cannot get SCEV AddRec expressions for pointers which need runtime checks, and the

loop vectorizer fails with a "Can't vectorize due to memory conflicts" error.



I think there are two cases:

1)      It would be possible for SCEV to prove that it is safe to fold the sext/zext nodes into an AddRec

expression, but this doesn't happen because either nsw/nuw flags have been lost or the code

to make the inference of nsw/nuw flags in some particular case is missing

2)      It is actually possible for some operations to overflow, so folding sext/zext nodes into AddRec

expressions would be incorrect.



Here is an example where we fail to get the number of back-edge branches taken because of sext/zext

operations:


void test0(unsigned short a, unsigned short *  in, unsigned short * out) {
  for (unsigned short w = 1; w < a - 1; w++) //this will never overflow
      out[w] = in[w+7] * 2;

}



In there anyone working on improving the 1) aspect of SCEV? If so, maybe some coordination of effort

might be a good idea.



Since the issue seems to be that certain operations can overflow and SCEV cannot properly reason about

overflows and extend operations, would it make more sense to try and:

-          Promote values that go into the trip count calculation and memory access indices to the smallest type

which would remove sext/zext/trunc operations from the loop body. This should remove the sext/zext

issue, as SCEV wouldn't have to deal with these operations.

-          Add nsw/nuw flags where necessary

-          Add runtime checks (outside the loop) to detect overflows in the original loop

Would there be any fundamental issue with this approach? I think it would it be preferable to point fixes
for case 1), so if anyone is working on something similar it would be good to know.

Thanks,
Silviu

-- IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.

ARM Limited, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered in England & Wales, Company No: 2557590
ARM Holdings plc, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered in England & Wales, Company No: 2548782
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150429/f91d9ccb/attachment.html>


More information about the llvm-dev mailing list