[LLVMdev] loop vectorizer

Tue Nov 5 19:12:14 PST 2013

On Oct 30, 2013, at 11:21 PM, Renato Golin <renato.golin at linaro.org> wrote:

> On 30 October 2013 18:40, Frank Winter <fwinter at jlab.org> wrote:
>       const std::uint64_t ir0 = (i+0)%4;  // not working
> 
> I thought this would be the case when I saw the original expression. Maybe we need to teach module arithmetic to SCEV?

I let this thread get stale, so here’s the background again:

source:

      const std::uint64_t ir0 = i%4 + 8*(i/4);
      c[ ir0 ]         = a[ ir0 ]         + b[ ir0 ];

before instcombine:

  %4 = urem i64 %i.0, 4
  %5 = udiv i64 %i.0, 4
  %6 = mul i64 8, %5
  %7 = add i64 %4, %6
  %8 = getelementptr inbounds float* %a, i64 %7

after instcombine:

  %2 = and i64 %i.04, 3
  %3 = lshr i64 %i.04, 2
  %4 = shl i64 %3, 3
  %5 = or i64 %4, %2
  %11 = getelementptr inbounds float* %c, i64 %5
  store float %10, float* %11, align 4, !tbaa !0

Honestly, I don't understand why InstCombine "anti-canonicalizes" add->or. I think that transformation should be deferred into we begin target-specific lower (e.g. InstOptimize pass).

Given, that we aren't going to change that any time soon, SCEV could probably be taught to recognize the specific pattern:

Instructions (or (and %a, C1), (shl %b, C2)) -> SCEV (add %a, %b)

-Andy
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20131105/09cdb551/attachment.html>