[LLVMdev] loop vectorizer
Andrew Trick
atrick at apple.com
Tue Nov 5 19:12:14 PST 2013
On Oct 30, 2013, at 11:21 PM, Renato Golin <renato.golin at linaro.org> wrote:
> On 30 October 2013 18:40, Frank Winter <fwinter at jlab.org> wrote:
> const std::uint64_t ir0 = (i+0)%4; // not working
>
> I thought this would be the case when I saw the original expression. Maybe we need to teach module arithmetic to SCEV?
I let this thread get stale, so here’s the background again:
source:
const std::uint64_t ir0 = i%4 + 8*(i/4);
c[ ir0 ] = a[ ir0 ] + b[ ir0 ];
before instcombine:
%4 = urem i64 %i.0, 4
%5 = udiv i64 %i.0, 4
%6 = mul i64 8, %5
%7 = add i64 %4, %6
%8 = getelementptr inbounds float* %a, i64 %7
after instcombine:
%2 = and i64 %i.04, 3
%3 = lshr i64 %i.04, 2
%4 = shl i64 %3, 3
%5 = or i64 %4, %2
%11 = getelementptr inbounds float* %c, i64 %5
store float %10, float* %11, align 4, !tbaa !0
Honestly, I don't understand why InstCombine "anti-canonicalizes" add->or. I think that transformation should be deferred into we begin target-specific lower (e.g. InstOptimize pass).
Given, that we aren't going to change that any time soon, SCEV could probably be taught to recognize the specific pattern:
Instructions (or (and %a, C1), (shl %b, C2)) -> SCEV (add %a, %b)
-Andy
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20131105/09cdb551/attachment.html>
More information about the llvm-dev
mailing list