[llvm-commits] PATCH: implement vector zext when vector types are legal (neon)

Nick Lewycky nlewycky at google.com
Tue Aug 17 22:56:51 PDT 2010


On 17 August 2010 18:49, Bob Wilson <bob.wilson at apple.com> wrote:

> That's pretty clever, but I've got a simpler patch that produces better
> code.  Committed as svn 111341.
>

Hah. Yeah that is a bit simpler, isn't it. Thank you. If you don't mind I'm
going to let you do -- hopefully the exact same thing -- for sign extend and
any extend.

To hijack my own thread, I'm also trying to add a combine for (add (mul
(zext a), (zext b)), c) --> (vmlal a, b, c). There's an intrinsic for vmlal
and I tried producing that, but ended up triggering "Could not select:
intrinsic %llvm.arm.neon.vmlalu" even though it handles that intrinsic
coming from IR just fine.

Would you mind taking a look at the attached patch and telling me if there's
a much simpler way to do this too? :) I don't want to manually list out
which types vmlal is valid for, but I didn't find an API which would lower a
single intrinsic for me or else return SDValue(), for example.

Thanks!

Nick

_test1:                                 @ @test1
> Leh_func_begin0:
> @ BB#0:
>        vmov    d0, r0, r1
>        vmov.u16        r0, d0[0]
>        mov     r1, #255
>        orr     r1, r1, #255, 24        @ 65280
>        vmov.u16        r2, d0[1]
>        vmov.u16        r3, d0[2]
>        vmov.u16        r12, d0[3]
>        and     r0, r0, r1
>        and     r2, r2, r1
>        and     r3, r3, r1
>        and     r1, r12, r1
>        vmov    s3, r1
>        vmov    s2, r3
>        vmov    s1, r2
>        vmov    s0, r0
>         vmov    r0, r1, d0
>        vmov    r2, r3, d1
>         mov     pc, lr
> Leh_func_end0:
>
> It's still pretty awful.  The ANDs to mask off the high bits are
> unnecessary, and all those extra VMOVs should be avoided.
>
> We probably need to expand SIGN_EXTEND and ANY_EXTEND as well.  I can look
> at that later, since I'm off to dinner now.
>
> On Aug 17, 2010, at 3:58 PM, Nick Lewycky wrote:
>
> > This patch implements custom lowering for ZEXT to N x i32 vector types.
> Currently llvm just crashes.
> >
> > I'm not very qualified either in the backend or with ARM assembly. Please
> review carefully!
> >
> > Since you're probably wondering, the code it produces is lengthy:
> >
> > test1:                                  @ @test1
> > @ BB#0:
> >         str     r11, [sp, #-4]!
> >         mov     r11, sp
> >         sub     sp, sp, #28
> >         bic     sp, sp, #15
> >         vmov.i32        d0, #0x0
> >         vmov.u16        r2, d0[0]
> >         vmov    d0, r0, r1
> >         strh    r2, [sp, #14]
> >         strh    r2, [sp, #10]
> >         strh    r2, [sp, #6]
> >         strh    r2, [sp, #2]
> >         vmov.u16        r0, d0[3]
> >         vmov.u16        r2, d0[2]
> >         vmov.u16        r1, d0[1]
> >         strh    r0, [sp, #12]
> >         strh    r2, [sp, #8]
> >         vmov.u16        r2, d0[0]
> >         strh    r1, [sp, #4]
> >         mov     r1, sp
> >         strh    r2, [sp]
> >         vldmia  r1, {d0, d1}
> >         vmov    r0, r1, d0
> >         vmov    r2, r3, d1
> >         mov     sp, r11
> >         ldr     r11, [sp], #4
> >         mov     pc, lr
> >
> > Nick
> >
> > <neon-zext.patch>_______________________________________________
> > llvm-commits mailing list
> > llvm-commits at cs.uiuc.edu
> > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20100817/b3807b76/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: combine-vmlal-broken.patch
Type: text/x-patch
Size: 1199 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20100817/b3807b76/attachment.bin>


More information about the llvm-commits mailing list