[llvm-commits] PATCH: implement vector zext when vector types are legal (neon)
Nick Lewycky
nlewycky at google.com
Tue Aug 17 22:56:51 PDT 2010
On 17 August 2010 18:49, Bob Wilson <bob.wilson at apple.com> wrote:
> That's pretty clever, but I've got a simpler patch that produces better
> code. Committed as svn 111341.
>
Hah. Yeah that is a bit simpler, isn't it. Thank you. If you don't mind I'm
going to let you do -- hopefully the exact same thing -- for sign extend and
any extend.
To hijack my own thread, I'm also trying to add a combine for (add (mul
(zext a), (zext b)), c) --> (vmlal a, b, c). There's an intrinsic for vmlal
and I tried producing that, but ended up triggering "Could not select:
intrinsic %llvm.arm.neon.vmlalu" even though it handles that intrinsic
coming from IR just fine.
Would you mind taking a look at the attached patch and telling me if there's
a much simpler way to do this too? :) I don't want to manually list out
which types vmlal is valid for, but I didn't find an API which would lower a
single intrinsic for me or else return SDValue(), for example.
Thanks!
Nick
_test1: @ @test1
> Leh_func_begin0:
> @ BB#0:
> vmov d0, r0, r1
> vmov.u16 r0, d0[0]
> mov r1, #255
> orr r1, r1, #255, 24 @ 65280
> vmov.u16 r2, d0[1]
> vmov.u16 r3, d0[2]
> vmov.u16 r12, d0[3]
> and r0, r0, r1
> and r2, r2, r1
> and r3, r3, r1
> and r1, r12, r1
> vmov s3, r1
> vmov s2, r3
> vmov s1, r2
> vmov s0, r0
> vmov r0, r1, d0
> vmov r2, r3, d1
> mov pc, lr
> Leh_func_end0:
>
> It's still pretty awful. The ANDs to mask off the high bits are
> unnecessary, and all those extra VMOVs should be avoided.
>
> We probably need to expand SIGN_EXTEND and ANY_EXTEND as well. I can look
> at that later, since I'm off to dinner now.
>
> On Aug 17, 2010, at 3:58 PM, Nick Lewycky wrote:
>
> > This patch implements custom lowering for ZEXT to N x i32 vector types.
> Currently llvm just crashes.
> >
> > I'm not very qualified either in the backend or with ARM assembly. Please
> review carefully!
> >
> > Since you're probably wondering, the code it produces is lengthy:
> >
> > test1: @ @test1
> > @ BB#0:
> > str r11, [sp, #-4]!
> > mov r11, sp
> > sub sp, sp, #28
> > bic sp, sp, #15
> > vmov.i32 d0, #0x0
> > vmov.u16 r2, d0[0]
> > vmov d0, r0, r1
> > strh r2, [sp, #14]
> > strh r2, [sp, #10]
> > strh r2, [sp, #6]
> > strh r2, [sp, #2]
> > vmov.u16 r0, d0[3]
> > vmov.u16 r2, d0[2]
> > vmov.u16 r1, d0[1]
> > strh r0, [sp, #12]
> > strh r2, [sp, #8]
> > vmov.u16 r2, d0[0]
> > strh r1, [sp, #4]
> > mov r1, sp
> > strh r2, [sp]
> > vldmia r1, {d0, d1}
> > vmov r0, r1, d0
> > vmov r2, r3, d1
> > mov sp, r11
> > ldr r11, [sp], #4
> > mov pc, lr
> >
> > Nick
> >
> > <neon-zext.patch>_______________________________________________
> > llvm-commits mailing list
> > llvm-commits at cs.uiuc.edu
> > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20100817/b3807b76/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: combine-vmlal-broken.patch
Type: text/x-patch
Size: 1199 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20100817/b3807b76/attachment.bin>
More information about the llvm-commits
mailing list