PATCH] ARM NEON Lowering: Merge extractelt, bitcast, sitofp sequence
Anton Korobeynikov
anton at korobeynikov.info
Fri Feb 15 00:42:49 PST 2013
Arnold,
The patterns look ok for me, but the problem itself looks like some
regalloc deficiency. I'd expect these no-op copies to be coalesced
out.
Maybe Jakob knows more :)
On Fri, Feb 15, 2013 at 3:47 AM, Arnold Schwaighofer
<aschwaighofer at apple.com> wrote:
> A vectorized sitfp on doubles will get scalarized to a sequence of an
> extract_element of <2 x i32>, a bitcast to f32 and a sitofp.
> Due to the the extract_element, and the bitcast we will uneccessarily generate
> moves between scalar and vector registers.
>
> The patch fixes this by using COPY_TO_REGCLASS and EXTRACT_SUBREG instead.
>
> Example:
>
> define void @vsitofp_double(<2 x i32>* %loadaddr,
> <2 x double>* %storeaddr) {
> %v0 = load <2 x i32>* %loadaddr
> %r = sitofp <2 x i32> %v0 to <2 x double>
> store <2 x double> %r, <2 x double>* %storeaddr
> ret void
> }
>
> We used to generate:
> vldr d16, [r0]
> vmov.32 r2, d16[1]
> vmov.32 r0, d16[0]
> vmov s0, r2
> vmov s2, r0
> vcvt.f64.s32 d17, s0
> vcvt.f64.s32 d16, s2
> vst1.32 {d16, d17}, [r1]
> Now we generate:
> vldr d0, [r0]
> vcvt.f64.s32 d17, s1
> vcvt.f64.s32 d16, s0
> vst1.32 {d16, d17}, [r1]
>
>
>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>
--
With best regards, Anton Korobeynikov
Faculty of Mathematics and Mechanics, Saint Petersburg State University
More information about the llvm-commits
mailing list