PATCH] ARM NEON Lowering: Merge extractelt, bitcast, sitofp sequence

Anton Korobeynikov anton at korobeynikov.info
Fri Feb 15 00:42:49 PST 2013


Arnold,

The patterns look ok for me, but the problem itself looks like some
regalloc deficiency. I'd expect these no-op copies to be coalesced
out.

Maybe Jakob knows more :)

On Fri, Feb 15, 2013 at 3:47 AM, Arnold Schwaighofer
<aschwaighofer at apple.com> wrote:
> A vectorized sitfp on doubles will get scalarized to a sequence of an
> extract_element of <2 x i32>, a bitcast to f32 and a sitofp.
> Due to the the extract_element, and the bitcast we will uneccessarily generate
> moves between scalar and vector registers.
>
> The patch fixes this by using COPY_TO_REGCLASS and EXTRACT_SUBREG instead.
>
> Example:
>
> define void @vsitofp_double(<2 x i32>* %loadaddr,
>                             <2 x double>* %storeaddr) {
>   %v0 = load <2 x i32>* %loadaddr
>   %r = sitofp <2 x i32> %v0 to <2 x double>
>   store <2 x double> %r, <2 x double>* %storeaddr
>   ret void
> }
>
> We used to generate:
>         vldr    d16, [r0]
>         vmov.32 r2, d16[1]
>         vmov.32 r0, d16[0]
>         vmov    s0, r2
>         vmov    s2, r0
>         vcvt.f64.s32    d17, s0
>         vcvt.f64.s32    d16, s2
>         vst1.32 {d16, d17}, [r1]
> Now we generate:
>         vldr    d0, [r0]
>         vcvt.f64.s32    d17, s1
>         vcvt.f64.s32    d16, s0
>         vst1.32 {d16, d17}, [r1]
>
>
>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>



--
With best regards, Anton Korobeynikov
Faculty of Mathematics and Mechanics, Saint Petersburg State University



More information about the llvm-commits mailing list