PATCH] ARM NEON Lowering: Merge extractelt, bitcast, sitofp sequence

Renato Golin renato.golin at linaro.org
Thu Feb 14 16:04:04 PST 2013


Hi Arnold,

Nice catch! I'm no expert in TableGen, especially the patterns, so I'll
wait for others to comment.

Typo in:
+// Fix scalarized uitof/sitof 2xf64 to no use intermediate scalar
registers.
-> "to NOT use"

cheers,
--renato



On 14 February 2013 23:47, Arnold Schwaighofer <aschwaighofer at apple.com>wrote:

> A vectorized sitfp on doubles will get scalarized to a sequence of an
> extract_element of <2 x i32>, a bitcast to f32 and a sitofp.
> Due to the the extract_element, and the bitcast we will uneccessarily
> generate
> moves between scalar and vector registers.
>
> The patch fixes this by using COPY_TO_REGCLASS and EXTRACT_SUBREG instead.
>
> Example:
>
> define void @vsitofp_double(<2 x i32>* %loadaddr,
>                             <2 x double>* %storeaddr) {
>   %v0 = load <2 x i32>* %loadaddr
>   %r = sitofp <2 x i32> %v0 to <2 x double>
>   store <2 x double> %r, <2 x double>* %storeaddr
>   ret void
> }
>
> We used to generate:
>         vldr    d16, [r0]
>         vmov.32 r2, d16[1]
>         vmov.32 r0, d16[0]
>         vmov    s0, r2
>         vmov    s2, r0
>         vcvt.f64.s32    d17, s0
>         vcvt.f64.s32    d16, s2
>         vst1.32 {d16, d17}, [r1]
> Now we generate:
>         vldr    d0, [r0]
>         vcvt.f64.s32    d17, s1
>         vcvt.f64.s32    d16, s0
>         vst1.32 {d16, d17}, [r1]
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20130215/01c67c32/attachment.html>


More information about the llvm-commits mailing list