[llvm-commits] Fix sitofp and fpextend codegen for x86/AVX[PR9473]

Thu Jun 2 09:12:49 PDT 2011

Hello Bruno,

Thanks for the advice,

> From the intel manual:
>
> VCVTSS2SD- Convert one single-precision floating-point value in
> xmm3/m32 to one double-precision floating- point value and merge with
> high bits of xmm2.
>
> And, according to your patch:
>
> +let isAsmParserOnly = 1 in {
> +  def VCVTSS2SDrm : I<0x5A, MRMSrcMem, (outs FR64:$dst),
> +                      (ins FR32:$src1, f32mem:$src2),
> +                      "vcvtss2sd\t{$src2, $src1, $dst|$dst, $src1, $src2}",
> +                      []>, XS, VEX_4V, Requires<[HasAVX, OptForSize]>;
> +}
> +
> +def VCVTSS2SDrm_alt : I<0x5A, MRMSrcMem, (outs FR64:$dst),
> +                    (ins f32mem:$src),
> +                    "vcvtss2sd\t{$src, $src, $dst|$dst, $src, $src}",
> +                    []>, XS, VEX, Requires<[HasAVX, OptForSize]>;
>
> The "alt" version is using a different encoding, this isn't correct,
> since there's only one encoding for the "rm" version, which is the
> "VEX_4V" one. There is no need for the "alt" version actually, but to
> follow the manual "merge with high bits of xmm2":
>
> Instead of doing:
>
> +def : Pat<(extloadf32 addr:$src),
> +          (VCVTSS2SDrm_alt addr:$src)>,
>
> You can do:
>
> def : Pat<(extloadf32 addr:$src2),
>          (VCVTSS2SDrm 0, addr:$src2)>,
>
> or something like that...

Ah, Its new for me.
Since I had no idea how to use 'dummy' rester in .td, I just tried to
solve the problem as in my patch.
I'll investigate to rewrite my patch with above expression('0' in
input register)

> A better solution, since this instruction is dealing with F32 reg
> classes and the high bits won't be touched, is to declare VCVTSS2SDrm
> as having Constraints = "$src1 = $dst", but keep printing its operands
> as usual. Also, you can do the pattern matching inline in the
> instruction definition, no need to do it as a Pat here. I believe you
> can do something similar to VCVTSI2SD_alt.
>

Okay, I'll try it and resend my (modified) patch.

--
Syoyo