[llvm-commits] Fix sitofp and fpextend codegen for x86/AVX[PR9473]
Syoyo Fujita
syoyofujita at gmail.com
Thu Jun 2 09:12:49 PDT 2011
Hello Bruno,
Thanks for the advice,
> From the intel manual:
>
> VCVTSS2SD- Convert one single-precision floating-point value in
> xmm3/m32 to one double-precision floating- point value and merge with
> high bits of xmm2.
>
> And, according to your patch:
>
> +let isAsmParserOnly = 1 in {
> + def VCVTSS2SDrm : I<0x5A, MRMSrcMem, (outs FR64:$dst),
> + (ins FR32:$src1, f32mem:$src2),
> + "vcvtss2sd\t{$src2, $src1, $dst|$dst, $src1, $src2}",
> + []>, XS, VEX_4V, Requires<[HasAVX, OptForSize]>;
> +}
> +
> +def VCVTSS2SDrm_alt : I<0x5A, MRMSrcMem, (outs FR64:$dst),
> + (ins f32mem:$src),
> + "vcvtss2sd\t{$src, $src, $dst|$dst, $src, $src}",
> + []>, XS, VEX, Requires<[HasAVX, OptForSize]>;
>
> The "alt" version is using a different encoding, this isn't correct,
> since there's only one encoding for the "rm" version, which is the
> "VEX_4V" one. There is no need for the "alt" version actually, but to
> follow the manual "merge with high bits of xmm2":
>
> Instead of doing:
>
> +def : Pat<(extloadf32 addr:$src),
> + (VCVTSS2SDrm_alt addr:$src)>,
>
> You can do:
>
> def : Pat<(extloadf32 addr:$src2),
> (VCVTSS2SDrm 0, addr:$src2)>,
>
> or something like that...
Ah, Its new for me.
Since I had no idea how to use 'dummy' rester in .td, I just tried to
solve the problem as in my patch.
I'll investigate to rewrite my patch with above expression('0' in
input register)
> A better solution, since this instruction is dealing with F32 reg
> classes and the high bits won't be touched, is to declare VCVTSS2SDrm
> as having Constraints = "$src1 = $dst", but keep printing its operands
> as usual. Also, you can do the pattern matching inline in the
> instruction definition, no need to do it as a Pat here. I believe you
> can do something similar to VCVTSI2SD_alt.
>
Okay, I'll try it and resend my (modified) patch.
--
Syoyo
More information about the llvm-commits
mailing list