[llvm-commits] Fix sitofp and fpextend codegen for x86/AVX[PR9473]

Bruno Cardoso Lopes bruno.cardoso at gmail.com
Tue May 31 10:23:06 PDT 2011


Hi Syoyo,

On Tue, May 31, 2011 at 1:27 PM, Syoyo Fujita <syoyofujita at gmail.com> wrote:
> Attached are series of patch to fix sitofp and fpextend instruction
> isel for x86/AVX backend(mattr=+avx).
>
> The problem is reported here
>
> http://lists.cs.uiuc.edu/pipermail/llvmdev/2011-March/038913.html
>
> and in bugzilla PR9473.
>
>
> Let me briefly explain what this patches do.
>
> sitofp instruction should be mapped to vcvtsitoss(or vcvtsitosd)
> assembler, and fpextend should be vcvtss2sd.
> These AVX instruction takes 2 input and 1 output in asm form, but for
> codegen(isel) form it should be 1 input and 1 output.
> It seems impossible to define DUMMY register in .td.
>
> So my solution is to separate .td definition into asm parser
> case(isAsmParserOnly=1) and codegen case(isCodeGenOnly =1)

>From the intel manual:

VCVTSS2SD- Convert one single-precision floating-point value in
xmm3/m32 to one double-precision floating- point value and merge with
high bits of xmm2.

And, according to your patch:

+let isAsmParserOnly = 1 in {
+  def VCVTSS2SDrm : I<0x5A, MRMSrcMem, (outs FR64:$dst),
+                      (ins FR32:$src1, f32mem:$src2),
+                      "vcvtss2sd\t{$src2, $src1, $dst|$dst, $src1, $src2}",
+                      []>, XS, VEX_4V, Requires<[HasAVX, OptForSize]>;
+}
+
+def VCVTSS2SDrm_alt : I<0x5A, MRMSrcMem, (outs FR64:$dst),
+                    (ins f32mem:$src),
+                    "vcvtss2sd\t{$src, $src, $dst|$dst, $src, $src}",
+                    []>, XS, VEX, Requires<[HasAVX, OptForSize]>;

The "alt" version is using a different encoding, this isn't correct,
since there's only one encoding for the "rm" version, which is the
"VEX_4V" one. There is no need for the "alt" version actually, but to
follow the manual "merge with high bits of xmm2":

Instead of doing:

+def : Pat<(extloadf32 addr:$src),
+          (VCVTSS2SDrm_alt addr:$src)>,

You can do:

def : Pat<(extloadf32 addr:$src2),
          (VCVTSS2SDrm 0, addr:$src2)>,

or something like that...

A better solution, since this instruction is dealing with F32 reg
classes and the high bits won't be touched, is to declare VCVTSS2SDrm
as having Constraints = "$src1 = $dst", but keep printing its operands
as usual. Also, you can do the pattern matching inline in the
instruction definition, no need to do it as a Pat here. I believe you
can do something similar to VCVTSI2SD_alt.

-- 
Bruno Cardoso Lopes
http://www.brunocardoso.cc



More information about the llvm-commits mailing list