[PATCH][X86] SSE2 cvtsd2ss instruction expects two operands

Cameron McInally cameron.mcinally at nyu.edu
Tue Oct 22 14:42:52 PDT 2013


On Tue, Oct 22, 2013 at 2:40 PM, Quentin Colombet <qcolombet at apple.com> wrote:
> Hi Cameron,
>
> On Oct 22, 2013, at 10:02 AM, Cameron McInally <cameron.mcinally at nyu.edu>
> wrote:
>
> Hey Quentin,
>
> On Mon, Oct 21, 2013 at 8:05 PM, Quentin Colombet <qcolombet at apple.com>
> wrote:
>
> Hi Cameron,
>
> I do not think this is quite right.
>
> CVTSD2SS has three operands for the SSE variant.
> Indeed, the converted operand is set to the low quadword (or doubleword) of
> the destination register ($dst), and the high quadword (doublewords) are
> left unchanged ($src1).
>
> CVTSD2SS does what you want for the FP variant (one source one destination):
>
> def CVTSD2SSrr  : SDI<0x5A, MRMSrcReg, (outs FR32:$dst), (ins FR64:$src),
>                      "cvtsd2ss\t{$src, $dst|$dst, $src}",
>                      [(set FR32:$dst, (fround FR64:$src))],
>                      IIC_SSE_CVT_Scalar_RR>, Sched<[WriteCvtF2F]>;
>
>
> Am I missing something?
>
>
> This may be a difference in terminology.
>
> Indeed!
>
> I will try to be more explicit.
>
> The SSE2 cvtsd2ss instruction expects two operands according to the
> "Intel Architecture Instructions Set Extensions Programming
> Reference", i.e. 319433-016 Oct 2013. [Note: It may also be of
> interest that the VEX/EVEX encoded variant of this instruction expects
> three+ operands.]
>
> I also agree that LLVM's CVTSD2SSrr pattern is correct. However, the
> Int_CVTSD2SSrr pattern is not correct.
>
> let Constraints = "$src1 = $dst" in {
> def Int_CVTSD2SSrr: I<0x5A, MRMSrcReg,
>                       (outs VR128:$dst), (ins VR128:$src1, VR128:$src2),
>                       "cvtsd2ss\t{$src2, $src1, $dst|$dst, $src1, $src2}",
>                       [(set VR128:$dst,
>                         (int_x86_sse2_cvtsd2ss VR128:$src1, VR128:$src2))],
>                       IIC_SSE_CVT_Scalar_RR>, XD, Requires<[UseSSE2]>,
>                       Sched<[WriteCvtF2F]>;
> }
>
> This pattern will produce an assembly instruction with 3 operands.
> [Note: the third operand is necessary for the VEX-encoded instruction,
> but is superfluous for the SSE2 instruction.]
>
> Incorrect SSE2 usage: cvtsd2ss %xmm0, %xmm2, %xmm2
> Correct SSE2 usage: cvtsd2ss %xmm0, %xmm2
>
> I see, my mistake.
>
>
> My compiler currently uses a version of GNU ld, which issues an error
> with the three operand instruction:
>
> [scrubbed]/sse2-cvtsd2ss-1_1.s: Assembler messages:
> [scrubbed]/sse2-cvtsd2ss-1_1.s:209: Error: number of operands mismatch
> for `cvtsd2ss'
>
> I may have wrongly assumed that this patch was trivial, so did not
> prepare a reduced test case. And, unfortunately, my test case and IR
> is incompatible with ToT Clang/LLVM. If there is any further confusion
> surrounding this patch, I will be happy to create a small compatible
> test case for this problem.
>
> Generally speaking, if you can come up with a test case it would be best
> even for trivial fixes.
> With a test case, this would look good for me.

I've attached the .ll output for a rather thorough test case that
makes use of the llvm.x86.sse2.cvtsd2ss intrinsic with -msse2.

Clang's assembler does indeed complain about the extra operand. There
is, of course, no such complaint when directly producing the object
file. ;)

[scrubbed]-MacBook-Air:bin [scrubbed]$ ./clang -msse2 test.c -S
[scrubbed]-MacBook-Air:bin [scrubbed]$ grep cvtsd2ss test.s
cvtsd2ss %xmm1, %xmm0, %xmm0
cvtsd2ss -144(%rbp), %xmm0
[scrubbed]-MacBook-Air:bin [scrubbed]$ ./clang -msse2 test.s
test.s:37:25: error: invalid operand for instruction
 cvtsd2ss %xmm1, %xmm0, %xmm0
                        ^~~~~
[scrubbed]-MacBook-Air:bin [scrubbed]$ ./clang -msse2 test.c
[scrubbed]-MacBook-Air:bin [scrubbed]$

Unfortunately, I do not have enough experience with Clang's toolchain
and test harness to suggest a plan of attack for catching bugs of this
class. Hopefully someone with a better understanding of the process
can chime in.

-Cameron
-------------- next part --------------
A non-text attachment was scrubbed...
Name: test.ll
Type: application/octet-stream
Size: 7401 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20131022/00a699f0/attachment.obj>


More information about the llvm-commits mailing list