<div dir="ltr">For this case you can probably copy the avx-intrinsics.ll test case for this intrinsic and create an sse2-intrinsics.ll. There appear to be several more intrinsics that aren't being tested on sse2 so if you want to add tests for them that would be awesome.<div>

<br></div><div>The run line should explicitly have -mattr=+sse2,-avx  This will keep it from generating AVX instructions if it lands on an AVX enabled buildbot.</div></div><div class="gmail_extra"><br><br><div class="gmail_quote">

On Tue, Oct 22, 2013 at 5:50 PM, Quentin Colombet <span dir="ltr"><<a href="mailto:qcolombet@apple.com" target="_blank">qcolombet@apple.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<div style="word-wrap:break-word">Hi Cameron,<div><br></div><div>What you can probably do is having a very small test case like this:</div><div><br></div><div>$ cat test.ll</div><div>; RUN llc -mtriple x86_64 -matrr=+sse2 < %s -o - | FileCheck %s</div>

<div><br></div><div>define @fct(<some args>) {</div><div>; CHECK-NOT: <expected instruction> <regex op1>, <regex op2>, <regex op3></div><div><span style="white-space:pre-wrap">  </span><simple use of your instrinsic></div>

<div>}</div><div><br></div><div>Basically, you just need to write a test case that use the intrinsic you want (no need to be a real workload) and check that it does not have three operands.</div><div>It would be best if you can actually check that your intrinsic is generated but I am not sure how it would play with CHECK-NOT. You can easily figure that out with a quick test :).</div>

<div><br></div><div>Regarding the RUN command, you will probably need to adapt the triple or mattr arguments, I did not check if it is the right syntax.</div><div>As soon as you have something you think is okay to go, name the test file appropriately and move it into test/CodeGen/X86. Then 'svn add' this file and produce the new diff to send to the mailing list.</div>

<div><br></div><div>By doing this, you would have a patch with a test case ;).</div><div><br></div><div>Cheers,</div><div><div>

<div style="text-indent:0px;letter-spacing:normal;font-variant:normal;text-align:-webkit-auto;font-style:normal;font-weight:normal;line-height:normal;text-transform:none;white-space:normal;font-family:Helvetica;word-wrap:break-word;word-spacing:0px">

-Quentin</div>


</div>

<br><div><div><div class="h5"><div>On Oct 22, 2013, at 2:42 PM, Cameron McInally <<a href="mailto:cameron.mcinally@nyu.edu" target="_blank">cameron.mcinally@nyu.edu</a>> wrote:</div><br></div></div><blockquote type="cite">

<div><div class="h5">On Tue, Oct 22, 2013 at 2:40 PM, Quentin Colombet <<a href="mailto:qcolombet@apple.com" target="_blank">qcolombet@apple.com</a>> wrote:<br><blockquote type="cite">Hi Cameron,<br><br>On Oct 22, 2013, at 10:02 AM, Cameron McInally <<a href="mailto:cameron.mcinally@nyu.edu" target="_blank">cameron.mcinally@nyu.edu</a>><br>

wrote:<br><br>Hey Quentin,<br><br>On Mon, Oct 21, 2013 at 8:05 PM, Quentin Colombet <<a href="mailto:qcolombet@apple.com" target="_blank">qcolombet@apple.com</a>><br>wrote:<br><br>Hi Cameron,<br><br>I do not think this is quite right.<br>

<br>CVTSD2SS has three operands for the SSE variant.<br>Indeed, the converted operand is set to the low quadword (or doubleword) of<br>the destination register ($dst), and the high quadword (doublewords) are<br>left unchanged ($src1).<br>

<br>CVTSD2SS does what you want for the FP variant (one source one destination):<br><br>def CVTSD2SSrr  : SDI<0x5A, MRMSrcReg, (outs FR32:$dst), (ins FR64:$src),<br>                     "cvtsd2ss\t{$src, $dst|$dst, $src}",<br>

                     [(set FR32:$dst, (fround FR64:$src))],<br>                     IIC_SSE_CVT_Scalar_RR>, Sched<[WriteCvtF2F]>;<br><br><br>Am I missing something?<br><br><br>This may be a difference in terminology.<br>

<br>Indeed!<br><br>I will try to be more explicit.<br><br>The SSE2 cvtsd2ss instruction expects two operands according to the<br>"Intel Architecture Instructions Set Extensions Programming<br>Reference", i.e. 319433-016 Oct 2013. [Note: It may also be of<br>

interest that the VEX/EVEX encoded variant of this instruction expects<br>three+ operands.]<br><br>I also agree that LLVM's CVTSD2SSrr pattern is correct. However, the<br>Int_CVTSD2SSrr pattern is not correct.<br><br>

let Constraints = "$src1 = $dst" in {<br>def Int_CVTSD2SSrr: I<0x5A, MRMSrcReg,<br>                      (outs VR128:$dst), (ins VR128:$src1, VR128:$src2),<br>                      "cvtsd2ss\t{$src2, $src1, $dst|$dst, $src1, $src2}",<br>

                      [(set VR128:$dst,<br>                        (int_x86_sse2_cvtsd2ss VR128:$src1, VR128:$src2))],<br>                      IIC_SSE_CVT_Scalar_RR>, XD, Requires<[UseSSE2]>,<br>                      Sched<[WriteCvtF2F]>;<br>

}<br><br>This pattern will produce an assembly instruction with 3 operands.<br>[Note: the third operand is necessary for the VEX-encoded instruction,<br>but is superfluous for the SSE2 instruction.]<br><br>Incorrect SSE2 usage: cvtsd2ss %xmm0, %xmm2, %xmm2<br>

Correct SSE2 usage: cvtsd2ss %xmm0, %xmm2<br><br>I see, my mistake.<br><br><br>My compiler currently uses a version of GNU ld, which issues an error<br>with the three operand instruction:<br><br>[scrubbed]/sse2-cvtsd2ss-1_1.s: Assembler messages:<br>

[scrubbed]/sse2-cvtsd2ss-1_1.s:209: Error: number of operands mismatch<br>for `cvtsd2ss'<br><br>I may have wrongly assumed that this patch was trivial, so did not<br>prepare a reduced test case. And, unfortunately, my test case and IR<br>

is incompatible with ToT Clang/LLVM. If there is any further confusion<br>surrounding this patch, I will be happy to create a small compatible<br>test case for this problem.<br><br>Generally speaking, if you can come up with a test case it would be best<br>

even for trivial fixes.<br>With a test case, this would look good for me.<br></blockquote><br>I've attached the .ll output for a rather thorough test case that<br>makes use of the llvm.x86.sse2.cvtsd2ss intrinsic with -msse2.<br>

<br>Clang's assembler does indeed complain about the extra operand. There<br>is, of course, no such complaint when directly producing the object<br>file. ;)<br><br>[scrubbed]-MacBook-Air:bin [scrubbed]$ ./clang -msse2 test.c -S<br>

[scrubbed]-MacBook-Air:bin [scrubbed]$ grep cvtsd2ss test.s<br>cvtsd2ss %xmm1, %xmm0, %xmm0<br>cvtsd2ss -144(%rbp), %xmm0<br>[scrubbed]-MacBook-Air:bin [scrubbed]$ ./clang -msse2 test.s<br>test.s:37:25: error: invalid operand for instruction<br>

 cvtsd2ss %xmm1, %xmm0, %xmm0<br>                        ^~~~~<br>[scrubbed]-MacBook-Air:bin [scrubbed]$ ./clang -msse2 test.c<br>[scrubbed]-MacBook-Air:bin [scrubbed]$<br><br>Unfortunately, I do not have enough experience with Clang's toolchain<br>

and test harness to suggest a plan of attack for catching bugs of this<br>class. Hopefully someone with a better understanding of the process<br>can chime in.<br><br>-Cameron<br></div></div><span><test.ll></span></blockquote>

</div><br></div></div><br>_______________________________________________<br>

llvm-commits mailing list<br>

<a href="mailto:llvm-commits@cs.uiuc.edu">llvm-commits@cs.uiuc.edu</a><br>

<a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits" target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits</a><br>

<br></blockquote></div><br><br clear="all"><div><br></div>-- <br>~Craig

</div>