[PATCH] D11438: Part 2 to fix x86_64 fp128 calling convention.

Thu Dec 10 19:42:10 PST 2015

chh marked 7 inline comments as done.

================
Comment at: test/CodeGen/X86/fp128-calling-conv.ll:5
@@ +4,3 @@
+; double myD = 1.0;
+ at myD = global double 1.000000e+00, align 8
+
----------------
davidxl wrote:
> Is this variable used?
No, Removed now.

================
Comment at: test/CodeGen/X86/fp128-calling-conv.ll:8
@@ +7,3 @@
+; long double myFP80 = 1.0L;  // x86_64-linux-gnu
+ at myFP80 = global x86_fp80 0xK3FFF8000000000000000, align 16
+
----------------
davidxl wrote:
> Is it used?
No. Removed now.

================
Comment at: test/CodeGen/X86/fp128-calling-conv.ll:11
@@ +10,3 @@
+; long double myFP128 = 1.0L;  // x86_64-linux-android
+ at myFP128 = global fp128 0xL00000000000000003FFF000000000000, align 16
+
----------------
davidxl wrote:
> Are the two parts swapped? GCC seems to generates: 3FFF0000000000000000000000000000 
That looked strange but correct. I copied that from clang's dump, and the llvm output assembly code is the same as gcc's. http://llvm.org/docs/LangRef.html says big-endian is used for hexadecimal floating point constants.

================
Comment at: test/CodeGen/X86/fp128-cast.ll:38
@@ +37,3 @@
+
+define void @TestCastF128_I32() {
+entry:
----------------
davidxl wrote:
> Missing a test for conversion to unsigned I32
Added conversion to uint32 and uint64.

================
Comment at: test/CodeGen/X86/fp128-cast.ll:135
@@ +134,3 @@
+  %0 = bitcast fp128 %mul to i128
+  %u.sroa.0.4.extract.shift = lshr i128 %0, 32
+  %or5 = or i128 %u.sroa.0.4.extract.shift, %0
----------------
davidxl wrote:
> Can you simplify the variable names ?
These were generated by clang for my simplified C code from libm.
They are useful to show the clang transformations.
I will add the C code example as comments.

================
Comment at: test/CodeGen/X86/fp128-cast.ll:166
@@ +165,3 @@
+; CHECK-NEXT:  movq %rsi, -24(%rsp)
+; CHECK-NEXT:  adcq $0, %rdi
+; CHECK-NEXT:  movq %rdi, -16(%rsp)
----------------
davidxl wrote:
> There is no guarantee adcq will be after movq ...
Okay, relaxed the check patterns.

================
Comment at: test/CodeGen/X86/fp128-compare.ll:33
@@ +32,3 @@
+; CHECK:       callq __lttf2
+; CHECK-NEXT:  shrl $31, %eax
+; CHECK:       retq
----------------
davidxl wrote:
> is this correct?
Yes. It's a strange optimization, which returns 1 if %cmp is negative as __lttf2 will return when %d1 < %d2.

================
Comment at: test/CodeGen/X86/fp128-i128.ll:92
@@ +91,3 @@
+if.end:                                           ; preds = %if.then, %entry
+  %u.sroa.0.0 = phi i128 [ %bf.set, %if.then ], [ %0, %entry ]
+  %2 = bitcast i128 %u.sroa.0.0 to fp128
----------------
davidxl wrote:
> var names can be cleaned up to be shorter.
These IL were copied from libm compiled code. Clang has its way to convert C union structure references. I will add original C code as comment.

================
Comment at: test/CodeGen/X86/fp128-i128.ll:125
@@ +124,3 @@
+define { i64, i64 } @TestShift128(i64 %x.coerce0, i64 %x.coerce1) #0 {
+entry:
+  %.fca.1.insert = insertvalue { i64, i64 } { i64 0, i64 undef }, i64 %x.coerce0, 1
----------------
davidxl wrote:
> seems irrelevant.
Okay, removed the i64 test.

================
Comment at: test/CodeGen/X86/fp128-i128.ll:137
@@ +136,3 @@
+
+define void @TestShift128_2() #2 {
+entry:
----------------
davidxl wrote:
> Seems irrelevant.
We need to test i128 too, since this patch put also i128 into the FR128 register class. The i128 instruction was generated from f128 C code.

================
Comment at: test/CodeGen/X86/fp128-i128.ll:205
@@ +204,3 @@
+; Test more complicated logical operations generated from copysignl.
+define void @TestCopySign({ fp128, fp128 }* noalias nocapture sret %agg.result, { fp128, fp128 }* byval nocapture readonly align 16 %z) #0 {
+entry:
----------------
davidxl wrote:
> Can this test case be simplified more?
This was from libm C code, which triggered one error related to f128 without a complete patch. So I added it. I tried to reduce the original C code but then that won't trigger the problem.

http://reviews.llvm.org/D11438