[PATCH] D107082: [X86][RFC] Enable `_Float16` type support on X86 following the psABI
LuoYuanke via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Jun 8 23:51:48 PDT 2022
LuoYuanke added inline comments.
================
Comment at: llvm/test/CodeGen/X86/fpclamptosat.ll:569
; CHECK-NEXT: cvttss2si %xmm0, %rax
; CHECK-NEXT: ucomiss {{\.?LCPI[0-9]+_[0-9]+}}(%rip), %xmm0
+; CHECK-NEXT: movabsq $-9223372036854775808, %rcx # imm = 0x8000000000000000
----------------
I'm curious why there is 1 more compare in this patch.
================
Comment at: llvm/test/CodeGen/X86/fpclamptosat.ll:776
+; CHECK-NEXT: cmovael %eax, %ecx
+; CHECK-NEXT: ucomiss {{\.?LCPI[0-9]+_[0-9]+}}(%rip), %xmm0
+; CHECK-NEXT: movl $2147483647, %edx # imm = 0x7FFFFFFF
----------------
Ditto.
================
Comment at: llvm/test/CodeGen/X86/fpclamptosat_vec.ll:605
+; CHECK-NEXT: .cfi_def_cfa_offset 80
+; CHECK-NEXT: movss %xmm2, {{[-0-9]+}}(%r{{[sb]}}p) # 4-byte Spill
+; CHECK-NEXT: movss %xmm1, {{[-0-9]+}}(%r{{[sb]}}p) # 4-byte Spill
----------------
Is the vector <4 x half> split to 4 scalar and pass by xmm? What's the ABI for vector half? Is there any case that test the scenario that run out of register and pass parameter through stack?
================
Comment at: llvm/test/CodeGen/X86/fptosi-sat-scalar.ll:2138
+; X64-NEXT: ucomiss {{\.?LCPI[0-9]+_[0-9]+}}(%rip), %xmm0
+; X64-NEXT: movl $255, %eax
+; X64-NEXT: cmovael %ecx, %eax
----------------
It seems less efficient than previous code on NAN, zero handling, but we can improve later.
================
Comment at: llvm/test/CodeGen/X86/half.ll:946
+; CHECK-I686-NEXT: calll __extendhfsf2
+; CHECK-I686-NEXT: fstps {{[0-9]+}}(%esp)
+; CHECK-I686-NEXT: movss {{.*#+}} xmm0 = mem[0],zero,zero,zero
----------------
Why the x87 instruction is generated?
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D107082/new/
https://reviews.llvm.org/D107082
More information about the llvm-commits
mailing list