[all-commits] [llvm/llvm-project] e61a7d: [X86][AVX512] Use comx for compare (#113567)
Mahesh-Attarde via All-commits
all-commits at lists.llvm.org
Wed Oct 30 01:17:47 PDT 2024
Branch: refs/heads/main
Home: https://github.com/llvm/llvm-project
Commit: e61a7dc256bd530a0b9551e2732e5b5b77e2cd1e
https://github.com/llvm/llvm-project/commit/e61a7dc256bd530a0b9551e2732e5b5b77e2cd1e
Author: Mahesh-Attarde <145317060+mahesh-attarde at users.noreply.github.com>
Date: 2024-10-30 (Wed, 30 Oct 2024)
Changed paths:
M llvm/lib/Target/X86/X86ISelLowering.cpp
M llvm/lib/Target/X86/X86InstrAVX10.td
A llvm/test/CodeGen/X86/avx10_2-cmp.ll
M llvm/test/TableGen/x86-fold-tables.inc
Log Message:
-----------
[X86][AVX512] Use comx for compare (#113567)
We added AVX10.2 COMEF ISA in LLVM, This does not optimize correctly in
scenario mentioned below.
Summary
Input
```
define i1 @oeq(float %x, float %y) {
%1 = fcmp oeq float %x, %y
ret i1 %1
}define i1 @une(float %x, float %y) {
%1 = fcmp une float %x, %y
ret i1 %1
}define i1 @ogt(float %x, float %y) {
%1 = fcmp ogt float %x, %y
ret i1 %1
}
// Prior AVX10.2, default code generation
oeq: # @oeq
cmpeqss xmm0, xmm1
movd eax, xmm0
and eax, 1
ret
une: # @une
cmpneqss xmm0, xmm1
movd eax, xmm0
and eax, 1
ret
ogt: # @ogt
ucomiss xmm0, xmm1
seta al
ret
```
This patch will remove `cmpeqss` and `cmpneqss`. For complete transform
check unit test.
Continuing on what PR https://github.com/llvm/llvm-project/pull/113098
added
Earlier Legalization and combine expanded `setcc oeq:ch` node into `and`
and `setcc eq` , `setcc o`. From suggestions in community
new internal transform
```
Optimized type-legalized selection DAG: %bb.0 'hoeq:'
SelectionDAG has 11 nodes:
t0: ch,glue = EntryToken
t2: f16,ch = CopyFromReg t0, Register:f16 %0
t4: f16,ch = CopyFromReg t0, Register:f16 %1
t14: i8 = setcc t2, t4, setoeq:ch
t10: ch,glue = CopyToReg t0, Register:i8 $al, t14
t11: ch = X86ISD::RET_GLUE t10, TargetConstant:i32<0>, Register:i8 $al, t10:1
Optimized legalized selection DAG: %bb.0 'hoeq:'
SelectionDAG has 12 nodes:
t0: ch,glue = EntryToken
t2: f16,ch = CopyFromReg t0, Register:f16 %0
t4: f16,ch = CopyFromReg t0, Register:f16 %1
t15: i32 = X86ISD::UCOMX t2, t4
t17: i8 = X86ISD::SETCC TargetConstant:i8<4>, t15
t10: ch,glue = CopyToReg t0, Register:i8 $al, t17
t11: ch = X86ISD::RET_GLUE t10, TargetConstant:i32<0>, Register:i8 $al, t10:1
```
Earlier transform is mentioned here
https://github.com/llvm/llvm-project/pull/113098#discussion_r1810307663
---------
Co-authored-by: mattarde <mattarde at intel.com>
To unsubscribe from these emails, change your notification settings at https://github.com/llvm/llvm-project/settings/notifications
More information about the All-commits
mailing list