[all-commits] [llvm/llvm-project] e61a7d: [X86][AVX512] Use comx for compare (#113567)

Mahesh-Attarde via All-commits all-commits at lists.llvm.org
Wed Oct 30 01:17:47 PDT 2024


  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: e61a7dc256bd530a0b9551e2732e5b5b77e2cd1e
      https://github.com/llvm/llvm-project/commit/e61a7dc256bd530a0b9551e2732e5b5b77e2cd1e
  Author: Mahesh-Attarde <145317060+mahesh-attarde at users.noreply.github.com>
  Date:   2024-10-30 (Wed, 30 Oct 2024)

  Changed paths:
    M llvm/lib/Target/X86/X86ISelLowering.cpp
    M llvm/lib/Target/X86/X86InstrAVX10.td
    A llvm/test/CodeGen/X86/avx10_2-cmp.ll
    M llvm/test/TableGen/x86-fold-tables.inc

  Log Message:
  -----------
  [X86][AVX512] Use comx for compare (#113567)

We added AVX10.2 COMEF ISA in LLVM, This does not optimize correctly in
scenario mentioned below.
Summary
Input 
```
define i1 @oeq(float %x, float %y) {
    %1 = fcmp oeq float %x, %y
    ret i1 %1
}define i1 @une(float %x, float %y) {
    %1 = fcmp une float %x, %y
    ret i1 %1
}define i1 @ogt(float %x, float %y) {
    %1 = fcmp ogt float %x, %y
    ret i1 %1
}
// Prior AVX10.2, default code generation

oeq:                                    # @oeq
        cmpeqss xmm0, xmm1
        movd    eax, xmm0
        and     eax, 1
        ret
une:                                    # @une
        cmpneqss        xmm0, xmm1
        movd    eax, xmm0
        and     eax, 1
        ret
ogt:                                    # @ogt
        ucomiss xmm0, xmm1
        seta    al
        ret 
```

This patch will remove `cmpeqss` and `cmpneqss`. For complete transform
check unit test.

Continuing on what PR https://github.com/llvm/llvm-project/pull/113098
added

Earlier Legalization and combine expanded `setcc oeq:ch` node into `and`
and `setcc eq` , `setcc o`. From suggestions in community
new internal transform
```
Optimized type-legalized selection DAG: %bb.0 'hoeq:'
SelectionDAG has 11 nodes:
  t0: ch,glue = EntryToken
      t2: f16,ch = CopyFromReg t0, Register:f16 %0
      t4: f16,ch = CopyFromReg t0, Register:f16 %1
    t14: i8 = setcc t2, t4, setoeq:ch
  t10: ch,glue = CopyToReg t0, Register:i8 $al, t14
  t11: ch = X86ISD::RET_GLUE t10, TargetConstant:i32<0>, Register:i8 $al, t10:1

Optimized legalized selection DAG: %bb.0 'hoeq:'
SelectionDAG has 12 nodes:
  t0: ch,glue = EntryToken
        t2: f16,ch = CopyFromReg t0, Register:f16 %0
        t4: f16,ch = CopyFromReg t0, Register:f16 %1
      t15: i32 = X86ISD::UCOMX t2, t4
    t17: i8 = X86ISD::SETCC TargetConstant:i8<4>, t15
  t10: ch,glue = CopyToReg t0, Register:i8 $al, t17
  t11: ch = X86ISD::RET_GLUE t10, TargetConstant:i32<0>, Register:i8 $al, t10:1
```
Earlier transform is mentioned here
https://github.com/llvm/llvm-project/pull/113098#discussion_r1810307663

---------

Co-authored-by: mattarde <mattarde at intel.com>



To unsubscribe from these emails, change your notification settings at https://github.com/llvm/llvm-project/settings/notifications


More information about the All-commits mailing list