[llvm-bugs] [Bug 40965] New: [X86] CMOV_GR8 is pseudo, and is expanded.

via llvm-bugs llvm-bugs at lists.llvm.org
Tue Mar 5 08:09:03 PST 2019


https://bugs.llvm.org/show_bug.cgi?id=40965

            Bug ID: 40965
           Summary: [X86] CMOV_GR8 is pseudo, and is expanded.
           Product: libraries
           Version: trunk
          Hardware: PC
                OS: Linux
            Status: NEW
          Severity: enhancement
          Priority: P
         Component: Backend: X86
          Assignee: unassignedbugs at nondot.org
          Reporter: lebedev.ri at gmail.com
                CC: craig.topper at gmail.com, llvm-bugs at lists.llvm.org,
                    llvm-dev at redking.me.uk, spatel+llvm at rotateright.com

mclow brought up this issue up in IRC, it came up during implementation of
libc++ std::midpoint implementation.

https://godbolt.org/z/oLrHBP

If std::midpoint() is used on 'int', then final x86 asm uses CMOV.
If std::midpoint() is used on 'signed char', then final x86 asm contains
branch.
The produced IR is more or less the same, is optimal, no branches.

If we look in `llc -print-after-all`

# *** IR Dump After X86 DAG->DAG Instruction Selection ***:
# Machine code for function main: IsSSA, TracksLiveness

bb.0 (%ir-block.0):
  %0:gr8 = MOV8rm $rip, 1, $noreg, @a, $noreg :: (dereferenceable load 1 from
@a, !tbaa !2)
  %1:gr8 = MOV8rm $rip, 1, $noreg, @b, $noreg :: (dereferenceable load 1 from
@b, !tbaa !2)
  %2:gr8 = SUB8rr %0:gr8(tied-def 0), %1:gr8, implicit-def $eflags
  %3:gr8 = SETLEr implicit $eflags
  %4:gr8 = CMOV_GR8 %0:gr8, %1:gr8, 5, implicit $eflags
  %5:gr8 = CMOV_GR8 %1:gr8, %0:gr8, 6, implicit $eflags
  %6:gr8 = ADD8rr %3:gr8(tied-def 0), %3:gr8, implicit-def dead $eflags
  %7:gr8 = ADD8ri %6:gr8(tied-def 0), -1, implicit-def dead $eflags
  %8:gr8 = SUB8rr %5:gr8(tied-def 0), killed %4:gr8, implicit-def dead $eflags
  %9:gr8 = SHR8r1 %8:gr8(tied-def 0), implicit-def dead $eflags
  $al = COPY %9:gr8
  MUL8r killed %7:gr8, implicit-def $al, implicit-def dead $eflags,
implicit-def $ax, implicit $al
  %10:gr8 = COPY $al
  %11:gr8 = ADD8rr %10:gr8(tied-def 0), %0:gr8, implicit-def dead $eflags
  %12:gr32 = MOVSX32rr8 killed %11:gr8
  $eax = COPY %12:gr32
  RET 0, $eax

# End machine code for function main.

# *** IR Dump After Expand ISel Pseudo-instructions ***:
# Machine code for function main: IsSSA, TracksLiveness

bb.0 (%ir-block.0):
  successors: %bb.1(0x40000000), %bb.2(0x40000000); %bb.1(50.00%),
%bb.2(50.00%)

  %0:gr8 = MOV8rm $rip, 1, $noreg, @a, $noreg :: (dereferenceable load 1 from
@a, !tbaa !2)
  %1:gr8 = MOV8rm $rip, 1, $noreg, @b, $noreg :: (dereferenceable load 1 from
@b, !tbaa !2)
  %2:gr8 = SUB8rr %0:gr8(tied-def 0), %1:gr8, implicit-def $eflags
  %3:gr8 = SETLEr implicit $eflags
  JG_1 %bb.2, implicit $eflags

bb.1 (%ir-block.0):
; predecessors: %bb.0
  successors: %bb.2(0x80000000); %bb.2(100.00%)
  liveins: $eflags

bb.2 (%ir-block.0):
; predecessors: %bb.0, %bb.1
  successors: %bb.3(0x40000000), %bb.4(0x40000000); %bb.3(50.00%),
%bb.4(50.00%)
  liveins: $eflags
  %4:gr8 = PHI %0:gr8, %bb.1, %1:gr8, %bb.0
  JGE_1 %bb.4, implicit $eflags

bb.3 (%ir-block.0):
; predecessors: %bb.2
  successors: %bb.4(0x80000000); %bb.4(100.00%)


bb.4 (%ir-block.0):
; predecessors: %bb.2, %bb.3

  %5:gr8 = PHI %1:gr8, %bb.3, %0:gr8, %bb.2
  %6:gr8 = ADD8rr %3:gr8(tied-def 0), %3:gr8, implicit-def dead $eflags
  %7:gr8 = ADD8ri %6:gr8(tied-def 0), -1, implicit-def dead $eflags
  %8:gr8 = SUB8rr %5:gr8(tied-def 0), killed %4:gr8, implicit-def dead $eflags
  %9:gr8 = SHR8r1 %8:gr8(tied-def 0), implicit-def dead $eflags
  $al = COPY %9:gr8
  MUL8r killed %7:gr8, implicit-def $al, implicit-def dead $eflags,
implicit-def $ax, implicit $al
  %10:gr8 = COPY $al
  %11:gr8 = ADD8rr %10:gr8(tied-def 0), %0:gr8, implicit-def dead $eflags
  %12:gr32 = MOVSX32rr8 killed %11:gr8
  $eax = COPY %12:gr32
  RET 0, $eax

# End machine code for function main.


We can see that "Expand ISel Pseudo-instructions" (ExpandISelPseudos)
has expanded `CMOV_GR8`.
Which makes sense, is there a 1-byte version of cmov?
Not as per https://www.felixcloutier.com/x86/cmovcc or Intel SDM.

I do think it is better to avoid branch here.
Can we instead widen to 32-bit (i think we try to avoid 16-bit in x86?), and
keep CMOV?

Thoughts?

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20190305/bc09a66d/attachment-0001.html>


More information about the llvm-bugs mailing list