<html>

    <head>

      <base href="https://bugs.llvm.org/">

    </head>

    <body><table border="1" cellspacing="0" cellpadding="8">

        <tr>

          <th>Bug ID</th>

          <td><a class="bz_bug_link 

          bz_status_NEW "

   title="NEW - [X86] CMOV_GR8 is pseudo, and is expanded."

   href="https://bugs.llvm.org/show_bug.cgi?id=40965">40965</a>

          </td>

        </tr>

        <tr>

          <th>Summary</th>

          <td>[X86] CMOV_GR8 is pseudo, and is expanded.

          </td>

        </tr>

        <tr>

          <th>Product</th>

          <td>libraries

          </td>

        </tr>

        <tr>

          <th>Version</th>

          <td>trunk

          </td>

        </tr>

        <tr>

          <th>Hardware</th>

          <td>PC

          </td>

        </tr>

        <tr>

          <th>OS</th>

          <td>Linux

          </td>

        </tr>

        <tr>

          <th>Status</th>

          <td>NEW

          </td>

        </tr>

        <tr>

          <th>Severity</th>

          <td>enhancement

          </td>

        </tr>

        <tr>

          <th>Priority</th>

          <td>P

          </td>

        </tr>

        <tr>

          <th>Component</th>

          <td>Backend: X86

          </td>

        </tr>

        <tr>

          <th>Assignee</th>

          <td>unassignedbugs@nondot.org

          </td>

        </tr>

        <tr>

          <th>Reporter</th>

          <td>lebedev.ri@gmail.com

          </td>

        </tr>

        <tr>

          <th>CC</th>

          <td>craig.topper@gmail.com, llvm-bugs@lists.llvm.org, llvm-dev@redking.me.uk, spatel+llvm@rotateright.com

          </td>

        </tr></table>

      <p>

        <div>

        <pre>mclow brought up this issue up in IRC, it came up during implementation of

libc++ std::midpoint implementation.

<a href="https://godbolt.org/z/oLrHBP">https://godbolt.org/z/oLrHBP</a>

If std::midpoint() is used on 'int', then final x86 asm uses CMOV.

If std::midpoint() is used on 'signed char', then final x86 asm contains

branch.

The produced IR is more or less the same, is optimal, no branches.

If we look in `llc -print-after-all`

# *** IR Dump After X86 DAG->DAG Instruction Selection ***:

# Machine code for function main: IsSSA, TracksLiveness

bb.0 (%ir-block.0):

  %0:gr8 = MOV8rm $rip, 1, $noreg, @a, $noreg :: (dereferenceable load 1 from

@a, !tbaa !2)

  %1:gr8 = MOV8rm $rip, 1, $noreg, @b, $noreg :: (dereferenceable load 1 from

@b, !tbaa !2)

  %2:gr8 = SUB8rr %0:gr8(tied-def 0), %1:gr8, implicit-def $eflags

  %3:gr8 = SETLEr implicit $eflags

  %4:gr8 = CMOV_GR8 %0:gr8, %1:gr8, 5, implicit $eflags

  %5:gr8 = CMOV_GR8 %1:gr8, %0:gr8, 6, implicit $eflags

  %6:gr8 = ADD8rr %3:gr8(tied-def 0), %3:gr8, implicit-def dead $eflags

  %7:gr8 = ADD8ri %6:gr8(tied-def 0), -1, implicit-def dead $eflags

  %8:gr8 = SUB8rr %5:gr8(tied-def 0), killed %4:gr8, implicit-def dead $eflags

  %9:gr8 = SHR8r1 %8:gr8(tied-def 0), implicit-def dead $eflags

  $al = COPY %9:gr8

  MUL8r killed %7:gr8, implicit-def $al, implicit-def dead $eflags,

implicit-def $ax, implicit $al

  %10:gr8 = COPY $al

  %11:gr8 = ADD8rr %10:gr8(tied-def 0), %0:gr8, implicit-def dead $eflags

  %12:gr32 = MOVSX32rr8 killed %11:gr8

  $eax = COPY %12:gr32

  RET 0, $eax

# End machine code for function main.

# *** IR Dump After Expand ISel Pseudo-instructions ***:

# Machine code for function main: IsSSA, TracksLiveness

bb.0 (%ir-block.0):

  successors: %bb.1(0x40000000), %bb.2(0x40000000); %bb.1(50.00%),

%bb.2(50.00%)

  %0:gr8 = MOV8rm $rip, 1, $noreg, @a, $noreg :: (dereferenceable load 1 from

@a, !tbaa !2)

  %1:gr8 = MOV8rm $rip, 1, $noreg, @b, $noreg :: (dereferenceable load 1 from

@b, !tbaa !2)

  %2:gr8 = SUB8rr %0:gr8(tied-def 0), %1:gr8, implicit-def $eflags

  %3:gr8 = SETLEr implicit $eflags

  JG_1 %bb.2, implicit $eflags

bb.1 (%ir-block.0):

; predecessors: %bb.0

  successors: %bb.2(0x80000000); %bb.2(100.00%)

  liveins: $eflags

bb.2 (%ir-block.0):

; predecessors: %bb.0, %bb.1

  successors: %bb.3(0x40000000), %bb.4(0x40000000); %bb.3(50.00%),

%bb.4(50.00%)

  liveins: $eflags

  %4:gr8 = PHI %0:gr8, %bb.1, %1:gr8, %bb.0

  JGE_1 %bb.4, implicit $eflags

bb.3 (%ir-block.0):

; predecessors: %bb.2

  successors: %bb.4(0x80000000); %bb.4(100.00%)

bb.4 (%ir-block.0):

; predecessors: %bb.2, %bb.3

  %5:gr8 = PHI %1:gr8, %bb.3, %0:gr8, %bb.2

  %6:gr8 = ADD8rr %3:gr8(tied-def 0), %3:gr8, implicit-def dead $eflags

  %7:gr8 = ADD8ri %6:gr8(tied-def 0), -1, implicit-def dead $eflags

  %8:gr8 = SUB8rr %5:gr8(tied-def 0), killed %4:gr8, implicit-def dead $eflags

  %9:gr8 = SHR8r1 %8:gr8(tied-def 0), implicit-def dead $eflags

  $al = COPY %9:gr8

  MUL8r killed %7:gr8, implicit-def $al, implicit-def dead $eflags,

implicit-def $ax, implicit $al

  %10:gr8 = COPY $al

  %11:gr8 = ADD8rr %10:gr8(tied-def 0), %0:gr8, implicit-def dead $eflags

  %12:gr32 = MOVSX32rr8 killed %11:gr8

  $eax = COPY %12:gr32

  RET 0, $eax

# End machine code for function main.

We can see that "Expand ISel Pseudo-instructions" (ExpandISelPseudos)

has expanded `CMOV_GR8`.

Which makes sense, is there a 1-byte version of cmov?

Not as per <a href="https://www.felixcloutier.com/x86/cmovcc">https://www.felixcloutier.com/x86/cmovcc</a> or Intel SDM.

I do think it is better to avoid branch here.

Can we instead widen to 32-bit (i think we try to avoid 16-bit in x86?), and

keep CMOV?

Thoughts?</pre>

        </div>

      </p>

      <hr>

      <span>You are receiving this mail because:</span>

      <ul>

          <li>You are on the CC list for the bug.</li>

      </ul>

    </body>

</html>