<table border="1" cellspacing="0" cellpadding="8">
    <tr>
        <th>Issue</th>
        <td>
            <a href=https://github.com/llvm/llvm-project/issues/63435>63435</a>
        </td>
    </tr>

    <tr>
        <th>Summary</th>
        <td>
            [VE] Inefficient code for icmp i128
        </td>
    </tr>

    <tr>
      <th>Labels</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Assignees</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Reporter</th>
      <td>
          efriedma-quic
      </td>
    </tr>
</table>

<pre>
    As discussed on https://reviews.llvm.org/D151358, the VE backend generates inefficient code for something like the following:

```
target triple = "ve"
define i64 @f(i64 %x, i64 %y, i128 %a, i128 %b) {
  %c = icmp ugt i128 %a, %b
  %d = select i1 %c, i64 %y, i64 %x
  ret i64 %d
}
```

This currently generates 8 instructions, but it can be done in 4 instructions.  We currently use 4 extra instructions to translate the result of cmpu into a boolean.

With D151358, this also impacts the efficiency of i128 smin/smax/umin/umax.
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJxsU12vozgM_TXhxboVhK_ywMPMdpD2fTXzHBID3glJN3Hubf_9CtrubbVXQsEOx_ax8VEx0uwQe1F_F_UpU4kXH3qcAqFZ1ds_iXQ2enPtv0UwFHWKEQ14BwvzOYrym5CDkEPAd8KPeLD2fT34MAs5nIq6KOujkH8ALwg_f8Co9G90BmZ0GBRjBHI4TaQJHYP2BmHyAaJfkRdyM1j6jXvw5K31H-TmrWB-EvnjbPL7s7uswowMHOhsEUR5AiHlOwopb98NTuQQqKlAVPkk5HE3ZX3ZWN7t624X8rg56tkZhexAtN9vyWC70nsV0usZ0syvYXvEJ9Ts0IgW9Qbco_9X9kHnHhaQH3fm3nF7-rL12_nXQhF0CgEd2-vToI9ALnJImsm7uJUaEwMxaOVgRDB-m4uD6gV3APiFT_lSRKgALxzUCw7YAwflolV8-2EBY7IMfgK9nhOQYw8KRu8tKnd4pvyLeIGXXaEIykYPtJ6V5rjne6yJvm459zHHlZyQQ1zVRcgh3by0qsshM31purJTGfZFc2zLpq27Olv6SaGuzZiPZdV1rZFHVbTFVJh2bGRuyimjXuayzBtZFG1RFvUBsSrKSlVj0XSdrpSoclwV2f8WPaMYE_ZNWZV1ZtWINj60FPoN9DamOYoqtxT5Ux8ZE9tddT9_iPoEf34lhH2ttl6zFGz_KriZeEnjQftVyGHLen-9nYP_GzULOezMopDDTu7fAAAA___CcTKw">