<table border="1" cellspacing="0" cellpadding="8">

    <tr>

        <th>Issue</th>

        <td>

            <a href=https://github.com/llvm/llvm-project/issues/63435>63435</a>

        </td>

    </tr>

    <tr>

        <th>Summary</th>

        <td>

            [VE] Inefficient code for icmp i128

        </td>

    </tr>

    <tr>

      <th>Labels</th>

      <td>

      </td>

    </tr>

    <tr>

      <th>Assignees</th>

      <td>

      </td>

    </tr>

    <tr>

      <th>Reporter</th>

      <td>

          efriedma-quic

      </td>

    </tr>

</table>

<pre>

    As discussed on https://reviews.llvm.org/D151358, the VE backend generates inefficient code for something like the following:

```

target triple = "ve"

define i64 @f(i64 %x, i64 %y, i128 %a, i128 %b) {

  %c = icmp ugt i128 %a, %b

  %d = select i1 %c, i64 %y, i64 %x

  ret i64 %d

}

```

This currently generates 8 instructions, but it can be done in 4 instructions.  We currently use 4 extra instructions to translate the result of cmpu into a boolean.

With D151358, this also impacts the efficiency of i128 smin/smax/umin/umax.

</pre>

<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJxsU12vozgM_TXhxboVhK_ywMPMdpD2fTXzHBID3glJN3Hubf_9CtrubbVXQsEOx_ax8VEx0uwQe1F_F_UpU4kXH3qcAqFZ1ds_iXQ2enPtv0UwFHWKEQ14BwvzOYrym5CDkEPAd8KPeLD2fT34MAs5nIq6KOujkH8ALwg_f8Co9G90BmZ0GBRjBHI4TaQJHYP2BmHyAaJfkRdyM1j6jXvw5K31H-TmrWB-EvnjbPL7s7uswowMHOhsEUR5AiHlOwopb98NTuQQqKlAVPkk5HE3ZX3ZWN7t624X8rg56tkZhexAtN9vyWC70nsV0usZ0syvYXvEJ9Ts0IgW9Qbco_9X9kHnHhaQH3fm3nF7-rL12_nXQhF0CgEd2-vToI9ALnJImsm7uJUaEwMxaOVgRDB-m4uD6gV3APiFT_lSRKgALxzUCw7YAwflolV8-2EBY7IMfgK9nhOQYw8KRu8tKnd4pvyLeIGXXaEIykYPtJ6V5rjne6yJvm459zHHlZyQQ1zVRcgh3by0qsshM31purJTGfZFc2zLpq27Olv6SaGuzZiPZdV1rZFHVbTFVJh2bGRuyimjXuayzBtZFG1RFvUBsSrKSlVj0XSdrpSoclwV2f8WPaMYE_ZNWZV1ZtWINj60FPoN9DamOYoqtxT5Ux8ZE9tddT9_iPoEf34lhH2ttl6zFGz_KriZeEnjQftVyGHLen-9nYP_GzULOezMopDDTu7fAAAA___CcTKw">