<html>

    <head>

      <base href="https://bugs.llvm.org/">

    </head>

    <body><table border="1" cellspacing="0" cellpadding="8">

        <tr>

          <th>Bug ID</th>

          <td><a class="bz_bug_link 

          bz_status_NEW "

   title="NEW - [x86] Esoteric macro-fusion opportunity: converting jumps to fuseable conditional branches"

   href="https://bugs.llvm.org/show_bug.cgi?id=38079">38079</a>

          </td>

        </tr>

        <tr>

          <th>Summary</th>

          <td>[x86] Esoteric macro-fusion opportunity: converting jumps to fuseable conditional branches

          </td>

        </tr>

        <tr>

          <th>Product</th>

          <td>new-bugs

          </td>

        </tr>

        <tr>

          <th>Version</th>

          <td>unspecified

          </td>

        </tr>

        <tr>

          <th>Hardware</th>

          <td>PC

          </td>

        </tr>

        <tr>

          <th>OS</th>

          <td>Linux

          </td>

        </tr>

        <tr>

          <th>Status</th>

          <td>NEW

          </td>

        </tr>

        <tr>

          <th>Severity</th>

          <td>enhancement

          </td>

        </tr>

        <tr>

          <th>Priority</th>

          <td>P

          </td>

        </tr>

        <tr>

          <th>Component</th>

          <td>new bugs

          </td>

        </tr>

        <tr>

          <th>Assignee</th>

          <td>unassignedbugs@nondot.org

          </td>

        </tr>

        <tr>

          <th>Reporter</th>

          <td>jari@kirma.fi

          </td>

        </tr>

        <tr>

          <th>CC</th>

          <td>llvm-bugs@lists.llvm.org

          </td>

        </tr></table>

      <p>

        <div>

        <pre>This is just a wild idea. In short, Intel macro-fusion accepts any conditional

branch with and/test instructions, for instance the following:

...

and ...

jnc ...

Which is, as carry flag is always cleared by and, the same as:

and ...

[... reorderable instructions ...]

jmp ...

but some forms of it can be macro-fused as above, thus reducing ops from two to

one.

I understand that this optimization seems odd. Multiple issues arise:

- Branch predictor load is increased

- Conditional branches may be easily mispredicted if they are not in hot loops,

and mispredictions quickly eliminate the benefit

- If code is in a hot loop, it has lower likelihood of an unconditional jump

(my guess).

Still, I want to bring up this esoteric corner case optimization.

This method can be applied also to add, sub, adc and sbc if an invariant is

known to hold; for instance, jnc can be used similarly to example above if

result is known not to set carry bit (that is, it never wraps around the

unsigned integer). Also, if result is known to be non-zero, jnz can be used in

the case of inc/dec.</pre>

        </div>

      </p>

      <hr>

      <span>You are receiving this mail because:</span>

      <ul>

          <li>You are on the CC list for the bug.</li>

      </ul>

    </body>

</html>