<html>
    <head>
      <base href="https://bugs.llvm.org/">
    </head>
    <body><table border="1" cellspacing="0" cellpadding="8">
        <tr>
          <th>Bug ID</th>
          <td><a class="bz_bug_link 
          bz_status_NEW "
   title="NEW - New code-gen options for retpolines and straight line speculation"
   href="https://bugs.llvm.org/show_bug.cgi?id=52323">52323</a>
          </td>
        </tr>

        <tr>
          <th>Summary</th>
          <td>New code-gen options for retpolines and straight line speculation
          </td>
        </tr>

        <tr>
          <th>Product</th>
          <td>clang
          </td>
        </tr>

        <tr>
          <th>Version</th>
          <td>unspecified
          </td>
        </tr>

        <tr>
          <th>Hardware</th>
          <td>PC
          </td>
        </tr>

        <tr>
          <th>OS</th>
          <td>Linux
          </td>
        </tr>

        <tr>
          <th>Status</th>
          <td>NEW
          </td>
        </tr>

        <tr>
          <th>Severity</th>
          <td>enhancement
          </td>
        </tr>

        <tr>
          <th>Priority</th>
          <td>P
          </td>
        </tr>

        <tr>
          <th>Component</th>
          <td>C
          </td>
        </tr>

        <tr>
          <th>Assignee</th>
          <td>unassignedclangbugs@nondot.org
          </td>
        </tr>

        <tr>
          <th>Reporter</th>
          <td>andrew.cooper3@citrix.com
          </td>
        </tr>

        <tr>
          <th>CC</th>
          <td>blitzrakete@gmail.com, dgregor@apple.com, erik.pilkington@gmail.com, llvm-bugs@lists.llvm.org, richard-llvm@metafoo.co.uk
          </td>
        </tr></table>
      <p>
        <div>
        <pre>Hello

[FYI, this is being cross-requested of GCC too]

Linux and other kernel level software makes use of
-mindirect-branch=thunk-extern to be able to alter the handling of indirect
branches at boot.  It turns out to be advantageous to inline the thunks when
retpoline is not in use. 
<a href="https://lore.kernel.org/lkml/20211026120132.613201817@infradead.org/">https://lore.kernel.org/lkml/20211026120132.613201817@infradead.org/</a> is some
infrastructure to make this work.

In some cases, we want to be able to inline an `lfence; jmp *%reg` thunk.  This
is fine for the low 8 registers, but not fine for %r{8..15} where the REX
prefix pushes the replacement size to being 6 bytes.

It would be very useful to have a code-gen option to write out `call
%cs:__x86_indirect_thunk_r{8..15}` where the redundant %cs prefix will increase
the instruction length to 6, allowing the non-retpoline form to be inlined.


Relatedly, x86 straight line speculation has been discussed before, but without
any action taken.  It would be helpful to have a code gen option which would
emit `int3` following any `ret` instruction, and any indirect jump, as neither
of these two cases have following architectural execution.

The reason these two are related is that if both options are in use, we want an
extra byte of replacement space to be able to inline `lfence; jmp *%reg; int3`.


Third Clang has been observed to spot conditional tail calls as `Jcc
__x86_indirect_thunk_*`.  This is a 6 byte source size, but needs up to 9 bytes
of space for inlining including an `int3` for straight line speculation reasons
(See <a href="https://lore.kernel.org/lkml/20211026120310.359986601@infradead.org/">https://lore.kernel.org/lkml/20211026120310.359986601@infradead.org/</a> for
full details).  It might be enough to simply prohibit an optimisation like this
when trying to pad retpolines for inlineability.</pre>
        </div>
      </p>


      <hr>
      <span>You are receiving this mail because:</span>

      <ul>
          <li>You are on the CC list for the bug.</li>
      </ul>
    </body>
</html>