[llvm-dev] Status of Intel JCC Mitigations and Next Steps
Philip Reames via llvm-dev
llvm-dev at lists.llvm.org
Tue Mar 24 15:54:44 PDT 2020
TLDR - We have a choice to make about assembler support, and a
disagreement about how to move forward. Community input needed.
Background
Intel has a hardware bug in Skylake and later whose mitigation requires
padding of branches to avoid performance degradation. Background here:
https://www.intel.com/content/dam/support/us/en/documents/processors/mitigations-jump-conditional-code-erratum.pdf
We now have in tree support for alignment of such branches via nop
padding, and limited support for padding existing instructions with
either prefixes or larger immediate values. This has survived several
days of dedicated testing and appears to be reasonably robust. The
padding support applies both to branch alignment for the mitigation, but
also normal align directives.
The original patches proposed a somewhat different approach than we've
ended up taking - primarily because of memory overhead concerns.
However, there was also a deeper disagreement on the original review
threads (D70157 and others) which was never settled, and we seem to be
at a point where this needs attention. In short, the question is how
assembler support should be handled.
The Choice
The problematic use case comes when assembling user provided .s files.
(Instead of the more restricted output of the compiler.) Our basic
choice is do we want to force a new directive syntax (and thus a source
code change to use the new feature), or attempt to automatically infer
where it's safe to insert padding?
The options as I see them:
* Assembler directives w/explicit opt in - In this model, assembler
input is assumed to only enable padding in regions where it is safe
to do so.
* Automagic assembler - In this model, the assembler is responsible
for inferring where it is legal to pad without breaking user
expectations.
(I'll stop and disclaim that I'm strongly in favor of the former. I've
tried to describe the pros/cons of each, but my perspective is
definitely biased.)
The difference between the two is a huge amount of complexity, and a
very fundamental correctness risk. The basic problem is that assemblers
have to handle unconstrained inputs, and IMO, the semantics of assembler
as used in practice is so under specified that it's really hard to infer
semantics in any useful way. As a couple of examples, is the fault
behavior of an instruction well defined? Is the label near an
instruction used by the signal handler? Is that data byte just before
an instruction actually decoded as part of the instruction?
The benefit of the later option is that existing assembly files can be
used without modification. This is a huge advantage in terms of ease of
mitigation for existing code bases. It's also the approach the original
patch sets for GCC took.
In the original review thread(s), I had taken the position that we
should reject the automagic assembler based on the correctness concerns
mentioned. I had thought the consensus in the review was clearly in
that direction as well, but this has recently come up again. Given
that, I wanted to open it to a wider audience.
Why am I pushing for a decision now?
There are two major reasons. First, there have recently been a couple
of patches posted and landed (D76176, and D76052) building towards the
automagic assembler variant. And second, I've started getting review
comments (https://reviews.llvm.org/D76398#1930383) which block forward
progress on generalized padding support assuming the automagic
interpretation. Implementing the automatic assembler variant for prefix
and immediate padding adds substantial complexity and I would very much
like not to bother with if I don't have to.
Current implementation details
We have support in the integrated assembler only for autopadding
suppression. This allows a LLVM based compiler to effectively apply
padding selectively. In particular, we've instrumented lowering from MI
to MC (X86MCInstLowering.cpp) to selectively disable padding around
constructs which are thought to be problematic. We do not have an
agreed upon syntax for this in assembler; the code that got checked in
is modeled closely around the last seriously discussed variant (see
below). This support is able to use all of the padding variants: nop,
prefix, and immediate.
We also have limited support in the assembler for not inserting nops
between fragments where doing so would break known idioms. The list of
such idioms is, IMO, ad hoc. This assembler support does not include
prefix or immediate padding.
Philip
p.s. For those interested, here's roughly what the last round of
assembler syntax I remember being discussed looked like.
.autopadding
.noautopadding
These two directives would respectively enable and disable automatic
padding of instructions within the region defined. It's presumed to be
legal to insert nops between instructions, modify encodings, or
otherwise adjust offsets of instruction boundaries within the region to
achieve target specific desired alignments. Similarly, it's presumed not
to be legal to change relative offsets outside an explicitly enabled
region. (Except for existing cases - e.g. relaxation of branches, etc...)
The assembler would provide a command line flag which conceptually
wrapped the whole file in a pair of enable/disable directives.
We'd previously discussed a variant with push/pop semantics and more
fine grained control over alignment requests, but I believe we decided
that was overkill in the end. (I walked away with that impression based
on the integrated assembler work at least.)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200324/3999fdc8/attachment.html>
More information about the llvm-dev
mailing list