[llvm-dev] Status of Intel JCC Mitigations and Next Steps

Wed Mar 25 13:08:26 PDT 2020

FWIW I'm with Eli here if you need any more data points.

-eric

On Tue, Mar 24, 2020 at 8:21 PM Eli Friedman via llvm-dev <
llvm-dev at lists.llvm.org> wrote:

> Changing the length of a sequence of assembly instructions will break
> someone’s code at some point.  The length of a sequence of instructions is
> known, in general, and people will write code to take advantage of that.
> For example, I’ve seen assembly code using something like Duff’s device,
> except that instead of using a jump table, it just computed the destination
> as “base+n*caselength”.  Given that, I don’t think it’s reasonable to hide
> this mechanism from user control.
>
>
>
> We definitely should not have any undocumented or unpredictable behavior
> in the assembler.  The actual instruction bytes matter.  That said, I’m not
> sure there’s a strong line between “automagic” and “explicit”, as long as
> the rules are documented.
>
>
>
> -Eli
>
>
>
> *From:* llvm-dev <llvm-dev-bounces at lists.llvm.org> *On Behalf Of *Philip
> Reames via llvm-dev
> *Sent:* Tuesday, March 24, 2020 3:55 PM
> *To:* llvm-dev <llvm-dev at lists.llvm.org>
> *Cc:* Luo, Yuanke <yuanke.luo at intel.com>; Zhang, Annita <
> annita.zhang at intel.com>; Craig Topper <craig.topper at intel.com>
> *Subject:* [EXT] [llvm-dev] Status of Intel JCC Mitigations and Next Steps
>
>
>
> TLDR - We have a choice to make about assembler support, and a
> disagreement about how to move forward.  Community input needed.
>
>
>
> Background
>
> Intel has a hardware bug in Skylake and later whose mitigation requires
> padding of branches to avoid performance degradation.  Background here:
> https://www.intel.com/content/dam/support/us/en/documents/processors/mitigations-jump-conditional-code-erratum.pdf
>
> We now have in tree support for alignment of such branches via nop
> padding, and limited support for padding existing instructions with either
> prefixes or larger immediate values.  This has survived several days of
> dedicated testing and appears to be reasonably robust.  The padding support
> applies both to branch alignment for the mitigation, but also normal align
> directives.
>
> The original patches proposed a somewhat different approach than we've
> ended up taking - primarily because of memory overhead concerns.  However,
> there was also a deeper disagreement on the original review threads (D70157
> and others) which was never settled, and we seem to be at a point where
> this needs attention.  In short, the question is how assembler support
> should be handled.
>
>
>
> The Choice
>
> The problematic use case comes when assembling user provided .s files.
> (Instead of the more restricted output of the compiler.)  Our basic choice
> is do we want to force a new directive syntax (and thus a source code
> change to use the new feature), or attempt to automatically infer where
> it's safe to insert padding?
>
> The options as I see them:
>
>    - Assembler directives w/explicit opt in - In this model, assembler
>    input is assumed to only enable padding in regions where it is safe to do
>    so.
>    - Automagic assembler - In this model, the assembler is responsible
>    for inferring where it is legal to pad without breaking user expectations.
>
> (I'll stop and disclaim that I'm strongly in favor of the former.  I've
> tried to describe the pros/cons of each, but my perspective is definitely
> biased.)
>
> The difference between the two is a huge amount of complexity, and a very
> fundamental correctness risk.  The basic problem is that assemblers have to
> handle unconstrained inputs, and IMO, the semantics of assembler as used in
> practice is so under specified that it's really hard to infer semantics in
> any useful way.  As a couple of examples, is the fault behavior of an
> instruction well defined?  Is the label near an instruction used by the
> signal handler?  Is that data byte just before an instruction actually
> decoded as part of the instruction?
>
> The benefit of the later option is that existing assembly files can be
> used without modification.  This is a huge advantage in terms of ease of
> mitigation for existing code bases.  It's also the approach the original
> patch sets for GCC took.
>
> In the original review thread(s), I had taken the position that we should
> reject the automagic assembler based on the correctness concerns
> mentioned.  I had thought the consensus in the review was clearly in that
> direction as well, but this has recently come up again.  Given that, I
> wanted to open it to a wider audience.
>
>
>
> Why am I pushing for a decision now?
>
> There are two major reasons.  First, there have recently been a couple of
> patches posted and landed (D76176, and D76052) building towards the
> automagic assembler variant.  And second, I've started getting review
> comments (https://reviews.llvm.org/D76398#1930383) which block forward
> progress on generalized padding support assuming the automagic
> interpretation.  Implementing the automatic assembler variant for prefix
> and immediate padding adds substantial complexity and I would very much
> like not to bother with if I don't have to.
>
>
>
> Current implementation details
>
> We have support in the integrated assembler only for autopadding
> suppression.  This allows a LLVM based compiler to effectively apply
> padding selectively.  In particular, we've instrumented lowering from MI to
> MC (X86MCInstLowering.cpp) to selectively disable padding around constructs
> which are thought to be problematic.  We do not have an agreed upon syntax
> for this in assembler; the code that got checked in is modeled closely
> around the last seriously discussed variant (see below).  This support is
> able to use all of the padding variants: nop, prefix, and immediate.
>
> We also have limited support in the assembler for not inserting nops
> between fragments where doing so would break known idioms.  The list of
> such idioms is, IMO, ad hoc.  This assembler support does not include
> prefix or immediate padding.
>
>
>
> Philip
>
> p.s. For those interested, here's roughly what the last round of assembler
> syntax I remember being discussed looked like.
>
> .autopadding
> .noautopadding
>
> These two directives would respectively enable and disable automatic
> padding of instructions within the region defined.  It's presumed to be
> legal to insert nops between instructions, modify encodings, or otherwise
> adjust offsets of instruction boundaries within the region to achieve
> target specific desired alignments.  Similarly, it's presumed not to be
> legal to change relative offsets outside an explicitly enabled region.
> (Except for existing cases - e.g. relaxation of branches, etc...)
>
> The assembler would provide a command line flag which conceptually wrapped
> the whole file in a pair of enable/disable directives.
>
> We'd previously discussed a variant with push/pop semantics and more fine
> grained control over alignment requests, but I believe we decided that was
> overkill in the end.  (I walked away with that impression based on the
> integrated assembler work at least.)
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200325/4045c800/attachment.html>