[llvm-dev] Status of Intel JCC Mitigations and Next Steps
Philip Reames via llvm-dev
llvm-dev at lists.llvm.org
Wed Mar 25 15:53:38 PDT 2020
On 3/25/20 2:22 PM, Eli Friedman wrote:
>
> I agree we shouldn’t try to guess what the user is trying to do.
> There shouldn’t be an unbounded set of heuristic rules; “documented”
> implies some sort of promise of stability in addition to the actual
> text in the manual. And we shouldn’t try to guess whether the user’s
> code cares about the length of a specific instruction.
>
> I think you’re creating a false dichotomy here, though. There’s some
> space between “the assembler can rewrite any instruction” and “the
> assembler should guess what the user is doing”. The patches you’re
> referring to are dealing with a very specific issue: there are
> multiple ways users specify prefixes for certain instructions.
> Sometimes the prefixes are integrated into the instruction. Sometimes
> they’re written separately, like “rep; stosb”. And sometimes users end
> up using raw “.byte” directives. So we add a rule: if we see
> something that looks like a prefix, followed by an instruction, we
> assume the combination might actually be a prefixed instruction, and
> don’t rewrite it. This is a heuristic in some sense: we can’t prove
> whether the prefixed instruction will ever execute at runtime. But
> adding prefixes to the instruction is likely problematic.
>
I see this as very much a slipper slope problem. As you mention, I am
trying to draw a hard line, but that's mostly because I can't figure out
what another reasonable line to draw might be. Why is one idiomatic
construct supported, but another not? Why is "label: nop; mov $(rax),
%rax" (with the expectation a fault occurs at label + 1 byte) any less
reasonable? If we say that's reasonable, and should just work, well, we
literally can't implement prefix padding at all. Unless maybe there's
something I'm missing here?
> We could take a different tack on this: instead of giving up pure
> “strictness”, we could instead give the user a diagnostic on patterns
> we recognize. But forcing users to rewrite “rep;stosb”->“rep stosb”
> doesn’t seem productive.
>
> -Eli
>
> *From:* Philip Reames <listmail at philipreames.com>
> *Sent:* Wednesday, March 25, 2020 1:34 PM
> *To:* Eric Christopher <echristo at gmail.com>; Eli Friedman
> <efriedma at quicinc.com>
> *Cc:* Luo, Yuanke <yuanke.luo at intel.com>; llvm-dev
> <llvm-dev at lists.llvm.org>; Zhang, Annita <annita.zhang at intel.com>;
> Craig Topper <craig.topper at intel.com>
> *Subject:* [EXT] Re: [llvm-dev] Status of Intel JCC Mitigations and
> Next Steps
>
> The slightly unexpected bit for me in these responses is the
> willingness to accept layout changes if documented. Let me lay out
> some options, which of these seem reasonable?
>
> 1) Default to automatic padding, provide option to disable, document
> where padding is inserted.
>
> 2) Default to not automatically padding, provide option to enable,
> document where padding would be inserted if enabled.
>
> And orthogonality to the above, two interpretations of each:
>
> a) auto padding is allowed to break common idioms (because you enabled
> it explicitly or because you can disable). If it does, you are
> expected to either simply not use it, or possibly use fine grained
> directives in source.
>
> b) auto padding conservative even when explicitly enabled. One
> implication of this scheme is that different versions of the assembler
> will almost by definition have to tweak insertion as we find new
> problematic idioms. This creates both documentation difficulties, and
> user confusion.
>
> So, four choices total. Which seem reasonable?
>
> Personally, I'd have no problem w/ 2a. Any of the other variants
> concern me.
>
> Philip
>
> On 3/25/20 1:08 PM, Eric Christopher wrote:
>
> FWIW I'm with Eli here if you need any more data points.
>
> -eric
>
> On Tue, Mar 24, 2020 at 8:21 PM Eli Friedman via llvm-dev
> <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote:
>
> Changing the length of a sequence of assembly instructions
> will break someone’s code at some point. The length of a
> sequence of instructions is known, in general, and people will
> write code to take advantage of that. For example, I’ve seen
> assembly code using something like Duff’s device, except that
> instead of using a jump table, it just computed the
> destination as “base+n*caselength”. Given that, I don’t think
> it’s reasonable to hide this mechanism from user control.
>
> We definitely should not have any undocumented or
> unpredictable behavior in the assembler. The actual
> instruction bytes matter. That said, I’m not sure there’s a
> strong line between “automagic” and “explicit”, as long as the
> rules are documented.
>
> -Eli
>
> *From:* llvm-dev <llvm-dev-bounces at lists.llvm.org
> <mailto:llvm-dev-bounces at lists.llvm.org>> *On Behalf Of
> *Philip Reames via llvm-dev
> *Sent:* Tuesday, March 24, 2020 3:55 PM
> *To:* llvm-dev <llvm-dev at lists.llvm.org
> <mailto:llvm-dev at lists.llvm.org>>
> *Cc:* Luo, Yuanke <yuanke.luo at intel.com
> <mailto:yuanke.luo at intel.com>>; Zhang, Annita
> <annita.zhang at intel.com <mailto:annita.zhang at intel.com>>;
> Craig Topper <craig.topper at intel.com
> <mailto:craig.topper at intel.com>>
> *Subject:* [EXT] [llvm-dev] Status of Intel JCC Mitigations
> and Next Steps
>
> TLDR - We have a choice to make about assembler support, and a
> disagreement about how to move forward. Community input needed.
>
> Background
>
> Intel has a hardware bug in Skylake and later whose mitigation
> requires padding of branches to avoid performance
> degradation. Background here:
> https://www.intel.com/content/dam/support/us/en/documents/processors/mitigations-jump-conditional-code-erratum.pdf
>
> We now have in tree support for alignment of such branches via
> nop padding, and limited support for padding existing
> instructions with either prefixes or larger immediate values.
> This has survived several days of dedicated testing and
> appears to be reasonably robust. The padding support applies
> both to branch alignment for the mitigation, but also normal
> align directives.
>
> The original patches proposed a somewhat different approach
> than we've ended up taking - primarily because of memory
> overhead concerns. However, there was also a deeper
> disagreement on the original review threads (D70157 and
> others) which was never settled, and we seem to be at a point
> where this needs attention. In short, the question is how
> assembler support should be handled.
>
> The Choice
>
> The problematic use case comes when assembling user provided
> .s files. (Instead of the more restricted output of the
> compiler.) Our basic choice is do we want to force a new
> directive syntax (and thus a source code change to use the new
> feature), or attempt to automatically infer where it's safe to
> insert padding?
>
> The options as I see them:
>
> * Assembler directives w/explicit opt in - In this model,
> assembler input is assumed to only enable padding in
> regions where it is safe to do so.
> * Automagic assembler - In this model, the assembler is
> responsible for inferring where it is legal to pad without
> breaking user expectations.
>
> (I'll stop and disclaim that I'm strongly in favor of the
> former. I've tried to describe the pros/cons of each, but my
> perspective is definitely biased.)
>
> The difference between the two is a huge amount of complexity,
> and a very fundamental correctness risk. The basic problem is
> that assemblers have to handle unconstrained inputs, and IMO,
> the semantics of assembler as used in practice is so under
> specified that it's really hard to infer semantics in any
> useful way. As a couple of examples, is the fault behavior of
> an instruction well defined? Is the label near an instruction
> used by the signal handler? Is that data byte just before an
> instruction actually decoded as part of the instruction?
>
> The benefit of the later option is that existing assembly
> files can be used without modification. This is a huge
> advantage in terms of ease of mitigation for existing code
> bases. It's also the approach the original patch sets for GCC
> took.
>
> In the original review thread(s), I had taken the position
> that we should reject the automagic assembler based on the
> correctness concerns mentioned. I had thought the consensus
> in the review was clearly in that direction as well, but this
> has recently come up again. Given that, I wanted to open it
> to a wider audience.
>
> Why am I pushing for a decision now?
>
> There are two major reasons. First, there have recently been
> a couple of patches posted and landed (D76176, and D76052)
> building towards the automagic assembler variant. And second,
> I've started getting review comments
> (https://reviews.llvm.org/D76398#1930383) which block forward
> progress on generalized padding support assuming the automagic
> interpretation. Implementing the automatic assembler variant
> for prefix and immediate padding adds substantial complexity
> and I would very much like not to bother with if I don't have to.
>
> Current implementation details
>
> We have support in the integrated assembler only for
> autopadding suppression. This allows a LLVM based compiler to
> effectively apply padding selectively. In particular, we've
> instrumented lowering from MI to MC (X86MCInstLowering.cpp) to
> selectively disable padding around constructs which are
> thought to be problematic. We do not have an agreed upon
> syntax for this in assembler; the code that got checked in is
> modeled closely around the last seriously discussed variant
> (see below). This support is able to use all of the padding
> variants: nop, prefix, and immediate.
>
> We also have limited support in the assembler for not
> inserting nops between fragments where doing so would break
> known idioms. The list of such idioms is, IMO, ad hoc. This
> assembler support does not include prefix or immediate padding.
>
> Philip
>
> p.s. For those interested, here's roughly what the last round
> of assembler syntax I remember being discussed looked like.
>
> .autopadding
> .noautopadding
>
> These two directives would respectively enable and disable
> automatic padding of instructions within the region defined.
> It's presumed to be legal to insert nops between instructions,
> modify encodings, or otherwise adjust offsets of instruction
> boundaries within the region to achieve target specific
> desired alignments. Similarly, it's presumed not to be legal
> to change relative offsets outside an explicitly enabled
> region. (Except for existing cases - e.g. relaxation of
> branches, etc...)
>
> The assembler would provide a command line flag which
> conceptually wrapped the whole file in a pair of
> enable/disable directives.
>
> We'd previously discussed a variant with push/pop semantics
> and more fine grained control over alignment requests, but I
> believe we decided that was overkill in the end. (I walked
> away with that impression based on the integrated assembler
> work at least.)
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200325/51deb070/attachment.html>
More information about the llvm-dev
mailing list