[PATCH] D97268: [X86] Use correct padding when in 16-bit mode

Wed Feb 24 06:28:41 PST 2021

jyknight added a comment.

In D97268#2583366 <https://reviews.llvm.org/D97268#2583366>, @void wrote:

> That...shouldn't matter since a NOP is a NOP. Regardless, I removed that modification to simplify the patch.

It _does_ matter. To start with it can affect performance (marginally).

But more importantly, if you're doing tricky things with runtime code modification, it's a correctness issue. A trap won't leave the instruction pointer in the middle of a long-nop instruction (which you may be about to replace with another instruction), but it can leave it between two distinct nops.

================
Comment at: llvm/lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp:1128
+        "\x8d\xb4\x00\x00",
+        // nop; lea 0w(%si),%si
+        "\x90\x8d\xb4\x00\x00",
----------------
void wrote:
> jyknight wrote:
> > If you want to go the complex route, and support multi-byte nops in 16bit mode:
> > 
> > The 5 and larger-byte strings here should be removed, as they're combinations of smaller instructions. Leave only the initial 4 elements in the Nops16Bit array. Therefore, getMaximumNopSize() should return 4 in 16bit mode.
> > 
> That is needlessly complex. These patterns are from binutils, which claims that they're "efficient." The code in this patch is far easier to understand, in my opinion.
Adding an entirely distinct second method for emitting NOPs is not the answer to complexity of the existing implementation.

Using the same code with two sets of data is less complexity overall -- just need a single LOC to choose the correct table...

================
Comment at: llvm/lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp:1075

 unsigned X86AsmBackend::getMaximumNopSize() const {
   if (!STI.hasFeature(X86::FeatureNOPL) && !STI.hasFeature(X86::Mode64Bit))
----------------
MaskRay wrote:
> jyknight wrote:
> > I think we could just return 1 in 16-bit mode. There's no real reason to do nop-optimization in that mode, is there? Then the rest of the patch isn't necessary.
> > 
> I agree that performance in 16-bit mode is not important. We can just return 1 :)
Actually, changing my mind here. Supporting 4-byte NOPs in 16-bit mode is useful, in order that `.nops 4, 4` works (dunno if anyone actually does that in practice, but it is a reasonable thing to support.)

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D97268/new/

https://reviews.llvm.org/D97268