[LLVMdev] writeNopData and non-instructions in .text

Thu Sep 18 14:25:38 PDT 2014

> All this makes me wonder:
> (1) Why do we allow the backend to fail at all? Shouldn't the
> "pad-with-0" or so behavior be the default?

Probably, yes. I can’t think of a counterargument, anyway.

> (2) What is the expected order? Pad to instruction size first or last?

The X86 implementation specifics here were chosen simply to match the cctools as(1) implementation, as that made doing things like binary diffs of the output easier when first bringing up the integrated assembler. Now that we’re long past that, if there’s something more compelling we should do instead, let’s do that.

For example, it’s been requested from time to time that we pad (between functions) with UD2 instead on x86. That seems a reasonable thing to consider, though it would have to be measured carefully for impact on branch prediction, return address prediction, etc.. Even in theoretically unreachable code, there can be interesting interactions in general with blobs of anything in between functions and I don’t know how (or whether) UD2 interacts with any of that. This would also require disambiguating padding inside a function for things like basic block alignment vs. padding in between functions and other places that are supposed to be unreachable for execution.

> On Sep 17, 2014, at 2:43 AM, Joerg Sonnenberger <joerg at britannica.bec.de> wrote:
> 
> On Tue, Sep 16, 2014 at 11:33:13PM -0400, David Majnemer wrote:
>> I would be in favor of the following:
>> 1. If the start is aligned *and* the length is aligned, use nops.
>> 2. If the start is aligned but the length is not aligned, insert as many
>> nops possible but pad out with zero.
>> 3. Otherwise (if the start is misaligned), use *just* zeros.
> 
> From reading MCAssembler.cpp, the function is always called before an
> aligned place to pad. As such seems the correct behavior is to assume
> that end will be aligned and "end - k * instruction size" should be
> padded with nops and [start, end - k * instruction size) should be
> padded with plain nulls?
> 

Another option would be "Pad to minimum instruction alignment first, then NOP instructions, then pad any remaining bytes at the end.”

I don’t have any strong preference on how to use zeros vs. instructions beyond agreeing that we should always fall back to zeroes if the backend doesn’t implement NOP padding.

-Jim