[PATCH] D42616: [X86] Emit 15-byte NOPs on recent AMD targets, else default to 10-byte NOPs (PR22965)

Sun Jan 28 10:11:46 PST 2018

craig.topper added a comment.

In https://reviews.llvm.org/D42616#990021, @spatel wrote:

> In https://reviews.llvm.org/D42616#989981, @RKSimon wrote:
>
> > What about the 11-byte NOPs for bdver - worth doing even though we'll either need to hijack FeatureXOP or use up another feature bit?
>
>
> Are we approaching a feature bit count limit? If not, I think it's better to make an explicit bit.

The feature bit limit is currently 192 and we're using 112. So we've got 80 more to go.

> 
> 
>> Not sure on padding through prefix/imm-extension - @craig.topper @spatel any ideas? We've recently been trying harder to reduce the size of immediates.......
> 
> What's the advantage of implicit padding via immediate widening vs. actual nops? I'm not familiar with how we do the padding currently, but it should be possible to choose the padding method in whatever pass handles that transform?

Actual nops take a decoder slot and emit a uop. Widening an instruction instead would avoid that.

Unfortunately, there's a not a pass that does the padding per se. Alignment is a field stored in the MachineBasicBlock until conversion to MC. At MC conversion the alignment is translated to an "alignment fragment". Instructions are immediately encoded into binary and stored in a "data fragment". The exception being relative jumps which start out with 1 byte offset and are stored in an "instruction fragment".

Once we have everything converted to fragments we go through an interative process to determine the final size and offset of every fragment. During this process we have to determine if a jump needs to be enlarged if the 1 byte offset isn't enough. If it does it has a ripple effect on every fragment after it. This can cause later jumps with negative offsets to need to be expanded as well. Once a jump has been expanded it won't be able to shrink again so eventually this process will terminate. Throughout the process the alignment fragment size is recalculated to satisfy the desired alignment. Once the iteration process finishes, the alignment fragment will be converted to NOPs and the jump instructions will be encoded. More information here https://eli.thegreenplace.net/2013/01/03/assembler-relaxation

So I don't know where immediate widening fits into that since we don't know the size of the nops we need until after most of the instructions have been encoded.

Repository:
  rL LLVM

https://reviews.llvm.org/D42616