[PATCH] D75268: A light-weight solution to align branches within 32B boundary by prefix padding

Thu Feb 27 10:19:24 PST 2020

reames added a comment.

I think this is a step in the right direction.  I expect we'll need to iterate on the design to allow further padding without extensive memory usage, but starting with a single padding instruction seems like a reasonable starting point.  Being able to iterate in tree is obvious valuable, so I think cleaning this up and getting something in is a good idea.

I'd like to suggest a couple of design changes to simplify the code here though.

First, I think we should introduce a new MCFragment type for the padding opportunity.  Having MCBranchAlign mean both "this is a place we need to enforce alignment" and "this is a place we can optional add padding for a later alignment" is confusing.  I'd suggest something along the lines of a MCPrefixPaddingFragment.  I think splitting this will make the code a lot more readable.

Second, I think we should rebase this on D75203 <https://reviews.llvm.org/D75203>.  The advantage of doing so is that we could completely remove the MCPrefixPaddingFragment from relaxation, and only adjust it's size afterwards.  Relaxation would be responsible for figuring out how much padding is needed and recording that in the MCBoundaryAlign, and then the post pass would divide the padding between nop, prefix, and other relaxable instructions.

Third, the complexity of the state machine for inserting fragments really bothers me.  I don't have a concrete suggestion here, but I think we need to simplify this.

Hm, after reflecting a bit on points 1 and 2, I may have an alternate suggestion which eliminates (1) entirely.  The basic idea is that we treat prefix padding as a thing done to relaxable instructions.  A relaxable instruction has all the MCInst info to determine legal prefixes.  If we added an interface to MCAsmBackend along the lines of "padInstructionEncoding(MCRelaxableFragment&)" which takes a fragment and changes it's encoding to increase encoding size, this would nicely fit into the design of D75203 <https://reviews.llvm.org/D75203>, and would allow all the prefix logic to live inside a single function in the X86AsmBackend.  Then we'd just have to decide which instructions to make relaxable instead of directly adding to the data fragment.  (i.e. the state machine)

On the surface, that seems like it would work well.  I'm going to prototype that a bit on top of D75203 <https://reviews.llvm.org/D75203> and see if it works as well as it seems.

================
Comment at: llvm/lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp:385
+  // Currently don't deal with Bundle cases.
+  if (OS.getAssembler().isBundlingEnabled())
     return false;
----------------
This looks like a potentially unrelated change.  Can it be separated?

================
Comment at: llvm/lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp:493
+  case X86::CS:
+    return 0x2e;
+  case X86::SS:
----------------
Please replace constants with X86::CS_PREFIX and friends.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D75268/new/

https://reviews.llvm.org/D75268