[PATCH] D70157: Align branches within 32-Byte boundary

Kan Shengchen via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Nov 22 03:01:45 PST 2019


skan added a comment.

In D70157#1755927 <https://reviews.llvm.org/D70157#1755927>, @jyknight wrote:

> An alternative would be to simply emit NOPs before branches as needed. That would be substantially simpler, since it would only require special handling for a branch or a fused-branch. I assume things were done this substantially-more-complex way in order to reduce performance cost of inserting NOP instructions? Are there numbers for how much better it is to use segment prefixes, vs a separate nop instruction? It seems a little bit surprising to me that it would be that important, but I don't know...
>
> I'll note that the method here has the semantic issue of making it effectively impossible to ever evaluate an expression like ".if . - symbol == 24" (assuming we're emitting instructions), since every instruction can now change size. I suspect that will make it impossible to turn this on by default without breaking a lot of assembly code. Previously, only certain instructions, like branches or arithmetic ops with constant arguments of unknown value, could change size.


Thanks for your remind! Now, if `-malign-branch-prefix-size=0` is used, the method will only insert `BranchPadding` or `FusedJccPadding` before branches as needed, and insert `BranchPrefix` before the instruction which is macro fusible but not macro fused. In this condition, the operation is as simple as inserting nop only. Since more performance can be recovered by inserting prefixes than inserting nops,  I believe I should support prefix padding.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D70157/new/

https://reviews.llvm.org/D70157





More information about the llvm-commits mailing list