[PATCH] D41721: [X86] Move HasNOPL to a subtarget feature bit. Plumb MCSubtargetInfo through the MCAsmBackend constructor
Simon Pilgrim via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Jan 4 14:47:46 PST 2018
RKSimon added reviewers: pcordes, andreadb.
RKSimon added subscribers: pcordes, andreadb.
RKSimon added a comment.
In https://reviews.llvm.org/D41721#967576, @craig.topper wrote:
> Limit NOP length to 10 on everything but AMD fam 16h and 17h. @RKSimon you mentioned fam 15h in your previous update, but the optimization manual I found for fam 15h only talks about 11 byte.
Yes, I think you're right - the fam15 sog says 3 prefix bytes are the max to avoid a decode stall, so 11 bytes should be the limit, but going to just 10 bytes is probably OK. Sorry I was going off a comment in PR22965 without rechecking.
> We could maybe go to 11 instead of 10. But Intel documentation only lists out to 9 bytes. And the current 10-15 byte sequences we do are different than the 10-15 byte sequences in AMD's manual. We use a CS prefix that's not mentioned in the optimization manuals. The CS prefix does appear in the binutils 10 byte sequence. I believe binutils stops at 10 bytes for all CPUs.
If we're limiting the 11-15 byte options to the AMD targets, I think it'd be better to follow their recommendations (which luckily are the same in both the 16h and 17h sogs).
I feel guilty for asking for this change as it looks like the max nop length change should be part of a separate patch as it might have performance effects that need reviewing.
@andreadb @pcordes do you have any suggestions?
More information about the llvm-commits