[PATCH] D126116: [X86][BOLT][NFC] Use getOperandType to determine memory access size

Fri Jun 3 11:59:22 PDT 2022

rafauler added a comment.

Let me try to elaborate on why this diff is important and has functional changes.

Previously, we were looking at register size to try to gauge how wide is the memory access: either source or destination register for a memory access. If we couldn't find one, we would say it is a "size zero" access. Returning "size 0" essentially disables all frame analyses because you have an unrecognized access to the stack. This diff looks at memory operands, which makes it cover a lot more cases.

We should probably add to this to the comments of this function so it becomes clear:

Classifying a stack access as *not* "SIMPLE" here means we don't know how to change this instruction memory access. It will disable any changes to the stack layout, so we can't do the most aggressive form of shrink wrapping. We must do so in a way that keeps the original stack layout. Otherwise you need to adjust the offset of all instructions accessing the stack: we can't do that anymore because there is one instruction that is not simple.  There are other implications as well.  We have heuristics to detect when a register is callee-saved and thus eligible for shrink wrapping. If you are restoring a register using a non-simple stack access, then it is classified as NOT callee-saved, and it disables shrink wrapping for *that* register (but not for others).

Classifying a stack access as "size 0" or detecting an indexed memory access (to address a vector, for example) here means we know there is a stack access, but we can't quite understand how wide is the access in bytes. This is very serious because we can't understand how memory accesses alias with each other for this function. This will essentially disable not only shrink wrapping but all frame analysis, it will fail it as "we don't understand this function and we give up on it". Because this diff changes the instructions classified as "size 0", it pretty much boosts how general are all frame optimizations.

Turns out that we were failing to classify the size of quite a few instructions, that's why I have the diff on https://reviews.llvm.org/D126112 to try to cover a few of the cases of non-simple stack accesses, but whose size should be different than zero. As you can see, those are instructions that lack a register operand. I also have one extra record for an instruction that does have a register, but it was wrongly classified as non-simple.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D126116/new/

https://reviews.llvm.org/D126116