[llvm-dev] Finding caller-saved registers at a function call site

Tue Jun 28 13:53:17 PDT 2016

Hi Rob,

Robert Lyerly wrote:
 > The reason I can't just run a liveness analysis over stack slots and
 > registers in the backend is that I'm trying to map live value locations
 > back up into their corresponding values in LLVM bitcode.  This is why
 > I'm using the stackmap intrinsic, as it does exactly that -- provides a
 > mapping between a bitcode value and its storage location for the
 > generated assembly.  I need this intermediate-level value because I'm
 > doing ABI translation.  I'm plucking values out of a call frame laid out
 > in one ABI and storing them in a destination stack frame that is laid
 > out according to another ABI.  The IR value is essentially the "key"
 > used to match corresponding storage locations across the two ABIs.  I'm
 > transforming a thread's current stack laid out for one ABI into one laid
 > out for another ABI.

This sounds exactly like the deoptimization[1] mechanism we use (and
LLVM has support for), except that when deoptimizing the code being
returned into is (and the associated frame layout) is generally
"fixed" i.e. is the interpreter or a low tier JIT.

 >     A related question is: are you interested in the *values* or the
 >     *locations* the values are in?  For instance if a specific value (say
 >     the result of a load) is spilled at 0x80(%rsp) and is also present in
 >     %r13 (callee saved register), then do you have to know both the
 >     locations or just one of the two?
 >
 >
 > I'm actually only interested in being able to find values; I don't
 > particularly care about where they're stored.  In your hypothetical, as
 > long as the compiler could tell me that the value was stored in one of
 > those locations, that'd be okay.

Again, this makes it very close to deoptimization.

 > I'm not concerned about values that are not live across the call ("live
 > on call"), only those that are live after returning from the call ("live
 > on return").  If the value is not live after the call, there's no need
 > for me to able to recover it.  I just need to be able to resume
 > execution in that function correctly, so I'm only concerned about values
 > in caller-saved registers that are needed after the call completes, and
 > therefore have been spilled to the stack as part of the procedure call
 > standard.
 >
 > Because I'm rewriting the stack to change the ABI, I need to be able to
 > set up the stack so that execution can correctly unwind back up the call
 > chain.  This means that I need to be able to populate spill stack slots
 > for caller-saved registers, hence this is why I need their locations.

Ah, so the spill slots are not just pertinent from the POV of the
function you're translating out of, but is also pertinent for the
function you're translating *into*?

IOW, you want to translate

void foo_0() {
   spill %rax to offset 0x90
   call bar
   reload %rax from offset 0x90
   return %rax
}

to

void foo_1() {
   spill %rax to offset 0x100
   call bar
   reload %rax from offset 0x100
   return %rax
}

at the call site to bar, and want to know that the contents of 0x90
need to be copied to 0x100 if you rewrite the stack frame?

I'm not sure how much of your project you're okay in discussing on a
public mailing list, but I suspect the strategy for the best scheme
here will depend on how different the two functions are from each
other.

If all they differ is in the physical stack slot offsets, then I'd
just look at opaquely rewriting the slot offsets and not specifically
caring about live values.

If they differ at a fundamental level, then maybe you need something
like what we do for precise relocating GCs (see documentation on
gc.statepoint); otherwise, for instance, how do you know that a value
that is put in a caller-saved-register in one compilation is also in a
caller-saved-register (and not in a callee-saved-register or constant
folded or rematerialized away) in another compilation?

-- Sanjoy

[1]: Hölzle, Urs, Craig Chambers, and David Ungar. “Debugging
   optimized code with dynamic deoptimization.” ACM Sigplan
   Notices. Vol. 27. No. 7. ACM, 1992.