[llvm-dev] RFC: XRay -- A Function Call Tracing System

Thu Apr 28 23:46:20 PDT 2016

On Fri, Apr 29, 2016 at 3:37 PM Sanjoy Das <sanjoy at playingwithpointers.com>
wrote:

> Hi Dean, Eric,
>
> This is great!  I'll wait for others from the community to chime in,
> but I'm happy to see this land upstream.
>
>
Thanks Sanjoy!

>  > details are contained in the whitepaper at:
>  >
> https://storage.googleapis.com/xray-downloads/whitepaper/XRayAFunctionCallTracingSystem.pdf
>
> In the code that you patch in, have you considered excising the "r10d
> = <function id>" bit and just having a 5 byte call to
> __xray_FunctionEntryStub?  If "<function id>" is a function of the PC
> being instrumented then you should be able to reconstruct it by
> peeking into the return PC on the stack.  This means your return
> sequence will have to call @__xray_FunctionExitStub (instead of
> branching), so that the return PC on the stack is correct.
>
>
This was a tradeoff between code-size and runtime cost. Every time we need
to find the function id given the return PC on the stack, we'd have to look
it up from a table mapping PC->function id. We've made the trade-off such
that we pay some cost in code size but save on the costs at runtime.

This might turn out to be something that could be tuned depending on the
situation, but the current implementation favours lower overheads at
runtime when tracing more than trying to save on binary size. I'm open to
probably having an alternative implementation of this later that optimises
for code-size so that's definitely worth exploring.

> Using a 5 byte call will let you keep one 5 byte nop instead of the 10/11
> byte nop sequence you have today (will really only help for the prologue).
>
>  > *LLVM:* Add new LLVM pseudo instructions for indicating where the
>  > instrumentation sleds should be placed, and what kind of sleds these
>  > are. For entry sleds, we can use the `PATCHABLE_OP` instruction, and a
>  > new instruction for patching exits (`PATCHABLE_RET`?) that get handled
>  > by the assembly printer per platform. Currently only implemented for
>  > x86_64.
>
> Nitpicky comment -- we can discuss this in more detail during code
> review:
>
> I'd prefer to add more functionality to `PATCHABLE_OP` instead of
> introducing yet one more pseudo instruction.  `PATCHABLE_OP` can
> already wrap an arbitrary machine instruction, teaching it a few
> different flavors of nop-insertion should not be a big deal.
>
>
Indeed -- although I'm more worried about the stack lowering code getting
affected by the fact that `PATCHABLE_OP` is not a return instruction nor is
it a terminator. Whether it's possible to make `PATCHABLE_OP` count as both
a normal instruction and a terminator/return at the same time is something
I haven't considered.

Note that I tried doing this with a single pseudo instruction, and in my
prototype implementation it got really messy (mostly because stack
adjustments in prologue emission is predicated on finding a return
instruction to trigger the logic). Maybe we just need to change that part
to handle `PATCHABLE_OP` differently to work appropriately, but definitely
something to discuss in code review. :D

Cheers
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160429/56f38eaf/attachment.html>