[llvm-dev] XRay: Demo on x86_64/Linux almost done; some questions.

Fri Jul 29 11:00:53 PDT 2016

On 28 July 2016 at 16:14, Serge Rogatch via llvm-dev
<llvm-dev at lists.llvm.org> wrote:
> Can I ask you why you chose to patch both function entrances and exits,
> rather than just patching the entrances and (in the patches) pushing on the
> stack the address of __xray_FunctionExit , so that the user function returns
> normally (with RETQ or POP RIP or whatever else instruction) rather than
> jumping into __xray_FunctionExit?

> This approach should also be faster because smaller code better fits in CPU
> cache, and patching itself should run faster (because there is less code to
> modify).

It may well be slower. Larger CPUs tend to track the call stack in
hardware and returning to an address pushed manually is an inevitable
branch mispredict in those cases.

Cheers.

Tim.