[PATCH] D23931: [XRay] ARM 32-bit no-Thumb support in LLVM

Tue Aug 30 07:19:45 PDT 2016

rSerge added inline comments.

================
Comment at: include/llvm/CodeGen/AsmPrinter.h:209
@@ +208,3 @@
+    bool AlwaysInstrument;
+    const class Function *Fn;
+  };
----------------
dberris wrote:
> Do you need to spell out 'class' here? Wouldn't `const Function*` suffice?
I just moved (copy-pasted) this from X86AsmPrinter.h . Without `class` it does not compile because XRayFunctionEntry already has a member wih the same name: `const MCSymbol *Function` .

================
Comment at: include/llvm/Target/Target.td:970
@@ +969,3 @@
+  let hasSideEffects = 0; // FIXME: is this correct?
+  let isReturn = 0; // Original return instruction will follow
+}
----------------
dberris wrote:
> This one is a little harder. At least in x86, we weren't able to get this to work this way, because stack adjustments may happen later than the insertion of the marker instruction. Unless you can control exactly when this instruction is inserted and that the stack adjustment code doesn't ever move this (or add things after this instruction) then you might want to go do the same thing that we're doing in X86.
The same thing as in x86_64 is not possible for ARM because it has multiple return instructions. Furthermore, CPU allows parametrized and even conditional return instructions. In the current ARM implementation we are making use of the fact that currently LLVM doesn't seem to generate conditional return instructions. 
On ARM, the same instruction can be used for popping multiple registers from the stack and returning (it just pops `pc` register too), and LLVM generates it sometimes.  So we can't insert the sled between this stack adjustment and the return without splitting the original instruction into 2 instructions. So on ARM, rather than jumping into the exit trampoline, we call it, it does the tracing, preserves the stack and returns.

================
Comment at: include/llvm/Target/TargetOpcodes.def:161
@@ +160,3 @@
+/// for inserting instrumentation instructions at runtime.
+/// The patch here prepends the return instruction.
+HANDLE_TARGET_OPCODE(PATCHABLE_FUNCTION_EXIT)
----------------
dberris wrote:
> Is there any reason to do this instead of following the same convention used in x86 of having the nops be after the return instruction?
Yes, as I've explained above, the problem is that ARM has multiple return instructions, so we have to preserve the original return instruction and call the exit tracing trampoline instead of jumping into it. I'm adding a comment in the code too.

================
Comment at: lib/CodeGen/XRayInstrumentation.cpp:46-53
@@ +45,10 @@
+  void replaceRetWithPatchableRet(MachineFunction &MF, const TargetInstrInfo *TII);
+  // Prepend the original return instruction with the exit sled code ("patchable
+  //   function exit" pseudo-instruction), preserving the original return instruction
+  //   just after the exit sled code.
+  // This is the approach to go on CPUs which have multiple options for the return
+  //   instruction, like ARM. For such CPUs we can't just jump into the XRay trampoline
+  //   and issue a single return instruction there. We rather have to call the
+  //   trampoline and return from it to the original return instruction of the
+  //   function being instrumented.
+  void prependRetWithPatchableExit(MachineFunction &MF, const TargetInstrInfo *TII);
----------------
dberris wrote:
> This is a great explanation. Can you say something similar in the description just so it's clear why there's a difference in the approach?
I'm adding it to llvm\include\llvm\Target\TargetOpcodes.def .

https://reviews.llvm.org/D23931