[llvm-dev] How to add assembly instructions in CodeGen

Wed May 9 06:54:32 PDT 2018

Hi Dean,

I looked at XRay. I also thought on the similar line to add assembly
instructions as auxiliary template code and jump on to there. However, that
may still dis-align the stack. I have to think about it. But your XRay code
does give me the courage to think about this seriously.

Thank you for your help. I also figured out that we can access certain
CodeGen's feature right from the IR level, as you have explained your
tussle of dealing with IR and CodeGen together. Hopefully I can work out a
convenient way.

Regards,
Soham Sinha
PhD Student, Department of Computer Science
Boston University

On Mon, May 7, 2018 at 8:38 PM, Dean Michael Berris <dean.berris at gmail.com>
wrote:

> On Tue, May 8, 2018 at 4:06 AM Soham Sinha <soham1 at bu.edu> wrote:
>
> > Hello Dean,
>
> > I looked at the XRay Instrumentation. That's a nice engineering effort. I
> am sure you had your motivation to do this in CodeGen just like I wanted to
> do. I don't understand all of your code but I get the idea that you are
> adjusting the alignment with explicit bytes and no-op instructions. My
> problem is also very much related to yours where my stack pointer ($rsp)
> alignment breaks in printf.
>
> > Having said that, I am not sure whether I need the engineering effort
> that you have pursued. I am trying to add function calls in some places of
> the machine code. I followed X86_64 calling convention to do so. I saved
> (pushed into stack) all the necessary registers (also tried saving all the
> 16 registers) and then filled up 3 arguments in rdi, rsi, rdx and then call
> the desired function (and then pop the registers). Mathematically, saving
> the 16 register should not break the alignment of the stack pointer. But
> when I am trying to debug with gdb, I see that the alignment breaks
> sometimes during the push operations of 16 registers, and it comes as
> broken alignment in the printf function. I am very confused what can go
> wrong here. This is why I was trying to rely on LLVM to maintain the
> alignment.
>
> > Interestingly, at the start of the runOnMachineFunction, I check the
> alignment of the function and also at the end of the runOnMachineFunction
> (after my push, call function and pop). The alignment stays same as 4 (16
> bytes). Therefore, I guess, the BuildMI function doesn't maintain the
> alignment and doesn't even report the broken alignment through the
> alignment variable of MachineFunction. I access the alignment through the
> function, getAlignment. I think BuildMI should have cared about alignment
> or at least update the alignment value.
>
>
> IIRC, getAlignment() tells you the function's *code* alignment, not whether
> the stack is aligned to a certain boundary at a given point. I don't know
> whether that information is maintained per MachineBasicBlock, because the
> decision on whether to spill variables onto the stack is done on a
> per-function-call basis -- you may need to look at the way functions are
> lowered specifically in X86 to see the (complicated) logic to figure out
> whether/how to spill which registers onto the stack and how to lay out the
> stack.
>
> To address this partially, we not only insert the custom event
> pseudo-instruction, but we dispatch to a trampoline that's defined in
> compiler-rt -- that code will maintain the stack alignment before making a
> function call. It saves all the relevant registers first, aligns the stack,
> then calls the function -- upon return we restore the registers from the
> stack. Essentially we're doing a context-switch, which might be what you're
> looking to do as well. That code is in compiler-rt hand-written as x86_64
> assembly.
>
> See
> https://github.com/llvm-mirror/compiler-rt/blob/master/lib/xray/xray_
> trampoline_x86_64.S#L224
> for some inspiration.
>
> The custom event instrumentation points just call into the trampoline,
> setting up the arguments on the spot. We've had to do some gymnastics to
> make that happen all the way up to the IR -- i.e. we insert the
> instrumentation as calls to LLVM intrinsics at the IR, and preserve those
> all the way down to the codegen. Doing it another way seemed much too hard,
> as you may be finding out. :(
>
> > I am afraid if I follow your path of instrumentation, again I might
> ultimately face the same issue where I could not maintain the alignment.
> Your effort is quite similar to what I am trying to do, but I am just
>   doing it in the MachineFunctionPass itself.
>
> > It's very non-trivial and tedious to change the internals of CodeGen
> because the LLVM MC infrastructure is very much intertwined with the
> Assembler. That makes compilation faster but instrumentation tougher. This
> is why I wrote a MachineFunctionPass so that my instrumentation stays like
> a module. I add my MachineFunctionPass at the end of addPreEmitPass phase
> of X86.
>
> > I wish LLVM provided more modular ways of instrumentation just like it
> provides similar instrumentation in the LLVM IR level.
>
>
> I have the same wish -- it'd be great if we can move the XRay
> instrumentation to normal MachineFunctionPass implementations.
>
> Just a thought -- have you considered using XRay instrumentation as a
> framework instead to accomplish what you're trying to do? I mean, instead
> of implementing your own pass?
>
> > Regards,
> > Soham Sinha
> > PhD Student, Department of Computer Science
> > Boston University
>
> > On Mon, May 7, 2018 at 1:20 AM, Dean Michael Berris
> > <dean.berris at gmail.com>
> wrote:
>
> >> On Sun, May 6, 2018 at 7:26 AM Soham Sinha via llvm-dev <
> >> llvm-dev at lists.llvm.org> wrote:
>
> >> > Hello,
>
> >> > I want to add assembly instructions at certain points in a function.
> This
> >> is X86 specific. So I am working in the lib/Target/X86 folder. I create
> a
> >> `MachineFunctionPass` in that folder. I register it in the
> >> X86TargetMachine.cpp in addPreEmitPass(). I use BuildMI to insert my own
> >> assembly instructions in the MachineFunctionPass. This works and my
> >> assembly instructions are inserted at desired places. However, this
> breaks
> >> the alignment. So when I run the generated code, I get segmentation
> fault
> >> (precisely in printf with XMM registers). Where should I add my pass?
>
>
> >> It sounds like you're running into stack alignment issues. If you're
> adding
> >> data to the stack, you may need to work a little harder with maintaining
> >> the state of the stack. This is not trivial to do especially if you're
> >> emitting the assembly by the time you're at a MachineFunctionPass
> (because
> >> register spilling and/or stack alignment information would have already
> >> been done by the time you're in machine instruction lowering). What you
> may
> >> need to do here is to either:
>
> >> - hook into the preamble and stack re-alignment code specifically in X86
> >> that would look at information from your pass. This is not trivial and I
> >> don't recommend going down this path (I tried, but I lost the patience
> to
> >> do it properly).
>
> >> - when emitting the assembly instructions that involve pushing/popping
> from
> >> the stack, that you're keeping track of the alignment of the stack
> >> variables. This is what we do with XRay, when we're lowering the custom
> >> event sleds.
>
> >> - use pseudo-instructions and preserving those until lowering, where the
> >> lowering
>
> >> > My pass depends on the MachineBasicBlock information as well.
> Therefore,
> >> I cannot add my pass too early in LLVM IR. What is the proper pass to
> add
> >> my custom MachineFunctionPass? I tried addPreRegAlloc, but it failed due
> to
> >> insufficient register allocation error or something on that line.
>
> >> > Can anybody please help me write a MachineFunctionPass where I can
> insert
> >> assembly instruction without breaking the alignment? I am doing this for
> >> X86_64.
>
>
> >> You can look at the XRay lowering for the PATCHABLE_EVENT_CALL lowering
> in
> >> X86AsmPrinter as a guide for the lowering, but you might also want to
> see
> >> how we're inserting these pseudo-instructions from the
>
> >> I don't remember having to specify where the pass is defined, since it's
> >> already in the assembly printing. So you might consider inserting these
> >> pseudo-instructions a the MachineFunctionPass, which gets lowered
> >> appropriately in the assembly printer. Unfortunately I don't think
> there's
> >> a generic way of doing this (yet) with the X86 back-end. There might be
> a
> >> good case for making this easier, but right now these kinds of things
> >> haven't been too important to fix yet.
>
> >> Hope this helps!
> >> --
> >> Dean
>
>
>
>
> --
> Dean
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180509/888b9e65/attachment.html>