[LLVMdev] RFC: Machine Level IR text-based serialization format
Alex L
arphaman at gmail.com
Thu Apr 30 13:28:48 PDT 2015
2015-04-30 12:54 GMT-07:00 Adrian Prantl <aprantl at apple.com>:
> >
> > On Apr 28, 2015, at 9:56 AM, Alex L <arphaman at gmail.com> wrote:
> >
> > Hi all,
> >
> > I would like to propose a text-based, human readable format that will be
> used to
> > serialize the machine level IR. The major goal of this format is to
> allow LLVM
> > to save the machine level IR after any code generation pass and then to
> load
> > it again and continue running passes on the machine level IR. The
> primary use case
> > of this format is to enable easier testing process for the code
> generation passes,
> > by allowing the developers to write tests that load the IR, then invoke
> just a
> > specific code gen pass and then inspect the output of that pass by
> checking the
> > printed out IR.
> >
> >
> > The proposed format has a number of key features:
> > - It stores the machine level IR and the optional LLVM IR in one text
> file.
> > - The connections between the machine level IR and the LLVM IR are
> preserved.
> > - The format uses a YAML based container for most of the data
> structures. The LLVM
> > IR is embedded in the YAML container.
> > - The format also uses a new, text-based syntax to serialize the machine
> instructions.
> > The instructions are embedded in YAML.
> >
> > This is an incomplete example of a YAML file containing the LLVM IR, the
> machine level IR
> > and the instructions:
> >
> > ---
> > ir: |
> > define i32 @fact(i32 %n) {
> > %1 = alloca i32, align 4
> > store i32 %n, i32* %1, align 4
> > %2 = load i32, i32* %1, align 4
> > %3 = icmp eq i32 %2, 0
> > br i1 %3, label %10, label %4
> >
> > ; <label>:4 ; preds = %0
> > %5 = load i32, i32* %1, align 4
> > %6 = sub nsw i32 %5, 1
> > %7 = call i32 @fact(i32 %6)
> > %8 = load i32, i32* %1, align 4
> > %9 = mul nsw i32 %7, %8
> > br label %10
> >
> > ; <label>:10 ; preds = %0, %4
> > %11 = phi i32 [ %9, %4 ], [ 1, %0 ]
> > ret i32 %11
> > }
> >
> > ...
> > ---
> > number: 0
> > name: fact
> > alignment: 4
> > regInfo:
> > ....
> > frameInfo:
> > ....
> > body:
> > - bb: 0
> > llbb: '%0'
> > successors: [ 'bb#2', 'bb#1' ]
> > liveIns: [ '%edi' ]
> > instructions:
> > - 'push64r undef %rax, %rsp, %rsp'
> > - 'mov32mr %rsp, 1, %noreg, 4, %noreg, %edi'
> > - ....
> > ....
> > - bb: 1
> > llbb: '%4'
> > successors: [ 'bb#2' ]
> > instructions:
> > - '%edi = mov32rm %rsp, 1, %noreg, 4, %noreg'
> > - ....
> > ....
> > - ....
> > ....
> > ...
> >
> > The example above shows a YAML file with two YAML documents (delimited
> by `---`
> > and `...`) containing the LLVM IR and the machine function information
> for the function `fact`.
> >
> >
> > When a specific format is chosen, I'll start with patches that serialize
> the
> > embedded LLVM IR. Then I'll add support for things like machine
> functions and
> > machine basic blocks, and I think that an intrusive implementation will
> work best
> > for data structures like these. After that I will continue adding
> support for
> > serialization of the remaining data structures.
> >
> >
> > Thanks for reading through the proposal. What are you thoughts about
> this format?
>
> I’m really looking forward to this; it will be extremely useful for
> testing the debug info backend.
> For debug nodes referenced via DBG_VALUE intrinsics, it looks like they
> could just point to the corresponding nodes in the optional IR.
> Are there any plans to represent metadata such as the DebugLoc(ations)
> attached to the machine instructions?
>
> -- adrian
Yes, the debug location that's attached to the machine instruction will be
serialized as well. I will describe how when
I will send out a patch that serializes it.
Alex.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150430/ff032299/attachment.html>
More information about the llvm-dev
mailing list