[LLVMdev] RFC: Machine Level IR text-based serialization format

Alex L arphaman at gmail.com
Tue Apr 28 11:18:35 PDT 2015


2015-04-28 10:15 GMT-07:00 Hal Finkel <hfinkel at anl.gov>:

>
> ------------------------------
>
> *From: *"Alex L" <arphaman at gmail.com>
> *To: *"LLVM Developers Mailing List" <llvmdev at cs.uiuc.edu>
> *Sent: *Tuesday, April 28, 2015 11:56:42 AM
> *Subject: *[LLVMdev] RFC: Machine Level IR text-based serialization format
>
>
> Hi all,
>
>
> I would like to propose a text-based, human readable format that will be used to
>
> serialize the machine level IR. The major goal of this format is to allow LLVM
>
> to save the machine level IR after any code generation pass and then to load
>
> it again and continue running passes on the machine level IR. The primary use case
>
> of this format is to enable easier testing process for the code generation passes,
>
> by allowing the developers to write tests that load the IR, then invoke just a
>
> specific code gen pass and then inspect the output of that pass by checking the
>
> printed out IR.
>
>
>
> The proposed format has a number of key features:
>
> - It stores the machine level IR and the optional LLVM IR in one text file.
>
> - The connections between the machine level IR and the LLVM IR are preserved.
>
> - The format uses a YAML based container for most of the data structures. The LLVM
>
>   IR is embedded in the YAML container.
>
> - The format also uses a new, text-based syntax to serialize the machine instructions.
>
>   The instructions are embedded in YAML.
>
>
> This is an incomplete example of a YAML file containing the LLVM IR, the machine level IR
>
> and the instructions:
>
>
> ---
>
> ir: |
>
>   define i32 @fact(i32 %n) {
>
>     %1 = alloca i32, align 4
>
>     store i32 %n, i32* %1, align 4
>
>     %2 = load i32, i32* %1, align 4
>
>     %3 = icmp eq i32 %2, 0
>
>     br i1 %3, label %10, label %4
>
>
>   ; <label>:4                                       ; preds = %0
>
>     %5 = load i32, i32* %1, align 4
>
>     %6 = sub nsw i32 %5, 1
>
>     %7 = call i32 @fact(i32 %6)
>
>     %8 = load i32, i32* %1, align 4
>
>     %9 = mul nsw i32 %7, %8
>
>     br label %10
>
>
>   ; <label>:10                                      ; preds = %0, %4
>
>     %11 = phi i32 [ %9, %4 ], [ 1, %0 ]
>
>     ret i32 %11
>
>   }
>
>
> ...
>
> ---
>
> number:          0
>
> name:            fact
>
> alignment:       4
>
> regInfo:
>
>   ....
>
> frameInfo:
>
>   ....
>
> body:
>
>   - bb:              0
>
>     llbb:            '%0'
>
>     successors:      [ 'bb#2', 'bb#1' ]
>
>     liveIns:         [ '%edi' ]
>
>     instructions:
>
>       - 'push64r undef %rax, %rsp, %rsp'
>
>       - 'mov32mr %rsp, 1, %noreg, 4, %noreg, %edi'
>
>
> Hi Alex,
>
> I think this looks promising. What are the 1 an 4 above? How are you
> proposing to serialize operand flags (dead, etc.)?
>
>  -Hal
>

Hi Hal,

The 1 and 4 above are constants that are specific to x86 memory addressing,
I believe they basically compute the address RSP + 1 * 0 + 4.
I haven't settled on a final version of the operand flags (for registers)
syntax, but at the moment I'm thinking of something like this:
- The IsDef flag is implied by the use of the register before the '=',
unless it's implicit.
- TiedTo and IsEarlyClobber aren't not serialized, as they are defined by
the instruction description. (I believe that's true in all cases, but I'm
not 100% sure).
- IsUndef, IsImp, IsKill, IsDead, IsInternalRead, IsDebug - keywords like
'implicit', 'undef', 'kill', 'dead' are used before the register e.g.
'undef %rax', 'implicit-def kill %eflags'.

I don't have a syntax for the SubReg_TargetFlags at the moment.

Alex


>
>       - ....
>
>         ....
>
>   - bb:              1
>
>     llbb:            '%4'
>
>     successors:      [ 'bb#2' ]
>
>     instructions:
>
>       - '%edi = mov32rm %rsp, 1, %noreg, 4, %noreg'
>
>       - ....
>
>         ....
>
>   - ....
>
>     ....
>
> ...
>
>
> The example above shows a YAML file with two YAML documents (delimited by `---`
>
> and `...`) containing the LLVM IR and the machine function information for the function `fact`.
>
>
>
> When a specific format is chosen, I'll start with patches that serialize the
>
> embedded LLVM IR. Then I'll add support for things like machine functions and
>
> machine basic blocks, and I think that an intrusive implementation will work best
>
> for data structures like these. After that I will continue adding support for
>
> serialization of the remaining data structures.
>
>
>
> Thanks for reading through the proposal. What are you thoughts about this format?
>
>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>
>
>
> --
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150428/0ef63547/attachment.html>


More information about the llvm-dev mailing list