<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">2015-04-30 12:54 GMT-07:00 Adrian Prantl <span dir="ltr"><<a href="mailto:aprantl@apple.com" target="_blank">aprantl@apple.com</a>></span>:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="">><br>

> On Apr 28, 2015, at 9:56 AM, Alex L <<a href="mailto:arphaman@gmail.com">arphaman@gmail.com</a>> wrote:<br>

><br>

</span><div><div class="h5">> Hi all,<br>

><br>

> I would like to propose a text-based, human readable format that will be used to<br>

> serialize the machine level IR. The major goal of this format is to allow LLVM<br>

> to save the machine level IR after any code generation pass and then to load<br>

> it again and continue running passes on the machine level IR. The primary use case<br>

> of this format is to enable easier testing process for the code generation passes,<br>

> by allowing the developers to write tests that load the IR, then invoke just a<br>

> specific code gen pass and then inspect the output of that pass by checking the<br>

> printed out IR.<br>

><br>

><br>

> The proposed format has a number of key features:<br>

> - It stores the machine level IR and the optional LLVM IR in one text file.<br>

> - The connections between the machine level IR and the LLVM IR are preserved.<br>

> - The format uses a YAML based container for most of the data structures. The LLVM<br>

>   IR is embedded in the YAML container.<br>

> - The format also uses a new, text-based syntax to serialize the machine instructions.<br>

>   The instructions are embedded in YAML.<br>

><br>

> This is an incomplete example of a YAML file containing the LLVM IR, the machine level IR<br>

> and the instructions:<br>

><br>

> ---<br>

> ir: |<br>

>   define i32 @fact(i32 %n) {<br>

>     %1 = alloca i32, align 4<br>

>     store i32 %n, i32* %1, align 4<br>

>     %2 = load i32, i32* %1, align 4<br>

>     %3 = icmp eq i32 %2, 0<br>

>     br i1 %3, label %10, label %4<br>

><br>

>   ; <label>:4                                       ; preds = %0<br>

>     %5 = load i32, i32* %1, align 4<br>

>     %6 = sub nsw i32 %5, 1<br>

>     %7 = call i32 @fact(i32 %6)<br>

>     %8 = load i32, i32* %1, align 4<br>

>     %9 = mul nsw i32 %7, %8<br>

>     br label %10<br>

><br>

>   ; <label>:10                                      ; preds = %0, %4<br>

>     %11 = phi i32 [ %9, %4 ], [ 1, %0 ]<br>

>     ret i32 %11<br>

>   }<br>

><br>

> ...<br>

> ---<br>

> number:          0<br>

> name:            fact<br>

> alignment:       4<br>

> regInfo:<br>

>   ....<br>

> frameInfo:<br>

>   ....<br>

> body:<br>

>   - bb:              0<br>

>     llbb:            '%0'<br>

>     successors:      [ 'bb#2', 'bb#1' ]<br>

>     liveIns:         [ '%edi' ]<br>

>     instructions:<br>

>       - 'push64r undef %rax, %rsp, %rsp'<br>

>       - 'mov32mr %rsp, 1, %noreg, 4, %noreg, %edi'<br>

>       - ....<br>

>         ....<br>

>   - bb:              1<br>

>     llbb:            '%4'<br>

>     successors:      [ 'bb#2' ]<br>

>     instructions:<br>

>       - '%edi = mov32rm %rsp, 1, %noreg, 4, %noreg'<br>

>       - ....<br>

>         ....<br>

>   - ....<br>

>     ....<br>

> ...<br>

><br>

> The example above shows a YAML file with two YAML documents (delimited by `---`<br>

> and `...`) containing the LLVM IR and the machine function information for the function `fact`.<br>

><br>

><br>

> When a specific format is chosen, I'll start with patches that serialize the<br>

> embedded LLVM IR. Then I'll add support for things like machine functions and<br>

> machine basic blocks, and I think that an intrusive implementation will work best<br>

> for data structures like these. After that I will continue adding support for<br>

> serialization of the remaining data structures.<br>

><br>

><br>

> Thanks for reading through the proposal. What are you thoughts about this format?<br>

<br>

</div></div>I’m really looking forward to this; it will be extremely useful for testing the debug info backend.<br>

For debug nodes referenced via DBG_VALUE intrinsics, it looks like they could just point to the corresponding nodes in the optional IR.<br>

Are there any plans to represent metadata such as the DebugLoc(ations) attached to the machine instructions?<br>

<span class="HOEnZb"><font color="#888888"><br>

-- adrian</font></span></blockquote><div><br></div><div>Yes, the debug location that's attached to the machine instruction will be serialized as well. I will describe how when</div><div>I will send out a patch that serializes it.</div><div><br></div><div>Alex.</div></div><br></div></div>