[LLVMdev] Proposal: extended MDString syntax

Wed Jun 26 16:29:17 PDT 2013

On Jun 26, 2013, at 4:18 PM, Eric Christopher <echristo at gmail.com> wrote:

> On Wed, Jun 26, 2013 at 3:59 PM, Nadav Rotem <nrotem at apple.com> wrote:
>> 
>> On Jun 26, 2013, at 3:51 PM, Chandler Carruth <chandlerc at google.com> wrote:
>> 
>> Can you suggest an alternative solution? Can you describe why you don't
>> think metadata is the right container? This alone isn't really helpful at
>> moving us toward something that there has been widespread agreement LLVM
>> needs.
>> 
>> 
>> Hi Chandler,
>> 
>> Sure, we can talk about serializing MF.  But the discussion should focus on
>> serializing MF, and not multi-line metadata support, which is only one of
>> the possible solutions.  I understand the problem that Dan mentioned (that
>> MF references IR), and I am sure that there are other problems that he did
>> not mention. I would be happy to hear more about other solutions that you
>> considered and other problems that you ran into.  Have you considered using
>> a new format that embeds LLVM-IR ?
>> 
> 
> (Note, this is the first I've heard of this plan and just figured it out myself)
> 
> So inverting it so that MI contains LLVM IR instead of the other way
> around? Then we'd need a serialization format for MI that happened to
> include a way of serializing LLVM IR within. From a quick "hey, this
> seems reasonable" the idea of embedding the MI into the IR rather than
> the other way around seems to make sense since we have already have
> code to serialize the IR.
> 
> The only other idea I've seen was an intern project that really didn't
> go very far a few years ago of using *AML (one of them, I can't recall
> which). I think Bob had some idea of finishing the project, but I'm
> not sure where it's going.
> 
> Do you have any other ideas or some ideas as to why you'd prefer one
> direction rather than the other?

Bin Zeng worked on a project as an intern last summer to serialize machine functions to yaml.  At the time, we were unable to commit it to trunk because we were waiting for Nick's yamlio work to get committed.  I've still got his patches and plan to commit them whenever I get a chance.  I was also considering having another intern pick up that project where it left off.

The approach is perhaps similar to what Dan is proposing, just flipped around.  In one scheme, the top-level container is yaml and the IR is embedded within it along with the machine function stuff.  In the other, the IR is the top-level container and the machine functions are embedded as metadata.  I prefer the yaml approach.

I'd be glad to reprioritize contributing the rest of Bin's patches to make those available sooner rather than later.  The more interesting part, with either scheme, is how to represent the machine functions.  We definitely want something that is readable but still easy to parse.