[LLVMdev] Proposal: extended MDString syntax

Wed Jun 26 22:43:08 PDT 2013

On Jun 26, 2013, at 4:36 PM, Chandler Carruth <chandlerc at google.com> wrote:

> 
> On Wed, Jun 26, 2013 at 4:30 PM, Eric Christopher <echristo at gmail.com> wrote:
> Off the cuff I'd think that IR containing MF seems most reasonable and
> the use of metadata to contain it seems to be good from two
> perspectives I think:
> 
> a) it already exists, 
> b) oddly enough that we could get rid of the metadata and still have a
> valid module/compilation unit seems like it might be interestingly
> useful, but I'm not sure what uses there are off the top of my head.
> 
> I'll give the reason why I like this having just thought about it a while:
> 
> I think of this as a pre-lowered hint. IE, take some IR, and give a hint to the code generator to lower like this over here. I see a few benefits of this model:
> 
> - It makes it reasonably easy to only specify the MI for the bit you really are trying to test. You can let the normal lowering process handle any other bits. I think this will help keep test cases small and reasonable.
> 
> - It makes it easy to re-baseline when the code generator changes but the changes are acceptable -- strip metadata and run it through the existing pipeline.
> 
> - It has the potential to be "incomplete" or of varying degrees of completeness which I think will be useful in testing different layers of the system... but Dan probably has more/better thoughts on this front than I do.
> 
> 
> The one thing I don't really like about the reversed model of MI containing IR is that now the MI model has to be "complete", so we have to invent what that means. I'm not really interested in this outside of generating test cases, so anything that simplifies the space of what we have to design *really* appeals to me.

I don’t have a strong opinion either way.

I don’t understand your comment about the MI model needing to be “complete”.  The yaml approach was not “MI containing IR”.  In fact, the initial implementation doesn’t have good support for serializing machine instructions, but it works great to IR-level passes run by llc, e.g., codegenprepare.  The yaml file is just a way of collecting the various kinds of information needed for that, and you can omit the machine instructions entirely if you want to serialize after an IR-level pass.  I think all of the benefits you mention for using metadata could apply just as well to using yaml — it’s just a matter of how you stuff the data into a file.

Some other things to keep in mind;

- There are a number of different data structures that will need to be serialized to really make this work.  Besides the IR and the MachineInstructions, there are various data structures in MachineFunctions, some of which are target-specific.  Yaml works well for that because it provides a nicely structured way of organizing that data.  The same could be done with metadata, though.

- One idea that Bin implemented last summer was to stash the last pass in the yaml.  Unlike IR-level passes, llc has more constraints on the order in which it runs passes.  We decided to just accept that limitation and assume a fixed order for the passes. We added the -stop-after option to specify where in the pass sequence to stop and serialize the code out to a yaml file.  By including the name of the -stop-after pass in the yaml output, we automatically know where to start up again when processing a yaml input.  There are some cases where passes are run more than once, and I don’t think we had a good solution for handling that.

I’m curious to find out if you have ideas for how to serialize the actual machine instructions.  That’s where it really gets interesting, IMO.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130626/b057802e/attachment.html>