[LLVMdev] RFP: Metadata is being used poorly to paper over missing IR constructs

Thu Jan 8 14:21:08 PST 2015

> On Jan 8, 2015, at 2:06 PM, Chandler Carruth <chandlerc at gmail.com> wrote:
> 
> Rather than bending over backwards to keep all of this working, we should try to add the IR facilities needed to avoid these problems.
> 
> This is the result of a discussion between myself Duncan and Eric (all of us probably relaying ideas from still other discussions) that I'm trying to write down here because none of us are going to be able to prioritize working on this soon. If anyone else ends up needing facilities related to this or having time to make this part of LLVM better, that would be awesome. Hence, a request for patches more than a request for comments. =D
Sorry, I don’t have any patches handy, but i’m full of comments :)
> 
> <braindump> (and apologies if this is poorly structured or rambly)
> 
> 
> I'm aware of at least two quite strange uses of metadata at the moment (Duncan, Eric, jump in with more I missed):
> 1) The need for an arbitrary (often target-specific) symbolic string in the IR that can be used with intrinsics and/or instructions
> 2) The need to build up a nearly arbitrary record of the "flags" used to control the code generation of the module which need to be handled correctly at link time.
> 
> Neither of these fit the model of metadata. They aren't really optional. They can't be stripped while preserving correctness. They aren't annotations at all. I'm suggesting they both need first class representation, and that this representation won't really be complex or intrusive.
> 
> 
> #1
> We need the ability to put semantic information in the IR that can be used by targets without extending the IR itself. In some cases we can do this with target-specific intrinsics, but those don't always fit the problem and have their own set of challenges.
> 
> I think it would be nice if we just had a top level IR construct for symbolic strings. These should be allowed both inline (much like immediate constants) and potentially out-of-line like attributes. It is possible that this feature could be useful to simplify attributes or #2, but it seems simple and useful enough that I'm OK with it living on its own.
> 
> These symbolic strings would definitionally have no impact on the generated code. We could make them opaque Constants if that's a useful API. If we don't need the Constant API, I would do something similar to Duncan's separation for metadata. My suspicion is that making these Constants would be convenient so they can be used as Values, with the understanding that it would be invalid to use them in arbitrary places, as they only have defined semantics in specific scenarios. The canonical example would be:
> 
>   call i64 @llvm.read_register(<sigil>"sp")
> 
> 
> #2
> We need to make module flags a first class entity of the module, just like data layout:
For #2, I agree we need module flags.  Run an objective C program and you’ll see stuff like this at the bottom which should not be metadata:

!{i32 4, !"Objective-C Garbage Collection", i32 0}

Personally i would put an AttributeSet on the Module.  It already has most of what you want here, already has parsing support, serialization, etc.  You just need to define the set of allowable flags for a module.  Targets can also use the string=string Attribute to add target specific attributes to the Module.  And you could sink DataLayout or even the triple in there if you wanted, although I haven’t thought through this part enough to know if its a good idea.

This would also play nice with LTO.  Modules store code gen attributes on the module itself.  When you LTO, at the point where you get a clash (say SSE vs non-SSE), you can sink the attribute from module level down to all that modules functions.

Thanks,
Pete
> 
>   &flag = ....
> 
> (syntax shamelessly stolen from Duncan's suggestion in IRC)
> 
> We can then actually specify exactly what the requirements are on module flags and how they are linked. We might be able to sink datalayout into this, or it might be better to keep separate, unsure. But having these be top-level real entities would be great.
> 
> 
> </braindump>
> 
> Hope these thoughts are useful. Sadly I'm not going to be able to drive any of these any time soon. I get a similar impression from Duncan. But I think lots of us would be able to help if someone wanted to contribute work on this front.
> 
> If there is broad consensus about the design points above, I'll paste them into two PRs as well for tracking.
> 
> -Chandler
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev