[PATCH] D29110: LangRef: Document the allowed metadata dropping transforms.

Tue Jan 24 18:45:03 PST 2017

pcc added a comment.

In https://reviews.llvm.org/D29110#655741, @mehdi_amini wrote:

> (Added a few people that cares about this stuff usually).
>
> So I'm quite worried about this. It slipped in with metadata attachments added to globals, and I'm not sure we really measured the consequence.
>
> I believe we had until now the contract that Metadata attachments (which were on instruction only) were always safely removable: we can always transform the IR without interpreting the Metadata attachment but we have to drop the metadata in the process (if we don't know how to update them).
>
> Adding metadata attachment to global variable is fine by itself, but at the same time it has been used in LLVM 4.0 to express *correctness* related construct: they can't any longer be ignored by the optimizer or the program may break.
>
> I believe the implication of this hasn't been considered enough, and ultimately it means that any pass that looks at a global needs to understand the associated metadata.

Only passes that rebuild globals without dropping them, and in practice there aren't that many of them. There is already support for copying metadata when rebuilding globals (see `GlobalObject::copyMetadata`). Look at its callers [0] for how often it is needed. Somehow we've been able to launch features which rely on global-metadata-for-correctness (e.g. CFI and devirtualization on Linux Chrome) with that level of support.

> It seems critical to me to put a limit on what can be expressed with the metadata (we certainly wouldn't want a metadata saying "any memory access to this variable is volatile", do we?). I'm afraid we're on a slippery slope, and https://reviews.llvm.org/D29104 is going in this direction.

Agreed that there should be limits, I think we're just trying to figure out what those limits are :)

Ultimately I care about keeping things simple. There are a number of cases where we need to attach "things" to a global for one reason or another. One example of such a "thing" is !type metadata (used for CFI among other things). That metadata is required for correctness, and takes advantage of a number of metadata features (e.g. uniquing based on the value of the isDistinct() flag, which is used to distinguish between classes with external linkage and those with internal linkage). Should we invent some other way of representing type information? I suppose we could, but that would seem to involve a significant parallel infrastructure to metadata and we wouldn't be in much of a better place to where we started, i.e. we would still need to preserve whatever we replace type metadata with, etc.

Perhaps a better way of thinking about whether to keep metadata is by analogy to linkage. We can drop unreferenced globals with internal linkage without breaking semantics, but not external or weak globals. In principle we could add a similar concept for metadata attached to globals, but I don't see a pressing need at this point.

[0] http://llvm-cs.pcc.me.uk/lib/IR/Metadata.cpp/rcopyMetadata

https://reviews.llvm.org/D29110