[llvm-dev] Conditional references between globals in LLVM IR

Mon Sep 13 16:29:32 PDT 2021

On 2021-09-13, Kuba Mracek wrote:
>Any more thoughts, suggestions, feedback on this?
>
>There's one more problem with the suggestion to use metadata instead of regular globals -- the IR emitted by Swift often does actually need to reference some of these records (not in the case of protocol conformance records, but e.g. for type metadata records) do things like GEP into them, extract fields, etc. That won't be work if the symbol was a LLVM metadata instead.
>
>I'm mainly looking for some agreement on the direction here. The approach that seems most tractable to me is to have the structures / symbols in question (protocol records, type records, protocol conformance records) be emitted as they are today, and adding some way in the LLVM IR to express the "rules" under which they are allowed to be removed. This also is very non-intrusive and arguably can't break correctness of existing LLVM code -- any pass is free to ignore the "rules", worst case is a missed optimization opportunity.
>
>Kuba

No real objection from me, but here are some concerns which people may have.

Does the variable name llvm.used.conditional ("conditional") represent the intent (discardable if
both vertices referenced by the edge (a Protocol Conformance Record)) clearly?

Is the compiler side llvm.used.conditional size optimization sufficiently effective?  This is a
variant of llvm.used and llvm.uses means the section is a GC root under section based linker garbage
collection (dead stripping). Would llvm.used be too conservative (wasting too much space) if the linker
can actually do a better job garbage collecting records?

If a prototype is not too much trouble, perhaps showing the benefit will help justify its value.

>> On Aug 25, 2021, at 12:01 PM, Kuba Mracek <mracek at apple.com> wrote:
>>
>> If I understand correctly you're saying that the Swift compiler would stop emitting these Protocol Conformance Records (and many other types of global data it emits today) as regular globals, but instead emit them as this new LLVM IR metadata, and then we would add a new custom lowering pass that would materialize those into regular globals? In principle I think this would be able to solve the problem, but the complexity of doing that seems extremely high to me, and we would also need to either 1) teach this new lowering pass a lot of Swift language specific logic, or 2) pass a lot of extra description via the new metadata format so that it fully describes what the resulting global should look like (name, section, alignment, visibility, etc.).
>>
>> It seems to me that this way, we'd build a "materialization of globals" pass, but if globals are what we want to end up with, maybe a better starting point is to just have globals in the incoming IR?
>>
>> You have a great point that basically we want a "weak user" and that LLVM metadata achieve that. That's why the proposal at <https://reviews.llvm.org/D104496 <https://reviews.llvm.org/D104496>> also uses metadata for the !llvm.used.conditional descriptor.
>>
>> Kuba
>>
>>> On Aug 19, 2021, at 11:50 AM, Jameson Nash <vtjnash at gmail.com <mailto:vtjnash at gmail.com>> wrote:
>>>
>>> I might be far off here, but I think this sounds to me somewhat like how `llvm.dbg.value` wants to treat its arguments. They are of type metadata, so I think that means they aren't typically supposed to be considered to be regular uses, but are a sort of weak user, via the ValueAsMetadata object wrapper. Then later, the debug lowering codegen pass will look for these pseudo-instructions, and try to turn them back into real data, if the content they pointed to is valid (and hasn't been replaced with undef).
>>>
>>>
>>> On Tue, Aug 17, 2021 at 5:33 PM Kuba Mracek via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote:
>>>
>>>
>>> > On Aug 17, 2021, at 2:23 PM, Fangrui Song <maskray at google.com <mailto:maskray at google.com>> wrote:
>>> >
>>> > On 2021-08-17, Kuba Mracek wrote:
>>> >>
>>> >>
>>> >>> On Aug 17, 2021, at 1:06 PM, Fangrui Song <maskray at google.com <mailto:maskray at google.com>> wrote:
>>> >>>
>>> >>> On 2021-08-17, Kuba Mracek wrote:
>>> >>>>
>>> >>>>
>>> >>>>> On Aug 17, 2021, at 11:32 AM, Fangrui Song <maskray at google.com <mailto:maskray at google.com>> wrote:
>>> >>>>>
>>> >>>>> On 2021-08-17, Kuba Mracek via llvm-dev wrote:
>>> >>>>>> I don't see how comdats solve this problem, unless you're suggesting to extend them somehow beyond how they work today? Or could you try to show how my example would look like with comdats?
>>> >>>>>>
>>> >>>>>> Maybe I'm missing something, but if I put @a, @b and @assoc in a single comdat then the behavior I get is that if anything references @a or @b, then the whole group is retained. That's not what I'm looking for -- I need to somehow be able to drop @assoc if only @a or only @b is referenced, but keep it if both are referenced.
>>> >>>>>>
>>> >>>>>> Also note that in my example, nothing is referencing @assoc -- the only thing keeping it alive is @llvm.used. So if I put @assoc in a separate comdat, it's going to be trivially dropped.
>>> >>>>>>
>>> >>>>>> Kuba
>>> >>>>>>
>>> >>>>>>> On Aug 17, 2021, at 9:51 AM, Reid Kleckner <rnk at google.com <mailto:rnk at google.com>> wrote:
>>> >>>>>>>
>>> >>>>>>> Can this be expressed with comdat groups? I believe GlobalDCE should already handle those in the way that you want: if any member is referenced, the whole group is retained, if no members are referenced, the whole group is dropped.
>>> >>>>>>>
>>> >>>>>>> There is a slight wrinkle that MachO doesn't support comdats, but that is the IR feature that we use to express this idea.
>>> >>>>>>>
>>> >>>>>>> On Mon, Aug 16, 2021 at 5:27 PM Kuba Mracek via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> <mailto:llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> <mailto:llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> <mailto:llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>>> <mailto:llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> <mailto:llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> <mailto:llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> <mailto:llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>>>>> wrote:
>>> >>>>>>> Hi llvm-dev!
>>> >>>>>>>
>>> >>>>>>> I'm trying to add dead-stripability of types into the Swift compiler, and for that I need some new affordance in the LLVM IR to be able to express the condition that says "this global can be removed if either of these other globals are removable" -- for a concrete example see below. As pcc has pointed out to me, we already have a !associated metadata, and what I'm looking for is basically a generalization of this concept.
>>> >>>>>>>
>>> >>>>>>> @a = internal global i32 1
>>> >>>>>>> @b = internal global i32 2
>>> >>>>>>> @assoc = internal global [2 x i32*] { i32* @a, i32* @b } !1
>>> >>>>>>>
>>> >>>>>>> ... other code here possibly using or not using @a and @b ...
>>> >>>>>>>
>>> >>>>>>> ; Somehow I want to express that the references from @assoc to @a and @b are not to be taken into account
>>> >>>>>>> ; when doing global liveness analysis, and presence of @assoc in @llvm.used and @llvm.compiler.used should
>>> >>>>>>> ; be ignored as well. Instead the references from @assoc to @a and @b are conceptually supposed to be "weak"
>>> >>>>>>> ; references that will not keep targets alive if there are no "strong" (regular) references. And if either @a or @b
>>> >>>>>>> ; (or both) do get removed, then @assoc should also be removed.
>>> >>>>>>> !1 = !{ ... }
>>> >>>>>>>
>>> >>>>>>> ; We need to mention @assoc in @llvm.used or @llvm.compiler.used, otherwise @assoc is trivially removable.
>>> >>>>>>> @llvm.used = appending global [...] { @assoc }
>>> >>>>>
>>> >>>>> There is an unusual condition: "if either @a or @b (or both) do get removed, then @assoc should also be removed."
>>> >>>>>
>>> >>>>> If @assoc is to be retained if either @a or @b is live at link time, this dead
>>> >>>>> stripping relation can be expressed via Mach-O S_ATTR_LIVE_SUPPORT.
>>> >>>>
>>> >>>> That I'm afraid is not accurate -- I believe the Mach-O support for S_ATTR_LIVE_SUPPORT at least in ld64 is limited to a single reference per global.
>>> >>>
>>> >>> ld64.lld supports more references.
>>> >>>
>>> >>> From https://github.com/apple-opensource/ld64 <https://github.com/apple-opensource/ld64> <https://github.com/apple-opensource/ld64 <https://github.com/apple-opensource/ld64>> Resolver::deadStripOptimize, I think multiple references are supported.
>>> >>> But I don't have a macOS device to confirm.
>>> >>>
>>> >>>>>
>>> >>>>> (
>>> >>>>> The existing !associated can not express many-to-one relationship.
>>> >>>>> (https://reviews.llvm.org/D97430 <https://reviews.llvm.org/D97430> <https://reviews.llvm.org/D97430 <https://reviews.llvm.org/D97430>> <https://reviews.llvm.org/D97430 <https://reviews.llvm.org/D97430> <https://reviews.llvm.org/D97430 <https://reviews.llvm.org/D97430>>> is such a (complex) case.)
>>> >>>>>
>>> >>>>> For ELF, we can use R_*_NONE relocations to express such many-to-one relationship.
>>> >>>>> There is no LLVM IR feature expressing this yet.
>>> >>>>> )
>>> >>>>>
>>> >>>>> I don't think any linker feature in PE/COFF, ELF, or Mach-O can express the
>>> >>>>> desiged garbage collection (dead stripping) semantics for the generic case.
>>> >>>>
>>> >>>> And I'm fine with that. I don't need this information to survive into object files, I'm okay to rely on GlobalDCE (on a translation-unit level, or on an LTO unit with LTO enabled) to do the work only on IR.
>>> >>>>
>>> >>>>>
>>> >>>>>>> This example above basically shows the situation of Swift's Protocol Conformance Records, which don't have any incoming static references and instead they are scanned and processed by the runtime (by looking those records up via APIs from the dynamic linker), so they need the @llvm.used marker to survive into the final binary. They reference two other globals, a type record and a protocol record, which means that today they actually keep both the type and the protocol alive transitively, and therefore prohibit dead-stripping of those. So practically, the @llvm.used marker is necessary but also too conservative at the same time, and I want to relax removal rules on these globals -- specifically these Protocol Conformance Records and some other records in similar situations.
>>> >>>>>
>>> >>>>> Can you give more information why Swift needs this unusual garbage collection
>>> >>>>> behavior?
>>> >>>>
>>> >>>> To allow effective dead stripping: When a particular type/protocol defined in user's code is not actually used, I want that type/protocol and all the code that comes with it to be removed / removable by the optimizer. Today, these Protocol Conformance Records act as liveness roots that prohibit removal of classes/structs/protocols because they actually reference them.
>>> >>>>
>>> >>>>> When @a is retained while @b is dropped, what is the @assoc content supposed to
>>> >>>>> be (a) when the compiler generates the object file and (b) when the linker
>>> >>>>> performs section based garbage collection (dead stripping)?
>>> >>>>
>>> >>>> If @b is dropped, @assoc should be dropped as well.
>>> >>>
>>> >>> This is the unusual condition to me.
>>> >>> Regular section based garbage collection/dead stripping uses the logical
>>> >>> OR condition. One section is retained if any section referencing it is
>>> >>> retained.
>>> >>>
>>> >>> Now your request is a logical AND: one section is retained if all section referencing it are retained.
>>> >>> I still don't why it is designed this way from your description of classes/structs/protocols. Don't you still need to retain the
>>> >>> Conformance Record if any of classes/structs/protocols is retained?
>>> >>
>>> >> I realize explaining the motivation here is critical and that I probably didn't do a great job at it so far :) Let me try again:
>>> >>
>>> >> Let's say that user's code has an internal (non-exported) class T, implementing some methods, let's say .foo() and .bar(). And let's say that the class T is completely unused by any code anywhere — the optimizer in LLVM or the linker should be able to dead strip all of T, right? That means remove the class record from the data section and remove the code of .foo() and .bar() from the code section.
>>> >>
>>> >> The problem is that as soon as you make the class T implement *any* protocol, let's say Equatable, which is a very common thing to do, a Protocol Conformance Record is created for the "T implements Equatable" conformance, and you end up in the situation I'm describing: The record itself is suddenly keeping "T" alive. But note that there are still zero users of "T" anywhere. It would be a valid dead-stripping optimization to remove T, and that Protocol Conformance Record as well, because "T" is not used. No code can rely on T implementing Equatable, because no code uses T.
>>> >>
>>> >> This reasoning works for both parts of the Protocol Conformance Record:
>>> >>
>>> >> - If no code uses "T", then no code can rely on T implementing Equatable, and thus the Protocol Conformance Record is useless and can be discarded.
>>> >> - If no code uses "Equatable", then no code can rely on T implementing Equatable, and thus the Protocol Conformance Record is useless and can be discarded.
>>> >>
>>> >> Of course, this is a special property of this particular language feature of Swift, and the implementation of the Swift compiler needs to make sure to apply this dead-stripping semantics on structures that it knows behave like this.
>>> >>
>>> >> Kuba
>>> >
>>> > Thanks for the example. The feature request is clear to me (I know very littleabout Swift) now.
>>> >
>>> > If we consider "T" and "Equatable" as vertices, the Protocol Conformance Record is an edge
>>> > connecting both vertices and the edge is only useful when both vertices exist, hence the logical AND
>>> > condition in garbage collection (dead stripping).
>>> >
>>> > Have you considered implementing just one side of the following
>>> >
>>> >> If no code uses "T", then no code can rely on T implementing Equatable, and thus the Protocol Conformance Record is useless and can be discarded.
>>> >> If no code uses "Equatable", then no code can rely on T implementing Equatable, and thus the Protocol Conformance Record is useless and can be discarded.
>>> >
>>> > That would not need a new LLVM IR feature and can be implemented by reversing the reference edge:
>>> >
>>> > * say, let just "T" (or just "Equatable") reference the records in the object file.
>>> > * use llvm.compiler.used instead of llvm.used for the records
>>> >
>>> > If the linker finds that "T" can be dropped, the recoreds can all be dropped. This way, you lose the
>>> > potential GC when "Equatable" is dropped.  But the additional GC could be of a lower value.
>>>
>>> Good suggestion, but yes, I've explored that and unfortunately I think it leaves too much on the table: It is both common to have an unused type, as well as have an unused protocol, and I really want to be able to effectively dead strip in both situations.
>>>
>>> >
>>> > ---
>>> >
>>> > llvm.used means the linker should retain the global object.
>>> >
>>> > I have investigated what ELF --gc-sections in the presence of regular LTO or ThinLTO.
>>> >
>>> > For regular LTO, very few objects can survive under IR dead stripping while will be dropped by the linker:
>>> >
>>> > * Target specific optimizations can drop references on constants (e.g. memcpy(..., &constant, sizeof(constant));)
>>> > * Due to phase ordering issues some definitions are not discarded by the optimizer.
>>> >
>>> > For ThinLTO there are more things:
>>> >
>>> > * ThinLTO can cause a definition to be imported to other modules. The original definition may be unneeded after imports.
>>> > * The definition may survive after intra-module optimization. After imports, a round of (inter-module) IR optimizations after computeDeadSymbolsWithConstProp may make the definition unneeded.
>>> > * Symbol resolution is conservative.
>>> >
>>> > But generally the objects dropped by the linker are much fewer than the objects dropped by IR dead stripping.
>>> > So being conservative on the linker side does not lose much when LTO is enabled.
>>> >
>>> >
>>> > If llvm.used.conditional is still to be added, then the 16 llvm::collectUsedGlobalVariables
>>> > references will need scrutiny.
>>>
>>> Agreed, but here's my thinking: The scrutiny is needed not for correctness but only for better effectiveness of dead-stripping -- using llvm.used with its current semantics, and ignoring llvm.used.conditional is conservatively correct and llvm.used.conditional is more of a "hint" that allows removal of globals under some conditions, but that hint does not need to be followed.
>>>
>>> Kuba
>>>
>>> >
>>> >>>
>>> >>>> What should the linker do (or what should its inputs be) is somewhat orthogonal -- I assume none of the existing object file format can express the semantics, so we could just be conservative and lower the globals into something that not going to be efficient but won't eliminate any globals that it shouldn't.
>>> >>>
>>> >>> OK. I understand this part of your proposal now.
>>> >>>
>>> >>>>>>> What I'm looking for is 1) a design for how to represent this in LLVM IR, 2) and an implementation in GlobalDCE that would actually follow these rules and remove globals whenever allowed. For the latter, me and Manman have prepared a patch, see <https://reviews.llvm.org/D104496 <https://reviews.llvm.org/D104496> <https://reviews.llvm.org/D104496 <https://reviews.llvm.org/D104496>><https://reviews.llvm.org/D104496 <https://reviews.llvm.org/D104496> <https://reviews.llvm.org/D104496 <https://reviews.llvm.org/D104496>>><https://reviews.llvm.org/D104496 <https://reviews.llvm.org/D104496> <https://reviews.llvm.org/D104496 <https://reviews.llvm.org/D104496>> <https://reviews.llvm.org/D104496 <https://reviews.llvm.org/D104496> <https://reviews.llvm.org/D104496 <https://reviews.llvm.org/D104496>>>>>, which does the job by adding a new specially named metadata -- !llvm.used.conditional -- that has a specific structure that can express these "conditionally used globals". But mainly I'd like to figure out the former -- what should this look like in the IR. Should we try to extend the existing !associated metadata? If yes, can we teach GlobalDCE to optimize based on !associated markers? Should we have a list like !llvm.used.conditional instead? Should it subsume !associated metadata and replace it?
>>> >>>>>>>
>>> >>>>>>> Thanks ahead for any feedback!
>>> >>>>>>>
>>> >>>>>>> Kuba
>>> >>
>>>
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev <https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
>>
>