[llvm-commits] [PATCH] Miscompile Fix: Ensure that we do not constant fold Globals containing ObjC References in GlobalOpt.

Mon Jan 14 11:56:36 PST 2013

On Jan 12, 2013, at 5:39 PM, Bob Wilson <bob.wilson at apple.com> wrote:
> On Jan 11, 2013, at 4:04 PM, Eli Friedman <eli.friedman at gmail.com> wrote:
>> The complete list of alternatives is as follows:
>> 
>> 1. Make llvm.used indicate that the initializer may be modified before
>> llvm.global_ctors (this patch)
>> 
>> 2. Some llvm.used-like-thing (essentially, this patch, but
>> s/llvm.used/llvm.objc_linker_superspecial_stuff/g).
>> 
>> 3. Add a new kind of linkage, so the global is
>> `@"\01L_OBJC_SELECTOR_REFERENCES_" = internal_objc_metadata global i8*
>> [...]`.
>> 
>> 4. Special-case variables whose names start with "\01L_OBJC" (the
>> original patch).
>> 
>> 5. Always assume globals can be modified before llvm.global_ctors
>> runs, and completely suppress the optimization.
>> 
>> 
>> 4 is bad because it makes the behavior of IR globals more complicated;
>> we don't want to force people writing IR optimizers to look at LangRef
>> for a special rule for globals which looks normal variables, but
>> actually aren't because the name triggers special behavior.
>> 
>> 5 is bad because it eliminates a valuable optimization.
>> 
>> 
>> The benefit of (2) over (1) is that it doesn't overload "used" quite
>> so much.  The downside of (2) is that it means adding yet another
>> "Intrinsic Global Variable" to LangRef, and the whole concept of an
>> "Intrinsic Global Variable" isn't very clean to work with in the first
>> place.
> 
> I like option (1) for its simplicity, but it still seems to be a reinterpretation of what llvm.used means.  If everyone agrees with that change, and it is clearly documented, then it would be OK.
> 
> But, I do not agree with the reinterpretation that makes llvm.used variables become essentially volatile.  (I.E., "there is an invisible reference to the variable and the value may be modified or read at any time via that reference.")  To be consistent with that interpretation, we would need to check llvm.used all over the place.

The appropriate interpretation of @llvm.used is that the address of the variable has escaped.  It does not force all accesses to be volatile:  e.g. the address having escaped does not make race conditions well-defined, nor does it permit the variable to be mapped to an I/O port.

The representation of @llvm.used is well-considered in order to promote appropriate limitations on optimizations without requiring them to understand @llvm.used specifically.

This is all why I did argue that using @llvm.used for this purpose is a possibly-inappropriate extension of the semantics.

> Alternatively, what about documenting llvm.used as specifically implying that the value may be overridden by the linker or runtime before llvm.global_ctors?  I would also want to clarify the current doc to state that the invisible reference is not used to modify the variable.

I do not think that llvm.used should be documented as permitting strange linker magic.

John.