[llvm-commits] [llvm] r150794 - in /llvm/trunk: lib/Transforms/IPO/GlobalOpt.cpp test/Transforms/GlobalOpt/invariant.ll

Sat Feb 18 03:14:37 PST 2012

On Feb 17, 2012, at 2:01 AM, Nick Lewycky wrote:
>>> +++ llvm/trunk/lib/Transforms/IPO/GlobalOpt.cpp Fri Feb 17 00:59:21 2012
>>> @@ -2295,6 +2295,7 @@
>>>                              DenseMap<Constant*, Constant*>  &MutatedMemory,
>>>                              std::vector<GlobalVariable*>  &AllocaTmps,
>>>                              SmallPtrSet<Constant*, 8>  &SimpleConstants,
>>> +                             SmallPtrSet<GlobalVariable*, 8>  &Invariants,
>> 
>> This really needs to refactoring we were just talking about.  This is getting ridiculous :)
> 
> Yes, I agree. Hopefully this weekend.

Thanks, I appreciate it.

>> Why are you checking for (and why is Richard generating) -1 for the object size here, instead of the proper size of the value?
> 
> We could do that, provided we both agree on the right size and don't get fouled up by things like padding. Thinking about it harder, I think we need to.
> 
> -1 is supposed to map to AliasAnalysis::UnknownSize. The trouble is that a pointer to the first inner member and "unknown size" should only make that member invariant, not the entire global.

Right.

>>> +  %B = call {}* @llvm.invariant.start(i64 -1, i8* %A)
>>> +  ; Why in the world does this pass the verifier?
>>> +  %C = bitcast {}* %B to i8*
>> 
>> This is a bitcast like any other.   I'm not sure what the question is, but it shouldn't be in the testcase.
> 
> The test is here to ensure we don't crash. I'll remove the comment; it was supposed to join the ranks of "it's okay if this test fails in the future because the optimizer optimizes harder" style of comments.

Sounds good.

> And what do you mean that it's a bitcast like any other?

It's just a pointer-to-pointer bitcast, a noop/copy.

> What value does invariant.start actually return?

Undefined, how about zero? :)  I don't know what selectiondag lowers it to offhand.

> What does it mean to store it to a {}*-typed global?

There is no store happening here, but no bytes would be stored because {} has zero size.

> The verifier should check that the only users of an invariant.start are invariant.end calls.

That would make sense to me!

>> Finally, most significantly, it isn't clear to me how the semantics of invariant.start/end work for staticly constructed objects.  Per lang-ref, the return value of ".start" is supposed to be passed into ".end".  Given the usual clang codegen for static constructors - which uses an atexit handler to run the dtor - I don't see how we can possibly generate this.
>> 
>> I understand that Clang (just now) only generates .start when there is no destructor, but I want to make sure that the semantics of these intrinsics are fully nailed down in langref before we start building things on them.  In this case, it might be best to introduce a new intrinsic here instead of [ab]using .begin in a case when .end isn't possible.
> 
> Okay. We can always introduce new intrinsics.
> 
> invariant.end is clearly tied to its invariant.start, but I never thought that the start needed to have an end. I guess that's one way to read the LangRef, but I thought it was assumed that "until X happens, Y is true" means that Y is true forever if X never happens.

..

>> struct foo {
>>   int x;
>>   foo(int x) : x(x) {}
>>   ~foo() { }
>>   };
>> 
>> const foo a(4);
>> 
>> const void *p =&a;
> 
> Good example! I'd have to think about that a bit.

After thinking about this myself, I think that the we should solve both problems at once as well as the "evaluate some, but not all static ctors within a file" case.  How about we add a new intrinsic specifically for static contructors.  Here's a strawman proposal.  It would have type:

{} *@llvm.static.constructor(i1 %isConst, ...)

The i1 indicates whether the static constructor is declared 'const' in the source code, and the ... would be a single pointer argument that is required to be a global variable.  I don't know if it is useful to have a size specifically, but if so, you could add that.  The return value is a token (like the invariant intrinsics).  The only allowed use would be by:

void @llvm.static.destructor({} *)

if a global has a static destructor associated with it.  In a case like the above, the generated static constructor function would look like:

define internal void @__cxx_global_var_init() section "__TEXT,__StaticInit,regular,pure_instructions" {
entry:
  %tmp = call {} *@llvm.static.constructor(i1 true, %struct.foo* @_ZL1a)
  call void @_ZN3fooC1Ei(%struct.foo* @_ZL1a, i32 4)
  call void void @llvm.static.destructor({} * %tmp)
  %0 = call i32 @__cxa_atexit(void (i8*)* bitcast (void (%struct.foo*)* @_ZN3fooD1Ev to void (i8*)*), i8* bitcast (%struct.foo* @_ZL1a to i8*), i8* bitcast (i8** @__dso_handle to i8*))
  ret void
}

Globalopt would then be enhanced to use the following logic: if a static constructor function has none of these intrinsics, then the current "all or nothing" behavior stays in effect.  Otherwise, it evaluates the static constructor function in sections, separated by llvm.static.constructor calls.  Each individual static constructor could succeed individually of the rest.  If the example had a #include of <iostream>, that would mean that we could statically evaluate "a" even though we can't handle the call to _ZNSt8ios_base4InitC1Ev.

The existing logic in globalopt for eliminating atexit calls would be kept.  If an llvm.static.destructor is immediately followed by a llvm.static.constructor call or a return, then it is dead and can be removed.

If the llvm.static.destructor call isn't present (i.e. the static.constructor intrinsic has no uses), then we know that there is no destructor.  If the constructor is evaluated and %isConst=1, then the global can be marked constant.

I think that something like this would be a pretty significant generalization to the existing code, and would allow us to optimize away a *lot* more.

What do you guys think?

-Chris