[llvm-commits] [llvm] r150794 - in /llvm/trunk: lib/Transforms/IPO/GlobalOpt.cpp test/Transforms/GlobalOpt/invariant.ll

Fri Feb 17 02:01:57 PST 2012

Chris Lattner wrote:
> On Feb 16, 2012, at 10:59 PM, Nick Lewycky wrote:
>> URL: http://llvm.org/viewvc/llvm-project?rev=150794&view=rev
>> Log:
>> Add support for invariant.start inside the static constructor evaluator. This is
>> useful to represent a variable that is const in the source but can't be constant
>> in the IR because of a non-trivial constructor. If globalopt evaluates the
>> constructor, and there was an invariant.start with no matching invariant.end
>> possible, it will mark the global constant afterwards.
>
> Very cool optimization.  Some questions though:
>
>> +++ llvm/trunk/lib/Transforms/IPO/GlobalOpt.cpp Fri Feb 17 00:59:21 2012
>> @@ -2295,6 +2295,7 @@
>>                               DenseMap<Constant*, Constant*>  &MutatedMemory,
>>                               std::vector<GlobalVariable*>  &AllocaTmps,
>>                               SmallPtrSet<Constant*, 8>  &SimpleConstants,
>> +                             SmallPtrSet<GlobalVariable*, 8>  &Invariants,
>
> This really needs to refactoring we were just talking about.  This is getting ridiculous :)

Yes, I agree. Hopefully this weekend.

>> @@ -2415,14 +2417,39 @@
>> +        if (II->getIntrinsicID() == Intrinsic::invariant_start) {
>> +          // We don't insert an entry into Values, as it doesn't have a
>> +          // meaningful return value.
>> +          if (!II->use_empty())
>> +            return false;
>> +          ConstantInt *Size = cast<ConstantInt>(II->getArgOperand(0));
>> +          if (Size->isAllOnesValue()) {
>> +            Value *PtrArg = getVal(Values, II->getArgOperand(1));
>> +            Value *Ptr = PtrArg->stripPointerCasts();
>> +            if (GlobalVariable *GV = dyn_cast<GlobalVariable>(Ptr))
>> +              Invariants.insert(GV);
>> +          }
>
> Why are you checking for (and why is Richard generating) -1 for the object size here, instead of the proper size of the value?

We could do that, provided we both agree on the right size and don't get 
fouled up by things like padding. Thinking about it harder, I think we 
need to.

-1 is supposed to map to AliasAnalysis::UnknownSize. The trouble is that 
a pointer to the first inner member and "unknown size" should only make 
that member invariant, not the entire global.

>> +  %B = call {}* @llvm.invariant.start(i64 -1, i8* %A)
>> +  ; Why in the world does this pass the verifier?
>> +  %C = bitcast {}* %B to i8*
>
> This is a bitcast like any other.   I'm not sure what the question is, but it shouldn't be in the testcase.

The test is here to ensure we don't crash. I'll remove the comment; it 
was supposed to join the ranks of "it's okay if this test fails in the 
future because the optimizer optimizes harder" style of comments.

And what do you mean that it's a bitcast like any other? What value does 
invariant.start actually return? What does it mean to store it to a 
{}*-typed global?

The verifier should check that the only users of an invariant.start are 
invariant.end calls.

> Finally, most significantly, it isn't clear to me how the semantics of invariant.start/end work for staticly constructed objects.  Per lang-ref, the return value of ".start" is supposed to be passed into ".end".  Given the usual clang codegen for static constructors - which uses an atexit handler to run the dtor - I don't see how we can possibly generate this.
>
> I understand that Clang (just now) only generates .start when there is no destructor, but I want to make sure that the semantics of these intrinsics are fully nailed down in langref before we start building things on them.  In this case, it might be best to introduce a new intrinsic here instead of [ab]using .begin in a case when .end isn't possible.

Okay. We can always introduce new intrinsics.

invariant.end is clearly tied to its invariant.start, but I never 
thought that the start needed to have an end. I guess that's one way to 
read the LangRef, but I thought it was assumed that "until X happens, Y 
is true" means that Y is true forever if X never happens.

You could similarly have a .start, then a conditional branch which means 
that you may or may not reach the .end at run time.

> Also, as requested on cfe-commits, please only generate optimization hints when the optimizer is enabled to avoid penalizing -O0 compile time.
>
> Finally, we miss cases with explicit-but-trivial destructors, which is a pretty common pattern in some codebases.  Is there any way to generalize this optimization?  It seems worthwhile to special case this sort of thing in clang, but it would be even better for the optimizer to be able to handle it when it (successfully) removes atexit call.

> struct foo {
>    int x;
>    foo(int x) : x(x) {}
>    ~foo() { }
>    };
>
> const foo a(4);
>
> const void *p =&a;

Good example! I'd have to think about that a bit.

Nick