[llvm-commits] patch: CXAGuardElimination pass.

Mon May 25 19:37:02 PDT 2009

Eli Friedman wrote:
> On Mon, May 25, 2009 at 5:17 PM, Nick Lewycky <nicholas at mxc.ca> wrote:
>> Eli Friedman wrote:
>>> On Mon, May 25, 2009 at 3:59 PM, Nick Lewycky <nicholas at mxc.ca> wrote:
>>>> Eli Friedman wrote:
>>>>> On Sun, May 24, 2009 at 11:25 PM, Nick Lewycky <nicholas at mxc.ca> wrote:
>>>>>> Eli Friedman wrote:
>>>>>>> On Sun, May 24, 2009 at 9:34 PM, Nick Lewycky <nicholas at mxc.ca> wrote:
>>>>>>>> However, it doesn't simplify it down all the way. See llvm.org/PR4261 for an
>>>>>>>> example of what happens after this optimization is applied on the above
>>>>>>>> program. We may decide that PR4261 is too hard to fix in general and just
>>>>>>>> add some extra logic to this pass, but I'd rather have this committed for a
>>>>>>>> start.
>>>>>>> Couldn't you just change AcquireRet from a constant 1 to a constant 0?
>>>>>>>  If it's safe to remove the guard, I don't see how the chosen path
>>>>>>> could make a difference.
>>>>>> Release has to run. It has the visible effect of changing the guard
>>>>>> variable to a 1. Nowhere in this pass do we prove those two calls are
>>>>>> the only places looking at the guard variable.
>>>>> Assuming it's actually a guard, nothing besides the guard should care
>>>>> whether the guard variable initialized or not in the current call. The
>>>>> pass as written already makes assumptions that agree with that: for
>>>>> example, it doesn't bother to check for instructions with side-effects
>>>>> after the call to __cxa_guard_release.
>>>> I'm going to change my mind on this one. That's not safe. Here's my example.
>>>>
>>>> We have a function that calls __cxa_guard_acquire/release on a single
>>>> guard variable. Then we decide to clone it as part of partial
>>>> specialization.
>>>>
>>>> The initialization being guarded depends on some global variable -- if
>>>> it's zero it does nothing, else it prints "foo". When cloneA is run we
>>>> know that the global is always 0 so the initialization does nothing and
>>>> we eliminate the guards. In cloneB the global is always non-zero, so it
>>>> always prints "foo" to the screen and we keep the guards.
>>>>
>>>> When cloneA runs it needs to mark the guard variable as being "already
>>>> initialized" or else when cloneB runs it will go ahead and print "foo"
>>>> and it's not supposed to. You've changed the behaviour of the program.
>>> Ooh, that's right.  Actually, it gets nastier than that: suppose
>>> something like the following:
>>>
>>> static int gval1 = 1;
>>> int gval2 = 1;
>>> struct A { A() { if (gval1) { printf("%d\n", gval2); } };
>>> void a() { static A x(); gval2 = 0;}
>>> void b() { a(); }
>>> void c() { gval1 = 0; a(); }
>>>
>>> Then suppose b() and c() are called at the same time on separate
>>> threads, and we've inlined/optimized everything; if the guard is
>>> removed from c(), we can end up printing out "0", which shouldn't be
>>> possible because setting gval2 to zero only happens after the
>>> constructor for A finishes.
>> Right. deadGuardElim checks for memory *reads* as well as writes and
>> unwinds in order to prevent this sort of thing from happening.
> 
> If a() gets inlined into c(), the load inside A() of gval1 can be
> eliminated; I'm not sure if we do that particular optimization at the
> moment, though.

Boo. We'll continue to have problems so long as we try to eliminate one 
pair at a time, we need to either prove them all dead for a given guard 
variable or not.

Excellent review, by the way!

Nick