[llvm-commits] patch: CXAGuardElimination pass.

Mon May 25 17:51:29 PDT 2009

On Mon, May 25, 2009 at 5:17 PM, Nick Lewycky <nicholas at mxc.ca> wrote:
> Eli Friedman wrote:
>> On Mon, May 25, 2009 at 3:59 PM, Nick Lewycky <nicholas at mxc.ca> wrote:
>>> Eli Friedman wrote:
>>>> On Sun, May 24, 2009 at 11:25 PM, Nick Lewycky <nicholas at mxc.ca> wrote:
>>>>> Eli Friedman wrote:
>>>>>> On Sun, May 24, 2009 at 9:34 PM, Nick Lewycky <nicholas at mxc.ca> wrote:
>>>>>>> However, it doesn't simplify it down all the way. See llvm.org/PR4261 for an
>>>>>>> example of what happens after this optimization is applied on the above
>>>>>>> program. We may decide that PR4261 is too hard to fix in general and just
>>>>>>> add some extra logic to this pass, but I'd rather have this committed for a
>>>>>>> start.
>>>>>> Couldn't you just change AcquireRet from a constant 1 to a constant 0?
>>>>>>  If it's safe to remove the guard, I don't see how the chosen path
>>>>>> could make a difference.
>>>>> Release has to run. It has the visible effect of changing the guard
>>>>> variable to a 1. Nowhere in this pass do we prove those two calls are
>>>>> the only places looking at the guard variable.
>>>> Assuming it's actually a guard, nothing besides the guard should care
>>>> whether the guard variable initialized or not in the current call. The
>>>> pass as written already makes assumptions that agree with that: for
>>>> example, it doesn't bother to check for instructions with side-effects
>>>> after the call to __cxa_guard_release.
>>> I'm going to change my mind on this one. That's not safe. Here's my example.
>>>
>>> We have a function that calls __cxa_guard_acquire/release on a single
>>> guard variable. Then we decide to clone it as part of partial
>>> specialization.
>>>
>>> The initialization being guarded depends on some global variable -- if
>>> it's zero it does nothing, else it prints "foo". When cloneA is run we
>>> know that the global is always 0 so the initialization does nothing and
>>> we eliminate the guards. In cloneB the global is always non-zero, so it
>>> always prints "foo" to the screen and we keep the guards.
>>>
>>> When cloneA runs it needs to mark the guard variable as being "already
>>> initialized" or else when cloneB runs it will go ahead and print "foo"
>>> and it's not supposed to. You've changed the behaviour of the program.
>>
>> Ooh, that's right.  Actually, it gets nastier than that: suppose
>> something like the following:
>>
>> static int gval1 = 1;
>> int gval2 = 1;
>> struct A { A() { if (gval1) { printf("%d\n", gval2); } };
>> void a() { static A x(); gval2 = 0;}
>> void b() { a(); }
>> void c() { gval1 = 0; a(); }
>>
>> Then suppose b() and c() are called at the same time on separate
>> threads, and we've inlined/optimized everything; if the guard is
>> removed from c(), we can end up printing out "0", which shouldn't be
>> possible because setting gval2 to zero only happens after the
>> constructor for A finishes.
>
> Right. deadGuardElim checks for memory *reads* as well as writes and
> unwinds in order to prevent this sort of thing from happening.

If a() gets inlined into c(), the load inside A() of gval1 can be
eliminated; I'm not sure if we do that particular optimization at the
moment, though.

-Eli