[llvm-commits] patch: CXAGuardElimination pass.

Mon May 25 16:46:38 PDT 2009

On Mon, May 25, 2009 at 3:59 PM, Nick Lewycky <nicholas at mxc.ca> wrote:
> Eli Friedman wrote:
>> On Sun, May 24, 2009 at 11:25 PM, Nick Lewycky <nicholas at mxc.ca> wrote:
>>> Eli Friedman wrote:
>>>> On Sun, May 24, 2009 at 9:34 PM, Nick Lewycky <nicholas at mxc.ca> wrote:
>>>>> However, it doesn't simplify it down all the way. See llvm.org/PR4261 for an
>>>>> example of what happens after this optimization is applied on the above
>>>>> program. We may decide that PR4261 is too hard to fix in general and just
>>>>> add some extra logic to this pass, but I'd rather have this committed for a
>>>>> start.
>>>> Couldn't you just change AcquireRet from a constant 1 to a constant 0?
>>>>  If it's safe to remove the guard, I don't see how the chosen path
>>>> could make a difference.
>>> Release has to run. It has the visible effect of changing the guard
>>> variable to a 1. Nowhere in this pass do we prove those two calls are
>>> the only places looking at the guard variable.
>>
>> Assuming it's actually a guard, nothing besides the guard should care
>> whether the guard variable initialized or not in the current call. The
>> pass as written already makes assumptions that agree with that: for
>> example, it doesn't bother to check for instructions with side-effects
>> after the call to __cxa_guard_release.
>
> I'm going to change my mind on this one. That's not safe. Here's my example.
>
> We have a function that calls __cxa_guard_acquire/release on a single
> guard variable. Then we decide to clone it as part of partial
> specialization.
>
> The initialization being guarded depends on some global variable -- if
> it's zero it does nothing, else it prints "foo". When cloneA is run we
> know that the global is always 0 so the initialization does nothing and
> we eliminate the guards. In cloneB the global is always non-zero, so it
> always prints "foo" to the screen and we keep the guards.
>
> When cloneA runs it needs to mark the guard variable as being "already
> initialized" or else when cloneB runs it will go ahead and print "foo"
> and it's not supposed to. You've changed the behaviour of the program.

Ooh, that's right.  Actually, it gets nastier than that: suppose
something like the following:

static int gval1 = 1;
int gval2 = 1;
struct A { A() { if (gval1) { printf("%d\n", gval2); } };
void a() { static A x(); gval2 = 0;}
void b() { a(); }
void c() { gval1 = 0; a(); }

Then suppose b() and c() are called at the same time on separate
threads, and we've inlined/optimized everything; if the guard is
removed from c(), we can end up printing out "0", which shouldn't be
possible because setting gval2 to zero only happens after the
constructor for A finishes.

-Eli