[LLVMdev] GlobalsModRef (and thus LTO) is completely broken
Chandler Carruth
chandlerc at gmail.com
Mon Jul 13 22:37:34 PDT 2015
On Mon, Jul 13, 2015 at 10:21 PM Chris Lattner <clattner at apple.com> wrote:
>
> > On Jul 13, 2015, at 8:19 PM, Chandler Carruth <chandlerc at gmail.com>
> wrote:
> >
> > Ok folks,
> >
> > I wrote up the general high-level thoughts I have about stateful AA in a
> separate thread. But we need to sort out the completely and horribly broken
> aspects of GlobalsModRef today, and the practical steps forward. This email
> is totally about the practical stuff.
> >
> > Now, as to why I emailed this group of people and with this subject, the
> only pass pipeline that includes GlobalsModRef, is the LTO pipeline.
>
> Ah, so it is just an LTO enabled benchmark hack then.
>
> > It is also relying on a precomputed set of global variables whose
> address is never used by an instruction other than some very small set
> (gep, bitcast) as "non-address-taken". It then runs GetUnderlyingObject on
> the two pointers in alias queries, and if that finds one of these
> "non-address-taken" globals for one of the memory locations but not the
> other, it concludes no-alias! This is broken for a number of reasons.
> >
> > a) If the two locations merely have a different *depth* of instruction
> stack, because GetUnderlyingObject has a recursion cap, one side can fail
> while the other succeeds, and we erroneously produce no-alias.
>
> Interesting. I’m sure it is no consolation, but GlobalsModRef probably
> predated the recursion cap :-)
>
> > b) If instcombine or any other pass for any reason introduces on one
> path an instruction that GetUnderlyingObject can't look through (select,
> phi, load, ....), we incorrectly conclude no-alias. This is what
> addEscapingUse was intended to solve, but we would literally have to call
> it from every pass because we can't rely on analysis invalidation!
> >
> > c) If any pass actually escapes a pointer from one function into
> another, we invalidate the underlying assumption of 'non-address-taken'
> that it relies upon.
>
> Yep, all of these are pretty nasty.
>
> > Now, as I argued in my general AA thread, I think we might be able to
> assume that (c) doesn't happen today. But both (a) and (b) seem like active
> nightmares to try to fix. I can see hacky ways to avoid (a) where we detect
> *why* GetUnderlyingObject fails, but I don't see how to fix both (a) and
> (b) (or to fix (a) well) without just disabling this specific aspect of
> GloblasModRef.
>
> Ok, but we need performance information to make sure this doesn’t cause a
> regression in practice for LTO builds. For example, Spec 2K and 2K6 are a
> reasonable place to start.
>
> > 1) Fix obvious issues with GloblasModRef and switch it to ValueHandles
> > 2) Mail out a patch to disable this part of GlobalsModRef. I can put it
> behind a flag or however folks would like it to work.
> > 3) Remove addEscapingUse() update API, which without #2 may regress some
> LTO test case I don't have (because I don't have any other than bootstrap)
>
> Sounds great if we can do this without causing a regression in practice.
> Are you aware of any miscompiles that might be attributed to this, or are
> these “theoretical" concerns?
>
I don't really have any way of benchmarking SPEC with LTO. If that's a
configuration that is super important to people, I'm hoping they'll let me
know about any particular performance issues.
Ultimately, I'm happy making no performance impacting changes here. The
thing is, I need to remove a broken abstraction (addEscapingUse) that was
half-way added without any test case (literally, I have no test case that
fails if I delete it. The regression tests don't fail if I assert(0) in
it). =/ I don't want to do that if it causes miscompiles for folks that
*are* relying on LTO. So my plan is to essentially post the patch that will
fix the miscompiles but may regress performance. If the miscompiles show up
for those relying on LTO and this (somewhat unsound) pass, they are in the
best position to evaluate the cost/benefit of a performance reducing patch.
Honsetly, my hope is that it won't even impact performance. But I have no
realistic way to measure it in any useful way. And for *my* platforms, I
would be perfectly happy to trade some performance for correctness here.
That's one of the reasons we're not relying on LTO yet. But I understand
that others in the community have more strict performance needs and so I'm
doing what I can to give them options.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150714/2ca6df08/attachment.html>
More information about the llvm-dev
mailing list