[LLVMdev] GlobalsModRef (and thus LTO) is completely broken

Tue Jul 14 12:02:30 PDT 2015

> On Jul 13, 2015, at 10:37 PM, Chandler Carruth <chandlerc at gmail.com> wrote:
> 
> On Mon, Jul 13, 2015 at 10:21 PM Chris Lattner <clattner at apple.com <mailto:clattner at apple.com>> wrote:
> 
> > On Jul 13, 2015, at 8:19 PM, Chandler Carruth <chandlerc at gmail.com <mailto:chandlerc at gmail.com>> wrote:
> >
> > Ok folks,
> >
> > I wrote up the general high-level thoughts I have about stateful AA in a separate thread. But we need to sort out the completely and horribly broken aspects of GlobalsModRef today, and the practical steps forward. This email is totally about the practical stuff.
> >
> > Now, as to why I emailed this group of people and with this subject, the only pass pipeline that includes GlobalsModRef, is the LTO pipeline.
> 
> Ah, so it is just an LTO enabled benchmark hack then.
> 
> > It is also relying on a precomputed set of global variables whose address is never used by an instruction other than some very small set (gep, bitcast) as "non-address-taken". It then runs GetUnderlyingObject on the two pointers in alias queries, and if that finds one of these "non-address-taken" globals for one of the memory locations but not the other, it concludes no-alias! This is broken for a number of reasons.
> >
> > a) If the two locations merely have a different *depth* of instruction stack, because GetUnderlyingObject has a recursion cap, one side can fail while the other succeeds, and we erroneously produce no-alias.
> 
> Interesting.  I’m sure it is no consolation, but GlobalsModRef probably predated the recursion cap :-)
Why not return the conservative result when the cap is hit?
> 
> > b) If instcombine or any other pass for any reason introduces on one path an instruction that GetUnderlyingObject can't look through (select, phi, load, ....), we incorrectly conclude no-alias. This is what addEscapingUse was intended to solve, but we would literally have to call it from every pass because we can't rely on analysis invalidation!
Dito
> >
> > c) If any pass actually escapes a pointer from one function into another, we invalidate the underlying assumption of 'non-address-taken' that it relies upon.
> Yep, all of these are pretty nasty.
> 
> > Now, as I argued in my general AA thread, I think we might be able to assume that (c) doesn't happen today. But both (a) and (b) seem like active nightmares to try to fix. I can see hacky ways to avoid (a) where we detect *why* GetUnderlyingObject fails, but I don't see how to fix both (a) and (b) (or to fix (a) well) without just disabling this specific aspect of GloblasModRef.
> 
> Ok, but we need performance information to make sure this doesn’t cause a regression in practice for LTO builds.  For example, Spec 2K and 2K6 are a reasonable place to start.
> 
> > 1) Fix obvious issues with GloblasModRef and switch it to ValueHandles
> > 2) Mail out a patch to disable this part of GlobalsModRef. I can put it behind a flag or however folks would like it to work.
> > 3) Remove addEscapingUse() update API, which without #2 may regress some LTO test case I don't have (because I don't have any other than bootstrap)
> 
> Sounds great if we can do this without causing a regression in practice.  Are you aware of any miscompiles that might be attributed to this, or are these “theoretical" concerns?
> 
> I don't really have any way of benchmarking SPEC with LTO. If that's a configuration that is super important to people, I'm hoping they'll let me know about any particular performance issues. 

We’ll get that data for ARM.
> 
> Ultimately, I'm happy making no performance impacting changes here. The thing is, I need to remove a broken abstraction (addEscapingUse) that was half-way added without any test case (literally, I have no test case that fails if I delete it. The regression tests don't fail if I assert(0) in it). =/ I don't want to do that if it causes miscompiles for folks that *are* relying on LTO. So my plan is to essentially post the patch that will fix the miscompiles but may regress performance. If the miscompiles show up for those relying on LTO and this (somewhat unsound) pass, they are in the best position to evaluate the cost/benefit of a performance reducing patch.
> 

> Honsetly, my hope is that it won't even impact performance. But I have no realistic way to measure it in any useful way. And for *my* platforms, I would be perfectly happy to trade some performance for correctness here. That's one of the reasons we're not relying on LTO yet. But I understand that others in the community have more strict performance needs and so I'm doing what I can to give them options.
I wouldn’t be willing to give up performance for hypothetical issues. Please protect all your changes with options. For some of your concerns it is probably hard to provide a test case that shows an/the actual issue.
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu <mailto:LLVMdev at cs.uiuc.edu>         http://llvm.cs.uiuc.edu <http://llvm.cs.uiuc.edu/>
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev <http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150714/cd02eb63/attachment.html>