[LLVMdev] GlobalsModRef (and thus LTO) is completely broken
Chandler Carruth
chandlerc at gmail.com
Tue Jul 14 23:25:51 PDT 2015
Replying here, but several of the questions raised boil down to "couldn't
you make the usage of GetUnderlyingObject conservatively correct?". I'll
try and address that.
I think this *is* the right approach, but I think it is very hard to do
without effectively disabling this part of GlobalsModRef. That is, the easy
ways are likely to fire very frequently IMO.
The core idea is to detect a "no information" state coming out of
GetUnderlyingObject (likely be providing a custom version just for
GlobalsModRef and tailored to its use cases). This is particularly
effective at avoiding the problems with the recursion limit. But let's look
at what cases we *wouldn't* return that. Here are the cases I see when I
thought about this last night with Hal, roughly in descending likelihood I
would guess:
1) We detect some global or an alloca. In that case, even BasicAA would be
sufficient to provide no-alias. GMR shouldn't be relevant.
2) We detect a phi, select, or inttoptr, and stop there.
3) We detect a load and stop there.
4) We detect a return from a function.
5) We detect an argument to the function.
I strongly suspect the vast majority of queries hit #1. That's why BasicAA
is *so* effective. Both #4 and #5 I think are actually reasonable places
for GMR to potentially say "no-alias" and provide useful definitive
information. But I also suspect these are the least common.
So let's look at #2 and #3 because I think they're interesting. For these,
I think it is extremely hard to return "no-alias". It seems extremely easy
for a reasonable and innocuous change to the IR to introduce a phi or a
select into one side of the GetUnderlyingObject but not the other. If that
ever happens, we can't return "no-alias" for #2, or we need to add really
expensive updates. It also seems reasonable (if not as likely) to want
adding a store and load to the IR to not trigger a miscompile. If it is
possible for a valid optimization pass to do reg2mem on an SSA value, then
that could happen to only one side of the paired GetUnderlyingObject and
break GMR with #3. If that seems like an unreasonable thing to do,
consider loop re-rolling or other transformations which may need to take
things in SSA form at move them out of SSA form. Even if we then try
immediately to put it back *into* SSA form, before we do that we create a
point where GMR cannot correctly return no-alias.
So ultimately, I don't think we want to rely on GMR returning "no-alias"
for either #2 or #3 because of the challenge of actually updating it in all
of the circumstances that could break them. That means that *only* #4 and
#5 are going to return "no-alias" usefully. And even then, function
inlining and function outlining both break #4 and #5, so you have to
preclude those transforms while GMR is active. And I have serious doubts
about these providing enough value to be worth the cost.
I think the better way to approach this is the other way around. Rather
than doing a backwards analysis to see if one location reaches and global
and the other location doesn't reach a global, I think it would be much
more effective to re-cast this as a forward analysis that determines all
the memory locations in a function that come from outside the function, and
use that to drive the no-alias responses.
On Tue, Jul 14, 2015 at 12:12 PM Gerolf Hoflehner <ghoflehner at apple.com>
wrote:
> I wouldn’t be willing to give up performance for hypothetical issues.
> Please protect all your changes with options. For some of your concerns it
> is probably hard to provide a test case that shows an/the actual issue.
>
I certainly agree that it will be very hard to provide a test case and
extremely rare to see this in the wild for most of these issues. As long as
I can remove the problematic update API we currently have (which as its an
API change can't really be put behind flags), I'm happy to have flags
control whether or not GMR uses the unsound / stale information to try to
answer alias queries. Do you have any opinion about what the default value
of the flags should be?
I'll go ahead and prepare the patches, as it seems like we're all ending up
in the same position, and just wondering about the precise tradeoffs we
want to settle on.
> _______________________________________________
> LLVM Developers mailing list
>
> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150715/805ee4b8/attachment.html>
More information about the llvm-dev
mailing list