[llvm-dev] Alias analysis results

Thu May 11 18:48:43 PDT 2017

+hal

Hi,

I'm not an expert on all of our AA bits, but I've been in the area some. If
I'm wrong, I'm sure someone will jump in and we'll both end up learning
something. :)

> Why the current TBAA implementation returns NoAlias for pairs of
> accesses for which it has no idea if they really alias?

For the same reason that LLVM assumes many other things it can't prove. The
language says "if you do ${x}, the behavior is undefined," so optimizing on
the assumption that the user isn't doing ${x} is OK.

That said, optimizing on aliasing assumptions has been contentious in the
past, so frontends that emit TBAA data (e.g. clang) generally support flags
like -fno-strict-aliasing to relax this behavior.

> that I would expect to a) generate [...] warnings and [...]

FWIW, it's generally nontrivial to emit helpful warnings from LLVM. This is
because LLVM doesn't have the AST that the IR was generated from, so it has
no source-level information to speak of (except the bits we encode in
debuginfo, but that doesn't exactly make for great diags).

In other words, a warning emitted by LLVM would probably be as helpful as
"hey, somewhere in the function '_Z3fooi', or one of the functions that we
happened to inline into it, we found some pretty obvious type punning."

> it's hard to imagine how a pair of accesses can be
> aliasing and non-aliasing at the same time

As you've noted, this is an artifact of taking two logically independent
pieces of information and merging them into one. While this might not be
ideal from a theoretical point of view, I don't see why an optimization
would care about the difference between "proven NoAlias" and "proven
illegal to alias." In either case, it's allowed to optimize based on the
assumption that those things don't alias. So, from a practical standpoint,
I don't understand what making that distinction would buy us.

> we are trying to use BasicAA as a faster replacement for
> TBAA in simple cases

AIUI, the idea of putting BasicAA in front of TBAA isn't an efficiency
hack: its purpose is to catch "obvious" type punning, and handle it
gracefully for the user (your 'b)' above).

You might find Hal's recent work on TySan (https://reviews.llvm.org/D32197)
interesting; it's a sanitizer that tries to catch and complain about type
punning at run-time. :)

George

On Thu, May 11, 2017 at 1:50 AM, Ivan A. Kosarev via llvm-dev <
llvm-dev at lists.llvm.org> wrote:

> Hello,
>
> I'm trying to get the whole picture of what we are doing in terms
> of alias analysis in LLVM. Being new to this I may be missing
> some well known things so my apologies if I seem to ask obvious
> things.
>
> There are lots of questions arise in my head as I read related
> bug reports, sources and documentation, of which looks to be the
> most simple is,
>
> Why the current TBAA implementation returns NoAlias for pairs of
> accesses for which it has no idea if they really alias?
>
> Let me elaborate on this. My understanding of what the TBAA
> implementation is supposed to respond is whether a given pair of
> accesses is allowed to alias by the rules of the input language.
> (Which in turn leads me to another question, that is, whether
> we are really trying to perform C/C++ alias analysis in a
> [language-neutral] codegen pass, but let's postpone this for
> later.)
>
> This consideration assumes that what the TBAA implementation
> actually returns as a result is orthogonal to the
> Must/No/MayAlias set, that used to be the result type of various
> AA, including TBAA. I would expect that the result of the
> complete alias analysis includes both the information on whether
> given accesses alias and, as a separate element, whether they are
> allowed to alias by the rules of the language. Then we may have
> combinations like (MustAlias, AllowedAlias) that seem to be the
> common case and combinations like (MustAlias, NotAllowedAlias)
> that I would expect to a) generate the breaks-alias-rules kind of
> warnings and b) proceed further as any other MustAlias case.
>
> The latter combination is what I would expect for illegal type
> puns, for whatever definition of "illegal". Note that for such
> cases we currently get MustAlias from BasicAA and NoAlias from
> TBAA. In some comments this is considered okay whereas I have to
> admit that I don't understand how that may possibly be okay,
> meaning it's hard to imagine how a pair of accesses can be
> aliasing and non-aliasing at the same time. To me it appears
> like that the root cause if this inconsistency is not that any of
> these analyses is not correct, but that we are trying to
> represent their responses in an inadequate form and we are trying
> to use BasicAA as a faster replacement for TBAA in simple cases.
>
> Any thoughts are highly appreciated.
>
> Thanks,
>
> --
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170511/2223ed47/attachment-0001.html>