[llvm-dev] Understand alias-analysis results

Shuai Wang via llvm-dev llvm-dev at lists.llvm.org
Fri Jul 10 04:03:47 PDT 2020


Dear Matt,



On Fri, Jul 10, 2020 at 4:43 PM Matt P. Dziubinski <matdzb at gmail.com> wrote:

> Hi!
>
> On 7/10/2020 07:17, Shuai Wang wrote:
> > Hello!
> >
> > Thank you very much! Yes, that makes a lot of sense to me. However, just
> > want to point out two things that are still unclear:
> >
> > 1. The output contains a alias set of only one element, for instance:
> > "must alias, Mod       Pointers: (i32* %y, LocationSize::precise(4))"
> >
> > This one really confused me. I would expect to have an alias set of at
> > least *two* elements (e.g., two pointers point to the same memory
> > region), otherwise for the above case, $y$ is aliased to whom? I
> > searched all outputs that are related to %y from
> > https://llvm.godbolt.org/z/9njGqx but all I can find is "NoAlias".
> > Overall, to understand why an alias set can have only one pointer looks
> > quite strange to me..
>
> This seems correct: Note that `int y` is a distinct object in memory,
> not a pointer, and shares no alias sets with anything (we happen to
> assign the int _value_ we obtain from pointer `x` by dereferencing it
> `*x`, but that bears no relevance to aliasing here). Perhaps this can
> help illustrate the scenario (assuming the URL doesn't get mangled):
>
> http://www.pythontutor.com/cpp.html#code=void%20MUSTALIAS%28void%20*p,%20void%20*q%29%20%7B%7D%0A%0Aint%20main%28%29%7B%0A%0A%20%20int%20**a,%20*b,%20*x%20,c%3B%0A%20%20c%20%3D%2010%3B%0A%20%20a%20%3D%20%26b%3B%0A%20%20b%20%3D%20%26c%3B%0A%20%20x%20%3D%20*a%3B%0A%20%20int%20y%20%3D%20*x%3B%0A%20%20MUSTALIAS%28x,%26c%29%3B%0A%20%20MUSTALIAS%28x,b%29%3B%0A%20%20return%200%3B%0A%7D&curInstr=12&mode=display&origin=opt-frontend.js&py=cpp&rawInputLstJSON=%5B%5D
>
> Note (in step 13 of 13) how `y` does not alias (it is just an `int`
> itself) anything (agreeing with NoAlias results you're getting).
>
>
Yes, I think I fully understand that 'y' is not a pointer. However, again,
what confuses me and seems incorrect is the output of LLVM alias analysis:

"*must alias, Mod*       Pointers: (i32* %y, LocationSize::precise(4))"

Isn't it literally indicating that "i32* y", denoting a pointer, is *must
alias* with some other pointers?

If so, then why does this set only have one pointer instead of at least
two? If not (which makes more sense), then why is "i32* %y" reported and
annotated as "*must alias, mod*"? Both ways do not make sense to me.

Best,
Shuai




> >
> > 2. For the following code chunks:
> >
> >    b = &c;
> >    x = *a;
> >    int y = *x;
> >    MUSTALIAS(x,&c);
> >    MUSTALIAS(x,b);
> >
> > I don't get it why (quote from your previous email) "the desired
> > inference of "x and b are "must alias"" cannot be correct--these are not
> > the same objects in memory". x and b are both pointing to c, right? Am I
> > missing anything here?
>
> Correct, both `x` and `b` point to `c` (the pointers themselves are
> distinct, but after dereferencing what they point to is the same location).
> Consider the slice on the (first call) to `MUSTALIAS(x, &c);`:
> https://llvm.godbolt.org/z/YaW1Mb
>
> %a = alloca i32**, align 8
> %b = alloca i32*, align 8
> %x = alloca i32*, align 8
> %c = alloca i32, align 4
> . . .
> %0 = load i32**, i32*** %a, align 8
> %1 = load i32*, i32** %0, align 8
> . . .
> %4 = load i32*, i32** %x, align 8
> %5 = bitcast i32* %4 to i8*
> %6 = bitcast i32* %c to i8*
> ; MUSTALIAS(x, &c);
> call void @MUSTALIAS(i8* %5, i8* %6)
>
> Notice the following results:
> NoAlias:        i32* %4, i32** %x
> MayAlias:       i32* %4, i32* %c
> MustAlias:      i32* %4, i8* %5
> MayAlias:       i32* %4, i8* %6
> . . .
> MayAlias:       i32** %b, i8* %5
> NoAlias:        i32** %x, i8* %5
> MayAlias:       i32* %c, i8* %5
> MayAlias:       i8* %5, i8* %6
> MustAlias:      i32* %c, i8* %6
> . . .
> MustAlias:   %4 = load i32*, i32** %x, align 8 <->   store i32* %1,
> i32** %x, align 8
>
> The location in memory of a pointer object `b` may alias location of the
> first argument `%5` but not the (other memory object) pointer `x`;
> however, `i32* %c` aliases second argument `i8* %6` (the same memory
> object) as well as may alias `i8* %5` (which we've obtained from loading
> the address from pointer `x`; the difference here being between the
> memory locations storing the pointers themselves, say, on the stack--not
> aliased, and the memory objects the addresses stored in the pointers
> refer to, i.e., i32** vs. i32* or i8*).
>
> Best,
> Matt
>
> >
> > Sorry for the potential trouble this may have caused.. And thank you
> > in advance!
> >
> > Best,
> > Shuai
> >
> > On Thu, Jul 9, 2020 at 9:58 PM Matt P. Dziubinski <matdzb at gmail.com
> > <mailto:matdzb at gmail.com>> wrote:
> >
> >     Hi again!
> >
> >     Replying in chronological order:
> >
> >      > On Thu, Jul 9, 2020 at 6:51 PM Shuai Wang <wangshuai901 at gmail.com
> >     <mailto:wangshuai901 at gmail.com>
> >      > <mailto:wangshuai901 at gmail.com <mailto:wangshuai901 at gmail.com>>>
> >     wrote:
> >      >
> >      >     Hey Matt,
> >      >
> >      >     That's awesome. Thank you very much for all the information
> and
> >      >     clarification! Just a few follow up questions. Could you
> >     kindly shed
> >      >     some lights on it? Thank you!
> >      >
> >      >     1. I tried to tweak the code in the following way: [...]
> >      >
> >      >     I am trying to interpret the outputs, so if I understand
> >     correctly,
> >      >     the output indicates that we have an alias set of 4 pointers
> >     which
> >      >     "potentially" point to the same memory region, correct? Then
> is
> >      >     there any more accurate analysis pass that I could use to
> >     somewhat
> >      >     infer that "there are two must alias sets, each set has two
> >      >     pointers"? Correct me if I was wrong here.. Using my local opt
> >      >     (version 6.0), I tried to iterate all feasible alias analysis
> >     passes
> >      >     but the results are not changed.
> >
> >     Seems correct, I don't think you'll get more precise results out of
> the
> >     basic-aa pass, note that it has limited context sensitivity:
> >     https://llvm.org/docs/AliasAnalysis.html#the-basic-aa-pass
> >
> >     Compare the results for `test_simple`, `test_in_array`, and
> >     `test_same_underlying_object_different_indices`:
> >
> https://github.com/llvm/llvm-project/blob/release/10.x/llvm/test/Analysis/BasicAA/struct-geps.ll
> >
> >      >     Also, what is the "must alias, Mod/Ref forwarding to
> >     0x563faa6c6260"?
> >
> >     If alias sets have been merged, you'll get the attached node
> forwarding
> >     to the root node; note the comment for `getForwardedTarget` making a
> >     reference to union-find:
> >
> https://github.com/llvm/llvm-project/blob/release/10.x/llvm/include/llvm/Analysis/AliasSetTracker.h#L281
> >     (with "merge" corresponding to the union-find collapsing for "union",
> >     https://en.wikipedia.org/wiki/Disjoint-set_data_structure#Union).
> >
> >     You can see how `AliasSet::mergeSetIn` (called, e.g., by
> >     `AliasSetTracker::mergeAliasSetsForPointer`) sets up forwarding for
> >     `AS.Forward`:
> >
> https://github.com/llvm/llvm-project/blob/release/10.x/llvm/lib/Analysis/AliasSetTracker.cpp#L51
> ,
> >
> >
> https://github.com/llvm/llvm-project/blob/release/10.x/llvm/lib/Analysis/AliasSetTracker.cpp#L301
> >     FWIW, you can use a tracer or manually step through `opt` in a
> debugger
> >     to follow the function calls.
> >
> >      >     And how to interpret that we have "2 must alias responses"?
> Where
> >      >     does it come from? And why do we have "0 may alias response"?
> I
> >      >     would expect to have at least "4 may alias responses" as well?
> >
> >     No, "MayAlias" and "MustAlias" are distinct elements in the lattice,
> cf.
> >     https://llvm.org/docs/AliasAnalysis.html#must-may-or-no
> >     There's a good explanation of the alias analysis queries and
> responses
> >     in the following talk (particularly the part starting with "AA Query"
> >     around the 22 min. mark):
> >     “Pointers, Alias & ModRef Analyses” (2018 EuroLLVM Developers’
> Meeting:
> >     A. Sbirlea & N. Lopes)
> >     https://www.youtube.com/watch?v=r0XVS4Atl3U
> >
> >     When you return `AliasResult` from your analysis you choose one:
> >
> https://llvm.org/doxygen/namespacellvm.html#ae1738272abcf2ac638b97e7dc6360cfd
> >
> >     You can see a simple example here (`TARAAResult::alias`):
> >     https://blog.tartanllama.xyz/llvm-alias-analysis/
> >
> >      >     2. I note that using the latest opt (version 11.0?) gives
> >     different
> >      >     outputs with my local opt (version 6.0). For opt (version
> >     6.0), it
> >      >     reports: 2 alias sets for 2 pointer values.
> >      >
> >      >     More importantly, can I expect to get generally better alias
> >      >     analysis results when switching to version 11.0?
> >
> >     I'd assume that generally it shouldn't get worse :-)
> >
> >      >
> >      >     Thank you very much!
> >
> >     On 7/9/2020 13:24, Shuai Wang wrote:
> >       > And another case:
> >       >
> >       > - Clang [-> LLVM-IR]: https://llvm.godbolt.org/z/SGeJZw
> >       > - [LLVM-IR ->] opt: https://llvm.godbolt.org/z/dNi-k2
> >       >
> >       > Is there any chance that we can smoothly infer that:
> >       > - x and &c are "must alias"
> >       > - x and b are "must alias"
> >       >
> >       > I don't know how to interpret the current results, in particular
> the
> >       > following outputs:
> >       >
> >       > AliasSet[0x5584ab7e5f30, 1] must alias, Mod/Ref   Pointers:
> >     (i32** %x,
> >       > LocationSize::precise(8))
> >       > AliasSet[0x5584ab7e6020, 1] must alias, Mod       Pointers:
> >     (i32* %y,
> >       > LocationSize::precise(4))
> >       >
> >       > It means we have two "must alias" sets, each of which contain
> >     only one
> >       > pointer? That seems quite confusing to me..
> >
> >     You can add -print-all-alias-modref-info for more detailed
> information:
> >     https://llvm.godbolt.org/z/9njGqx -- you'll notice "MustAlias:  i32*
> >     %c,
> >     i8* %6".
> >
> >     Adding `-evaluate-aa-metadata` for `load` and `store` instructions,
> >     https://llvm.godbolt.org/z/YaW1Mb, you'll notice "MustAlias:   %0 =
> >     load
> >     i32**, i32*** %a, align 8 <->   store i32** %b, i32*** %a, align 8"
> >
> >     However, from your results we can already note:
> >
> >     AliasSet[0x5584ab7e5d00, 5] may alias, Mod/Ref   Pointers: (i32* %c,
> >     LocationSize::precise(4)), (i32** %b, LocationSize::precise(8)),
> (i32**
> >     %0, LocationSize::precise(8)), (i32* %2, LocationSize::precise(4))
> >
> >     Note how in the C source code pointer `b` points to int `c` (`b =
> &c;`)
> >     corresponding to the memory locations (same object in memory when
> >     loading from `c` or `*b`). However, pointers `x` and `y` are distinct
> >     objects in memory themselves. In general, remember not to confuse
> >     pointers with what they point to--here also distinct, since `x`
> points
> >     `b` but `y` points to `c` (I mention this specifically since the
> >     desired
> >     inference of "x and b are "must alias"" cannot be correct--these are
> >     not
> >     the same objects in memory).
> >
> >     Best,
> >     Matt
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200710/58c6850f/attachment.html>


More information about the llvm-dev mailing list