[llvm-dev] Understand alias-analysis results

Matt P. Dziubinski via llvm-dev llvm-dev at lists.llvm.org
Fri Jul 10 05:58:53 PDT 2020


On 7/10/2020 13:03, Shuai Wang wrote:
> On Fri, Jul 10, 2020 at 4:43 PM Matt P. Dziubinski <matdzb at gmail.com 
> <mailto:matdzb at gmail.com>> wrote:
>     Note (in step 13 of 13) how `y` does not alias (it is just an `int`
>     itself) anything (agreeing with NoAlias results you're getting).
> 
> Yes, I think I fully understand that 'y' is not a pointer. However, 
> again, what confuses me and seems incorrect is the output of LLVM alias 
> analysis:
> 
> "*must alias, Mod*       Pointers: (i32* %y, LocationSize::precise(4))"
> 
> Isn't it literally indicating that "i32* y", denoting a pointer, is 
> *must alias* with some other pointers?
> 
> If so, then why does this set only have one pointer instead of at least 
> two? If not (which makes more sense), then why is "i32* %y" reported and 
> annotated as "*must alias, mod*"? Both ways do not make sense to me.

This seems analogous to the following results:
- `allmust`: 
https://github.com/llvm/llvm-project/blob/release/10.x/llvm/test/Analysis/AliasSet/saturation.ll#L4
- `test1`: 
https://github.com/llvm/llvm-project/blob/release/10.x/llvm/test/Analysis/AliasSet/intrinsics.ll#L3

Note how we obtain four (singleton) alias sets for `allmust`, each 
containing `i32* %a`, `i32* %b`, `i32* %c`, and `i32* %d`, respectively.
This is similar to how we obtain the (singleton) alias set with `i32* y` 
(all singleton sets, i.e., single element sets, denoting the reflexivity 
of aliasing relationship; here: `i32* y` "must alias" itself).

Suppose that right after `int y = *x;` we add a statement `x = &y;` to 
the C source code: https://llvm.godbolt.org/z/PP6b75
We then obtain the corresponding analysis from opt: 
https://llvm.godbolt.org/z/WEMzvs

   AliasSet[0x55e1b009c5a0, 6] may alias, Mod/Ref   Pointers: (i32* %c, 
LocationSize::precise(4)), (i32** %b, LocationSize::precise(8)), (i32** 
%0, LocationSize::precise(8)), (i32* %2, LocationSize::precise(4)), 
(i32* %y, LocationSize::precise(4))

whereas previously (i.e., before we added `int y = *x;`) we had

     AliasSet[0x55633b7fc150, 5] may alias, Mod/Ref   Pointers: (i32* 
%c, LocationSize::precise(4)), (i32** %b, LocationSize::precise(8)), 
(i32** %0, LocationSize::precise(8)), (i32* %2, LocationSize::precise(4))

The way I think about it:

- (i32* %c, LocationSize::precise(4)) - LLVM IR level pointer (i32* %c) 
to runtime stack slot containing the value (integer) of the C source 
level `int c` (32-bit, or 4 bytes); this is also the LLVM IR level 
pointer to `i32` returned by `alloca`,

- (i32** %b, LocationSize::precise(8)) - LLVM IR level pointer (i32** 
%b) to pointer to runtime stack slot containing the value (address) of C 
source level `int *b` (64-bit, or 8 bytes); this is also the LLVM IR 
pointer to `i32*` returned by `alloca`.

The reason `(i32* %c, LocationSize::precise(4))` and `(i32** %b, 
LocationSize::precise(8))` appear together in both alias sets is because 
of the following LLVM IR statement:

   store i32* %c, i32** %b, align 8

After adding the aforementioned C source level statement `x = &y;`, we 
can observe that `(i32* %y, LocationSize::precise(4))` got merged into 
the (latter) alias set, too.

Does this seem reasonable?

Best,
Matt


More information about the llvm-dev mailing list