[llvm-dev] Understand alias-analysis results
Matt P. Dziubinski via llvm-dev
llvm-dev at lists.llvm.org
Fri Jul 10 05:58:53 PDT 2020
On 7/10/2020 13:03, Shuai Wang wrote:
> On Fri, Jul 10, 2020 at 4:43 PM Matt P. Dziubinski <matdzb at gmail.com
> <mailto:matdzb at gmail.com>> wrote:
> Note (in step 13 of 13) how `y` does not alias (it is just an `int`
> itself) anything (agreeing with NoAlias results you're getting).
>
> Yes, I think I fully understand that 'y' is not a pointer. However,
> again, what confuses me and seems incorrect is the output of LLVM alias
> analysis:
>
> "*must alias, Mod* Pointers: (i32* %y, LocationSize::precise(4))"
>
> Isn't it literally indicating that "i32* y", denoting a pointer, is
> *must alias* with some other pointers?
>
> If so, then why does this set only have one pointer instead of at least
> two? If not (which makes more sense), then why is "i32* %y" reported and
> annotated as "*must alias, mod*"? Both ways do not make sense to me.
This seems analogous to the following results:
- `allmust`:
https://github.com/llvm/llvm-project/blob/release/10.x/llvm/test/Analysis/AliasSet/saturation.ll#L4
- `test1`:
https://github.com/llvm/llvm-project/blob/release/10.x/llvm/test/Analysis/AliasSet/intrinsics.ll#L3
Note how we obtain four (singleton) alias sets for `allmust`, each
containing `i32* %a`, `i32* %b`, `i32* %c`, and `i32* %d`, respectively.
This is similar to how we obtain the (singleton) alias set with `i32* y`
(all singleton sets, i.e., single element sets, denoting the reflexivity
of aliasing relationship; here: `i32* y` "must alias" itself).
Suppose that right after `int y = *x;` we add a statement `x = &y;` to
the C source code: https://llvm.godbolt.org/z/PP6b75
We then obtain the corresponding analysis from opt:
https://llvm.godbolt.org/z/WEMzvs
AliasSet[0x55e1b009c5a0, 6] may alias, Mod/Ref Pointers: (i32* %c,
LocationSize::precise(4)), (i32** %b, LocationSize::precise(8)), (i32**
%0, LocationSize::precise(8)), (i32* %2, LocationSize::precise(4)),
(i32* %y, LocationSize::precise(4))
whereas previously (i.e., before we added `int y = *x;`) we had
AliasSet[0x55633b7fc150, 5] may alias, Mod/Ref Pointers: (i32*
%c, LocationSize::precise(4)), (i32** %b, LocationSize::precise(8)),
(i32** %0, LocationSize::precise(8)), (i32* %2, LocationSize::precise(4))
The way I think about it:
- (i32* %c, LocationSize::precise(4)) - LLVM IR level pointer (i32* %c)
to runtime stack slot containing the value (integer) of the C source
level `int c` (32-bit, or 4 bytes); this is also the LLVM IR level
pointer to `i32` returned by `alloca`,
- (i32** %b, LocationSize::precise(8)) - LLVM IR level pointer (i32**
%b) to pointer to runtime stack slot containing the value (address) of C
source level `int *b` (64-bit, or 8 bytes); this is also the LLVM IR
pointer to `i32*` returned by `alloca`.
The reason `(i32* %c, LocationSize::precise(4))` and `(i32** %b,
LocationSize::precise(8))` appear together in both alias sets is because
of the following LLVM IR statement:
store i32* %c, i32** %b, align 8
After adding the aforementioned C source level statement `x = &y;`, we
can observe that `(i32* %y, LocationSize::precise(4))` got merged into
the (latter) alias set, too.
Does this seem reasonable?
Best,
Matt
More information about the llvm-dev
mailing list