[llvm-dev] Understand alias-analysis results

Matt P. Dziubinski via llvm-dev llvm-dev at lists.llvm.org
Thu Jul 9 03:14:05 PDT 2020


On 7/9/2020 10:15, Shuai Wang via llvm-dev wrote:
> Hello,
> 
> I am performing alias analysis toward the following simple code:
> 
> [...]
> 
> I checked the generated .ll code, and it shows that within the main 
> function and NOALIAS functions, there is only a "ret" statement, with no 
> global or local variables used. Could anyone shed some lights on where 
> the "1 may alias" come from?  And is there a way that I can force the 
> alias analysis algorithm to focus only the "main" function? Thank you 
> very much.

Hi!

Here's more information after initializing the variables (assuming the 
intent in the source code was, e.g., to initialize `a` and `b` to `0` 
and the pointers `f1` and `f2` to `NULL`, using aggregate initialization 
for `s`):
- Clang [-> LLVM-IR]: https://llvm.godbolt.org/z/WT7V3E
- [LLVM-IR ->] opt: https://llvm.godbolt.org/z/Veswa4

Alias sets for function 'main': Alias Set Tracker: 1 alias sets for 2 
pointer values.
AliasSet[0x55ec7f9a23e0, 3] may alias, Mod/Ref   Pointers: (i8* %0, 
LocationSize::precise(4)), (i32* %a, LocationSize::precise(4))

Note that in the original source code `a`, `b` are 
uninitialized--consequently, attempting to access `s[a].f1` and 
`s[b].f2` is undefined behavior (as we're using automatic storage 
duration objects `a` and `b` while their values are indeterminate): 
https://taas.trust-in-soft.com/tsnippet/t/acff56c8

Cf. https://cigix.me/c17#6.7.9.p10 ("If an object that has automatic 
storage duration is not initialized explicitly, its value is 
indeterminate.") & https://cigix.me/c17#J.2.p1
("The behavior is undefined in the following circumstances: [...] The 
value of an object with automatic storage duration is used while it is 
indeterminate").

As such, you can notice that most of the code is going to be optimized 
away between mem2reg and dead argument elimination:
https://llvm.godbolt.org/z/iEdKE_

(Similarly, even if `a` and `b` were initialized to `0`, we only wrote 
to `f1` for `s[0]` and `s[1]`, so accessing `s[b].f2` is again using an 
object while it is indeterminate and undefined behavior.)

*** IR Dump After Promote Memory to Register ***

; the following corresponds to loading `s[a].f1`
%3 = load i32, i32* %a, align 4, !tbaa !7
%idxprom = sext i32 %3 to i64
%arrayidx3 = getelementptr inbounds [2 x %struct.MyStruct], [2 x 
%struct.MyStruct]* %s, i64 0, i64 %idxprom
%f14 = getelementptr inbounds %struct.MyStruct, %struct.MyStruct* 
%arrayidx3, i32 0, i32 0
%4 = load i32*, i32** %f14, align 16, !tbaa !2
%5 = bitcast i32* %4 to i8*

; the following corresponds to loading `s[b].f2`
%6 = load i32, i32* %b, align 4, !tbaa !7
%idxprom5 = sext i32 %6 to i64
%arrayidx6 = getelementptr inbounds [2 x %struct.MyStruct], [2 x 
%struct.MyStruct]* %s, i64 0, i64 %idxprom5
%f2 = getelementptr inbounds %struct.MyStruct, %struct.MyStruct* 
%arrayidx6, i32 0, i32 1
%7 = load i32*, i32** %f2, align 8, !tbaa !9
%8 = bitcast i32* %7 to i8*
call void @NOALIAS(i8* %5, i8* %8)

*** IR Dump After Dead Argument Elimination ***
; note how the arguments have been rewritten to `undef` in the following:
call void @NOALIAS(i8* undef, i8* undef)

 > And is there a way that I can force the alias analysis algorithm to 
focus only the "main" function?

One way is to make the definition of `NOALIAS` unavailable (as if 
external) by only providing the declaration (as in the above examples).

Best,
Matt


More information about the llvm-dev mailing list