[llvm-dev] Inlining + CSE + restrict pointers == funtimes

Neil Henning via llvm-dev llvm-dev at lists.llvm.org
Wed Jan 22 05:02:13 PST 2020


So I've been narrowing down a very fun issue in our Burst compiler stack
with respect to noalias support, and I've managed to basically boil this
down to the following failure (see https://godbolt.org/z/-mdjPV):

int called(int* __restrict__ a, int* b, int* c) {
return *a + *b + *c;
}

int foo(int * x, int * y) {
return *x + *y + called(x, x, y);
}

int bar(int * x, int * y) {
return called(x, x, y) + *x + *y;
}

Which becomes:

define dso_local i32 @called(i32* noalias nocapture readonly %0, i32*
nocapture readonly %1, i32* nocapture readonly %2) local_unnamed_addr #0 !
dbg !7 {
%4 = load i32, i32* %0, align 4, !dbg !19, !tbaa !20
%5 = load i32, i32* %1, align 4, !dbg !24, !tbaa !20
%6 = add nsw i32 %5, %4, !dbg !25
%7 = load i32, i32* %2, align 4, !dbg !26, !tbaa !20
%8 = add nsw i32 %6, %7, !dbg !27
ret i32 %8, !dbg !28
}

define dso_local i32 @foo(i32* nocapture readonly %0, i32* nocapture
readonly %1) local_unnamed_addr #0 !dbg !29 {
%3 = load i32, i32* %0, align 4, !dbg !36, !tbaa !20
%4 = load i32, i32* %1, align 4, !dbg !37, !tbaa !20
%5 = add i32 %4, %3
%6 = shl i32 %5, 1
%7 = add i32 %6, %3, !dbg !38
ret i32 %7, !dbg !39
}

define dso_local i32 @bar(i32* nocapture readonly %0, i32* nocapture
readonly %1) local_unnamed_addr #0 !dbg !40 {
%3 = load i32, i32* %0, align 4, !dbg !47, !tbaa !20, !alias.scope !48
%4 = load i32, i32* %1, align 4, !dbg !51, !tbaa !20, !noalias !48
%5 = add i32 %4, %3
%6 = shl i32 %5, 1
%7 = add i32 %6, %3, !dbg !52
ret i32 %7, !dbg !53
}

The issue is that CSE just looks at two loads from the same location and
goes 'hey I can combine them!' but it doesn't take into account whether
either load has extra aliasing information or not. So in foo it has turned
a noalias pointer into an aliasing one, but in bar it has turned an
aliasing pointer into a non-aliasing one.

I'm not sure what the C spec says (if anything) about this, but for us we'd
like the behaviour to be defined.

Does anyone have any opinions on solving this before I drop a patch?
Should we perhaps make the behaviour to choose (alias over non-alias, or
vice versa) controllable via a hidden CSE option?
What should we do in the presence of two conflicting sets of noalias
information?

Sidenote: I'm aware of the 'full restrict' patch that has been circulated,
but irrespective of whether that lands or not we'd still like to have some
defined behaviour for the above case.

Cheers,
-Neil.
-- 
Neil Henning
Senior Software Engineer Compiler
unity.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200122/04363ee8/attachment.html>


More information about the llvm-dev mailing list