[llvm-dev] Inlining + CSE + restrict pointers == funtimes

Neil Henning via llvm-dev llvm-dev at lists.llvm.org
Wed Jan 22 07:27:24 PST 2020


It's EarlyCSE that is doing the damage in this case:

; Function Attrs: norecurse nounwind readonly uwtable
define dso_local i32 @foo(i32* nocapture readonly %0, i32* nocapture
readonly %1) local_unnamed_addr #0 {
  %3 = load i32, i32* %0, align 4, !tbaa !3
  %4 = load i32, i32* %1, align 4, !tbaa !3
  %5 = add nsw i32 %4, %3
  %6 = load i32, i32* %0, align 4, !tbaa !3, !alias.scope !7
  %7 = load i32, i32* %0, align 4, !tbaa !3, !noalias !7
  %8 = add nsw i32 %7, %6
  %9 = load i32, i32* %1, align 4, !tbaa !3, !noalias !7
  %10 = add nsw i32 %8, %9
  %11 = add nsw i32 %5, %10
  ret i32 %11
}
*** IR Dump After Early CSE w/ MemorySSA ***
; Function Attrs: norecurse nounwind readonly uwtable
define dso_local i32 @foo(i32* nocapture readonly %0, i32* nocapture
readonly %1) local_unnamed_addr #0 {
  %3 = load i32, i32* %0, align 4, !tbaa !3
  %4 = load i32, i32* %1, align 4, !tbaa !3
  %5 = add nsw i32 %4, %3
  %6 = add nsw i32 %3, %3
  %7 = add nsw i32 %6, %4
  %8 = add nsw i32 %5, %7
  ret i32 %8
}

The problem is that EarlyCSE is not aware of the aliasing metadata at all -
it just sees two loads that look the same and chooses one to die.

Cheers,
-Neil.

On Wed, Jan 22, 2020 at 2:14 PM Finkel, Hal J. <hfinkel at anl.gov> wrote:

> Hi, Neil,
>
> When you say CSE, do you mean EarlyCSE or GVN?
>
> Is the metadata being combined using MDNode::intersect or
> AAMDNodes::intersect?
>
>  -Hal
> On 1/22/20 7:02 AM, Neil Henning via llvm-dev wrote:
>
> So I've been narrowing down a very fun issue in our Burst compiler stack
> with respect to noalias support, and I've managed to basically boil this
> down to the following failure (see https://godbolt.org/z/-mdjPV):
>
> int called(int* __restrict__ a, int* b, int* c) {
> return *a + *b + *c;
> }
>
> int foo(int * x, int * y) {
> return *x + *y + called(x, x, y);
> }
>
> int bar(int * x, int * y) {
> return called(x, x, y) + *x + *y;
> }
>
> Which becomes:
>
> define dso_local i32 @called(i32* noalias nocapture readonly %0, i32*
> nocapture readonly %1, i32* nocapture readonly %2) local_unnamed_addr #0 !
> dbg !7 {
> %4 = load i32, i32* %0, align 4, !dbg !19, !tbaa !20
> %5 = load i32, i32* %1, align 4, !dbg !24, !tbaa !20
> %6 = add nsw i32 %5, %4, !dbg !25
> %7 = load i32, i32* %2, align 4, !dbg !26, !tbaa !20
> %8 = add nsw i32 %6, %7, !dbg !27
> ret i32 %8, !dbg !28
> }
>
> define dso_local i32 @foo(i32* nocapture readonly %0, i32* nocapture
> readonly %1) local_unnamed_addr #0 !dbg !29 {
> %3 = load i32, i32* %0, align 4, !dbg !36, !tbaa !20
> %4 = load i32, i32* %1, align 4, !dbg !37, !tbaa !20
> %5 = add i32 %4, %3
> %6 = shl i32 %5, 1
> %7 = add i32 %6, %3, !dbg !38
> ret i32 %7, !dbg !39
> }
>
> define dso_local i32 @bar(i32* nocapture readonly %0, i32* nocapture
> readonly %1) local_unnamed_addr #0 !dbg !40 {
> %3 = load i32, i32* %0, align 4, !dbg !47, !tbaa !20, !alias.scope !48
> %4 = load i32, i32* %1, align 4, !dbg !51, !tbaa !20, !noalias !48
> %5 = add i32 %4, %3
> %6 = shl i32 %5, 1
> %7 = add i32 %6, %3, !dbg !52
> ret i32 %7, !dbg !53
> }
>
> The issue is that CSE just looks at two loads from the same location and
> goes 'hey I can combine them!' but it doesn't take into account whether
> either load has extra aliasing information or not. So in foo it has turned
> a noalias pointer into an aliasing one, but in bar it has turned an
> aliasing pointer into a non-aliasing one.
>
> I'm not sure what the C spec says (if anything) about this, but for us
> we'd like the behaviour to be defined.
>
> Does anyone have any opinions on solving this before I drop a patch?
> Should we perhaps make the behaviour to choose (alias over non-alias, or
> vice versa) controllable via a hidden CSE option?
> What should we do in the presence of two conflicting sets of noalias
> information?
>
> Sidenote: I'm aware of the 'full restrict' patch that has been circulated,
> but irrespective of whether that lands or not we'd still like to have some
> defined behaviour for the above case.
>
> Cheers,
> -Neil.
> --
> Neil Henning
> Senior Software Engineer Compiler
> unity.com
>
> _______________________________________________
> LLVM Developers mailing listllvm-dev at lists.llvm.orghttps://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
> --
> Hal Finkel
> Lead, Compiler Technology and Programming Languages
> Leadership Computing Facility
> Argonne National Laboratory
>
>

-- 
Neil Henning
Senior Software Engineer Compiler
unity.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200122/fa67847d/attachment.html>


More information about the llvm-dev mailing list