[PATCH] D47854: [LangRef] Clarify semantics of load metadata.

Thu Jun 7 07:14:02 PDT 2018

hfinkel added a comment.

In https://reviews.llvm.org/D47854#1124963, @nlopes wrote:

> I just got a scary ah-ah moment..
>  Take this code:
>
>   int f(int &p, int n, int a) {
>       int r = 0;
>       for (int i = 0; i < n; ++i) {
>         if (a) {
>           r = r * p + 3;
>         } else {
>           ++a;
>         }
>       }
>       return r;
>   }
>
>
> This code doesn't load p if `f` is called with `a = false` and `n = 1` (i.e., only 1 iteration of the loop and goes to the else branch).
>
> clang -O1 (not -O2 to avoid hard-to-read-unrolling):
>
>   define i32 @f(i32* dereferenceable(4), i32, i32) {
>     %4 = icmp sgt i32 %1, 0
>     br i1 %4, label %5, label %7
>  
>   ; <label>:5:
>     %6 = load i32, i32* %0, align 4   ; <---
>     br label %9
>  
>   ; <label>:7:
>     %8 = phi i32 [ 0, %3 ], [ %17, %9 ]
>     ret i32 %8
>  
>   ; <label>:9:
>     %10 = phi i32 [ 0, %5 ], [ %18, %9 ]
>     %11 = phi i32 [ 0, %5 ], [ %17, %9 ]
>     %12 = phi i32 [ %2, %5 ], [ %16, %9 ]
>     %13 = icmp eq i32 %12, 0
>     %14 = mul nsw i32 %6, %11
>     %15 = add nsw i32 %14, 3
>     %16 = select i1 %13, i32 1, i32 %12
>     %17 = select i1 %13, i32 %11, i32 %15
>     %18 = add nuw nsw i32 %10, 1
>     %19 = icmp eq i32 %18, %1
>     br i1 %19, label %7, label %9
>   }
>
>
> https://godbolt.org/g/SQh8EJ
>
> The load was moved to the loop preheader, but it's executed irrespective of `a`.
>  Hence if p is not 4-byte aligned, then LLVM introduce undefined behavior. Being dereferenceable doesn't imply it's so with a 4-byte alignment load.
>
> This implies we need to remove the !align attribute (as well as all the other) when hoisting loads. Lowering 1-byte loads is not particularly efficient..
>  We can't really define load !align as returning poison or anything other than UB since some chips trap with unaligned memory accesses. So this is a real bug we need to fix for those folks using SPARC & friends.

Yes, although hopefully align(4) would have also been specified on the function argument. As the reference must have been bound to a valid object, from C++ semantics, we should know that it's properly aligned. Do we do that now?

Repository:
  rL LLVM

https://reviews.llvm.org/D47854