[PATCH] D9403: llvm.noalias - Clang CodeGen for local restrict-qualified pointers

hfinkel at anl.gov hfinkel at anl.gov
Thu Jul 16 21:42:26 PDT 2015


hfinkel added inline comments.

================
Comment at: lib/CodeGen/CGExpr.cpp:1286
@@ -1283,1 +1285,3 @@
+    Value = Builder.CreateNoAlias(Value, NAI->second);
+
   if (Ty->isAtomicType() ||
----------------
hfinkel wrote:
> hfinkel wrote:
> > rjmccall wrote:
> > > hfinkel wrote:
> > > > rjmccall wrote:
> > > > > This is very strange.  Why do the aliasing properties of an address cause metadata to be attached to the value stored into it?  Shouldn't the metadata be on the address (in which case, perhaps it should have been attached to the load from the restrict pointer?) or the store?
> > > > > 
> > > > > Also, please find some way to avoid doing a DenseMap lookup on every single store.  I would guess that the address only has an entry in that table if it was restrict-qualified; you can probably find a way to pass that information down.
> > > > > This is very strange. Why do the aliasing properties of an address cause metadata to be attached to the value stored into it? Shouldn't the metadata be on the address (in which case, perhaps it should have been attached to the load from the restrict pointer?) or the store?
> > > > 
> > > > 
> > > > To be clear, this is not the metadata, this is the intrinsic; the metadata needs to be attached to all access in the scope. The intrinsic wraps the 'noalias' address. And this is "wrapping" the restrict-qualified address with the llvm.noalias intrinsic before it is stored into the stack space allocated for the local variable.
> > > > 
> > > > > Also, please find some way to avoid doing a DenseMap lookup on every single store. I would guess that the address only has an entry in that table if it was restrict-qualified; you can probably find a way to pass that information down.
> > > > 
> > > > You are right: some (indirect) caller of this function should see a restrict-qualified variable. Maybe this information should end up in the LValue structure?
> > > > 
> > > > Maybe it is worth now thinking about that should be the follow-on patch, because that also might influence the design. This patch handles direct restrict-qualified local variables, but does not handle the case where the variables are inside a structure. Specifically, I mean this:
> > > > 
> > > >   struct V {
> > > >     double * restrict x;
> > > >     double * restrict y;
> > > >     double * restrict z;
> > > >   };
> > > > 
> > > >   void foo() {
> > > >     V v = { getX(), getY(), getZ() };
> > > >     // The stores that compose this initializer need to be wrapped in @llvm.noalias also.
> > > >   }
> > > > 
> > > > So I'd like to end up with a design for this patch that is easy to extend to handle the restrict-qualified-local-structure-member case as well.
> > > "Value" is the value being stored to the address.
> > > 
> > > If I've got code like this:
> > >   double * restrict addr = get_addr();
> > >   double value = get_value();
> > >   *addr = value;
> > > 
> > > you are producing IR like
> > >   %addr = call double* @get_addr()
> > >   %value = call double get_value()
> > >   %value2 = call @llvm.noalias(double %value)
> > >   store %value2, %addr
> > > 
> > > This is what I'm saying does not make any sense, because there's nothing special about %value; the restrictions are on %addr.  I think you probably just have a bug in your patch and meant to use Addr.
> > > 
> > > > You are right: some (indirect) caller of this function should see a restrict-qualified variable. Maybe this information should end up in the LValue structure?
> > > 
> > > Yes, we already store when l-values are volatile, so storing that they're restrict should be easy.
> > No, this is not correct. If I have code like this:
> > 
> >   double * restrict addr = get_addr();
> >   double value = get_value();
> >   *addr = value;
> > 
> > then we don't get code like this:
> > 
> >   %addr = call double* @get_addr()
> >   %value = call double get_value()
> >   %value2 = call @llvm.noalias(double %value)
> >   store %value2, %addr
> > 
> > The code that Clang generates does not look like that: it does not generate an SSA variable for %addr, it generates a local stack location for it, and then loads/stores from that location when accessed. The only address that appear in NoAliasAddrMap are the address of the local allocas for the restrict-qualified variables themselves. It is only the pointer values being stored into the local restrict-qualified variables themselves that get wrapped.
> > 
> > If you look at the regression test, I think it is clear that the code does the right thing. (If that's not clear, I should improve the test.)
> > Yes, we already store when l-values are volatile, so storing that they're restrict should be easy.
> 
> FWIW, LValue already keeps track of whether or not the value is restrict qualified (along with whether or not it is volatile). I'll see if I can predicate the map lookup on that, which may be the best solution (does not increase the size of LValue for an uncommon case, but also does not perform a map lookup for every store).
> 
Before I forget ;) -- Here's the actual IR for this case you've provided above...

  $ cat /tmp/ta.c
  double *get_addr(void);
  double get_value(void);

  void foo() {
    double * restrict addr = get_addr();
    double value = get_value();
    *addr = value;
  }

The optimized IR is:

  $ clang -O3 -S -emit-llvm -o - /tmp/ta.c 
  ; ModuleID = '/tmp/ta.c'
  target datalayout = "E-m:e-i64:64-n32:64"
  target triple = "powerpc64-unknown-linux-gnu"
  
  ; Function Attrs: nounwind
  define void @foo() #0 {
  entry:
    %call = tail call double* @get_addr() #2, !noalias !1
    %0 = tail call double* @llvm.noalias.p0f64(double* %call, metadata !1) #2
    %call1 = tail call double @get_value() #2, !noalias !1
    store double %call1, double* %0, align 8, !tbaa !4, !noalias !1
    ret void
  }

and the original IR is:

  $ clang -O3 -S -emit-llvm -o - /tmp/ta.c -mllvm -disable-llvm-optzns
  ; ModuleID = '/tmp/ta.c'
  target datalayout = "E-m:e-i64:64-n32:64"
  target triple = "powerpc64-unknown-linux-gnu"
  
  ; Function Attrs: nounwind
  define void @foo() #0 {
  entry:
    %addr = alloca double*, align 8
    %value = alloca double, align 8
    %0 = bitcast double** %addr to i8*
    call void @llvm.lifetime.start(i64 8, i8* %0) #1, !noalias !1
    %call = call double* @get_addr(), !noalias !1
    %1 = call double* @llvm.noalias.p0f64(double* %call, metadata !1) #1
    store double* %1, double** %addr, align 8, !tbaa !4, !noalias !1
    %2 = bitcast double* %value to i8*
    call void @llvm.lifetime.start(i64 8, i8* %2) #1, !noalias !1
    %call1 = call double @get_value(), !noalias !1
    store double %call1, double* %value, align 8, !tbaa !8, !noalias !1
    %3 = load double, double* %value, align 8, !tbaa !8, !noalias !1
    %4 = load double*, double** %addr, align 8, !tbaa !4, !noalias !1
    store double %3, double* %4, align 8, !tbaa !8, !noalias !1
    %5 = bitcast double* %value to i8*
    call void @llvm.lifetime.end(i64 8, i8* %5) #1
    %6 = bitcast double** %addr to i8*
    call void @llvm.lifetime.end(i64 8, i8* %6) #1
    ret void
  }

Thanks again!



http://reviews.llvm.org/D9403







More information about the cfe-commits mailing list