[PATCH] D27034: [AliasAnalysis] Teach BasicAA about memcpy.

bryant via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Sat Nov 26 09:07:08 PST 2016


bryant added inline comments.


================
Comment at: lib/Analysis/BasicAliasAnalysis.cpp:773
   }
 
+  if (auto *Inst = dyn_cast<MemCpyInst>(CS.getInstruction())) {
----------------
hfinkel wrote:
> bryant wrote:
> > hfinkel wrote:
> > > bryant wrote:
> > > > reames wrote:
> > > > > This entire change should be unnecessary.  We have readonly and writeonly attributes with these semantics and the intrinsic should already be declared with them.  
> > > > > 
> > > > > Reading further, it looks like you're change is mainly aimed at the no alias property right?  If so, take a look at the noalias parameter attribute.  It might be we've forgotten to tag the intrinsic correctly.
> > > > > This entire change should be unnecessary. We have readonly and writeonly attributes with these semantics and the intrinsic should already be declared with them.
> > > > 
> > > > Including write-/readonly inside memcpy's declaration with my initial example doesn't work, at least not with `-basicaa`:
> > > > 
> > > >     declare void @llvm.memcpy.p0i8.p0i8.i64(i8* nocapture writeonly, i8* nocapture readonly, i64, i32, i1) nounwind
> > > > 
> > > >     define void @source_clobber(i8* %a, i8* %b) {
> > > >     ; CHECK-LABEL: @source_clobber(
> > > >     ; CHECK-NEXT:  ; 1 = MemoryDef(liveOnEntry)
> > > >     ; CHECK-NEXT:    call void @llvm.memcpy.p0i8.p0i8.i64(i8* %a, i8* %b, i64 128, i32 1, i1 false)
> > > >     ; CHECK-NEXT:  ; MemoryUse(liveOnEntry)
> > > >     ; CHECK-NEXT:    [[X:%.*]] = load i8, i8* %b
> > > >     ; CHECK-NEXT:    ret void
> > > >     ;
> > > >       call void @llvm.memcpy.p0i8.p0i8.i64(i8* %a, i8* %b, i64 128, i32 1, i1 false)
> > > >       %x = load i8, i8* %b
> > > >       ret void
> > > >     }
> > > > 
> > > >     09:57:21 ~> opt -disable-output -basicaa -aa-eval -print-all-alias-modref-info memcpy-test.ll
> > > >     Function: source_clobber: 2 pointers, 1 call sites
> > > >       MayAlias:     i8* %a, i8* %b
> > > >       Both ModRef:  Ptr: i8* %a     <->  call void @llvm.memcpy.p0i8.p0i8.i64(i8* %a, i8* %b, i64 128, i32 1, i1 false)
> > > >       Both ModRef:  Ptr: i8* %b     <->  call void @llvm.memcpy.p0i8.p0i8.i64(i8* %a, i8* %b, i64 128, i32 1, i1 false)
> > > > 
> > > > > Reading further, it looks like you're change is mainly aimed at the no alias property right? If so, take a look at the noalias parameter attribute. It might be we've forgotten to tag the intrinsic correctly.
> > > > 
> > > > It's not always the case that parameters can satisfy the `noalias` attribute. Suppose, for instance, that the previous function was instead
> > > > 
> > > >     define void @f(i8* %a, i8* %b, i8* %c, i8* %d)
> > > > 
> > > > where a and b could alias c and d, respectively. Then neither a nor b fits noalias requirement. Yet, even in such a case, the LangRef constraint on memcpy should be sufficient to deduce that a no-alias b.
> > > > 
> > > I don't understand what your comment means. How are %c and %d accessed?
> > > 
> > > Just in case it is not clear, noalias's semantics are similar in this regard to C's restrict semantics. noalias means that memory accessed through the attributed pointer, or any pointers based on that pointer, are not accessed through a pointer not based on the attributed pointer. In C, memcpy certainly has restrict on its pointer arguments, and we could add noalias on ours as well.
> > > I don't understand what your comment means. How are %c and %d accessed?
> > 
> > I took reames' original suggestion to mean that in my original example, `source_clobber`'s parameters %a and %b should have been tagged `noalias`, in which case the unpatched BasicAA indeed deduces the correct mod/ref info. 
> > 
> > > Just in case it is not clear, noalias's semantics are similar in this regard to C's restrict semantics. noalias means that memory accessed through the attributed pointer, or any pointers based on that pointer, are not accessed through a pointer not based on the attributed pointer.
> > 
> > So this implies that a parameter `i8* %a noalias` would no-alias every other pointer parameter, right? Because that was my point: Calling memcpy with %a and %b only implies that %a no-alias %b, whereas `%a noalias` implies a much stronger condition. Suppose, instead of `source_clobber`, we had the function:
> > 
> > ```
> > define void @f(i8* %a, i8* %b, i8* %c, i8* %d) {
> >   call void @llvm.memcpy.p0i8.p0i8.i64(i8* %a, i8* %b, i64 128, i32 1, i1 false)
> >   call void @llvm.memcpy.p0i8.p0i8.i64(i8* %c, i8* %d, i64 128, i32 1, i1 false)
> >   %x = load i8, i8* %b
> > }
> > ```
> > 
> > It's totally okay for %a to may-alias %c, and %b to may-alias %d, right? An `i8* %a noalias` would forbid this possibility.
> > 
> > > In C, memcpy certainly has restrict on its pointer arguments, and we could add noalias on ours as well.
> > 
> > Do you mean tagging the memcpy IR prototype's parameters `noalias`? I've tried that and it doesn't seem to work, either. Here's a quick patch that adds `Attribute::NoAlias` to the memcpy intrinsic:
> > 
> > ```diff
> > diff --git a/include/llvm/IR/Intrinsics.td b/include/llvm/IR/Intrinsics.td
> > index be0ae3e..4a240ae 100644
> > --- a/include/llvm/IR/Intrinsics.td
> > +++ b/include/llvm/IR/Intrinsics.td
> > @@ -78,6 +78,10 @@ class ReadNone<int argNo> : IntrinsicProperty {
> >    int ArgNo = argNo;
> >  }
> >  
> > +class NoAlias<int argNo> : IntrinsicProperty {
> > +  int ArgNo = argNo;
> > +}
> > +
> >  def IntrNoReturn : IntrinsicProperty;
> >  
> >  // IntrNoduplicate - Calls to this intrinsic cannot be duplicated.
> > @@ -368,7 +372,8 @@ def int_memcpy  : Intrinsic<[],
> >                               [llvm_anyptr_ty, llvm_anyptr_ty, llvm_anyint_ty,
> >                                llvm_i32_ty, llvm_i1_ty],
> >                              [IntrArgMemOnly, NoCapture<0>, NoCapture<1>,
> > -                             WriteOnly<0>, ReadOnly<1>]>;
> > +                             WriteOnly<0>, ReadOnly<1>, NoAlias<0>,
> > +                             NoAlias<1>]>;
> >  def int_memmove : Intrinsic<[],
> >                              [llvm_anyptr_ty, llvm_anyptr_ty, llvm_anyint_ty,
> >                               llvm_i32_ty, llvm_i1_ty],
> > diff --git a/utils/TableGen/CodeGenIntrinsics.h b/utils/TableGen/CodeGenIntrinsics.h
> > index ea3ec67..239e37d 100644
> > --- a/utils/TableGen/CodeGenIntrinsics.h
> > +++ b/utils/TableGen/CodeGenIntrinsics.h
> > @@ -108,7 +108,7 @@ struct CodeGenIntrinsic {
> >    /// True if the intrinsic is marked as convergent.
> >    bool isConvergent;
> >  
> > -  enum ArgAttribute { NoCapture, Returned, ReadOnly, WriteOnly, ReadNone };
> > +  enum ArgAttribute { NoCapture, Returned, ReadOnly, WriteOnly, ReadNone, NoAlias };
> >    std::vector<std::pair<unsigned, ArgAttribute>> ArgumentAttributes;
> >  
> >    CodeGenIntrinsic(Record *R);
> > diff --git a/utils/TableGen/CodeGenTarget.cpp b/utils/TableGen/CodeGenTarget.cpp
> > index 13c8c12..431611e 100644
> > --- a/utils/TableGen/CodeGenTarget.cpp
> > +++ b/utils/TableGen/CodeGenTarget.cpp
> > @@ -610,6 +610,9 @@ CodeGenIntrinsic::CodeGenIntrinsic(Record *R) {
> >      } else if (Property->isSubClassOf("ReadNone")) {
> >        unsigned ArgNo = Property->getValueAsInt("ArgNo");
> >        ArgumentAttributes.push_back(std::make_pair(ArgNo, ReadNone));
> > +    } else if (Property->isSubClassOf("NoAlias")) {
> > +      unsigned ArgNo = Property->getValueAsInt("ArgNo");
> > +      ArgumentAttributes.push_back(std::make_pair(ArgNo, NoAlias));
> >      } else
> >        llvm_unreachable("Unknown property!");
> >    }
> > diff --git a/utils/TableGen/IntrinsicEmitter.cpp b/utils/TableGen/IntrinsicEmitter.cpp
> > index 60cd2da..5fe75ad 100644
> > --- a/utils/TableGen/IntrinsicEmitter.cpp
> > +++ b/utils/TableGen/IntrinsicEmitter.cpp
> > @@ -587,6 +587,12 @@ void IntrinsicEmitter::EmitAttributes(const CodeGenIntrinsicTable &Ints,
> >              OS << "Attribute::ReadNone";
> >              addComma = true;
> >              break;
> > +          case CodeGenIntrinsic::NoAlias:
> > +            if (addComma)
> > +              OS << ",";
> > +            OS << "Attribute::NoAlias";
> > +            addComma = true;
> > +            break;
> >            }
> >  
> >            ++ai;
> > ```
> > 
> > And re-running `source_clobber` through opt:
> > 
> > ```
> > 06:55:57 ~/3rd/llvm> opt -basicaa -aa-eval -print-all-alias-modref-info -S test.ll
> > Function: source_clobber: 2 pointers, 1 call sites
> >   MayAlias:     i8* %a, i8* %b
> >   Both ModRef:  Ptr: i8* %a     <->  call void @llvm.memcpy.p0i8.p0i8.i64(i8* %a, i8* %b, i64 128, i32 1, i1 false)
> >   Both ModRef:  Ptr: i8* %b     <->  call void @llvm.memcpy.p0i8.p0i8.i64(i8* %a, i8* %b, i64 128, i32 1, i1 false)
> > ; ModuleID = 'test.ll'
> > source_filename = "test.ll"
> > 
> > ; Function Attrs: argmemonly nounwind
> > declare void @llvm.memcpy.p0i8.p0i8.i64(i8* noalias nocapture writeonly, i8* noalias nocapture readonly, i64, i32, i1) #0
> > 
> > define void @source_clobber(i8* %a, i8* %b) {
> >   call void @llvm.memcpy.p0i8.p0i8.i64(i8* %a, i8* %b, i64 128, i32 1, i1 false)
> >   %x = load i8, i8* %b
> >   ret void
> > }
> > 
> > attributes #0 = { argmemonly nounwind }
> > ```
> > 
> > Notice that the intrinsic prototype now contains `noalias`, yet the `-aa-eval` results are still over-conservative.
> > Because that was my point: Calling memcpy with %a and %b only implies that %a no-alias %b, whereas %a noalias implies a much stronger condition. 
> 
> No, it is actually a weaker condition. That's the problem. noalias does not say anything about the aliasing properties of the function arguments themselves. It only has a meaning if you know how the pointers are used within the function. So, yes, we could add noalias to memcpy. No, we'd not be able to optimize the caller of memcpy based only on that information. In the case of memcpy, we know how the memory is accessed so we can say more about the aliasing properties.
> No, we'd not be able to optimize the caller of memcpy based only on that information. In the case of memcpy, we know how the memory is accessed so we can say more about the aliasing properties.

Why not? Why can't we assume that two pointers don't alias as soon as they're fed into memcpy, and that if they do, it's undefined behavior?

>No, it is actually a weaker condition. That's the problem. noalias does not say anything about the aliasing properties of the function arguments themselves. It only has a meaning if you know how the pointers are used within the function.

LangRef says that "objects accessed via pointer values based on the argument or return value are not also accessed, **during the execution of the function**, via pointer values not based on the argument or return value." "During execution of the function" tells me that regardless of how the function uses a noalias pointer argument, it's not allowed to alias any other pointer argument. Correct?

And it's UB if I pass in aliased pointers to **any** function whose arguments are noalias because "the caller shares the responsibility with the callee for ensuring that these requirements are met."


Repository:
  rL LLVM

https://reviews.llvm.org/D27034





More information about the llvm-commits mailing list