[PATCH] Allow loop vectorization with llvm.lifetime calls

Arnold Schwaighofer aschwaighofer at apple.com
Wed Jul 31 08:54:30 PDT 2013


H Marc,

thanks for looking into this!


On Jul 31, 2013, at 9:58 AM, Jessome, Marc <marc.jessome at intel.com> wrote:

--- lib/Transforms/Vectorize/LoopVectorize.cpp	(revision 187444)
+++ lib/Transforms/Vectorize/LoopVectorize.cpp	(working copy)
@@ -2488,15 +2490,27 @@
       CallInst *CI = cast<CallInst>(it);
       Intrinsic::ID ID = getIntrinsicIDForCall(CI, TLI);
       assert(ID && "Not an intrinsic call!");
-      for (unsigned Part = 0; Part < UF; ++Part) {
-        SmallVector<Value*, 4> Args;
-        for (unsigned i = 0, ie = CI->getNumArgOperands(); i != ie; ++i) {
-          VectorParts &Arg = getVectorValue(CI->getArgOperand(i));
-          Args.push_back(Arg[Part]);
+      switch (ID) {
+      case Intrinsic::lifetime_end:
+        Builder.CreateLifetimeEnd(
+            CI->getArgOperand(1), dyn_cast<ConstantInt>(CI->getArgOperand(0)));
+        break;
+      case Intrinsic::lifetime_start:
+        Builder.CreateLifetimeStart(
+            CI->getArgOperand(1), dyn_cast<ConstantInt>(CI->getArgOperand(0)));
+        break;


I don’t think we can assume that CI->getArgOperand(...) will always be defined outside the loop, although in sane code one should expect so (alloca and all). What if for some weird reason the bitcast was sunk into the loop:


+entry:
+  %arr = alloca [1024 x i32], align 16
+  %0 = bitcast [1024 x i32]* %arr to i8*
+  call void @llvm.lifetime.start(i64 4096, i8* %0) #1
+  br label %for.body
+
+for.body:
+  %indvars.iv = phi i64 [ 0, %entry ], [ %indvars.iv.next, %for.body ]
+  %1 = bitcast [1024 x i32]* %arr to i8*
+  call void @llvm.lifetime.end(i64 4096, i8* %1) #1

To be on the safe side, I think we have to use “getVectorValue” and extract the first element out of the returned value. 

As to the cost of “lifetime_end/start” intrinsics, I think we should just teach “ BasicTTI.getIntrinsicInstrCost” that those have a cost of 0.

  case Intrinsic::fma:     ISD = ISD::FMA;    break;
  case Intrinsic::fmuladd: ISD = ISD::FMA;    break; // FIXME: mul + add?
+ case Intrinsic::lifetime_end/start: return 0;
  }


@@ -4613,11 +4627,23 @@
     CallInst *CI = cast<CallInst>(I);
     Intrinsic::ID ID = getIntrinsicIDForCall(CI, TLI);
     assert(ID && "Not an intrinsic call!");
-    Type *RetTy = ToVectorTy(CI->getType(), VF);
-    SmallVector<Type*, 4> Tys;
-    for (unsigned i = 0, ie = CI->getNumArgOperands(); i != ie; ++i)
-      Tys.push_back(ToVectorTy(CI->getArgOperand(i)->getType(), VF));
-    return TTI.getIntrinsicInstrCost(ID, RetTy, Tys);
+    switch (ID) {
+    case Intrinsic::lifetime_start:
+    case Intrinsic::lifetime_end: {
+      Type *RetTy = CI->getType();
+      SmallVector<Type*, 4> Tys;
+      for (unsigned i = 0, ie = CI->getNumArgOperands(); i != ie; ++i)
+        Tys.push_back(CI->getArgOperand(i)->getType());
+      return TTI.getIntrinsicCost(ID, RetTy, Tys);

and then just use "TTI.getIntrinsicInstrCost(ID, RetTy, Tys)” above.





More information about the llvm-commits mailing list