[PATCH] D16664: [CUDA] Generate CUDA's printf alloca in its function's entry block.

Reid Kleckner via cfe-commits cfe-commits at lists.llvm.org
Thu Jan 28 11:50:15 PST 2016


rnk added inline comments.

================
Comment at: lib/CodeGen/CGCUDABuiltin.cpp:105-108
@@ -104,2 +104,6 @@
   } else {
-    BufferPtr = Builder.Insert(new llvm::AllocaInst(
+    // Insert our alloca not into the current BB, but into the function's entry
+    // block.  This is important because nvvm doesn't support alloca -- if we
+    // put the alloca anywhere else, llvm may eventually output
+    // stacksave/stackrestore intrinsics, which cause our nvvm backend to choke.
+    auto *Alloca = new llvm::AllocaInst(
----------------
The fact that allocas for local variables should always go in the entry block is pretty widespread cultural knowledge in LLVM and clang. Most readers aren't going to need this comment, unless you expect that people working on CUDA won't have that background. Plus, if you use CreateTempAlloca, there won't be any question about which insert point should be used.

================
Comment at: lib/CodeGen/CGCUDABuiltin.cpp:109
@@ -106,1 +108,3 @@
+    // stacksave/stackrestore intrinsics, which cause our nvvm backend to choke.
+    auto *Alloca = new llvm::AllocaInst(
         llvm::Type::getInt8Ty(Ctx), llvm::ConstantInt::get(Int32Ty, BufSize),
----------------
You can still use CreateTempAlloca by making an `[i8 x N]` LLVM type. You'll have to use CreateStructGEP below for forming GEPs. Overall I think that'd be nicer, since you don't need to worry about insertion at all.


http://reviews.llvm.org/D16664





More information about the cfe-commits mailing list