[PATCH] D16664: [CUDA] Generate CUDA's printf alloca in its function's entry block.

Thu Jan 28 14:32:17 PST 2016

jlebar marked 3 inline comments as done.
jlebar added a comment.

Thank you for the reviews.

Please have another look; I switched to using a struct proper.  It's a lot cleaner!  We're now assuming that the struct is aligned in the same way as vprintf wants, but if anything I expect this new code is more likely to match what it wants.


================
Comment at: lib/CodeGen/CGCUDABuiltin.cpp:105-108
@@ -104,2 +104,6 @@
   } else {
-    BufferPtr = Builder.Insert(new llvm::AllocaInst(
+    // Insert our alloca not into the current BB, but into the function's entry
+    // block.  This is important because nvvm doesn't support alloca -- if we
+    // put the alloca anywhere else, llvm may eventually output
+    // stacksave/stackrestore intrinsics, which cause our nvvm backend to choke.
+    auto *Alloca = new llvm::AllocaInst(
----------------
rnk wrote:
> The fact that allocas for local variables should always go in the entry block is pretty widespread cultural knowledge in LLVM and clang. Most readers aren't going to need this comment, unless you expect that people working on CUDA won't have that background. Plus, if you use CreateTempAlloca, there won't be any question about which insert point should be used.
OK, yeah, I also don't like comments that explain something that everyone other than the author knows.  Thanks.


http://reviews.llvm.org/D16664