[PATCH] D150427: [AMDGPU] Non hostcall printf support for HIP

Matt Arsenault via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Fri May 12 10:42:35 PDT 2023


arsenm added inline comments.


================
Comment at: clang/include/clang/Basic/LangOptions.def:274
 LANGOPT(OffloadingNewDriver, 1, 0, "use the new driver for generating offloading code.")
+ENUM_LANGOPT(AMDGPUPrintfKindVal, AMDGPUPrintfKind, 2, AMDGPUPrintfKind::Buffered, "printf lowering scheme to be used, hostcall or buffer based")
 
----------------
yaxunl wrote:
> This should be a target option like https://reviews.llvm.org/D91546 instead of a language option since it is target specific.
Should be -m option 


================
Comment at: clang/lib/Driver/ToolChains/Clang.cpp:4673
+ 
+  // unconditionally claim the pritnf option now to avoid unused diagnostic.
+  // TODO: OpenCL targets will should use this option to switch between
----------------
Capitalize, typo 'pritnf'


================
Comment at: clang/lib/Driver/ToolChains/Clang.cpp:4674
+  // unconditionally claim the pritnf option now to avoid unused diagnostic.
+  // TODO: OpenCL targets will should use this option to switch between
+  // hostcall and buffered printf schemes.
----------------
Typo 'will should'

Don't really understand the TODO, this should trigger for OpenCL as it is


================
Comment at: clang/test/CodeGenHIP/printf_nonhostcall.cpp:5
+// RUN:   -o - %s | FileCheck --enable-var-scope %s
+
+#define __device__ __attribute__((device))
----------------
Do we need a test with -fno-builtin?


================
Comment at: clang/test/CodeGenHIP/printf_nonhostcall.cpp:69
+  const char *s = "hello world";
+  return printf("%.*f %*.*s %p\n", 8, 3.14159, 8, 4, s, s);
+}
----------------
Doesn't cover the full range of printable types. need some other non-string pointers and different address spaces, some FP promotions, 16 and 64 bit integers 


================
Comment at: llvm/lib/Transforms/Utils/AMDGPUEmitPrintf.cpp:259-261
+      // This is a tradeoff. we might end up taking more compile
+      // time to calculate string contents if possible, but the generated
+      // code would be better runtime wise.
----------------
Don't understand the point of the comment, I would assume anything involving analysis of a constant string has ignorable compile time 


================
Comment at: llvm/lib/Transforms/Utils/AMDGPUEmitPrintf.cpp:371-384
+        Function *TheFn = Intrinsic::getDeclaration(
+            Builder.GetInsertBlock()->getModule(), Intrinsic::memcpy, Tys);
+        SmallVector<Value *, 1> BuffOffset;
+
+        Value *FnArgs[] = {
+            PtrToStore, Args[i], val,
+            ConstantInt::get(Type::getInt1Ty(Builder.getContext()), false)};
----------------
Builder.CreateMemCpy


================
Comment at: llvm/lib/Transforms/Utils/AMDGPUEmitPrintf.cpp:402
+      SmallVector<Value *, 1> BuffOffset;
+      uint offsetVal = toStore->getType()->getIntegerBitWidth() == 32 ? 4 : 8;
+      BuffOffset.push_back(ConstantInt::get(Builder.getInt32Ty(), offsetVal));
----------------
Just use the type store size


================
Comment at: llvm/lib/Transforms/Utils/AMDGPUEmitPrintf.cpp:403
+      uint offsetVal = toStore->getType()->getIntegerBitWidth() == 32 ? 4 : 8;
+      BuffOffset.push_back(ConstantInt::get(Builder.getInt32Ty(), offsetVal));
+
----------------
You don't need a SmallVector to push back a single entry 


================
Comment at: llvm/lib/Transforms/Utils/AMDGPUEmitPrintf.cpp:428
+    SmallVector<StringData, 8> StringContents;
+    llvm::Module *M = Builder.GetInsertBlock()->getModule();
+    LLVMContext &Ctx = Builder.getContext();
----------------
Don't need llvm::


================
Comment at: llvm/lib/Transforms/Utils/AMDGPUEmitPrintf.cpp:430
+    LLVMContext &Ctx = Builder.getContext();
+    auto Int1Ty = Builder.getInt1Ty();
+    auto Int8Ty = Builder.getInt8Ty();
----------------
No auto 


================
Comment at: llvm/lib/Transforms/Utils/AMDGPUEmitPrintf.cpp:458
+    auto CreateControlDWord = M->getOrInsertFunction(
+        StringRef("__ockl_create_control_dword"), Builder.getInt32Ty(),
+        Builder.getInt32Ty(), Int1Ty, Int1Ty);
----------------
Do we really need another ockl control variable for this? Why isn't it a parameter? printf=stdout always 


================
Comment at: llvm/lib/Transforms/Utils/AMDGPUEmitPrintf.cpp:473
+    if (!FmtStr.empty()) {
+      llvm::MD5 Hasher;
+      llvm::MD5::MD5Result Hash;
----------------
Don't need llvm::


================
Comment at: llvm/lib/Transforms/Utils/AMDGPUEmitPrintf.cpp:498
+      NamedMDNode *metaD = M->getOrInsertNamedMetadata("llvm.printf.fmts");
+      if(0 == metaD->getNumOperands()) {
+        MDString *fmtStrArray = MDString::get(Ctx, "0:0:deadbeef,\"\"");
----------------
Backwards conditional 


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D150427/new/

https://reviews.llvm.org/D150427



More information about the cfe-commits mailing list