[PATCH] D150427: [AMDGPU] Non hostcall printf support for HIP
Matt Arsenault via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Fri May 12 10:42:35 PDT 2023
arsenm added inline comments.
================
Comment at: clang/include/clang/Basic/LangOptions.def:274
LANGOPT(OffloadingNewDriver, 1, 0, "use the new driver for generating offloading code.")
+ENUM_LANGOPT(AMDGPUPrintfKindVal, AMDGPUPrintfKind, 2, AMDGPUPrintfKind::Buffered, "printf lowering scheme to be used, hostcall or buffer based")
----------------
yaxunl wrote:
> This should be a target option like https://reviews.llvm.org/D91546 instead of a language option since it is target specific.
Should be -m option
================
Comment at: clang/lib/Driver/ToolChains/Clang.cpp:4673
+
+ // unconditionally claim the pritnf option now to avoid unused diagnostic.
+ // TODO: OpenCL targets will should use this option to switch between
----------------
Capitalize, typo 'pritnf'
================
Comment at: clang/lib/Driver/ToolChains/Clang.cpp:4674
+ // unconditionally claim the pritnf option now to avoid unused diagnostic.
+ // TODO: OpenCL targets will should use this option to switch between
+ // hostcall and buffered printf schemes.
----------------
Typo 'will should'
Don't really understand the TODO, this should trigger for OpenCL as it is
================
Comment at: clang/test/CodeGenHIP/printf_nonhostcall.cpp:5
+// RUN: -o - %s | FileCheck --enable-var-scope %s
+
+#define __device__ __attribute__((device))
----------------
Do we need a test with -fno-builtin?
================
Comment at: clang/test/CodeGenHIP/printf_nonhostcall.cpp:69
+ const char *s = "hello world";
+ return printf("%.*f %*.*s %p\n", 8, 3.14159, 8, 4, s, s);
+}
----------------
Doesn't cover the full range of printable types. need some other non-string pointers and different address spaces, some FP promotions, 16 and 64 bit integers
================
Comment at: llvm/lib/Transforms/Utils/AMDGPUEmitPrintf.cpp:259-261
+ // This is a tradeoff. we might end up taking more compile
+ // time to calculate string contents if possible, but the generated
+ // code would be better runtime wise.
----------------
Don't understand the point of the comment, I would assume anything involving analysis of a constant string has ignorable compile time
================
Comment at: llvm/lib/Transforms/Utils/AMDGPUEmitPrintf.cpp:371-384
+ Function *TheFn = Intrinsic::getDeclaration(
+ Builder.GetInsertBlock()->getModule(), Intrinsic::memcpy, Tys);
+ SmallVector<Value *, 1> BuffOffset;
+
+ Value *FnArgs[] = {
+ PtrToStore, Args[i], val,
+ ConstantInt::get(Type::getInt1Ty(Builder.getContext()), false)};
----------------
Builder.CreateMemCpy
================
Comment at: llvm/lib/Transforms/Utils/AMDGPUEmitPrintf.cpp:402
+ SmallVector<Value *, 1> BuffOffset;
+ uint offsetVal = toStore->getType()->getIntegerBitWidth() == 32 ? 4 : 8;
+ BuffOffset.push_back(ConstantInt::get(Builder.getInt32Ty(), offsetVal));
----------------
Just use the type store size
================
Comment at: llvm/lib/Transforms/Utils/AMDGPUEmitPrintf.cpp:403
+ uint offsetVal = toStore->getType()->getIntegerBitWidth() == 32 ? 4 : 8;
+ BuffOffset.push_back(ConstantInt::get(Builder.getInt32Ty(), offsetVal));
+
----------------
You don't need a SmallVector to push back a single entry
================
Comment at: llvm/lib/Transforms/Utils/AMDGPUEmitPrintf.cpp:428
+ SmallVector<StringData, 8> StringContents;
+ llvm::Module *M = Builder.GetInsertBlock()->getModule();
+ LLVMContext &Ctx = Builder.getContext();
----------------
Don't need llvm::
================
Comment at: llvm/lib/Transforms/Utils/AMDGPUEmitPrintf.cpp:430
+ LLVMContext &Ctx = Builder.getContext();
+ auto Int1Ty = Builder.getInt1Ty();
+ auto Int8Ty = Builder.getInt8Ty();
----------------
No auto
================
Comment at: llvm/lib/Transforms/Utils/AMDGPUEmitPrintf.cpp:458
+ auto CreateControlDWord = M->getOrInsertFunction(
+ StringRef("__ockl_create_control_dword"), Builder.getInt32Ty(),
+ Builder.getInt32Ty(), Int1Ty, Int1Ty);
----------------
Do we really need another ockl control variable for this? Why isn't it a parameter? printf=stdout always
================
Comment at: llvm/lib/Transforms/Utils/AMDGPUEmitPrintf.cpp:473
+ if (!FmtStr.empty()) {
+ llvm::MD5 Hasher;
+ llvm::MD5::MD5Result Hash;
----------------
Don't need llvm::
================
Comment at: llvm/lib/Transforms/Utils/AMDGPUEmitPrintf.cpp:498
+ NamedMDNode *metaD = M->getOrInsertNamedMetadata("llvm.printf.fmts");
+ if(0 == metaD->getNumOperands()) {
+ MDString *fmtStrArray = MDString::get(Ctx, "0:0:deadbeef,\"\"");
----------------
Backwards conditional
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D150427/new/
https://reviews.llvm.org/D150427
More information about the cfe-commits
mailing list