[PATCH] D138702: support for HIP non hostcall printf

Matt Arsenault via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Mon Nov 28 07:21:21 PST 2022


arsenm requested changes to this revision.
arsenm added a comment.
Herald added a subscriber: wdng.

I have a few questions. First, why surface this to users? If we really need to, I don't think this is the right flag name/design. A named argument to some kind of printf lowering flag would be better. Second, I thought we were moving towards hostcall printf, not away from it.



================
Comment at: clang/include/clang/Basic/LangOptions.def:275
 LANGOPT(OffloadingNewDriver, 1, 0, "use the new driver for generating offloading code.")
+LANGOPT(DelayedPrintf, 1, 0, "version onf printf function to be used, hostcall or buffer based")
 
----------------
Typo 'onf'


================
Comment at: clang/include/clang/Driver/Options.td:985
+def fdelayed_printf : Flag<["-"], "fdelayed-printf">,
+  HelpText<"Specifies which version of printf is to be used while CodeGen">,
+  Flags<[CC1Option]>,
----------------
Help text reads weird


================
Comment at: clang/lib/Driver/ToolChains/Clang.cpp:4655-4667
+
+    if (JA.isDeviceOffloading(Action::OFK_HIP)) {
+      // Device side compilation printf
+      if (Args.getLastArg(options::OPT_fdelayed_printf))
+        CmdArgs.push_back("-fdelayed-printf");
+    }
   }
----------------
Missing clang side tests


================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUPrintfRuntimeBinding.cpp:135-136
 
-bool AMDGPUPrintfRuntimeBindingImpl::shouldPrintAsStr(char Specifier,
-                                                      Type *OpType) const {
-  if (Specifier != 's')
-    return false;
-  const PointerType *PT = dyn_cast<PointerType>(OpType);
-  if (!PT || PT->getAddressSpace() != AMDGPUAS::CONSTANT_ADDRESS)
-    return false;
-  Type *ElemType = PT->getContainedType(0);
-  if (ElemType->getTypeID() != Type::IntegerTyID)
-    return false;
-  IntegerType *ElemIType = cast<IntegerType>(ElemType);
-  return ElemIType->getBitWidth() == 8;
+// This function is essentially a copy from the file
+// Transforms/Utils/AMDGPUEmitPrintf.cpp
+static Value *getStrlenWithNull(IRBuilder<> &Builder, Value *Str) {
----------------
Why not share these? They should be in the same place. This should also be a separate change from any flag changes


================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUPrintfRuntimeBinding.cpp:141
+
+  auto CharZero = Builder.getInt8(0);
+  auto One = Builder.getInt64(1);
----------------
auto *


================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUPrintfRuntimeBinding.cpp:150
+  BasicBlock *Join = nullptr;
+  if (Prev->getTerminator()) {
+    Join = Prev->splitBasicBlock(Builder.GetInsertPoint(), "strlen.join");
----------------
Why would the block be missing a terminator here?


================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUPrintfRuntimeBinding.cpp:183-185
+  auto Begin = Builder.CreatePtrToInt(Str, Int64Ty);
+  auto End = Builder.CreatePtrToInt(PtrPhi, Int64Ty);
+  auto Len = Builder.CreateSub(End, Begin);
----------------
Really should try hard to avoid introducing ptrtoint. Do you really need to do a pointer subtract?


================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUPrintfRuntimeBinding.cpp:215
+      StringRef Str;
+      llvm::Value *RealSize;
+      llvm::Value *AlignedSize;
----------------
Don't need llvm::


================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUPrintfRuntimeBinding.cpp:439
 
-      Type *idPointer = PointerType::get(I32Ty, AMDGPUAS::GLOBAL_ADDRESS);
+      Type *idPointer = PointerType::get(Int32Ty, AMDGPUAS::GLOBAL_ADDRESS);
       Value *id_gep_cast =
----------------
Broken for opaque pointers?


================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUPrintfRuntimeBinding.cpp:443
 
-      new StoreInst(ConstantInt::get(I32Ty, UniqID), id_gep_cast, Brnch);
+      new StoreInst(ConstantInt::get(Int32Ty, UniqID), id_gep_cast, Brnch);
 
----------------
Why isn't this using the IRBuilder?


================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUPrintfRuntimeBinding.cpp:491-492
+              if (S[0]) {
+                char *MyNewStr = new char[NSizeStr]();
+                strcpy(MyNewStr, S.str().c_str());
+                int NumInts = NSizeStr / 4;
----------------
Why do we have raw new and strcpy here? We have better string classes


================
Comment at: llvm/test/CodeGen/AMDGPU/hip-delayed-printf.ll:1
+; RUN: opt -mtriple=amdgcn--amdhsa -passes=amdgpu-printf-runtime-binding -S < %s | FileCheck --check-prefix=FUNC --check-prefix=GCN --check-prefix=METADATA %s
+
----------------
Don't need these separate prefixes


================
Comment at: llvm/test/CodeGen/AMDGPU/hip-delayed-printf.ll:3
+
+; FUNC-LABEL: @test_kernel(
+; GCN-LABEL: entry
----------------
Should use generated checks


================
Comment at: llvm/test/CodeGen/AMDGPU/hip-delayed-printf.ll:39
+  store i32 25, ptr %p.ascast, align 4
+  %0 = load i32, ptr %p.ascast, align 4
+  %cmp = icmp sgt i32 %0, 30
----------------
Use named values in tests


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D138702/new/

https://reviews.llvm.org/D138702



More information about the cfe-commits mailing list