[PATCH] D150427: [AMDGPU] Non hostcall printf support for HIP

Matt Arsenault via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Thu Jun 1 10:08:21 PDT 2023


arsenm added inline comments.


================
Comment at: clang/include/clang/Driver/Options.td:1030
   NegFlag<SetFalse>>;
+def mprintf_kind_EQ : Joined<["-"], "mprintf-kind=">, Group<m_Group>,
+  HelpText<"Specify the printf lowering scheme (AMDGPU only), allowed values are "
----------------
vikramRH wrote:
> arsenm wrote:
> > I'm a bit worried this is introducing a side-ABI option not captured in the triple or at least module flags 
> I did not understand the concern here. I agree The usage of option itself is not the most robust solution and it would have been better if we could do this without user intervention. but I do not see ways to do that now (atleast not immediately).
This introduces a way to produce binaries that will silently not work correctly depending on the host system. Without somehow tracking this property in the binary, there's no way to diagnose trying to load a binary on an incompatible system.


================
Comment at: clang/test/CodeGenHIP/printf_nonhostcall.cpp:228
+  return printf(s, 10);
+}
----------------
vikramRH wrote:
> arsenm wrote:
> > Need some vector tests, especially 3 x vectors 
> vectors are not yet supported as in the hostcall case since the code currently runs for HIP only. I plan to port the OpenCL lowering also here along with hostcall support . we would need to extend hostcall and this implementation to support vectors as part of the porting activity.
Something still needs to happen for vectors, it can't just crash. The different types of vectors may appear regardless of language 


================
Comment at: llvm/lib/Transforms/Utils/AMDGPUEmitPrintf.cpp:484
+
+      Builder.CreateStore(ConstantInt::get(Builder.getInt64Ty(), Hash.low()),
+                          Ptr);
----------------
Builder.getInt64(Hash.low())


================
Comment at: llvm/lib/Transforms/Utils/AMDGPUEmitPrintf.cpp:211
+struct StringData {
+  std::string Str = "";
+  bool isConst = true;
----------------
vikramRH wrote:
> arsenm wrote:
> > arsenm wrote:
> > > Don't need = ""
> > Can't you just use the raw StringRef out of getConstantStringInfo
> The structure just caches the string calculated from getConstantStringInfo
But you can just use returned StringRef for that. The actual constant contents persists in the LLVMContext, you don't need to copy it


================
Comment at: llvm/lib/Transforms/Utils/AMDGPUEmitPrintf.cpp:458
+    auto CreateControlDWord = M->getOrInsertFunction(
+        StringRef("__ockl_create_control_dword"), Builder.getInt32Ty(),
+        Builder.getInt32Ty(), Int1Ty, Int1Ty);
----------------
vikramRH wrote:
> arsenm wrote:
> > sameerds wrote:
> > > vikramRH wrote:
> > > > arsenm wrote:
> > > > > vikramRH wrote:
> > > > > > arsenm wrote:
> > > > > > > Do we really need another ockl control variable for this? Why isn't it a parameter? printf=stdout always 
> > > > > > There are certain HIP API's such as "hip_assert" that output to stderr. currently such API's are supported via hostcalls. Although this implementation does not currently support the API's ,its kept as an option. 
> > > > > Right but the way to handle that would be a parameter for where to output, not an externally set global 
> > > > I am not clear here, you expect additional inputs to device lib function ?
> > > @arsenm, this "control word" written into the buffer. In that sense, it is indeed a parameter passed from device to host as part of the printf packet. It is not yet another global variable.
> > I'm not following why this part requires introducing another call into device libs. It's expanding this not-really defined API. I'd expect this to be a trivial function we can expand inline here and write directly into the parameter buffer?
> I started the implementation keeping in mind the device side asserts currenlty supported via hostcall. They use explicit calls to such device lib functions in their definitions. I could not support them now but these would be required if I ever decide to. also I never ended up inlining this call. do you think its really necessary ?
But that implementation is in the library itself? What does the function actually do?

I think we're better off the more defined/documented ABI we have in terms of size-and-offset from base pointer than call into unstable library calls.

Can we get some documentation on the ABI in AMDGPUUsage?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D150427/new/

https://reviews.llvm.org/D150427



More information about the cfe-commits mailing list