[PATCH] D121951: [AMDGPU][OpenCL] Add "amdgpu-no-hostcall-ptr" in Clang codegen pre-COV_5

Sameer Sahasrabuddhe via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Tue Mar 29 09:13:27 PDT 2022


sameerds added a comment.

In D121951#3411856 <https://reviews.llvm.org/D121951#3411856>, @scott.linder wrote:

> @yaxunl Does excluding device-libs via COV_None make sense?
>
> @sameerds Would you still rather we just not add this attribute in the frontend at all? I'm OK with it if that is the consensus

Yes, I still think there is no need to emit that attribute in the frontend. It will always be inferred by the Attributor when optimization is enabled. This also eliminates the check for COV_None and there seems to be some uncertainty about COV_None anyway. This also eliminates the updates to all the tests where the no-hostcall-ptr attribute does not actually matter. If ever we need to check if hostcall is being used on OpenCL and COV < 5, we should do it per feature and inform the user appropriately.



================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUHSAMetadataStreamer.cpp:406-408
     if (Func.getParent()->getNamedMetadata("llvm.printf.fmts"))
       emitKernelArg(DL, Int8PtrTy, Align(8), ValueKind::HiddenPrintfBuffer);
+    else if (!Func.hasFnAttribute("amdgpu-no-hostcall-ptr"))
----------------
scott.linder wrote:
> sameerds wrote:
> > I would structure this differently: If this is code-object-v4 or lower, then if  "llvm.printf.fmts" is present, then this kernel clearly contains OpenCL bits, and cannot use hostcall. So it's okay to just assume that the no-hostcall-ptr attribute is always present in this situation, which means the only metadata generated is for ValueKind::HiddenPrintfBuffer. Else if this is code-object-v5, then proceed to emit both metadata.
> > 
> > 
> I'm not sure I follow; doesn't the code as-is implement what you're describing?
> 
> If the printf metadata is present, this will only ever emit the `HiddenPrintfBuffer`, irrespective of the hostcall attribute. Otherwise, this respects `amdgpu-no-hostcall-ptr` (i.e. for HIP and other languages).
> 
> The "if this is code-object-v4 or lower" portion is implicit, as this is just the `MetadataStreamerV2` impl. The `MetadataStreamerV3` (below) and `MetadataStreamerV4` (inherits from V3) impls below are similar. The `MetadataStreamerV5` impl is already correct for V5 (i.e. supports both argument kinds for the same kernel).
Your last para about different streamers cleared the confusion. So, this looks good.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D121951/new/

https://reviews.llvm.org/D121951



More information about the cfe-commits mailing list