[PATCH] D76772: [AMDGPU] Add __builtin_amdgcn_workgroup_size_x/y/z

Wed Mar 25 10:48:11 PDT 2020

yaxunl added inline comments.

================
Comment at: clang/lib/CodeGen/CGBuiltin.cpp:13428
+    return Call;
+  return CGF.Builder.CreateAddrSpaceCast(Call, RetTy);
+}
----------------
arsenm wrote:
> Why is this necessary? The builtin always has the same return type?
due to https://github.com/llvm/llvm-project/commit/c65f966d76aa5412920b3f14d199e764135bd5ec

pointers returned by builtin functions are in default address space for HIP.

================
Comment at: clang/lib/CodeGen/CGBuiltin.cpp:13435
+  auto *DP = EmitAMDGPUDispatchPtr(CGF);
+  auto *Offset = llvm::ConstantInt::get(CGF.Int32Ty, XOffset + Index * 2);
+  auto *GEP = CGF.Builder.CreateGEP(DP, Offset);
----------------
arsenm wrote:
> Comment that this is indexing the hsa_kernel_dispatch_packet sstruct?
done

================
Comment at: clang/lib/CodeGen/CGBuiltin.cpp:13442
+  llvm::MDBuilder MDHelper(CGF.getLLVMContext());
+  llvm::MDNode *RNode = MDHelper.createRange(APInt(16, 1), APInt(16, 1025));
+  LD->setMetadata(llvm::LLVMContext::MD_range, RNode);
----------------
arsenm wrote:
> I thought I had a patch to include the maximum group size in AMDGPUTargetInfo to avoid hardcoding it, but I guess it was never committed
Added getMaxOpenCLWorkGroupSize() to TargetInfo

================
Comment at: clang/lib/CodeGen/CGBuiltin.cpp:13443
+  llvm::MDNode *RNode = MDHelper.createRange(APInt(16, 1), APInt(16, 1025));
+  LD->setMetadata(llvm::LLVMContext::MD_range, RNode);
+  return LD;
----------------
arsenm wrote:
> Also set it's invariant
done

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D76772/new/

https://reviews.llvm.org/D76772