[PATCH] D86097: [OpenMP][AMDGCN] Generate global variables and attributes for AMDGCN

Saiyedul Islam via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Thu Nov 26 04:27:04 PST 2020


saiislam marked 3 inline comments as done.
saiislam added inline comments.


================
Comment at: clang/lib/CodeGen/CGOpenMPRuntime.cpp:1344
 
+FieldDecl *clang::CodeGen::CodeGenUtil::CodeGenUtil::addFieldToRecordDecl(
+    ASTContext &C, DeclContext *DC, QualType FieldTy) {
----------------
JonChesterfield wrote:
> This appears to be the same as the free function we had before, except now all the call sites are prefixed CodegenUtil. Is there a functional change I'm missing?
> 
> The rest of this patch would be easier to read with this part split off.
addFieldToRecordDecl and createGlobalStruct methods had file static scope. To make them callable from other files, from amdgcn specific file in this case, they were put in this utility class.

D92167 puts this change into a separate patch. Will update this patch once D92167 gets accepted.


================
Comment at: clang/lib/CodeGen/CGOpenMPRuntimeAMDGCN.cpp:77
+
+void CGOpenMPRuntimeAMDGCN::emitNonSPMDKernelWrapper(
+    const OMPExecutableDirective &D, StringRef ParentName,
----------------
JonChesterfield wrote:
> This is a very verbose way to say that amdgcn calls emitmetatdata at the end of emitkernel and nvptx doesn't. Suggest unconditionally calling emitmetatdata, and having emitmetatdata be a no-op for nvptx.
Won't the no-op approach be less extensible? Current way, though verbose, leaves scope for attaching prefix/suffix code as and when required around emitkernel. While in case of no-op, every implementing arch might have to use the exact same pattern of methods with and without code.


================
Comment at: clang/lib/CodeGen/CGOpenMPRuntimeAMDGCN.cpp:89
+//
+void CGOpenMPRuntimeAMDGCN::setPropertyWorkGroupSize(CodeGenModule &CGM,
+                                                     StringRef Name,
----------------
JonChesterfield wrote:
> I think there's a credible chance this is useful to nvptx, so doesn't have to be amdgcn specific
You are right, it can be useful for nvptx as well. May be we can club its generalization with the nvptx's use-case when it arrives in the future?


================
Comment at: clang/lib/CodeGen/CGOpenMPRuntimeAMDGCN.cpp:108
+  bool FlatAttrEmitted = false;
+  unsigned DefaultWorkGroupSz =
+      CGM.getTarget().getGridValue(llvm::omp::GVIDX::GV_Default_WG_Size);
----------------
JonChesterfield wrote:
> I think this is about computing a maximum workgroup size which the runtime uses to limit the number of threads it launches. If so, this is probably useful for nvptx and amdgcn. I'm having trouble working out what the conditions are though. Maybe it's based on an openmp clause?
Yes, the if block in 111-147 corresponds to "number of threads" for thread_limit and num_threads clauses in teams and parallel directives.


================
Comment at: clang/lib/CodeGen/CGOpenMPRuntimeAMDGCN.cpp:169
+  // Create all device images
+  llvm::Constant *AttrData[] = {
+      llvm::ConstantInt::get(CGM.Int16Ty, 2), // Version
----------------
JonChesterfield wrote:
> HostServices is unused. Mode is redundant with exec_mode. wg_size is redundant with the other wg_size symbol added above. This kern_desc object should be deleted, not upstreamed.
Ok, thanks. Will update in next revision.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D86097/new/

https://reviews.llvm.org/D86097



More information about the cfe-commits mailing list