[Openmp-commits] [PATCH] D71384: [libomptarget] Implement hsa plugin

Fri Dec 13 11:19:32 PST 2019

jdoerfert added inline comments.

================
Comment at: openmp/libomptarget/plugins/hsa/CMakeLists.txt:134
+# We need the AOMP specific build of ATMI that has HSA_INTEROP turned on. 
+# Also, the AOMP specific build of ATMI has seperate release and debug builds. 
+target_link_libraries(
----------------
JonChesterfield wrote:
> jdoerfert wrote:
> > What are all the reasons we need AOMP here and not the LLVM/Clang we ship this with?
> The fundamental reason is the library depends on an ATMI library which ships with aomp, and I don't have a great handle on under what conditions one can bundle runtime libraries with llvm.
> 
> Some incidental reasons are that aomp clang has patches related to making OpenMP work on amdgcn that aren't upsteam, e.g. some constant handling that is partly inlined here.
OK, but people that have ATMI have aomp, correct? I mean, either they can build upstream LLVM with amdgcn offloading and everything is there or they can't build the offloading part.

Long term, as you mentioned at some point earlier, we have to minimize such LLVM duplication ;)

================
Comment at: openmp/libomptarget/plugins/hsa/src/rtl.cpp:162
+  // static int EnvNumThreads;
+  static const int HardTeamLimit = 1 << 20; // 1 Meg
+  static const int DefaultNumTeams = 128;
----------------
JonChesterfield wrote:
> jdoerfert wrote:
> > For the record, no need to fix this now: As with the team limit env variable (see below), I think this has to be handled differently in 5.1. We need to deal with arbitrary team sizes, if the hardware doesn't we need to have a software loop.
> Do you happen to know what arbitrary is in this context? UINT32_MAX, UINT64_MAX? It's hopefully documented in the spec.
The spec is generally vague on the types of expressions but anything that fits in a uint64_t should be acceptable for us.

================
Comment at: openmp/libomptarget/plugins/hsa/src/rtl.cpp:655
+  __tgt_offload_entry *HostBegin = image->EntriesBegin;
+  __tgt_offload_entry *HostEnd = image->EntriesEnd;
+
----------------
JonChesterfield wrote:
> jdoerfert wrote:
> > Assuming this is the same for all targets, I would think this stuff should live in a plugins/helper file. I mean, a way to iterate over all offload entries.  What do you think?
> Sure - there's probably a number of common patterns between this and some other plugins that should be extracted. This particular one looks OK to me - it's only just shy of iterators.
I am mostly interested in the ones that have a cross dependence on the driver/linker/embedding part. Full refactoring might be a long term goal but nothing required right now.

================
Comment at: openmp/libomptarget/plugins/hsa/src/rtl.cpp:1186
+    fprintf(stderr, "loop_tripcount: %ld\n", loop_tripcount);
+  }
+  DP("Final %d num_groups and %d threadsPerGroup\n", num_groups,
----------------
JonChesterfield wrote:
> jdoerfert wrote:
> > Can we have a macro for this `if (print_kernel_trace > 1) { fprint ... }` scheme? Less control flow will make this easier to read.
> Reluctant to go for a macro but agree with the sentiment. How do you feel about
> ```
> int errprint(const char *fmt, ...) {
>   int rc = 0;
>   if (print_kernel_trace > 1) {
>     va_list args;
>     va_start(args, fmt);
>     rc = vfprintf(stderr, fmt, args);
>     va_end(args);
>   }
>   return rc;
> }
> ```
Sure. Macro, variadic function, variadic template (if this wasn't C), all fine with me.

I'd call it dbg_trace, kernel_trace, or sth similar, errprint sounds like it is an error once it is hit.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D71384/new/

https://reviews.llvm.org/D71384