[PATCH] D59474: [OpenMP 5.0] Codegen support for user-defined mappers

Alexey Bataev via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Tue Jun 25 10:54:03 PDT 2019


ABataev added a comment.

Sorry for the delay, Lingda. I tried to find some better solution for this.



================
Comment at: lib/CodeGen/CGOpenMPRuntime.cpp:8739
+///     // For each component specified by this mapper:
+///     if (currentComponent.hasMapper())
+///       (*currentComponent.Mapper())(rt_mapper_handle, arg_base, arg_begin,
----------------
lildmh wrote:
> ABataev wrote:
> > lildmh wrote:
> > > ABataev wrote:
> > > > Currently `currentComponent` is generated by the compiler. But can we instead pass this data as an extra parameter to this `omp_mapper` function.
> > > Emm, I think this scheme will be very difficult and inefficient. If we pass components as an argument of `omp_mapper` function, it means that the runtime needs to generate all components related to a map clause. I don't think the runtime is able to do that efficiently. On the other hand, in the current scheme, these components are naturally generated by the compiler, and the runtime only needs to know the base pointer, pointer, type, size. etc.
> > With the current scheme, we may end with the code blowout. We need to generate very similar code for different types and variables. The worst thing here is that we will be unable to optimize this huge amount of code because the codegen relies on the runtime functions and the code cannot be inlined. That's why I would like to move as much as possible code to the runtime rather than to emit it in the compiler. 
> I understand your concerns. I think this is the best we can do right now.
> 
> The most worrisome case will be when we have nested mappers within each other. In this case, a mapper function will call another mapper function. We can inline the inner mapper functions in this scenario, so that these mapper function can be properly optimized. As a result, I think the performance should be fine.
Instead, we can use indirect function calls passed in the array to the runtime. Do you think it is going to be slower? In your current scheme, we generate many runtime calls instead. Could you try to estimate the number of calls in cases if we'll call the mappers through the indirect function calls and in your cuurent scheme, where we need to call the runtime functions many times in each particular mapper?


================
Comment at: lib/CodeGen/CGOpenMPRuntime.cpp:744
   OMPRTL__tgt_target_data_update_nowait,
+  // Call to int64_t __tgt_mapper_num_components(void *rt_mapper_handle);
+  OMPRTL__tgt_mapper_num_components,
----------------
Do we really need to use `int64_t` for number of elements? `size_t` must be enough.


================
Comment at: lib/CodeGen/CGOpenMPRuntime.cpp:8907
+    llvm::Value *CombinedMember =
+        MapperCGF.Builder.CreateAdd(OriMapType, ShiftedPreviousSize);
+    // Do nothing if it is not a member of previous components.
----------------
You can use `nuw` attribute here, I think


================
Comment at: lib/CodeGen/CGOpenMPRuntime.cpp:9033
+  llvm::Value *IsArray = MapperCGF.Builder.CreateICmpSGE(
+      Size, MapperCGF.Builder.getIntN(C.getTypeSize(SizeTy), 1),
+      "omp.arrayinit.isarray");
----------------
You can use `C.toBits(CGM.getSizeSize())`


================
Comment at: lib/CodeGen/CGOpenMPRuntime.cpp:9055
+  // Size).
+  unsigned SizeTyWidth = C.getTypeSize(C.getSizeType());
+  llvm::Value *ArraySize = MapperCGF.Builder.CreateMul(
----------------
Just `CGM.getSizeSize()`


================
Comment at: lib/CodeGen/CGOpenMPRuntime.cpp:9056
+  unsigned SizeTyWidth = C.getTypeSize(C.getSizeType());
+  llvm::Value *ArraySize = MapperCGF.Builder.CreateMul(
+      Size, MapperCGF.Builder.getIntN(SizeTyWidth, ElementSize.getQuantity()));
----------------
Add `nuw` attribute


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D59474/new/

https://reviews.llvm.org/D59474





More information about the cfe-commits mailing list