[PATCH] D93439: [OpenMP][NFC] Provide a new remark and documentation

Johannes Doerfert via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Dec 16 18:32:22 PST 2020


jdoerfert created this revision.
jdoerfert added a reviewer: tianshilei1992.
Herald added subscribers: guansong, bollu, hiraditya, yaxunl.
jdoerfert requested review of this revision.
Herald added subscribers: llvm-commits, sstefan1.
Herald added projects: OpenMP, LLVM.

If a GPU function is externally reachable we give up trying to find the
(unique) kernel it is called from. This can hinder optimizations. Emit a
remark and explain mitigation strategies.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D93439

Files:
  llvm/lib/Transforms/IPO/OpenMPOpt.cpp
  openmp/docs/remarks/OptimizationRemarks.rst


Index: openmp/docs/remarks/OptimizationRemarks.rst
===================================================================
--- openmp/docs/remarks/OptimizationRemarks.rst
+++ openmp/docs/remarks/OptimizationRemarks.rst
@@ -1,2 +1,30 @@
 OpenMP Optimization Remarks
 ===========================
+
+
+.. _omp100:
+.. _omp_no_external_caller_in_target_region:
+
+`[OMP100]` Potentially unknown OpenMP target region caller
+----------------------------------------------------------
+
+A function remark that indicates the function, when compiled for a GPU, is
+potentially called from outside the translation unit. Note that a remark is
+only issued if we tried to perform an optimization which would require us to
+know all callers on the GPU.
+
+To facilitate OpenMP semantics on GPUs we provide a runtime mechanism through
+which the code that makes up the body of a parallel region is shared with the
+threads in the team. Generally we use the address of the outlined parallel
+region to identify the code that needs to be executed. If we know all target
+regions that reach the parallel region we can avoid this function pointer
+passing scheme and often improve the register usage on the GPU. However, If a
+parallel region on the GPU is in a function with external linkage we may not
+know all callers statically. If there are outside callers within target
+regions, this remark is to be ignored. If there are no such callers, users can
+modify the linkage and thereby help optimization with a `static` or
+`__attribute__((internal))` function annotation. If changing the linkage is
+impossible, e.g., because there are outside callers on the host, one can split
+the function into an external visible interface which is not compiled for
+the target and an internal implementation which is compiled for the target
+and should be called from within the target region.
Index: llvm/lib/Transforms/IPO/OpenMPOpt.cpp
===================================================================
--- llvm/lib/Transforms/IPO/OpenMPOpt.cpp
+++ llvm/lib/Transforms/IPO/OpenMPOpt.cpp
@@ -1469,8 +1469,16 @@
     }
 
     CachedKernel = nullptr;
-    if (!F.hasLocalLinkage())
+    if (!F.hasLocalLinkage()) {
+
+      // See https://openmp.llvm.org/remarks/OptimizationRemarks.html
+      auto Remark = [&](OptimizationRemark OR) {
+        return OR << "[OMP100] Potentially unknown OpenMP target region caller";
+      };
+      emitRemarkOnFunction(&F, "OMP100", Remark);
+
       return nullptr;
+    }
   }
 
   auto GetUniqueKernelForUse = [&](const Use &U) -> Kernel {


-------------- next part --------------
A non-text attachment was scrubbed...
Name: D93439.312357.patch
Type: text/x-patch
Size: 2556 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20201217/9c455777/attachment.bin>


More information about the llvm-commits mailing list