[Openmp-commits] [PATCH] D14031: [OpenMP] Initial implementation of OpenMP offloading library - libomptarget.

George Rokos via Phabricator via Openmp-commits openmp-commits at lists.llvm.org
Fri Jan 13 15:15:18 PST 2017


grokos added inline comments.


================
Comment at: libomptarget/Build_With_CMake.txt:102-115
+-DLIBOMPTARGET_NVPTX_ENABLE_BCLIB=false|true
+Enable CUDA LLVM bitcode offloading device RTL. This is used for
+link time optimization of the omp runtime and application code.
+
+-DLIBOMPTARGET_NVPTX_CUDA_COMPILER=<CUDA compiler name>
+Location of a CUDA compiler capable of emitting LLVM bitcode.
+Currently only the Clang compiler is supported. This is only used
----------------
Hahnfeld wrote:
> This should probably go under `NVPTX device RTL specific`
We only have one instruction file for the whole library. After all, cmake is invoked once from libomptarget's root directory, all -D definitions are passed to the "root" cmake.

I am happy to revise this scheme if you think it makes more sense to split the instructions, although I strongly prefer the current implementation.


================
Comment at: libomptarget/CMakeLists.txt:96-97
+  
+  # Install libomptarget under the lib destination folder.
+  install(TARGETS omptarget LIBRARY DESTINATION "lib")
+  
----------------
sfantao wrote:
> Hahnfeld wrote:
> > There is `LLVM_LIBDIR_SUFFIX` which should probably be respected (compare to `LIBOMP_LIBDIR_SUFFIX`)
> Right, following what we selected before we can use `lib${LIBOMPTARGET_LIBDIR_SUFFIX}` instead of just `lib`.
OK, I incorporated the changes in the new diff. Thanks for the suggestion!


================
Comment at: libomptarget/CMakeLists.txt:106-108
+  # Build offloading plugins and device RTLs if they are available.
+  add_subdirectory(plugins)
+  add_subdirectory(deviceRTLs)
----------------
Hahnfeld wrote:
> These are not part of this patch and should therefore not be added here
OK, I'll make them part of the corresponding patches for plugins and RTLs.


================
Comment at: libomptarget/README.txt:60
+this RTL: 
+  - clang (from the OpenMP development branch at http://clang-omp.github.io/ )
+  - clang (development branch at http://clang.llvm.org - several features still 
----------------
sfantao wrote:
> Hahnfeld wrote:
> > Is that true? I remember there were some changes to the section names in the fatbinary?
> Good catch. The library is compatible with what is `https://github.com/clang-ykt`, we should probably use that instead. Unlike `http://clang-omp.github.io/`, it contains the latest Sema/CodeGen reimplementation that has been happening in the trunk. 
> 
> Also, as Jonas mentioned, some section naming is different and the offload entries descriptors contain more information.
I updated the documentation in the new diff.


================
Comment at: libomptarget/src/omptarget.cpp:122-130
+    DeviceID = d.DeviceID;
+    RTL = d.RTL;
+    RTLDeviceID = d.RTLDeviceID;
+    IsInit = d.IsInit;
+    HasPendingGlobals = d.HasPendingGlobals;
+    HostDataToTargetMap = d.HostDataToTargetMap;
+    PendingCtorsDtors = d.PendingCtorsDtors;
----------------
Hahnfeld wrote:
> Could this reuse the assignment operator? Or otherwise do this in the initializer list
It's not efficient to reuse the assignment operator. I've done this in the initializer list instead.


================
Comment at: libomptarget/src/omptarget.cpp:255-257
+/// Map between Device ID (i.e. openmp device id) and its DeviceTy.
+typedef std::vector<DeviceTy> DevicesTy;
+static DevicesTy Devices;
----------------
Hahnfeld wrote:
> Please move this up, directly below `struct DeviceTy`
Done in new diff.


================
Comment at: libomptarget/src/omptarget.cpp:283
+  // Parse environment variable OMP_TARGET_OFFLOAD (if set)
+  char *envStr = getenv("OMP_TARGET_OFFLOAD");
+  if (envStr) {
----------------
Hahnfeld wrote:
> This is not in the standard, should this really start with `OMP_`?
Our buildbot already uses this env var name, but I am happy to change it to whatever name you propose. However, I don't see any problem with `OMP_`, especially if there is any chance that future revisions of the standard introduce such an env var.


================
Comment at: libomptarget/src/omptarget.cpp:421-422
+  // Init the device if not done before
+  if (!Device.IsInit) {
+    if (Device.initOnce() != OFFLOAD_SUCCESS) {
+      DP("Failed to init device %d\n", device_num);
----------------
Hahnfeld wrote:
> can be collapsed
Done.


================
Comment at: libomptarget/src/omptarget.cpp:494
+
+  DeviceTy &Device = Devices[device_num];
+  Device.RTL->data_delete(Device.RTLDeviceID, (void *)device_ptr);
----------------
Hahnfeld wrote:
> Do we want to check `device_is_ready` here?
Right, thanks for pointing it out!


================
Comment at: libomptarget/src/omptarget.cpp:508-511
+  if (device_num == omp_get_initial_device()) {
+    DP("Call to omp_target_is_present on host, returning true\n");
+    return true;
+  }
----------------
Hahnfeld wrote:
> Should we have a list of all allocated addresses above?
No, `omp_target_is_present` checks whether some host address has corresponding storage on the device, it's got nothing to do with memory allocated via `omp_target_alloc`. So, in order to check whether a host address has been mapped we just use the device's `HostDataToTargetMap`.


================
Comment at: libomptarget/src/omptarget.cpp:513
+
+  DeviceTy& Device = Devices[device_num];
+  long IsLast; // not used
----------------
Hahnfeld wrote:
> Do we want to check `device_is_ready` here?
It's not necessary. `omp_target_is_present` only accesses the device's `HostDataToTargetMap` to determine whether an address has been mapped. If the device is not initialized, `HostDataToTargetMap` has zero contents and `omp_target_is_present` correctly returns false. The function does not make any call to RTL functions (like `data_alloc` or `data_free`), so even if the device has not been initialized there is no problem.


================
Comment at: libomptarget/src/omptarget.cpp:532-533
+
+  if (src_device != omp_get_initial_device())
+    if (!device_is_ready(src_device)) {
+      DP("omp_target_memcpy returns OFFLOAD_FAIL\n");
----------------
Hahnfeld wrote:
> can be collapsed
Done.


================
Comment at: libomptarget/src/omptarget.cpp:538-539
+
+  if (dst_device != omp_get_initial_device())
+    if (!device_is_ready(dst_device)) {
+      DP("omp_target_memcpy returns OFFLOAD_FAIL\n");
----------------
Hahnfeld wrote:
> likewise
Done.


================
Comment at: libomptarget/src/omptarget.cpp:910
+
+int DeviceTy::deallocTgtPtr(void *HstPtrBegin, long Size, long ForceDelete) {
+  // Check if the pointer is contained in any sub-nodes.
----------------
Hahnfeld wrote:
> `bool ForceDelete`?
I've changed most instances where `long` served as a `bool`, i.e. variables `IsNew`, `IsLast`, `Forcedelete`, `IsImplicit` etc.

I also realized that `long` was used for array sizes, whereas the correct datatype should be `int64_t` (that's the datatype used by clang to communicate with the interface functions). I fixed that too.


================
Comment at: libomptarget/src/omptarget.cpp:1106
+
+    // if an RTL was found we are done - proceed to register the next image
+    if (!FoundRTL) {
----------------
Hahnfeld wrote:
> `if no RTL was found`?
Thanks for spotting this.


================
Comment at: libomptarget/src/omptarget.cpp:1112-1136
+    // Load ctors/dtors for static objects
+    for (int32_t i = 0; i < FoundRTL->NumberOfDevices; ++i) {
+      DeviceTy &Device = Devices[i];
+      Device.PendingGlobalsMtx.lock();
+      Device.HasPendingGlobals = true;
+      for (__tgt_offload_entry *entry = img->EntriesBegin;
+          entry != img->EntriesEnd; ++entry) {
----------------
Hahnfeld wrote:
> Can't this be done immediately after the corresponding RTL was found the same way we register the image into the translation table? Maybe also refactor the code into another static function?
Done. I refactored the code into a new static function which is called right after registering the image into the translation table.


================
Comment at: libomptarget/src/omptarget.cpp:1114
+    for (int32_t i = 0; i < FoundRTL->NumberOfDevices; ++i) {
+      DeviceTy &Device = Devices[i];
+      Device.PendingGlobalsMtx.lock();
----------------
Hahnfeld wrote:
> I think this should either be `Devices[FoundRTL->Idx + i]` or `*FoundRTL->Devices[i]`?
Correct, fixed in new diff.


================
Comment at: libomptarget/src/omptarget.cpp:1177
+      for (int32_t i = 0; i < FoundRTL->NumberOfDevices; ++i) {
+        DeviceTy &Device = Devices[i];
+        Device.PendingGlobalsMtx.lock();
----------------
Hahnfeld wrote:
> I think this should either be `Devices[FoundRTL->Idx + i]` or `*FoundRTL->Devices[i]`?
Fixed.


================
Comment at: libomptarget/src/omptarget.cpp:1193-1205
+      // Remove translation table for this image.
+      TrlTblMtx.lock();
+      auto tt = HostEntriesBeginToTransTable.find(desc->EntriesBegin);
+      if (tt != HostEntriesBeginToTransTable.end()) {
+        HostEntriesBeginToTransTable.erase(tt);
+        DP("Unregistering image " DPxMOD " from RTL " DPxMOD "!\n",
+            DPxPTR(img->ImageStart), DPxPTR(R->LibraryHandler));
----------------
Hahnfeld wrote:
> I think this is only needed once per `desc` and not per `img`, isn't it?
Correct.


================
Comment at: libomptarget/src/omptarget.cpp:1354-1355
+static int CheckDevice(int32_t device_id) {
+  // Get device info.
+  DeviceTy &Device = Devices[device_id];
+
----------------
Hahnfeld wrote:
> The `Device` should only be fetched after the `device_id` has been proved valid
Fixed.


================
Comment at: libomptarget/src/omptarget.cpp:1357-1377
+  // No devices available?
+  // Devices.size() can only change while registering a new
+  // library, so try to acquire the lock of RTLs' mutex.
+  RTLsMtx.lock();
+  size_t Devices_size = Devices.size();
+  RTLsMtx.unlock();
+  if (!(device_id >= 0 && (size_t)device_id < Devices_size)) {
----------------
Hahnfeld wrote:
> This looks the same as `device_is_ready`, can this be reused?
Correct, I've reused the former function in the new diff.


================
Comment at: libomptarget/src/omptarget.cpp:1383-1384
+  Device.PendingGlobalsMtx.unlock();
+  if (hasPendingGlobals) {
+    if (InitLibrary(Device) != OFFLOAD_SUCCESS) {
+      DP("Failed to init globals on device %d\n", device_id);
----------------
Hahnfeld wrote:
> can be collapsed
Done.


================
Comment at: libomptarget/src/omptarget.cpp:1393-1395
+// Following datatypes and functions (tgt_oldmap_type, combined_entry_t,
+// translate_map, cleanup_map) will be removed once the compiler starts using
+// the new map types.
----------------
Hahnfeld wrote:
> Backwards-compatible on the first check-in? That sounds weird...
Backwards compatible with internal, unreleased versions of clang. Oh well....


================
Comment at: libomptarget/src/omptarget.cpp:2116
+            DPxPTR(HstPtrBegin));
+        rc = OFFLOAD_FAIL;
+      } else {
----------------
Hahnfeld wrote:
> `break` afterwards?
Thanks for spotting this, obviously if offload fails at some point there is no need to continue with the loop.


================
Comment at: libomptarget/src/omptarget.cpp:2131
+            DP ("Copying data to device failed.\n");
+            rc = OFFLOAD_FAIL;
+          }
----------------
Hahnfeld wrote:
> likewise?
Done.


================
Comment at: libomptarget/src/omptarget.h:86
+struct __tgt_bin_desc {
+  int32_t NumDevices;                // Number of device types supported
+  __tgt_device_image *DeviceImages;  // Array of device images (1 per dev. type)
----------------
Hahnfeld wrote:
> Maybe rename this to `NumDeviceImages`?
You're right, some variable names caused confusion, I've changed them :)


================
Comment at: libomptarget/src/omptarget.h:88-89
+  __tgt_device_image *DeviceImages;  // Array of device images (1 per dev. type)
+  __tgt_offload_entry *EntriesBegin; // Begin of table with all host entries
+  __tgt_offload_entry *EntriesEnd;   // End of table (non inclusive)
+};
----------------
Hahnfeld wrote:
> Can those be renamed to have `Host` in the same so that I'm not that much confused? :D
Done.


================
Comment at: libomptarget/src/omptarget.h:166
+// is non-zero after the region execution is done it also performs the
+// same action as data_update and data_end above. The following types are
+// used; this function returns 0 if it was able to transfer the execution
----------------
Hahnfeld wrote:
> Only as `data_end`, not as `data_update`?
Right, it's only data_end.


================
Comment at: libomptarget/test/CMakeLists.txt:70-76
+  if(NOT MSVC)
+    set(LIBOMPTARGET_TEST_C_COMPILER ${LLVM_RUNTIME_OUTPUT_INTDIR}/clang)
+    set(LIBOMPTARGET_TEST_CXX_COMPILER ${LLVM_RUNTIME_OUTPUT_INTDIR}/clang++)
+    set(LIBOMPTARGET_FILECHECK ${LLVM_RUNTIME_OUTPUT_INTDIR}/FileCheck)
+  else()
+    libomptarget_error_say("Not prepared to run tests on Windows systems.")
+  endif()
----------------
sfantao wrote:
> Hahnfeld wrote:
> > 1. This will abort CMake which probably isn't a good idea.
> > 2. `MSVC` is possible for standalone builds?
> 1. Ok, we should probably replace this by a warning instead of using an error.
> 
> 2. It should be possible. The reason of the error/warning is that it is untested. We try not to do anything that precludes MSVC builds - we also try to pave the way for that to happen by preparing some MSVC options as currently used by other components of LLVM and libomp. However, the goal of our contribution is to add support for linux machines which are the machines we and our users have access to. This is true for this library but is also true for the compiler support - it will need tuning/features to fully work on Windows. We expect someone with interest in having support for Windows to complete the work by contributing the features and testing infrastructure for that.
OK, I replaced the fatal error with a warning.


https://reviews.llvm.org/D14031





More information about the Openmp-commits mailing list