[Openmp-commits] [PATCH] D14031: [OpenMP] Initial implementation of OpenMP offloading library - libomptarget.
C Bergström via Openmp-commits
openmp-commits at lists.llvm.org
Sat Dec 3 11:30:16 PST 2016
Guys before you go too crazy - is there a single bot setup to test
this? If there isn't.. I hope that someone sets one up. Otherwise you
may break stuff for others, but if I need to fix it.. I won't know if
I break other things..
On Sun, Dec 4, 2016 at 1:14 AM, Jonas Hahnfeld via Phabricator
<reviews at reviews.llvm.org> wrote:
> Hahnfeld requested changes to this revision.
> Hahnfeld added a reviewer: Hahnfeld.
> Hahnfeld added a comment.
> This revision now requires changes to proceed.
>
> Full review and comments all over
>
>
>
> ================
> Comment at: libomptarget/Build_With_CMake.txt:102-115
> +-DLIBOMPTARGET_NVPTX_ENABLE_BCLIB=false|true
> +Enable CUDA LLVM bitcode offloading device RTL. This is used for
> +link time optimization of the omp runtime and application code.
> +
> +-DLIBOMPTARGET_NVPTX_CUDA_COMPILER=<CUDA compiler name>
> +Location of a CUDA compiler capable of emitting LLVM bitcode.
> +Currently only the Clang compiler is supported. This is only used
> ----------------
> This should probably go under `NVPTX device RTL specific`
>
>
> ================
> Comment at: libomptarget/CMakeLists.txt:96-97
> +
> + # Install libomptarget under the lib destination folder.
> + install(TARGETS omptarget LIBRARY DESTINATION "lib")
> +
> ----------------
> There is `LLVM_LIBDIR_SUFFIX` which should probably be respected (compare to `LIBOMP_LIBDIR_SUFFIX`)
>
>
> ================
> Comment at: libomptarget/CMakeLists.txt:106-108
> + # Build offloading plugins and device RTLs if they are available.
> + add_subdirectory(plugins)
> + add_subdirectory(deviceRTLs)
> ----------------
> These are not part of this patch and should therefore not be added here
>
>
> ================
> Comment at: libomptarget/README.txt:60
> +this RTL:
> + - clang (from the OpenMP development branch at http://clang-omp.github.io/ )
> + - clang (development branch at http://clang.llvm.org - several features still
> ----------------
> Is that true? I remember there were some changes to the section names in the fatbinary?
>
>
> ================
> Comment at: libomptarget/src/omptarget.cpp:122-130
> + DeviceID = d.DeviceID;
> + RTL = d.RTL;
> + RTLDeviceID = d.RTLDeviceID;
> + IsInit = d.IsInit;
> + HasPendingGlobals = d.HasPendingGlobals;
> + HostDataToTargetMap = d.HostDataToTargetMap;
> + PendingCtorsDtors = d.PendingCtorsDtors;
> ----------------
> Could this reuse the assignment operator? Or otherwise do this in the initializer list
>
>
> ================
> Comment at: libomptarget/src/omptarget.cpp:255-257
> +/// Map between Device ID (i.e. openmp device id) and its DeviceTy.
> +typedef std::vector<DeviceTy> DevicesTy;
> +static DevicesTy Devices;
> ----------------
> Please move this up, directly below `struct DeviceTy`
>
>
> ================
> Comment at: libomptarget/src/omptarget.cpp:283
> + // Parse environment variable OMP_TARGET_OFFLOAD (if set)
> + char *envStr = getenv("OMP_TARGET_OFFLOAD");
> + if (envStr) {
> ----------------
> This is not in the standard, should this really start with `OMP_`?
>
>
> ================
> Comment at: libomptarget/src/omptarget.cpp:284-285
> + char *envStr = getenv("OMP_TARGET_OFFLOAD");
> + if (envStr) {
> + if (!strcmp(envStr, "DISABLED")) {
> + DP("Target offloading disabled by environment\n");
> ----------------
> can be collapsed
>
>
> ================
> Comment at: libomptarget/src/omptarget.cpp:421-422
> + // Init the device if not done before
> + if (!Device.IsInit) {
> + if (Device.initOnce() != OFFLOAD_SUCCESS) {
> + DP("Failed to init device %d\n", device_num);
> ----------------
> can be collapsed
>
>
> ================
> Comment at: libomptarget/src/omptarget.cpp:494
> +
> + DeviceTy &Device = Devices[device_num];
> + Device.RTL->data_delete(Device.RTLDeviceID, (void *)device_ptr);
> ----------------
> Do we want to check `device_is_ready` here?
>
>
> ================
> Comment at: libomptarget/src/omptarget.cpp:508-511
> + if (device_num == omp_get_initial_device()) {
> + DP("Call to omp_target_is_present on host, returning true\n");
> + return true;
> + }
> ----------------
> Should we have a list of all allocated addresses above?
>
>
> ================
> Comment at: libomptarget/src/omptarget.cpp:513
> +
> + DeviceTy& Device = Devices[device_num];
> + long IsLast; // not used
> ----------------
> Do we want to check `device_is_ready` here?
>
>
> ================
> Comment at: libomptarget/src/omptarget.cpp:532-533
> +
> + if (src_device != omp_get_initial_device())
> + if (!device_is_ready(src_device)) {
> + DP("omp_target_memcpy returns OFFLOAD_FAIL\n");
> ----------------
> can be collapsed
>
>
> ================
> Comment at: libomptarget/src/omptarget.cpp:538-539
> +
> + if (dst_device != omp_get_initial_device())
> + if (!device_is_ready(dst_device)) {
> + DP("omp_target_memcpy returns OFFLOAD_FAIL\n");
> ----------------
> likewise
>
>
> ================
> Comment at: libomptarget/src/omptarget.cpp:910
> +
> +int DeviceTy::deallocTgtPtr(void *HstPtrBegin, long Size, long ForceDelete) {
> + // Check if the pointer is contained in any sub-nodes.
> ----------------
> `bool ForceDelete`?
>
>
> ================
> Comment at: libomptarget/src/omptarget.cpp:1106
> +
> + // if an RTL was found we are done - proceed to register the next image
> + if (!FoundRTL) {
> ----------------
> `if no RTL was found`?
>
>
> ================
> Comment at: libomptarget/src/omptarget.cpp:1112-1136
> + // Load ctors/dtors for static objects
> + for (int32_t i = 0; i < FoundRTL->NumberOfDevices; ++i) {
> + DeviceTy &Device = Devices[i];
> + Device.PendingGlobalsMtx.lock();
> + Device.HasPendingGlobals = true;
> + for (__tgt_offload_entry *entry = img->EntriesBegin;
> + entry != img->EntriesEnd; ++entry) {
> ----------------
> Can't this be done immediately after the corresponding RTL was found the same way we register the image into the translation table? Maybe also refactor the code into another static function?
>
>
> ================
> Comment at: libomptarget/src/omptarget.cpp:1114
> + for (int32_t i = 0; i < FoundRTL->NumberOfDevices; ++i) {
> + DeviceTy &Device = Devices[i];
> + Device.PendingGlobalsMtx.lock();
> ----------------
> I think this should either be `Devices[FoundRTL->Idx + i]` or `*FoundRTL->Devices[i]`?
>
>
> ================
> Comment at: libomptarget/src/omptarget.cpp:1177
> + for (int32_t i = 0; i < FoundRTL->NumberOfDevices; ++i) {
> + DeviceTy &Device = Devices[i];
> + Device.PendingGlobalsMtx.lock();
> ----------------
> I think this should either be `Devices[FoundRTL->Idx + i]` or `*FoundRTL->Devices[i]`?
>
>
> ================
> Comment at: libomptarget/src/omptarget.cpp:1193-1205
> + // Remove translation table for this image.
> + TrlTblMtx.lock();
> + auto tt = HostEntriesBeginToTransTable.find(desc->EntriesBegin);
> + if (tt != HostEntriesBeginToTransTable.end()) {
> + HostEntriesBeginToTransTable.erase(tt);
> + DP("Unregistering image " DPxMOD " from RTL " DPxMOD "!\n",
> + DPxPTR(img->ImageStart), DPxPTR(R->LibraryHandler));
> ----------------
> I think this is only needed once per `desc` and not per `img`, isn't it?
>
>
> ================
> Comment at: libomptarget/src/omptarget.cpp:1354-1355
> +static int CheckDevice(int32_t device_id) {
> + // Get device info.
> + DeviceTy &Device = Devices[device_id];
> +
> ----------------
> The `Device` should only be fetched after the `device_id` has been proved valid
>
>
> ================
> Comment at: libomptarget/src/omptarget.cpp:1357-1377
> + // No devices available?
> + // Devices.size() can only change while registering a new
> + // library, so try to acquire the lock of RTLs' mutex.
> + RTLsMtx.lock();
> + size_t Devices_size = Devices.size();
> + RTLsMtx.unlock();
> + if (!(device_id >= 0 && (size_t)device_id < Devices_size)) {
> ----------------
> This looks the same as `device_is_ready`, can this be reused?
>
>
> ================
> Comment at: libomptarget/src/omptarget.cpp:1383-1384
> + Device.PendingGlobalsMtx.unlock();
> + if (hasPendingGlobals) {
> + if (InitLibrary(Device) != OFFLOAD_SUCCESS) {
> + DP("Failed to init globals on device %d\n", device_id);
> ----------------
> can be collapsed
>
>
> ================
> Comment at: libomptarget/src/omptarget.cpp:1393-1395
> +// Following datatypes and functions (tgt_oldmap_type, combined_entry_t,
> +// translate_map, cleanup_map) will be removed once the compiler starts using
> +// the new map types.
> ----------------
> Backwards-compatible on the first check-in? That sounds weird...
>
>
> ================
> Comment at: libomptarget/src/omptarget.cpp:2116
> + DPxPTR(HstPtrBegin));
> + rc = OFFLOAD_FAIL;
> + } else {
> ----------------
> `break` afterwards?
>
>
> ================
> Comment at: libomptarget/src/omptarget.cpp:2131
> + DP ("Copying data to device failed.\n");
> + rc = OFFLOAD_FAIL;
> + }
> ----------------
> likewise?
>
>
> ================
> Comment at: libomptarget/src/omptarget.h:86
> +struct __tgt_bin_desc {
> + int32_t NumDevices; // Number of device types supported
> + __tgt_device_image *DeviceImages; // Array of device images (1 per dev. type)
> ----------------
> Maybe rename this to `NumDeviceImages`?
>
>
> ================
> Comment at: libomptarget/src/omptarget.h:88-89
> + __tgt_device_image *DeviceImages; // Array of device images (1 per dev. type)
> + __tgt_offload_entry *EntriesBegin; // Begin of table with all host entries
> + __tgt_offload_entry *EntriesEnd; // End of table (non inclusive)
> +};
> ----------------
> Can those be renamed to have `Host` in the same so that I'm not that much confused? :D
>
>
> ================
> Comment at: libomptarget/src/omptarget.h:166
> +// is non-zero after the region execution is done it also performs the
> +// same action as data_update and data_end above. The following types are
> +// used; this function returns 0 if it was able to transfer the execution
> ----------------
> Only as `data_end`, not as `data_update`?
>
>
> ================
> Comment at: libomptarget/test/CMakeLists.txt:70-76
> + if(NOT MSVC)
> + set(LIBOMPTARGET_TEST_C_COMPILER ${LLVM_RUNTIME_OUTPUT_INTDIR}/clang)
> + set(LIBOMPTARGET_TEST_CXX_COMPILER ${LLVM_RUNTIME_OUTPUT_INTDIR}/clang++)
> + set(LIBOMPTARGET_FILECHECK ${LLVM_RUNTIME_OUTPUT_INTDIR}/FileCheck)
> + else()
> + libomptarget_error_say("Not prepared to run tests on Windows systems.")
> + endif()
> ----------------
> 1. This will abort CMake which probably isn't a good idea.
> 2. `MSVC` is possible for standalone builds?
>
>
> https://reviews.llvm.org/D14031
>
>
>
More information about the Openmp-commits
mailing list