<div class="socmaildefaultfont" dir="ltr" style="font-family:Arial, Helvetica, sans-serif;font-size:10.5pt" ><div dir="ltr" >Yes, we working in getting a bot to test this. We want to have it in place before push this work.</div>
<div dir="ltr" > </div>
<div dir="ltr" >Thanks,</div>
<div dir="ltr" >Samuel</div>
<div dir="ltr" > </div>
<blockquote data-history-content-modified="1" dir="ltr" style="border-left:solid #aaaaaa 2px; margin-left:5px; padding-left:5px; direction:ltr; margin-right:0px" >----- Original message -----<br>From: C Bergström <cbergstrom@pathscale.com><br>To: reviews+D14031+public+7ce63fa100576b3a@reviews.llvm.org<br>Cc: Georgios Rokos/Watson/IBM@IBMUS, Hal Finkel <hfinkel@anl.gov>, "Peyton, Jonathan L" <jonathan.l.peyton@intel.com>, "Cownie, James H" <james.h.cownie@intel.com>, Alexey Bataev <a.bataev@hotmail.com>, Chandler Carruth <chandlerc@gmail.com>, "Narayanaswamy, Ravi" <ravi.narayanaswamy@intel.com>, Samuel F Antao/Watson/IBM@IBMUS, Jason Henline <jhen@google.com>, hyviquel@gmail.com, "Hahnfeld, Jonas" <Hahnfeld@itc.rwth-aachen.de>, Michał Górny <mgorny@gentoo.org>, Konstantin.S.Bobrovsky@intel.com, pawel.osmialowski@arm.com, John Leidel <john.leidel@gmail.com>, mkuron@icp.uni-stuttgart.de, Tom Deakin <tom.deakin@bristol.ac.uk>, James Beyer <jbeyer@nvidia.com>, j.price@bristol.ac.uk, "Zhang, Guansong" <guansong.zhang@amd.com>, f.brygidyn@samsung.com, Alexandre Eichenberger/Watson/IBM@IBMUS, openmp-commits@lists.llvm.org, Kevin K O'Brien/Watson/IBM@IBMUS, Carlo Bertolli/Watson/IBM@IBMUS, Andrey Bokhanko <andreybokhanko@gmail.com>, Sergey Ostanevich <sergos.gnu@gmail.com><br>Subject: Re: [PATCH] D14031: [OpenMP] Initial implementation of OpenMP offloading library - libomptarget.<br>Date: Sat, Dec 3, 2016 7:30 PM<br>
<div><font face="Default Monospace,Courier New,Courier,monospace" size="2" >Guys before you go too crazy - is there a single bot setup to test<br>this? If there isn't.. I hope that someone sets one up. Otherwise you<br>may break stuff for others, but if I need to fix it.. I won't know if<br>I break other things..<br><br>On Sun, Dec 4, 2016 at 1:14 AM, Jonas Hahnfeld via Phabricator<br><reviews@reviews.llvm.org> wrote:<br>> Hahnfeld requested changes to this revision.<br>> Hahnfeld added a reviewer: Hahnfeld.<br>> Hahnfeld added a comment.<br>> This revision now requires changes to proceed.<br>><br>> Full review and comments all over<br>><br>><br>><br>> ================<br>> Comment at: libomptarget/Build_With_CMake.txt:102-115<br>> +-DLIBOMPTARGET_NVPTX_ENABLE_BCLIB=false|true<br>> +Enable CUDA LLVM bitcode offloading device RTL. This is used for<br>> +link time optimization of the omp runtime and application code.<br>> +<br>> +-DLIBOMPTARGET_NVPTX_CUDA_COMPILER=<CUDA compiler name><br>> +Location of a CUDA compiler capable of emitting LLVM bitcode.<br>> +Currently only the Clang compiler is supported. This is only used<br>> ----------------<br>> This should probably go under `NVPTX device RTL specific`<br>><br>><br>> ================<br>> Comment at: libomptarget/CMakeLists.txt:96-97<br>> +<br>> + # Install libomptarget under the lib destination folder.<br>> + install(TARGETS omptarget LIBRARY DESTINATION "lib")<br>> +<br>> ----------------<br>> There is `LLVM_LIBDIR_SUFFIX` which should probably be respected (compare to `LIBOMP_LIBDIR_SUFFIX`)<br>><br>><br>> ================<br>> Comment at: libomptarget/CMakeLists.txt:106-108<br>> + # Build offloading plugins and device RTLs if they are available.<br>> + add_subdirectory(plugins)<br>> + add_subdirectory(deviceRTLs)<br>> ----------------<br>> These are not part of this patch and should therefore not be added here<br>><br>><br>> ================<br>> Comment at: libomptarget/README.txt:60<br>> +this RTL:<br>> + - clang (from the OpenMP development branch at <a href="http://clang-omp.github.io" target="_blank" >http://clang-omp.github.io</a>/ )<br>> + - clang (development branch at <a href="http://clang.llvm.org" target="_blank" >http://clang.llvm.org</a> - several features still<br>> ----------------<br>> Is that true? I remember there were some changes to the section names in the fatbinary?<br>><br>><br>> ================<br>> Comment at: libomptarget/src/omptarget.cpp:122-130<br>> + DeviceID = d.DeviceID;<br>> + RTL = d.RTL;<br>> + RTLDeviceID = d.RTLDeviceID;<br>> + IsInit = d.IsInit;<br>> + HasPendingGlobals = d.HasPendingGlobals;<br>> + HostDataToTargetMap = d.HostDataToTargetMap;<br>> + PendingCtorsDtors = d.PendingCtorsDtors;<br>> ----------------<br>> Could this reuse the assignment operator? Or otherwise do this in the initializer list<br>><br>><br>> ================<br>> Comment at: libomptarget/src/omptarget.cpp:255-257<br>> +/// Map between Device ID (i.e. openmp device id) and its DeviceTy.<br>> +typedef std::vector<DeviceTy> DevicesTy;<br>> +static DevicesTy Devices;<br>> ----------------<br>> Please move this up, directly below `struct DeviceTy`<br>><br>><br>> ================<br>> Comment at: libomptarget/src/omptarget.cpp:283<br>> + // Parse environment variable OMP_TARGET_OFFLOAD (if set)<br>> + char *envStr = getenv("OMP_TARGET_OFFLOAD");<br>> + if (envStr) {<br>> ----------------<br>> This is not in the standard, should this really start with `OMP_`?<br>><br>><br>> ================<br>> Comment at: libomptarget/src/omptarget.cpp:284-285<br>> + char *envStr = getenv("OMP_TARGET_OFFLOAD");<br>> + if (envStr) {<br>> + if (!strcmp(envStr, "DISABLED")) {<br>> + DP("Target offloading disabled by environment\n");<br>> ----------------<br>> can be collapsed<br>><br>><br>> ================<br>> Comment at: libomptarget/src/omptarget.cpp:421-422<br>> + // Init the device if not done before<br>> + if (!Device.IsInit) {<br>> + if (Device.initOnce() != OFFLOAD_SUCCESS) {<br>> + DP("Failed to init device %d\n", device_num);<br>> ----------------<br>> can be collapsed<br>><br>><br>> ================<br>> Comment at: libomptarget/src/omptarget.cpp:494<br>> +<br>> + DeviceTy &Device = Devices[device_num];<br>> + Device.RTL->data_delete(Device.RTLDeviceID, (void *)device_ptr);<br>> ----------------<br>> Do we want to check `device_is_ready` here?<br>><br>><br>> ================<br>> Comment at: libomptarget/src/omptarget.cpp:508-511<br>> + if (device_num == omp_get_initial_device()) {<br>> + DP("Call to omp_target_is_present on host, returning true\n");<br>> + return true;<br>> + }<br>> ----------------<br>> Should we have a list of all allocated addresses above?<br>><br>><br>> ================<br>> Comment at: libomptarget/src/omptarget.cpp:513<br>> +<br>> + DeviceTy& Device = Devices[device_num];<br>> + long IsLast; // not used<br>> ----------------<br>> Do we want to check `device_is_ready` here?<br>><br>><br>> ================<br>> Comment at: libomptarget/src/omptarget.cpp:532-533<br>> +<br>> + if (src_device != omp_get_initial_device())<br>> + if (!device_is_ready(src_device)) {<br>> + DP("omp_target_memcpy returns OFFLOAD_FAIL\n");<br>> ----------------<br>> can be collapsed<br>><br>><br>> ================<br>> Comment at: libomptarget/src/omptarget.cpp:538-539<br>> +<br>> + if (dst_device != omp_get_initial_device())<br>> + if (!device_is_ready(dst_device)) {<br>> + DP("omp_target_memcpy returns OFFLOAD_FAIL\n");<br>> ----------------<br>> likewise<br>><br>><br>> ================<br>> Comment at: libomptarget/src/omptarget.cpp:910<br>> +<br>> +int DeviceTy::deallocTgtPtr(void *HstPtrBegin, long Size, long ForceDelete) {<br>> + // Check if the pointer is contained in any sub-nodes.<br>> ----------------<br>> `bool ForceDelete`?<br>><br>><br>> ================<br>> Comment at: libomptarget/src/omptarget.cpp:1106<br>> +<br>> + // if an RTL was found we are done - proceed to register the next image<br>> + if (!FoundRTL) {<br>> ----------------<br>> `if no RTL was found`?<br>><br>><br>> ================<br>> Comment at: libomptarget/src/omptarget.cpp:1112-1136<br>> + // Load ctors/dtors for static objects<br>> + for (int32_t i = 0; i < FoundRTL->NumberOfDevices; ++i) {<br>> + DeviceTy &Device = Devices[i];<br>> + Device.PendingGlobalsMtx.lock();<br>> + Device.HasPendingGlobals = true;<br>> + for (__tgt_offload_entry *entry = img->EntriesBegin;<br>> + entry != img->EntriesEnd; ++entry) {<br>> ----------------<br>> Can't this be done immediately after the corresponding RTL was found the same way we register the image into the translation table? Maybe also refactor the code into another static function?<br>><br>><br>> ================<br>> Comment at: libomptarget/src/omptarget.cpp:1114<br>> + for (int32_t i = 0; i < FoundRTL->NumberOfDevices; ++i) {<br>> + DeviceTy &Device = Devices[i];<br>> + Device.PendingGlobalsMtx.lock();<br>> ----------------<br>> I think this should either be `Devices[FoundRTL->Idx + i]` or `*FoundRTL->Devices[i]`?<br>><br>><br>> ================<br>> Comment at: libomptarget/src/omptarget.cpp:1177<br>> + for (int32_t i = 0; i < FoundRTL->NumberOfDevices; ++i) {<br>> + DeviceTy &Device = Devices[i];<br>> + Device.PendingGlobalsMtx.lock();<br>> ----------------<br>> I think this should either be `Devices[FoundRTL->Idx + i]` or `*FoundRTL->Devices[i]`?<br>><br>><br>> ================<br>> Comment at: libomptarget/src/omptarget.cpp:1193-1205<br>> + // Remove translation table for this image.<br>> + TrlTblMtx.lock();<br>> + auto tt = HostEntriesBeginToTransTable.find(desc->EntriesBegin);<br>> + if (tt != HostEntriesBeginToTransTable.end()) {<br>> + HostEntriesBeginToTransTable.erase(tt);<br>> + DP("Unregistering image " DPxMOD " from RTL " DPxMOD "!\n",<br>> + DPxPTR(img->ImageStart), DPxPTR(R->LibraryHandler));<br>> ----------------<br>> I think this is only needed once per `desc` and not per `img`, isn't it?<br>><br>><br>> ================<br>> Comment at: libomptarget/src/omptarget.cpp:1354-1355<br>> +static int CheckDevice(int32_t device_id) {<br>> + // Get device info.<br>> + DeviceTy &Device = Devices[device_id];<br>> +<br>> ----------------<br>> The `Device` should only be fetched after the `device_id` has been proved valid<br>><br>><br>> ================<br>> Comment at: libomptarget/src/omptarget.cpp:1357-1377<br>> + // No devices available?<br>> + // Devices.size() can only change while registering a new<br>> + // library, so try to acquire the lock of RTLs' mutex.<br>> + RTLsMtx.lock();<br>> + size_t Devices_size = Devices.size();<br>> + RTLsMtx.unlock();<br>> + if (!(device_id >= 0 && (size_t)device_id < Devices_size)) {<br>> ----------------<br>> This looks the same as `device_is_ready`, can this be reused?<br>><br>><br>> ================<br>> Comment at: libomptarget/src/omptarget.cpp:1383-1384<br>> + Device.PendingGlobalsMtx.unlock();<br>> + if (hasPendingGlobals) {<br>> + if (InitLibrary(Device) != OFFLOAD_SUCCESS) {<br>> + DP("Failed to init globals on device %d\n", device_id);<br>> ----------------<br>> can be collapsed<br>><br>><br>> ================<br>> Comment at: libomptarget/src/omptarget.cpp:1393-1395<br>> +// Following datatypes and functions (tgt_oldmap_type, combined_entry_t,<br>> +// translate_map, cleanup_map) will be removed once the compiler starts using<br>> +// the new map types.<br>> ----------------<br>> Backwards-compatible on the first check-in? That sounds weird...<br>><br>><br>> ================<br>> Comment at: libomptarget/src/omptarget.cpp:2116<br>> + DPxPTR(HstPtrBegin));<br>> + rc = OFFLOAD_FAIL;<br>> + } else {<br>> ----------------<br>> `break` afterwards?<br>><br>><br>> ================<br>> Comment at: libomptarget/src/omptarget.cpp:2131<br>> + DP ("Copying data to device failed.\n");<br>> + rc = OFFLOAD_FAIL;<br>> + }<br>> ----------------<br>> likewise?<br>><br>><br>> ================<br>> Comment at: libomptarget/src/omptarget.h:86<br>> +struct __tgt_bin_desc {<br>> + int32_t NumDevices; // Number of device types supported<br>> + __tgt_device_image *DeviceImages; // Array of device images (1 per dev. type)<br>> ----------------<br>> Maybe rename this to `NumDeviceImages`?<br>><br>><br>> ================<br>> Comment at: libomptarget/src/omptarget.h:88-89<br>> + __tgt_device_image *DeviceImages; // Array of device images (1 per dev. type)<br>> + __tgt_offload_entry *EntriesBegin; // Begin of table with all host entries<br>> + __tgt_offload_entry *EntriesEnd; // End of table (non inclusive)<br>> +};<br>> ----------------<br>> Can those be renamed to have `Host` in the same so that I'm not that much confused? :D<br>><br>><br>> ================<br>> Comment at: libomptarget/src/omptarget.h:166<br>> +// is non-zero after the region execution is done it also performs the<br>> +// same action as data_update and data_end above. The following types are<br>> +// used; this function returns 0 if it was able to transfer the execution<br>> ----------------<br>> Only as `data_end`, not as `data_update`?<br>><br>><br>> ================<br>> Comment at: libomptarget/test/CMakeLists.txt:70-76<br>> + if(NOT MSVC)<br>> + set(LIBOMPTARGET_TEST_C_COMPILER ${LLVM_RUNTIME_OUTPUT_INTDIR}/clang)<br>> + set(LIBOMPTARGET_TEST_CXX_COMPILER ${LLVM_RUNTIME_OUTPUT_INTDIR}/clang++)<br>> + set(LIBOMPTARGET_FILECHECK ${LLVM_RUNTIME_OUTPUT_INTDIR}/FileCheck)<br>> + else()<br>> + libomptarget_error_say("Not prepared to run tests on Windows systems.")<br>> + endif()<br>> ----------------<br>> 1. This will abort CMake which probably isn't a good idea.<br>> 2. `MSVC` is possible for standalone builds?<br>><br>><br>> <a href="https://reviews.llvm.org/D14031" target="_blank" >https://reviews.llvm.org/D14031</a><br>><br>><br>></font><br> </div></blockquote>
<div dir="ltr" > </div></div><BR>