[PATCH] D105981: [AMDGPU][OpenMP] Support linking of math libraries

Jon Chesterfield via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Wed Jul 14 08:34:49 PDT 2021


JonChesterfield added a reviewer: t-tye.
JonChesterfield added inline comments.


================
Comment at: clang/lib/Driver/ToolChains/AMDGPUOpenMP.cpp:267
+    llvm::SmallVector<std::string, 12> BCLibs;
+    BCLibs.append(RocmInstallation.getCommonBitcodeLibs(
+        DriverArgs, LibDeviceFile, Wave64, DAZ, FiniteOnly, UnsafeMathOpt,
----------------
Gone searching. This stuff has already been copied & pasted between AMDGPU.cpp and HIP.cpp. And diverged, looks like HIP has gained a bunch of inverse flags that AMDGPU has not, and some flags are duplicated (OPT_cl_fp32_correctly_rounded_divide_sqrt, OPT_fhip_fp32_correctly_rounded_divide_sqrt).

@b-sumner / @t-tye / @yaxunl / @scchan 

I'd like to suggest that we use the same handling of these flags on opencl / hip / c++  openmp. Since the names have diverged and we don't want to break backwards compatibility, I'd like to check for all three (hip, no_hip, cl) flags for each one, with last one wins semantics, and do that from a single method called by all the amdgpu language drivers.

That way hip and opencl will continue working as they do today, except hip will additionally accept some opencl flags and do the right thing, and likewise opencl.

If that is unacceptable, the near future involves OPT_fopenmp_fp32_correctly_rounded_device_sqrt and similar, and I'd much rather use the same path for all the languages than copy&paste again.

@pdhaliwal getCommonBitcodeLibs will splice in ockl as well as ocml and dependencies. OpenMP doesn't call into ockl so I'd like to avoid that, preferably by moving the ockl append out of getCommonBitcodeLibs

@amdgpu in general can we get rid of this oclc_isa_version_xxx.bc file, which only defines a single constant in each case
`@__oclc_ISA_version = linkonce_odr protected local_unnamed_addr addrspace(4) constant i32 9006, align 4`
in favour of emitting that linkonce_odr symbol from clang? Clang already knows which gpu it is compiling for, and uses that to find the file with the corresponding name on disk, so it could instead emit that symbol, letting us drop the O(N) tiny files from the install tree and slightly improving compile time?





Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D105981/new/

https://reviews.llvm.org/D105981



More information about the cfe-commits mailing list