[PATCH] D45212: [HIP] Let CUDA toolchain support HIP language mode and amdgpu

Thu Apr 19 19:25:48 PDT 2018

yaxunl marked 21 inline comments as done.
yaxunl added inline comments.

================
Comment at: include/clang/Driver/Options.td:552-553
+def : Joined<["--"], "offload-arch=">, Alias<cuda_gpu_arch_EQ>;
+def offload_archs : Joined<["--"], "offload-archs=">, Flags<[DriverOption]>,
+  HelpText<"List of offload architectures for CUDA/HIP/OpenMP (e.g. sm_35,gfx803).">;
 def no_cuda_gpu_arch_EQ : Joined<["--"], "no-cuda-gpu-arch=">, Flags<[DriverOption]>,
----------------
tra wrote:
> The discussion you've mentioned appears to be unfinished. @jlebar has raised a valid point regarding -no-offload-arch[s] and his email went unanswered AFAICT.
> 
> If you propose to generalize a way to specify offload targets, it should probably be done in a separate patch.
> 
> I'd just stick with `--cuda-gpu-arch` for the time being until we figure out how we're going to handle target specification in general. It works well enough for the moment and the new options are not required for the functionality implemented by this patch.
Will use `--cuda-gpu-arch` for this patch.

================
Comment at: lib/Driver/Driver.cpp:2333-2339
+      if (HostTC->getTriple().isNVPTX() ||
+          HostTC->getTriple().getArch() == llvm::Triple::amdgcn) {
         // We do not support targeting NVPTX for host compilation. Throw
         // an error and abort pipeline construction early so we don't trip
         // asserts that assume device-side compilation.
         C.getDriver().Diag(diag::err_drv_cuda_nvptx_host);
         return true;
----------------
tra wrote:
> You really want to either have your own error message or change existing one to something more generic. `unsupported use of NVPTX for host compilation.` will sound very confusing tof someone who tries to compile for AMD GPU.
Will generalise this error message.

================
Comment at: lib/Driver/Driver.cpp:3319
+  static bool isAMDGPUCUDAOffloading(const Action *A, llvm::Triple T) {
+    return A->isOffloading(Action::OFK_Cuda) &&
+           (StringRef(A->getOffloadingArch()).startswith("gfx") ||
----------------
tra wrote:
> Would it make sense for HIP to have its own offloading kind? You seem to be adding similar checks in few other places.
Yes it makes sense to let HIP have its own offloading kind. Will do.

================
Comment at: lib/Driver/Driver.cpp:3472-3474
+        const char *Value = A->getValue();
+        if (const char *Ext = strrchr(Value, '.'))
+          InputType = TC.LookupTypeForExtension(Ext + 1);
----------------
tra wrote:
> `llvm::sys::path::extension` ?
will do

================
Comment at: lib/Driver/Driver.cpp:3905-3912
+    SmallString<128> fname(Split.first.str().c_str());
+    if (!BoundArch.empty()) {
+      fname += "-";
+      fname.append(BoundArch);
+    }
     std::string TmpName = GetTemporaryPath(
+        fname, types::getTypeTempSuffix(JA.getType(), IsCLMode()));
----------------
tra wrote:
> I'm not sure why we do this here. Nor does it seem relevant to HIP. 
will remove it

================
Comment at: lib/Driver/Driver.cpp:3982-3985
+    if (((StringRef)BaseInput).endswith(".a"))
+      Suffixed += "a";
+    else
+      Suffixed += Suffix;
----------------
tra wrote:
> Same here -- I'm not sure why we need this nor how it's related to HIP.
will remove

================
Comment at: lib/Driver/ToolChains/Cuda.cpp:352
+  CmdArgs.push_back("-o");
+  std::string TmpName = C.getDriver().GetTemporaryPath("LC_OUTPUT", "o");
+  const char *llcOutputFile =
----------------
tra wrote:
> `L*L*C_OUTPUT` ?
will change to LLC_OUTPUT

================
Comment at: lib/Driver/ToolChains/Cuda.cpp:356
+  CmdArgs.push_back(llcOutputFile);
+  const char *Exec = Args.MakeArgString(C.getDriver().Dir + "/llc");
+  C.addCommand(llvm::make_unique<Command>(JA, *this, Exec, CmdArgs, Inputs));
----------------
tra wrote:
> Please use routines in`llvm::sys::path` here and in other places where you manipulate paths.
will do

================
Comment at: lib/Driver/ToolChains/Cuda.cpp:360-363
+  CmdArgs2.push_back("-flavor");
+  CmdArgs2.push_back("gnu");
+  CmdArgs2.push_back("--no-undefined");
+  CmdArgs2.push_back("-shared");
----------------
tra wrote:
> You could use .append({...})
> ```
> CmdArgs2.append({"foo","bar","buz"});
> ```
will do

================
Comment at: lib/Driver/ToolChains/Cuda.cpp:436-439
+    if (Arg->getSpelling() == "-L") {
+      LibraryPaths.push_back(Args.MakeArgString(
+          std::string(Arg->getValue()) + "/libdevice/" + std::string(GFXNAME)));
+      LibraryPaths.push_back(Args.MakeArgString(Arg->getValue()));
----------------
tra wrote:
> This is rather awkward -- you're looking for /libdevice under all paths specified by -L, but there's no way to explicitly point to the directory with the bitcode library. If device library may be in a non-canonical location, I'd rather there was an explicit option to specify it. Furthermore, my understanding is that you will need to find these bitcode libraries during `-c` compilation. Using `-L` to derive bitcode search path during compilation looks wrong to me.
Will use `--hip-device-lib-path=` and drop /libdevice.

================
Comment at: lib/Driver/ToolChains/Cuda.cpp:445
+  LibraryPaths.push_back(Args.MakeArgString(
+      C.getDriver().Dir + "/../lib/libdevice/" + std::string(GFXNAME)));
+
----------------
tra wrote:
> This (and other places where you're calculating libdevice path relative to driver dir) should probably be derived from the resource dir.  Driver's path may not be the 'root' the compiler has been told to use.
will remove this

================
Comment at: lib/Driver/ToolChains/Cuda.cpp:447
+
+  addDirectoryList(Args, LibraryPaths, "-L", "LIBRARY_PATH");
+
----------------
tra wrote:
> LIBRARY_PATH sounds rather generic. Perhaps it should have HIP somewhere in its name.
will use HIP_DEVICE_LIB_PATH

================
Comment at: lib/Driver/ToolChains/Cuda.cpp:462
+
+  CmdArgs.push_back("-suppress-warnings");
+
----------------
tra wrote:
> Why do we need to silence the warnings?
will remove this

================
Comment at: lib/Driver/ToolChains/Cuda.cpp:498-501
+    OptArgs.push_back(mcpustr);
+    OptArgs.push_back("-dce");
+    OptArgs.push_back("-sroa");
+    OptArgs.push_back("-globaldce");
----------------
tra wrote:
> This does not look like the right place to disable particular passes. Shouldn't it be done somewhere in LLVM?
These are not disabling the passes but adding these passes. They are some optimizations which are usually improving performance for amdgcn.

https://reviews.llvm.org/D45212