[PATCH] D42800: Let CUDA toolchain support amdgpu target

Mon Feb 5 14:18:47 PST 2018

gregrodgers added a comment.

Here my replys to the inline comments.   Everything should be fixed in the next revision.

================
Comment at: include/clang/Basic/Cuda.h:79
   COMPUTE_72,
+  COMPUTE_GCN,
 };
----------------
t-tye wrote:
> Suggest using amdgcn which matches the architecture name in https://llvm.org/docs/AMDGPUUsage.html#processors .
Yes,  I will add them in the update.

================
Comment at: include/clang/Basic/Cuda.h:79
   COMPUTE_72,
+  COMPUTE_GCN,
 };
----------------
gregrodgers wrote:
> t-tye wrote:
> > Suggest using amdgcn which matches the architecture name in https://llvm.org/docs/AMDGPUUsage.html#processors .
> Yes,  I will add them in the update.
Done in next update

================
Comment at: lib/Basic/Targets/AMDGPU.cpp:437
+      case CudaArch::UNKNOWN:
+        assert(false && "No GPU arch when compiling CUDA device code.");
+        return "";
----------------
arsenm wrote:
> llvm_unreachable
Fixed in next update

================
Comment at: lib/Basic/Targets/AMDGPU.h:85
     return TT.getEnvironmentName() == "amdgiz" ||
+           TT.getOS() == llvm::Triple::CUDA ||
            TT.getEnvironmentName() == "amdgizcl";
----------------
yaxunl wrote:
> t-tye wrote:
> > We still want to use the amdhsa OS for amdgpu which currently supports the different environments. So can cuda simply support the same environments? Is the plan is to eliminate the environments and simply always use the default address space for generic so this code is no longer needed?
> Currently we already use amdgiz by default. This is no longer needed.
removed in next update

================
Comment at: lib/Driver/ToolChains/Cuda.cpp:359-361
+  addBCLib(C, Args, CmdArgs, LibraryPaths, "opencl.amdgcn.bc");
+  addBCLib(C, Args, CmdArgs, LibraryPaths, "ockl.amdgcn.bc");
+  addBCLib(C, Args, CmdArgs, LibraryPaths, "irif.amdgcn.bc");
----------------
arsenm wrote:
> Why is this done under an NVPTX:: class
Because we are not creating a new toolchain for AMDGCN.  We modify the logic in the tool constructor as needed for AMDGCN, keeping the ability to provide a set of mixed targets. 

================
Comment at: lib/Driver/ToolChains/Cuda.cpp:403-415
+  // If Verbose, save input for llc in /tmp and print all symbols
+  if (Args.hasArg(options::OPT_v)) {
+    ArgStringList nmArgs;
+    nmArgs.push_back(ResultingBitcodeF);
+    nmArgs.push_back("-debug-syms");
+    const char *nmExec = Args.MakeArgString(C.getDriver().Dir + "/llvm-nm");
+    C.addCommand(llvm::make_unique<Command>(JA, *this, nmExec, nmArgs, Inputs));
----------------
Hahnfeld wrote:
> This never gets cleaned up!
OK, Deleted in revision. 

================
Comment at: lib/Driver/ToolChains/Cuda.cpp:531-534
+  SmallString<256> OutputFileName(Output.getFilename());
+  if (JA.isOffloading(Action::OFK_OpenMP))
+    llvm::sys::path::replace_extension(OutputFileName, "cubin");
+  CmdArgs.push_back(Args.MakeArgString(OutputFileName));
----------------
Hahnfeld wrote:
> That is already done in `TC.getInputFilename(Output)` (since rC318763), the same function call that you are removing here...
Fixed in next update

================
Comment at: lib/Driver/ToolChains/Cuda.cpp:639-640
+    CmdArgs2.push_back(Args.MakeArgString(Output.getFilename()));
+    const char *Exec2 =
+        Args.MakeArgString(C.getDriver().Dir + "/clang-fixup-fatbin");
+    C.addCommand(
----------------
Hahnfeld wrote:
> `clang-fixup-fatbin` is not upstreamed and won't work. Sounds like a horrible name btw...
Major cleanup here in the next update.  It is not a bad name when you see the update and the comments in the update. 

================
Comment at: lib/Driver/ToolChains/Cuda.cpp:788-793
+  // Do not add -link-cuda-bitcode or ptx42 features if gfx
+  for (Arg *A : DriverArgs)
+    if (A->getOption().matches(options::OPT_cuda_gpu_arch_EQ) &&
+        StringRef(A->getValue()).startswith("gfx"))
+      return;
+
----------------
Hahnfeld wrote:
> You should use `GpuArch` which comes from `DriverArgs.getLastArgValue`: The last `-march` overrides previous arguments.
Nice catch.  I will fix this in the next update. 

https://reviews.llvm.org/D42800