[PATCH] D45212: [HIP] Let CUDA toolchain support HIP language mode and amdgpu

Wed May 23 07:59:58 PDT 2018

yaxunl marked 19 inline comments as done.
yaxunl added a comment.

In https://reviews.llvm.org/D45212#1105177, @tra wrote:

> Hi,
>
> Sorry about the long silence. I'm back to continue the reviews. I'll handle what I can today and will continue with the rest on Tuesday.
>
> It looks like patch description needs to be updated:
>
> > Use clang-offload-bindler to create binary for device ISA.
>
> I don't see anything related to offload-bundler in this patch any more.

You are right. Using clang-offload-bundler to create binary for device ISA has been moved to another patch. Will update the description of this patch.

In https://reviews.llvm.org/D45212#1105282, @tra wrote:

> One more thing -- it would be really good to add some tests to make sure your commands are constructed the way you want.

will do

================
Comment at: include/clang/Driver/Options.td:582
 def fno_cuda_rdc : Flag<["-"], "fno-cuda-rdc">;
+def hip_device_lib_path_EQ : Joined<["--"], "hip-device-lib-path=">, Group<i_Group>,
+  HelpText<"HIP device library path">;
----------------
tra wrote:
> I'm not sure about `i_Group`? This will cause this option to be passed to all preprocessor jobs. It will also be passed to host and device side compilations, while you probably only want/need it on device side only.
will change to Link_Group

================
Comment at: lib/Driver/ToolChains/Cuda.cpp:323
+    C.getDriver().Diag(diag::err_drv_no_such_file) << BCName;
+  CmdArgs.push_back(Args.MakeArgString(FullName));
+  return FoundLibDevice;
----------------
tra wrote:
> FullName is already result of Args.MakeArgString. You only need to do it once.
will fix

================
Comment at: lib/Driver/ToolChains/Cuda.cpp:329
+// object file. It calls llvm-link, opt, llc, then lld steps.
+void AMDGCN::Linker::ConstructJob(Compilation &C, const JobAction &JA,
+                                  const InputInfo &Output,
----------------
tra wrote:
> This function is too large to easily see that we're actually constructing sequence of commands.
> I'd probably split construction of individual tool's command line into its own function.
will do

================
Comment at: lib/Driver/ToolChains/Cuda.cpp:336
+  assert(StringRef(JA.getOffloadingArch()).startswith("gfx") &&
+         " unless gfx processor, backend should be clang");
+  const auto &TC =
----------------
tra wrote:
> No need for the leading space in the message.
will fix.

================
Comment at: lib/Driver/ToolChains/Cuda.cpp:344-345
+  // Add the input bc's created by compile step.
+  for (InputInfoList::const_iterator it = Inputs.begin(), ie = Inputs.end();
+       it != ie; ++it) {
+    const InputInfo &II = *it;
----------------
tra wrote:
> `for (const InputInfo &it : Inputs)` ?
will fix

================
Comment at: lib/Driver/ToolChains/Cuda.cpp:350
+
+  std::string GFXNAME = JA.getOffloadingArch();
+
----------------
tra wrote:
> All-caps name looks like a macro. Rename to `GfxName` ?
will fix

================
Comment at: lib/Driver/ToolChains/Cuda.cpp:354-359
+  // Find in --hip-device-lib-path and HIP_LIBRARY_PATH.
+  for (auto Arg : Args) {
+    if (Arg->getSpelling() == "--hip-device-lib-path=") {
+      LibraryPaths.push_back(Args.MakeArgString(Arg->getValue()));
+    }
+  }
----------------
tra wrote:
> ```
> for (path : Args.getAllArgValues(...)) {
>    LibraryPaths.push_back(Args.MakeArgString(path));
> }
> 
> ```
will fix

================
Comment at: lib/Driver/ToolChains/Cuda.cpp:375-378
+  addBCLib(C, Args, CmdArgs, LibraryPaths,
+           (Twine("oclc_isa_version_") + StringRef(GFXNAME).drop_front(3) +
+            ".amdgcn.bc")
+               .str());
----------------
tra wrote:
> This is somewhat unreadable. Perhaps you could construct the name in a temp variable.
will do

================
Comment at: lib/Driver/ToolChains/Cuda.cpp:384
+  const char *ResultingBitcodeF =
+      C.addTempFile(C.getArgs().MakeArgString(TmpName.c_str()));
+  CmdArgs.push_back(ResultingBitcodeF);
----------------
tra wrote:
> You don't need to use c_str() for MakeArgString. It will happily accept std::string.
will fix

================
Comment at: lib/Driver/ToolChains/Cuda.cpp:394
+  // The input to opt is the output from llvm-link.
+  OptArgs.push_back(ResultingBitcodeF);
+  // Pass optimization arg to opt.
----------------
tra wrote:
> `BitcodeOutputFile`?
will change

================
Comment at: lib/Driver/ToolChains/Cuda.cpp:417
+  const char *mcpustr = Args.MakeArgString("-mcpu=" + GFXNAME);
+  OptArgs.push_back(mcpustr);
+  OptArgs.push_back("-o");
----------------
tra wrote:
> I think you can get rid of the temp var here without hurting readability.
will do

================
Comment at: lib/Driver/ToolChains/Cuda.cpp:420
+  std::string OptOutputFileName =
+      C.getDriver().GetTemporaryPath("OPT_OUTPUT", "bc");
+  const char *OptOutputFile =
----------------
tra wrote:
> I wonder if we could derive temp file name from the input's name. This may make it easier to find relevant temp files if/when we need to debug the compilation process.
will do

================
Comment at: lib/Driver/ToolChains/Cuda.cpp:422
+  const char *OptOutputFile =
+      C.addTempFile(C.getArgs().MakeArgString(OptOutputFileName.c_str()));
+  OptArgs.push_back(OptOutputFile);
----------------
tra wrote:
> No need for c_str() here.
will do

================
Comment at: lib/Driver/ToolChains/Cuda.cpp:439
+  const char *LlcOutputFile =
+      C.addTempFile(C.getArgs().MakeArgString(LlcOutputFileName.c_str()));
+  LlcArgs.push_back(LlcOutputFile);
----------------
tra wrote:
> c_str(), again.
will fix

================
Comment at: lib/Driver/ToolChains/Cuda.cpp:764
+  if (DriverArgs.hasArg(options::OPT_nocudalib) ||
+      DeviceOffloadingKind == Action::OFK_HIP)
     return;
----------------
tra wrote:
> I'd  just add something like this and leave existing if unchanged:
> ```
> // There's no need for CUDA-specific bitcode linking with HIP.
> if( DeviceOffloadingKind == Action::OFK_HIP)
>   return;
> ```
> 
> 
will change

================
Comment at: lib/Driver/ToolChains/Cuda.cpp:926-938
 Tool *CudaToolChain::buildAssembler() const {
-  return new tools::NVPTX::Assembler(*this);
+  if (getTriple().isNVPTX())
+    return new tools::NVPTX::Assembler(*this);
+  return ToolChain::buildAssembler();
 }

 Tool *CudaToolChain::buildLinker() const {
----------------
tra wrote:
> All these should be in the derived toolchain.
will do

================
Comment at: lib/Driver/ToolChains/Cuda.h:137-138
+  Linker(const ToolChain &TC)
+      : Tool("AMDGCN::Linker", "amdgcn-link", TC, RF_Full,
+             llvm::sys::WEM_UTF8, "--options-file") {}
+
----------------
tra wrote:
> Where does amdgcn-link come from? Does it accept --options-file ?
amdgcn-link is the short name of the amdgcn bitcode linker. It is not a real program and does not support response file. Will remove the arguments about response file.

================
Comment at: lib/Driver/ToolChains/Cuda.h:153-154

 class LLVM_LIBRARY_VISIBILITY CudaToolChain : public ToolChain {
 public:
   CudaToolChain(const Driver &D, const llvm::Triple &Triple,
----------------
tra wrote:
> You may want to derive your own HIPToolChain as you're unlikely to reuse NVPTX-specific tools.
Will do

================
Comment at: lib/Driver/ToolChains/Cuda.h:172

-  // Never try to use the integrated assembler with CUDA; always fork out to
-  // ptxas.
-  bool useIntegratedAs() const override { return false; }
+  bool useIntegratedAs() const override;
   bool isCrossCompiling() const override { return true; }
----------------
tra wrote:
> In HIPToolchain this one would become an inline function returning true.
will do

https://reviews.llvm.org/D45212