[PATCH] D16082: [CUDA] Invoke ptxas and fatbinary during compilation.
Artem Belevich via cfe-commits
cfe-commits at lists.llvm.org
Mon Jan 11 14:32:10 PST 2016
tra added a comment.
Make sure it works with -save-temps and -fintegrated-as/-fno-integrated-as. They tend to throw wrenches into pipeline construction.
================
Comment at: lib/Driver/Driver.cpp:1380
@@ +1379,3 @@
+ C.MakeAction<LinkJobAction>(DeviceActions, types::TY_CUDA_FATBIN),
+ /* GpuArchName = */ nullptr,
+ /* AtTopLevel = */ false);
----------------
So, you're treating GpuArchName==nullptr as a special case of DeviceAction for fatbin?
Perhaps it warrants its own CudaDeviceLinkAction?
================
Comment at: lib/Driver/Driver.cpp:1620-1625
@@ -1602,3 +1619,8 @@
case phases::Assemble:
- return C.MakeAction<AssembleJobAction>(Input, types::TY_Object);
+ // Assembling generates an object file, except when we're assembling CUDA,
+ // in which case we get a cubin file.
+ return C.MakeAction<AssembleJobAction>(
+ std::move(Input), TC.getTriple().getOS() == llvm::Triple::CUDA
+ ? types::TY_CUDA_CUBIN
+ : types::TY_Object);
}
----------------
cubin *is* an ELF object file. Do we really need a new type here or can we get by with TY_Object?
================
Comment at: lib/Driver/ToolChains.cpp:4193
@@ +4192,3 @@
+ : Linux(D, Triple, Args) {
+ if (CudaInstallation.isValid()) {
+ getProgramPaths().push_back(CudaInstallation.getBinPath());
----------------
unneeded {} here and in few more places throughout the patch.
================
Comment at: lib/Driver/ToolChains.cpp:4230
@@ -4224,3 +4229,3 @@
// Skip this argument unless the architecture matches BoundArch
- if (A->getValue(0) != StringRef(BoundArch))
+ if (!BoundArch || A->getValue(0) != StringRef(BoundArch))
continue;
----------------
You may as well move it out of the loop and return early if BoundArch is nullptr.
================
Comment at: lib/Driver/Tools.cpp:10621-10625
@@ +10620,7 @@
+ ArgStringList CmdArgs;
+ if (TC.getTriple().isArch64Bit()) {
+ CmdArgs.push_back("-m64");
+ } else {
+ CmdArgs.push_back("-m32");
+ }
+
----------------
CmdArgs.push_back(TC.getTriple().isArch64Bit() ? "-m64" : "-m32");
or, even
ArgStringList CmdArgs = {TC.getTriple().isArch64Bit() ? "-m64" : "-m32"};
Same in Linker::ConstructJob below.
================
Comment at: lib/Driver/Tools.cpp:10677-10678
@@ +10676,4 @@
+ std::string Arch = A->getGpuArchName();
+ // We need to name pass an Arch of the form "sm_XX" for cubin files and
+ // "compute_XX" for ptx. CudaDeviceAction's getGpuArchName() is guaranteed
+ // to match "sm_XX".
----------------
First line does not parse.
In general compute_XX does not necessarily match sm_XX.
[[ http://docs.nvidia.com/cuda/cuda-compiler-driver-nvcc/#ptxas-options | ptxas options ]] says that
> Allowed values for this option: compute_20, compute_30, compute_35, compute_50, compute_52; and sm_20, sm_21, sm_30, sm_32, sm_35, sm_50 and sm_52
Note that there's no compute_21, compute_32. You'll need sm_XX -> compute_YY map.
================
Comment at: lib/Driver/Tools.h:923
@@ +922,3 @@
+
+// Runs fatbinary, which is sort of a linker for NVPTX.
+class LLVM_LIBRARY_VISIBILITY Linker : public Tool {
----------------
Please add more details about what fatbin does.
".. which combines GPU object files and, optionally, PTX assembly into a single output file."
http://reviews.llvm.org/D16082
More information about the cfe-commits
mailing list