[PATCH] D34784: [OpenMP] Add flag for specifying the target device architecture for OpenMP device offloading

Fri Jun 30 17:51:23 PDT 2017

hfinkel added inline comments.

================
Comment at: include/clang/Driver/Options.td:453
+def Xopenmp_target : Separate<["-"], "Xopenmp-target">,
+  HelpText<"Pass arguments to target offloading toolchain.">;
+def Xopenmp_target_EQ : JoinedAndSeparate<["-"], "Xopenmp-target=">,
----------------
Can this be?

    HelpText<"Pass <arg> to the target offloading toolchain.">,
    MetaVarName<"<arg>">;

================
Comment at: include/clang/Driver/Options.td:455
+def Xopenmp_target_EQ : JoinedAndSeparate<["-"], "Xopenmp-target=">,
+  HelpText<"Pass arguments to target offloading toolchain. First entry is a triple that identifies the toolchain.">;
 def z : Separate<["-"], "z">, Flags<[LinkerInput, RenderAsInput]>,
----------------
  HelpText<"Pass <arg> to the specified target offloading toolchain. The triple that identifies the toolchain must be provided after the equals sign.">, MetaVarName<"<arg>">;

================
Comment at: lib/Driver/ToolChains/Cuda.cpp:443
+
+    // Get the compute capability from the -fopenmp-targets flag.
+    // The default compute capability is sm_20 since this is a CUDA
----------------
Is this first sentence accurate?

================
Comment at: lib/Driver/ToolChains/Cuda.cpp:444
+    // Get the compute capability from the -fopenmp-targets flag.
+    // The default compute capability is sm_20 since this is a CUDA
+    // tool chain.
----------------
This comment should be moved down to where the sm_20 default is added.

================
Comment at: lib/Driver/ToolChains/Cuda.cpp:446
+    // tool chain.
+    auto OptList = Args.getAllArgValues(options::OPT_Xopenmp_target_EQ);
+
----------------
Why is this logic in this function? Don't you need the same logic in Generic_GCC::TranslateArgs to handle non-CUDA offloading toolchains?

================
Comment at: lib/Driver/ToolChains/Cuda.cpp:469
+    // toolchain specified).
+    assert(Args.getAllArgValues(options::OPT_fopenmp_targets_EQ).size() == 1 &&
+        "Target toolchain not specified on -Xopenmp-target and cannot be deduced.");
----------------
A user can trigger this assert, right? Please make this a diagnostic error instead.

================
Comment at: lib/Driver/ToolChains/Cuda.cpp:474
+    for (StringRef Opt : OptList) {
+      AddMArchOption(DAL, Opts, Opt);
+    }
----------------
Shouldn't you be adding all of the options, not just the -march= ones?

================
Comment at: lib/Driver/ToolChains/Cuda.cpp:478
+    auto MArchList = DAL->getAllArgValues(options::OPT_march_EQ);
+    assert(MArchList.size() < 2 && "At most one GPU arch allowed.");
+    if (MArchList.empty())
----------------
Can a user hit this? If so, it must be an actual diagnostic.

================
Comment at: test/Driver/openmp-offload.c:607
+
+// CHK-FOPENMP-EQ-TARGET: clang{{.*}} argument unused during compilation: '-Xopenmp-target=powerpc64le-ibm-linux-gnu -march=pwr8'
+
----------------
I don't see why you'd check that the arguments are unused. They should be used. One exception might be that you might want to force -Xopenmp-target=foo to be unused if foo is not a currently-targeted offloading triple. There could be a separate test case for that.

Otherwise, I think you should be able to check the relevant backend commands, no? (something like where CHK-COMMANDS is used above in this file).

https://reviews.llvm.org/D34784