[PATCH] D18671: [CUDA] Add --cuda-flush-denormals-to-zero.

Justin Lebar via cfe-commits cfe-commits at lists.llvm.org
Thu Mar 31 14:40:15 PDT 2016


jlebar added a comment.

Thank you for the review, Art!


================
Comment at: include/clang/Driver/Options.td:385
@@ -384,1 +384,3 @@
   HelpText<"CUDA installation path">;
+def cuda_flush_denormals_to_zero : Flag<["--"], "cuda-flush-denormals-to-zero">,
+  HelpText<"Flush denormal floating point values to zero in CUDA device mode.">;
----------------
tra wrote:
> We need a provide a way to both enable and disable this. We either need a "-no" variant or make it an option with value.
> 
> Also, can we shorten it to --cuda-ftz? I would probably mistype current name more often than not.
> 
> We need a provide a way to both enable and disable this. We either need a "-no" variant or make it an option with value.

Many (most) of the -f flags don't have -fno variants -- how do we decide which ones get an -fno and which don't?

> Also, can we shorten it to --cuda-ftz? I would probably mistype current name more often than not.

Well, you and I both were calling it "ctz" about 50% of the time, so I'm not sure --cuda-ftz would solve the problem!  :)  (In all seriousness, that was one of the reasons I chose not to abbreviate it.)

Maybe "ftz" is a well-known acronym.  Doesn't quite look like it from googling, though.

I looked through the flags and concluded that "ftz" was more abbreviated than most of them.  Although "flush-denormals-to-zero" is at the verbose end of the spectrum.  I considered "flush-denormals", thought that was a big ambiguous -- flush them how?

================
Comment at: include/clang/Driver/Options.td:386
@@ -385,1 +385,3 @@
+def cuda_flush_denormals_to_zero : Flag<["--"], "cuda-flush-denormals-to-zero">,
+  HelpText<"Flush denormal floating point values to zero in CUDA device mode.">;
 def dA : Flag<["-"], "dA">, Group<d_Group>;
----------------
tra wrote:
> Is there an equivalent for ftz fo host-side FP operations? It would be good to keep identical host and device side calculations as close as we can.
> Is there an equivalent for ftz fo host-side FP operations?

Not that I can tell.  The only other one I saw was opencl's equivalent flag, which does nothing at the moment.

================
Comment at: lib/Driver/ToolChains.cpp:4212
@@ +4211,3 @@
+  if (DriverArgs.hasArg(options::OPT_cuda_flush_denormals_to_zero))
+    CC1Args.push_back("-fcuda-flush-denormals-to-zero");
+
----------------
tra wrote:
> Perhaps we don't need different flags at driver and CC1 levels. Top-level "-f*" options in OPT_f_group are passed to CC1 automatically.
Aha, much better, thank you!


http://reviews.llvm.org/D18671





More information about the cfe-commits mailing list