[PATCH] D78020: clang/AMDGPU: Assume denormals are enabled for the default target.

Matt Arsenault via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Mon Apr 13 08:02:16 PDT 2020


arsenm created this revision.
arsenm added reviewers: yaxunl, rampitec, b-sumner, msearles.
Herald added subscribers: kerbowa, t-tye, tpr, dstuttard, nhaehnle, wdng, jvesely, kzhuravl.
arsenm added a parent revision: D78019: HIP: Fix handling of denormal mode.

Since the default logic was based on having fast denormal/fma
features, and the default target has no features, we assumed flushing by
default. This fixes incorrectly assuming flushing in builds for
"generic" IR libraries.

      

The handling for no specified --cuda-gpu-arch in HIP is kind of
broken. Somewhere else forces a default target of gfx803, which does
not enable denormal handling by default. We don't see this default
switching here, so you'll end up with a different denormal mode
depending on whether you explicitly requested gfx803, or used it by
default.


https://reviews.llvm.org/D78020

Files:
  clang/lib/Driver/ToolChains/AMDGPU.cpp
  clang/test/Driver/cl-denorms-are-zero.cl
  clang/test/Driver/cuda-flush-denormals-to-zero.cu


Index: clang/test/Driver/cuda-flush-denormals-to-zero.cu
===================================================================
--- clang/test/Driver/cuda-flush-denormals-to-zero.cu
+++ clang/test/Driver/cuda-flush-denormals-to-zero.cu
@@ -22,6 +22,10 @@
 // RUN: %clang -x hip -no-canonical-prefixes -### -target x86_64-linux-gnu -c -march=haswell --cuda-gpu-arch=gfx803 -nocudainc -nogpulib %s 2>&1 | FileCheck -check-prefix=FTZ %s
 // RUN: %clang -x hip -no-canonical-prefixes -### -target x86_64-linux-gnu -c -march=haswell --cuda-gpu-arch=gfx900 -nocudainc -nogpulib %s 2>&1 | FileCheck -check-prefix=NOFTZ %s
 
+// Test no subtarget
+// FIXME: This defaults to gfx803, but gets a different denormal mode than explicitly specifying it.
+// RUN: %clang -x hip -no-canonical-prefixes -### -target x86_64-linux-gnu -c -march=haswell -nocudainc -nogpulib %s 2>&1 | FileCheck -check-prefix=NOFTZ %s
+
 
 // Test multiple offload archs with different defaults.
 // RUN: %clang -x hip -no-canonical-prefixes -### -target x86_64-linux-gnu -c -march=haswell --cuda-gpu-arch=gfx803 --cuda-gpu-arch=gfx900 -nocudainc -nogpulib %s 2>&1 | FileCheck -check-prefix=MIXED-DEFAULT-MODE %s
Index: clang/test/Driver/cl-denorms-are-zero.cl
===================================================================
--- clang/test/Driver/cl-denorms-are-zero.cl
+++ clang/test/Driver/cl-denorms-are-zero.cl
@@ -1,20 +1,24 @@
 // Slow FMAF and slow f32 denormals
-// RUN: %clang -### -target amdgcn--amdhsa -c -mcpu=pitcairn %s 2>&1 | FileCheck -check-prefixes=AMDGCN,AMDGCN-FLUSH %s
+// RUN: %clang -### -target amdgcn--amdhsa -nogpulib -c -mcpu=pitcairn %s 2>&1 | FileCheck -check-prefixes=AMDGCN,AMDGCN-FLUSH %s
 // RUN: %clang -### -cl-denorms-are-zero -o - -target amdgcn--amdhsa -c -mcpu=pitcairn %s 2>&1 | FileCheck -check-prefixes=AMDGCN,AMDGCN-FLUSH %s
 
 // Fast FMAF, but slow f32 denormals
-// RUN: %clang -### -target amdgcn--amdhsa -c -mcpu=tahiti %s 2>&1 | FileCheck -check-prefixes=AMDGCN,AMDGCN-FLUSH %s
+// RUN: %clang -### -target amdgcn--amdhsa -nogpulib -c -mcpu=tahiti %s 2>&1 | FileCheck -check-prefixes=AMDGCN,AMDGCN-FLUSH %s
 // RUN: %clang -### -cl-denorms-are-zero -o - -target amdgcn--amdhsa -c -mcpu=tahiti %s 2>&1 | FileCheck -check-prefixes=AMDGCN,AMDGCN-FLUSH %s
 
 // Fast F32 denormals, but slow FMAF
-// RUN: %clang -### -target amdgcn--amdhsa -c -mcpu=fiji %s 2>&1 | FileCheck -check-prefixes=AMDGCN,AMDGCN-FLUSH %s
+// RUN: %clang -### -target amdgcn--amdhsa -nogpulib -c -mcpu=fiji %s 2>&1 | FileCheck -check-prefixes=AMDGCN,AMDGCN-FLUSH %s
 // RUN: %clang -### -cl-denorms-are-zero -o - -target amdgcn--amdhsa -c -mcpu=fiji %s 2>&1 | FileCheck -check-prefixes=AMDGCN,AMDGCN-FLUSH %s
 
 // Fast F32 denormals and fast FMAF
-// RUN: %clang -### -target amdgcn--amdhsa -c -mcpu=gfx900 %s 2>&1 | FileCheck -check-prefixes=AMDGCN,AMDGCN-DENORM %s
-// RUN: %clang -### -cl-denorms-are-zero -o - -target amdgcn--amdhsa -c -mcpu=gfx900 %s 2>&1 | FileCheck -check-prefixes=AMDGCN,AMDGCN-FLUSH %s
+// RUN: %clang -### -target amdgcn--amdhsa -nogpulib -c -mcpu=gfx900 %s 2>&1 | FileCheck -check-prefixes=AMDGCN,AMDGCN-DENORM %s
+// RUN: %clang -### -cl-denorms-are-zero -o - -target amdgcn--amdhsa -nogpulib -c -mcpu=gfx900 %s 2>&1 | FileCheck -check-prefixes=AMDGCN,AMDGCN-FLUSH %s
+
+// Default target is artificial, but should assume a conservative default.
+// RUN: %clang -### -target amdgcn--amdhsa -nogpulib -c %s 2>&1 | FileCheck -check-prefixes=AMDGCN,AMDGCN-DENORM %s
+// RUN: %clang -### -cl-denorms-are-zero -o - -target amdgcn--amdhsa -nogpulib -c %s 2>&1 | FileCheck -check-prefixes=AMDGCN,AMDGCN-FLUSH %s
 
 // AMDGCN-FLUSH: "-fdenormal-fp-math-f32=preserve-sign,preserve-sign"
 
 // This should be omitted and default to ieee
-// AMDGCN-DENORM-NOT: "-fdenormal-fp-math-f32"
+// AMDGCN-DENORM-NOT: denormal-fp-math
Index: clang/lib/Driver/ToolChains/AMDGPU.cpp
===================================================================
--- clang/lib/Driver/ToolChains/AMDGPU.cpp
+++ clang/lib/Driver/ToolChains/AMDGPU.cpp
@@ -262,6 +262,11 @@
 
 bool AMDGPUToolChain::getDefaultDenormsAreZeroForTarget(
     llvm::AMDGPU::GPUKind Kind) {
+
+  // Assume nothing without a specific target.
+  if (Kind == llvm::AMDGPU::GK_NONE)
+    return false;
+
   const unsigned ArchAttr = llvm::AMDGPU::getArchAttrAMDGCN(Kind);
 
   // Default to enabling f32 denormals by default on subtargets where fma is


-------------- next part --------------
A non-text attachment was scrubbed...
Name: D78020.256975.patch
Type: text/x-patch
Size: 4403 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20200413/a66fd0f1/attachment.bin>


More information about the cfe-commits mailing list