[PATCH] D98783: [AMDGPU] Add GlobalDCE before internalization pass

Yaxun Liu via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Thu Apr 15 10:54:05 PDT 2021


yaxunl updated this revision to Diff 337825.
yaxunl retitled this revision from "[CUDA][HIP] Remove unused addr space casts" to "[AMDGPU] Add GlobalDCE before internalization pass".
yaxunl edited the summary of this revision.
yaxunl added a reviewer: rampitec.
yaxunl added a comment.
Herald added subscribers: llvm-commits, kerbowa, hiraditya, t-tye, tpr, dstuttard, nhaehnle, wdng, jvesely, kzhuravl, arsenm.
Herald added a project: LLVM.

Add GlobalDCE before internalization pass.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D98783/new/

https://reviews.llvm.org/D98783

Files:
  clang/test/CodeGenCUDA/unused-global-var.cu
  llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp


Index: llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
===================================================================
--- llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
+++ llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
@@ -578,6 +578,9 @@
         PM.addPass(AMDGPUPrintfRuntimeBindingPass());
 
         if (InternalizeSymbols) {
+          // Global variables may have dead uses which need to be removed.
+          // Otherwise these useless global variables will not get internalized.
+          PM.addPass(GlobalDCEPass());
           PM.addPass(InternalizePass(mustPreserveGV));
         }
         PM.addPass(AMDGPUPropagateAttributesLatePass(*this));
Index: clang/test/CodeGenCUDA/unused-global-var.cu
===================================================================
--- /dev/null
+++ clang/test/CodeGenCUDA/unused-global-var.cu
@@ -0,0 +1,50 @@
+// REQUIRES: amdgpu-registered-target
+// RUN: %clang_cc1 -triple amdgcn-amd-amdhsa -fcuda-is-device -x hip %s \
+// RUN:   -std=c++11 -O3 -mllvm -amdgpu-internalize-symbols -emit-llvm -o - \
+// RUN:   -target-cpu gfx906 | FileCheck %s
+
+#include "Inputs/cuda.h"
+
+// AMDGPU internalize unused global variables for whole-program compilation
+// (-fno-gpu-rdc for each TU, or -fgpu-rdc for LTO), which are then
+// eliminated by global DCE. If there are invisible unused address space casts
+// for global variables, the internalization and elimination of unused global
+// variales will be hindered. This test makes sure no such address space
+// casts.
+
+// Check unused device/constant variables are eliminated.
+
+// CHECK-NOT: @v1
+__device__ int v1;
+
+// CHECK-NOT: @v2
+__constant__ int v2;
+
+// CHECK-NOT: @_ZL2v3
+constexpr int v3 = 1;
+
+// Check managed variables are always kept.
+
+// CHECK: @v4
+__managed__ int v4;
+
+// Check used device/constant variables are not eliminated.
+// CHECK: @u1
+__device__ int u1;
+
+// CHECK: @u2
+__constant__ int u2;
+
+// Check u3 is kept because its address is taken.
+// CHECK: @_ZL2u3
+constexpr int u3 = 2;
+
+// Check u4 is not kept because it is not ODR-use.
+// CHECK-NOT: @_ZL2u4
+constexpr int u4 = 3;
+
+__device__ int fun1(const int& x);
+
+__global__ void kern1(int *x) {
+  *x = u1 + u2 + fun1(u3) + u4;
+}


-------------- next part --------------
A non-text attachment was scrubbed...
Name: D98783.337825.patch
Type: text/x-patch
Size: 2239 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20210415/4e567459/attachment.bin>


More information about the cfe-commits mailing list