[llvm] [AMDGPU] Cgscc amdgpu attributor (PR #179719)

Wed Feb 4 10:48:24 PST 2026

llvmbot wrote:



@llvm/pr-subscribers-backend-amdgpu

@llvm/pr-subscribers-llvm-globalisel

Author: Yoonseo Choi (yoonseoch)

<details>
<summary>Changes</summary>

This PR is adding a boilerplate of CGSCC AMDGPUAttributor pass (amdgpu-attributor-cgscc) by doing refactoring from the existing Module AMDGPUAttributor pass (amdgpu-attributor). 

CGSCC AMDGPUAttributor pass sets `AttributorConfig.IsModulePass = false`,  and make Attributor's `Functions` set contain only functions in a SCC. 
The main implementations of abstract attributes have not changed. 

Lit tests running amdgpu-attributor were made to run amdgpu-attributor-cgscc as well. The same prefix for FileCheck is used to compare the module and the cgscc passes. 
Fails from CGSCC passes are expected as shown by github-actions. 

I looked over a couple of failed tests. 
`propagate-waves-per-eu.ll` failed in updating invalid waves-per-eu attributes at the kernel, which has one device callee. When CGSCC pass was running on the SCC of the device function, its caller, kernel's attributes couldn't be updated as the SCC doesn't contain the kernel (`Functions` doesn't contain kernel) and `AttributorConfig.IsModulePass = false`. When SCC pass was running on the SCC of the kernel, it stopped right away as attributor is not assuming a closed world. 

`uniform-work-group-multistep.ll` failed similarly. A call graph looks like kernel->internal1->internal2. SCC0, SCC1, SCC2 contains kernel, internal1, and internal2, respectively. CGSCC pass runs three times at SCC2, then SCC1, and finally SCC0. internel2's uniform-work-group-size attribute cannot be updated as while CGSCC pass runs for its SCC2, because CGSCC pass cannot update attributes of internal1 (Attributor's `Functions` only contains internal2 and configured as non-module pass). Without internal1's attribute, internal2's attribute is not changed. Next CGSCC passes on internal1 doesn't look at its callee, internal2, but only look at its caller, kernel. 


---

Patch is 55.99 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/179719.diff


63 Files Affected:

- (modified) llvm/lib/Target/AMDGPU/AMDGPU.h (+13) 
- (modified) llvm/lib/Target/AMDGPU/AMDGPUAttributor.cpp (+48-9) 
- (modified) llvm/lib/Target/AMDGPU/AMDGPUPassRegistry.def (+6) 
- (modified) llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp (+10) 
- (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/flat-scratch-init.ll (+3) 
- (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/irtranslator-amdgpu_kernel-system-sgprs.ll (+1) 
- (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.workgroup.id.ll (+1) 
- (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.workitem.id.ll (+2) 
- (modified) llvm/test/CodeGen/AMDGPU/aa-as-infer.ll (+1) 
- (modified) llvm/test/CodeGen/AMDGPU/addrspacecast-constantexpr.ll (+1) 
- (modified) llvm/test/CodeGen/AMDGPU/addrspacecast.ll (+2) 
- (modified) llvm/test/CodeGen/AMDGPU/amdgpu-attributor-accesslist-offsetbins-out-of-sync.ll (+1) 
- (modified) llvm/test/CodeGen/AMDGPU/amdgpu-attributor-flat-scratch-init-asan.ll (+1) 
- (modified) llvm/test/CodeGen/AMDGPU/amdgpu-attributor-intrinsic-missing-nocallback.ll (+1) 
- (modified) llvm/test/CodeGen/AMDGPU/amdgpu-attributor-min-agpr-alloc.ll (+1) 
- (modified) llvm/test/CodeGen/AMDGPU/amdgpu-attributor-nocallback-intrinsics.ll (+1) 
- (modified) llvm/test/CodeGen/AMDGPU/annotate-existing-abi-attributes.ll (+1) 
- (modified) llvm/test/CodeGen/AMDGPU/annotate-kernel-features-hsa-call.ll (+1) 
- (modified) llvm/test/CodeGen/AMDGPU/annotate-kernel-features-hsa.ll (+1) 
- (modified) llvm/test/CodeGen/AMDGPU/annotate-kernel-features.ll (+1) 
- (modified) llvm/test/CodeGen/AMDGPU/attr-amdgpu-align.ll (+1) 
- (modified) llvm/test/CodeGen/AMDGPU/attr-amdgpu-max-num-workgroups-propagate.ll (+1) 
- (modified) llvm/test/CodeGen/AMDGPU/attributor-flatscratchinit-undefined-behavior.ll (+2) 
- (modified) llvm/test/CodeGen/AMDGPU/attributor-flatscratchinit.ll (+2) 
- (modified) llvm/test/CodeGen/AMDGPU/attributor-loop-issue-58639.ll (+1) 
- (modified) llvm/test/CodeGen/AMDGPU/attributor-noalias-addrspace.ll (+1) 
- (modified) llvm/test/CodeGen/AMDGPU/callee-special-input-sgprs-fixed-abi.ll (+2) 
- (modified) llvm/test/CodeGen/AMDGPU/callee-special-input-vgprs-packed.ll (+2) 
- (modified) llvm/test/CodeGen/AMDGPU/callee-special-input-vgprs.ll (+1) 
- (modified) llvm/test/CodeGen/AMDGPU/direct-indirect-call.ll (+1) 
- (modified) llvm/test/CodeGen/AMDGPU/duplicate-attribute-indirect.ll (+1) 
- (modified) llvm/test/CodeGen/AMDGPU/hsa-metadata-heap-v5.ll (+1) 
- (modified) llvm/test/CodeGen/AMDGPU/hsa-metadata-hostcall-v4.ll (+1) 
- (modified) llvm/test/CodeGen/AMDGPU/hsa-metadata-hostcall-v5.ll (+1) 
- (modified) llvm/test/CodeGen/AMDGPU/hsa-metadata-multigrid-sync-arg-v5.ll (+1) 
- (modified) llvm/test/CodeGen/AMDGPU/hsa-metadata-queue-ptr-v4.ll (+3) 
- (modified) llvm/test/CodeGen/AMDGPU/hsa-metadata-queue-ptr-v5.ll (+3) 
- (modified) llvm/test/CodeGen/AMDGPU/hsa-metadata-queueptr-v5.ll (+1) 
- (modified) llvm/test/CodeGen/AMDGPU/implicitarg-attributes.ll (+1) 
- (modified) llvm/test/CodeGen/AMDGPU/implicitarg-offset-attributes.ll (+3) 
- (modified) llvm/test/CodeGen/AMDGPU/indirect-call-set-from-other-function.ll (+1) 
- (modified) llvm/test/CodeGen/AMDGPU/issue120256-annotate-constexpr-addrspacecast.ll (+1) 
- (modified) llvm/test/CodeGen/AMDGPU/llvm.amdgcn.cluster.id.ll (+1) 
- (modified) llvm/test/CodeGen/AMDGPU/llvm.amdgcn.workgroup.id.ll (+1) 
- (modified) llvm/test/CodeGen/AMDGPU/llvm.amdgcn.workitem.id.ll (+1) 
- (modified) llvm/test/CodeGen/AMDGPU/pal-simple-indirect-call.ll (+1) 
- (modified) llvm/test/CodeGen/AMDGPU/preload-implicit-kernargs-IR-lowering.ll (+2) 
- (modified) llvm/test/CodeGen/AMDGPU/preload-implicit-kernargs-debug-info.ll (+1) 
- (modified) llvm/test/CodeGen/AMDGPU/preload-kernargs-IR-lowering.ll (+4) 
- (modified) llvm/test/CodeGen/AMDGPU/propagate-amdgpu-cluster-dims.ll (+1) 
- (modified) llvm/test/CodeGen/AMDGPU/propagate-flat-work-group-size.ll (+1) 
- (modified) llvm/test/CodeGen/AMDGPU/propagate-waves-per-eu.ll (+1) 
- (modified) llvm/test/CodeGen/AMDGPU/recursive_global_initializer.ll (+1) 
- (modified) llvm/test/CodeGen/AMDGPU/remove-no-kernel-id-attribute.ll (+1) 
- (modified) llvm/test/CodeGen/AMDGPU/simple-indirect-call-2.ll (+3) 
- (modified) llvm/test/CodeGen/AMDGPU/simple-indirect-call.ll (+1) 
- (modified) llvm/test/CodeGen/AMDGPU/uniform-work-group-attribute-missing.ll (+1) 
- (modified) llvm/test/CodeGen/AMDGPU/uniform-work-group-multistep.ll (+1) 
- (modified) llvm/test/CodeGen/AMDGPU/uniform-work-group-nested-function-calls.ll (+1) 
- (modified) llvm/test/CodeGen/AMDGPU/uniform-work-group-prevent-attribute-propagation.ll (+1) 
- (modified) llvm/test/CodeGen/AMDGPU/uniform-work-group-propagate-attribute.ll (+1) 
- (modified) llvm/test/CodeGen/AMDGPU/uniform-work-group-recursion-test.ll (+1) 
- (modified) llvm/test/CodeGen/AMDGPU/uniform-work-group-test.ll (+1) 


``````````diff

diff --git a/llvm/lib/Target/AMDGPU/AMDGPU.h b/llvm/lib/Target/AMDGPU/AMDGPU.h
index 5df11a45b4889..f3992ff19566a 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPU.h
+++ b/llvm/lib/Target/AMDGPU/AMDGPU.h
@@ -10,6 +10,7 @@
 #ifndef LLVM_LIB_TARGET_AMDGPU_AMDGPU_H
 #define LLVM_LIB_TARGET_AMDGPU_AMDGPU_H
 
+#include "llvm/Analysis/CGSCCPassManager.h"
 #include "llvm/CodeGen/MachinePassManager.h"
 #include "llvm/IR/PassManager.h"
 #include "llvm/Pass.h"
@@ -19,6 +20,7 @@
 namespace llvm {
 
 class AMDGPUTargetMachine;
+class LazyCallGraph;
 class GCNTargetMachine;
 class TargetMachine;
 
@@ -367,6 +369,17 @@ class AMDGPUAttributorPass : public PassInfoMixin<AMDGPUAttributorPass> {
   PreservedAnalyses run(Module &M, ModuleAnalysisManager &AM);
 };
 
+class AMDGPUAttributorCGSCCPass
+    : public PassInfoMixin<AMDGPUAttributorCGSCCPass> {
+private:
+  TargetMachine &TM;
+
+public:
+  AMDGPUAttributorCGSCCPass(TargetMachine &TM) : TM(TM) {}
+  PreservedAnalyses run(LazyCallGraph::SCC &C, CGSCCAnalysisManager &AM,
+                        LazyCallGraph &CG, CGSCCUpdateResult &UR);
+};
+
 class AMDGPUPreloadKernelArgumentsPass
     : public PassInfoMixin<AMDGPUPreloadKernelArgumentsPass> {
   const TargetMachine &TM;
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUAttributor.cpp b/llvm/lib/Target/AMDGPU/AMDGPUAttributor.cpp
index 0b2ee6371da06..2222ce9723765 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUAttributor.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUAttributor.cpp
@@ -1575,14 +1575,10 @@ AAAMDGPUClusterDims::createForPosition(const IRPosition &IRP, Attributor &A) {
   llvm_unreachable("AAAMDGPUClusterDims is only valid for function position");
 }
 
-static bool runImpl(Module &M, AnalysisGetter &AG, TargetMachine &TM,
+static bool runImpl(SetVector<Function *> &Functions, bool IsModulePass,
+                    Module &M, AnalysisGetter &AG, TargetMachine &TM,
                     AMDGPUAttributorOptions Options,
                     ThinOrFullLTOPhase LTOPhase) {
-  SetVector<Function *> Functions;
-  for (Function &F : M) {
-    if (!F.isIntrinsic())
-      Functions.insert(&F);
-  }
 
   CallGraphUpdater CGUpdater;
   BumpPtrAllocator Allocator;
@@ -1599,7 +1595,7 @@ static bool runImpl(Module &M, AnalysisGetter &AG, TargetMachine &TM,
   AttributorConfig AC(CGUpdater);
   AC.IsClosedWorldModule = Options.IsClosedWorld;
   AC.Allowed = &Allowed;
-  AC.IsModulePass = true;
+  AC.IsModulePass = IsModulePass;
   AC.DefaultInitializeLiveInternals = false;
   AC.IndirectCalleeSpecializationCallback =
       [](Attributor &A, const AbstractAttribute &AA, CallBase &CB,
@@ -1671,7 +1667,50 @@ PreservedAnalyses llvm::AMDGPUAttributorPass::run(Module &M,
       AM.getResult<FunctionAnalysisManagerModuleProxy>(M).getManager();
   AnalysisGetter AG(FAM);
 
+  SetVector<Function *> Functions;
+  for (Function &F : M) {
+    if (!F.isIntrinsic())
+      Functions.insert(&F);
+  }
+
+  if (Functions.empty())
+    return PreservedAnalyses::all();
+
   // TODO: Probably preserves CFG
-  return runImpl(M, AG, TM, Options, LTOPhase) ? PreservedAnalyses::none()
-                                               : PreservedAnalyses::all();
+  return runImpl(Functions, true /*IsModulePass*/, M, AG, TM, Options, LTOPhase)
+             ? PreservedAnalyses::none()
+             : PreservedAnalyses::all();
 }
+
+PreservedAnalyses llvm::AMDGPUAttributorCGSCCPass::run(LazyCallGraph::SCC &C,
+                                                       CGSCCAnalysisManager &AM,
+                                                       LazyCallGraph &CG,
+                                                       CGSCCUpdateResult &UR) {
+
+  FunctionAnalysisManager &FAM =
+      AM.getResult<FunctionAnalysisManagerCGSCCProxy>(C, CG).getManager();
+  AnalysisGetter AG(FAM);
+
+  SetVector<Function *> Functions;
+  for (LazyCallGraph::Node &N : C) {
+    Function *F = &N.getFunction();
+    if (!F->isIntrinsic())
+      Functions.insert(F);
+  }
+
+  LLVM_DEBUG(dbgs() << "AMDGPUAttributorCGSCCPass: SCC size: "
+                    << Functions.size() << "\n");
+  LLVM_DEBUG(for (LazyCallGraph::Node &N : C) {
+    dbgs() << "  SCC function: " << N.getFunction().getName() << "\n";
+  });
+
+  if (Functions.empty())
+    return PreservedAnalyses::all();
+
+  AMDGPUAttributorOptions Options;
+  Module *M = C.begin()->getFunction().getParent();
+  return runImpl(Functions, false /*IsModulePass*/, *M, AG, TM, Options,
+                 ThinOrFullLTOPhase::None)
+             ? PreservedAnalyses::none()
+             : PreservedAnalyses::all();
+}
\ No newline at end of file
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUPassRegistry.def b/llvm/lib/Target/AMDGPU/AMDGPUPassRegistry.def
index f464fbf31c754..c2d2fa000ef5a 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUPassRegistry.def
+++ b/llvm/lib/Target/AMDGPU/AMDGPUPassRegistry.def
@@ -50,6 +50,12 @@ MODULE_PASS_WITH_PARAMS(
     parseAMDGPUAttributorPassOptions, "closed-world")
 #undef MODULE_PASS_WITH_PARAMS
 
+#ifndef CGSCC_PASS
+#define CGSCC_PASS(NAME, CREATE_PASS)
+#endif
+CGSCC_PASS("amdgpu-attributor-cgscc", AMDGPUAttributorCGSCCPass(*this))
+#undef CGSCC_PASS
+
 #ifndef FUNCTION_PASS
 #define FUNCTION_PASS(NAME, CREATE_PASS)
 #endif
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
index d25b22b2b96dc..d639c09b999b4 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
@@ -857,6 +857,16 @@ void AMDGPUTargetMachine::registerPassBuilderCallbacks(PassBuilder &PB) {
 #define GET_PASS_REGISTRY "AMDGPUPassRegistry.def"
 #include "llvm/Passes/TargetPassRegistry.inc"
 
+  PB.registerPipelineParsingCallback(
+      [this](StringRef Name, CGSCCPassManager &PM,
+             ArrayRef<PassBuilder::PipelineElement> Pipeline) {
+        if (Name == "amdgpu-attributor-cgscc") {
+          PM.addPass(AMDGPUAttributorCGSCCPass(*this));
+          return true;
+        }
+        return false;
+      });
+
   PB.registerScalarOptimizerLateEPCallback(
       [](FunctionPassManager &FPM, OptimizationLevel Level) {
         if (Level == OptimizationLevel::O0)
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/flat-scratch-init.ll b/llvm/test/CodeGen/AMDGPU/GlobalISel/flat-scratch-init.ll
index e207d95287783..d173f04d963f6 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/flat-scratch-init.ll
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/flat-scratch-init.ll
@@ -1,6 +1,9 @@
 ; RUN: opt -passes=amdgpu-attributor -mcpu=gfx900 < %s | llc -mcpu=gfx900 | FileCheck -check-prefixes=GCN,RW-FLAT %s
+; RUN: opt -passes=amdgpu-attributor-cgscc -mcpu=gfx900 < %s | llc -mcpu=gfx900 | FileCheck -check-prefixes=GCN,RW-FLAT %s
 ; RUN: opt -passes=amdgpu-attributor -mcpu=gfx900 -mattr=+architected-flat-scratch < %s | llc | FileCheck -check-prefixes=GCN,RO-FLAT %s
+; RUN: opt -passes=amdgpu-attributor-cgscc -mcpu=gfx900 -mattr=+architected-flat-scratch < %s | llc | FileCheck -check-prefixes=GCN,RO-FLAT %s
 ; RUN: opt -passes=amdgpu-attributor -mcpu=gfx942 < %s | llc | FileCheck -check-prefixes=GCN,RO-FLAT %s
+; RUN: opt -passes=amdgpu-attributor-cgscc -mcpu=gfx942 < %s | llc | FileCheck -check-prefixes=GCN,RO-FLAT %s
 
 target triple = "amdgcn-amd-amdhsa"
 
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/irtranslator-amdgpu_kernel-system-sgprs.ll b/llvm/test/CodeGen/AMDGPU/GlobalISel/irtranslator-amdgpu_kernel-system-sgprs.ll
index eb7227c489b95..a3671e139eae1 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/irtranslator-amdgpu_kernel-system-sgprs.ll
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/irtranslator-amdgpu_kernel-system-sgprs.ll
@@ -1,4 +1,5 @@
 ; RUN: opt -mtriple=amdgcn-amd-amdhsa -mcpu=fiji -passes=amdgpu-attributor < %s | llc -global-isel -mtriple=amdgcn-amd-amdhsa -mcpu=fiji -amdgpu-ir-lower-kernel-arguments=0 -stop-after=irtranslator -o - | FileCheck -check-prefix=HSA %s
+; RUN: opt -mtriple=amdgcn-amd-amdhsa -mcpu=fiji -passes=amdgpu-attributor-cgscc < %s | llc -global-isel -mtriple=amdgcn-amd-amdhsa -mcpu=fiji -amdgpu-ir-lower-kernel-arguments=0 -stop-after=irtranslator -o - | FileCheck -check-prefix=HSA %s
 
 ; HSA-LABEL: name: default_kernel
 ; HSA: liveins:
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.workgroup.id.ll b/llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.workgroup.id.ll
index f491df8448a7a..98f70bed0bfa0 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.workgroup.id.ll
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.workgroup.id.ll
@@ -1,4 +1,5 @@
 ; RUN: opt -mtriple=amdgcn-amd-amdhsa -passes=amdgpu-attributor %s -o %t.bc
+; RUN: opt -mtriple=amdgcn-amd-amdhsa -passes=amdgpu-attributor-cgscc %s -o %t.bc
 ; RUN: llc -global-isel -new-reg-bank-select -mtriple=amdgcn-- -mcpu=hawaii < %t.bc | FileCheck --check-prefixes=ALL,UNKNOWN-OS %s
 ; RUN: llc -global-isel -new-reg-bank-select -mtriple=amdgcn-- -mcpu=tonga  < %t.bc | FileCheck --check-prefixes=ALL,UNKNOWN-OS %s
 ; RUN: llc -global-isel -new-reg-bank-select -mtriple=amdgcn-unknown-mesa3d -mcpu=hawaii < %t.bc | FileCheck -check-prefixes=ALL,MESA3D %s
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.workitem.id.ll b/llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.workitem.id.ll
index 7b923f4c93281..8685e050b3a3e 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.workitem.id.ll
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.workitem.id.ll
@@ -1,5 +1,7 @@
 ; RUN: sed 's/CODE_OBJECT_VERSION/400/g' %s | opt -S -mtriple=amdgcn-amd-amdhsa -passes=amdgpu-attributor -o %t.v4.ll
+; RUN: sed 's/CODE_OBJECT_VERSION/400/g' %s | opt -S -mtriple=amdgcn-amd-amdhsa -passes=amdgpu-attributor-cgscc -o %t.v4.ll
 ; RUN: sed 's/CODE_OBJECT_VERSION/600/g' %s | opt -S -mtriple=amdgcn-amd-amdhsa -passes=amdgpu-attributor -o %t.v6.ll
+; RUN: sed 's/CODE_OBJECT_VERSION/600/g' %s | opt -S -mtriple=amdgcn-amd-amdhsa -passes=amdgpu-attributor-cgscc -o %t.v6.ll
 ; RUN: llc -global-isel -new-reg-bank-select -mtriple=amdgcn-unknown-amdhsa < %t.v4.ll | FileCheck --check-prefixes=ALL,HSA,UNPACKED %s
 ; RUN: llc -global-isel -new-reg-bank-select -mtriple=amdgcn-unknown-amdhsa < %t.v4.ll | FileCheck --check-prefixes=ALL,HSA,UNPACKED %s
 ; RUN: llc -global-isel -new-reg-bank-select -mtriple=amdgcn-- -mcpu=hawaii -mattr=+flat-for-global < %t.v4.ll | FileCheck --check-prefixes=ALL,MESA,UNPACKED %s
diff --git a/llvm/test/CodeGen/AMDGPU/aa-as-infer.ll b/llvm/test/CodeGen/AMDGPU/aa-as-infer.ll
index 78766b4e8eb08..2fa0a0f7fc6d4 100644
--- a/llvm/test/CodeGen/AMDGPU/aa-as-infer.ll
+++ b/llvm/test/CodeGen/AMDGPU/aa-as-infer.ll
@@ -1,5 +1,6 @@
 ; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5
 ; RUN: opt -mtriple=amdgcn-amd-amdhsa -passes=amdgpu-attributor -S %s -o - | FileCheck %s
+; RUN: opt -mtriple=amdgcn-amd-amdhsa -passes=amdgpu-attributor-cgscc -S %s -o - | FileCheck %s
 
 @g1 = protected addrspace(1) externally_initialized global i32 0, align 4
 @g2 = protected addrspace(1) externally_initialized global i32 0, align 4
diff --git a/llvm/test/CodeGen/AMDGPU/addrspacecast-constantexpr.ll b/llvm/test/CodeGen/AMDGPU/addrspacecast-constantexpr.ll
index 98fbbe1a515ed..5c0e999fff1bb 100644
--- a/llvm/test/CodeGen/AMDGPU/addrspacecast-constantexpr.ll
+++ b/llvm/test/CodeGen/AMDGPU/addrspacecast-constantexpr.ll
@@ -1,5 +1,6 @@
 ; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --function-signature --check-globals
 ; RUN: opt -mtriple=amdgcn-unknown-amdhsa -S -passes=amdgpu-attributor < %s | FileCheck -check-prefix=HSA %s
+; RUN: opt -mtriple=amdgcn-unknown-amdhsa -S -passes=amdgpu-attributor-cgscc < %s | FileCheck -check-prefix=HSA %s
 
 declare void @llvm.memcpy.p1.p4.i32(ptr addrspace(1) nocapture, ptr addrspace(4) nocapture, i32, i1) #0
 
diff --git a/llvm/test/CodeGen/AMDGPU/addrspacecast.ll b/llvm/test/CodeGen/AMDGPU/addrspacecast.ll
index 4df82946343b5..c488cb402a2b8 100644
--- a/llvm/test/CodeGen/AMDGPU/addrspacecast.ll
+++ b/llvm/test/CodeGen/AMDGPU/addrspacecast.ll
@@ -1,5 +1,7 @@
 ; RUN: opt -passes=amdgpu-attributor -mcpu=kaveri -mattr=-promote-alloca < %s | llc | FileCheck -enable-var-scope -check-prefix=HSA -check-prefix=CI %s
+; RUN: opt -passes=amdgpu-attributor-cgscc -mcpu=kaveri -mattr=-promote-alloca < %s | llc | FileCheck -enable-var-scope -check-prefix=HSA -check-prefix=CI %s
 ; RUN: opt -passes=amdgpu-attributor -mcpu=gfx900 -mattr=-promote-alloca < %s | llc | FileCheck -enable-var-scope -check-prefix=HSA -check-prefix=GFX9 %s
+; RUN: opt -passes=amdgpu-attributor-cgscc -mcpu=gfx900 -mattr=-promote-alloca < %s | llc | FileCheck -enable-var-scope -check-prefix=HSA -check-prefix=GFX9 %s
 
 target triple = "amdgcn-amd-amdhsa"
 
diff --git a/llvm/test/CodeGen/AMDGPU/amdgpu-attributor-accesslist-offsetbins-out-of-sync.ll b/llvm/test/CodeGen/AMDGPU/amdgpu-attributor-accesslist-offsetbins-out-of-sync.ll
index 18ec3ab64298b..022414545e1d3 100644
--- a/llvm/test/CodeGen/AMDGPU/amdgpu-attributor-accesslist-offsetbins-out-of-sync.ll
+++ b/llvm/test/CodeGen/AMDGPU/amdgpu-attributor-accesslist-offsetbins-out-of-sync.ll
@@ -1,5 +1,6 @@
 ; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5
 ; RUN: opt -S -mtriple=amdgcn-amd-amdhsa -passes='amdgpu-attributor' %s -o - | FileCheck %s
+; RUN: opt -S -mtriple=amdgcn-amd-amdhsa -passes='amdgpu-attributor-cgscc' %s -o - | FileCheck %s
 
 %struct.ShaderClosure = type { <3 x float>, i32, float, <3 x float>, [10 x float], [8 x i8] }
 %struct.ShaderData = type { <3 x float>, <3 x float>, <3 x float>, <3 x float>, i32, i32, i32, i32, i32, float, float, i32, i32, float, float, %struct.differential3, %struct.differential3, %struct.differential, %struct.differential, <3 x float>, <3 x float>, <3 x float>, %struct.differential3, i32, i32, i32, float, <3 x float>, <3 x float>, <3 x float>, [64 x %struct.ShaderClosure] }
diff --git a/llvm/test/CodeGen/AMDGPU/amdgpu-attributor-flat-scratch-init-asan.ll b/llvm/test/CodeGen/AMDGPU/amdgpu-attributor-flat-scratch-init-asan.ll
index 0d68762182f96..0538dad159bbf 100644
--- a/llvm/test/CodeGen/AMDGPU/amdgpu-attributor-flat-scratch-init-asan.ll
+++ b/llvm/test/CodeGen/AMDGPU/amdgpu-attributor-flat-scratch-init-asan.ll
@@ -1,5 +1,6 @@
 ; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --check-attributes --check-globals all --version 6
 ; RUN: opt -S -mtriple=amdgcn-amd-amdhsa -mcpu=gfx90a -passes='amdgpu-attributor' %s -o - | FileCheck %s
+; RUN: opt -S -mtriple=amdgcn-amd-amdhsa -mcpu=gfx90a -passes='amdgpu-attributor-cgscc' %s -o - | FileCheck %s
 
 @lds_1 = internal addrspace(3) global [1 x i8] poison, align 4
 
diff --git a/llvm/test/CodeGen/AMDGPU/amdgpu-attributor-intrinsic-missing-nocallback.ll b/llvm/test/CodeGen/AMDGPU/amdgpu-attributor-intrinsic-missing-nocallback.ll
index d7d623ac89146..f6fa8d81da17a 100644
--- a/llvm/test/CodeGen/AMDGPU/amdgpu-attributor-intrinsic-missing-nocallback.ll
+++ b/llvm/test/CodeGen/AMDGPU/amdgpu-attributor-intrinsic-missing-nocallback.ll
@@ -1,5 +1,6 @@
 ; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --check-globals all --version 5
 ; RUN: opt -S -mtriple=amdgcn-unknown-amdhsa -mcpu=gfx90a -passes=amdgpu-attributor %s | FileCheck %s
+; RUN: opt -S -mtriple=amdgcn-unknown-amdhsa -mcpu=gfx90a -passes=amdgpu-attributor-cgscc %s | FileCheck %s
 
 ; Make sure we do not infer anything about implicit inputs through an
 ; intrinsic call which is not nocallback.
diff --git a/llvm/test/CodeGen/AMDGPU/amdgpu-attributor-min-agpr-alloc.ll b/llvm/test/CodeGen/AMDGPU/amdgpu-attributor-min-agpr-alloc.ll
index 4db668e05cb21..ff0a2e2ef9172 100644
--- a/llvm/test/CodeGen/AMDGPU/amdgpu-attributor-min-agpr-alloc.ll
+++ b/llvm/test/CodeGen/AMDGPU/amdgpu-attributor-min-agpr-alloc.ll
@@ -1,5 +1,6 @@
 ; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --check-attributes --check-globals all --version 4
 ; RUN: opt -S -mtriple=amdgcn-unknown-amdhsa -mcpu=gfx90a -passes=amdgpu-attributor %s | FileCheck %s
+; RUN: opt -S -mtriple=amdgcn-unknown-amdhsa -mcpu=gfx90a -passes=amdgpu-attributor-cgscc %s | FileCheck %s
 
 ; Shrink result attribute list by preventing use of most attributes.
 define internal void @use_most() {
diff --git a/llvm/test/CodeGen/AMDGPU/amdgpu-attributor-nocallback-intrinsics.ll b/llvm/test/CodeGen/AMDGPU/amdgpu-attributor-nocallback-intrinsics.ll
index 71c509afa8e64..2d5eed383429c 100644
--- a/llvm/test/CodeGen/AMDGPU/amdgpu-attributor-nocallback-intrinsics.ll
+++ b/llvm/test/CodeGen/AMDGPU/amdgpu-attributor-nocallback-intrinsics.ll
@@ -1,5 +1,6 @@
 ; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --check-attributes --check-globals all --version 5
 ; RUN: opt -S -mtriple=amdgcn-unknown-amdhsa -passes=amdgpu-attributor -mcpu=gfx90a %s | FileCheck %s
+; RUN: opt -S -mtriple=amdgcn-unknown-amdhsa -passes=amdgpu-attributor-cgscc -mcpu=gfx90a %s | FileCheck %s
 
 ; Make sure we infer no inputs are used through some intrinsics
 
diff --git a/llvm/test/CodeGen/AMDGPU/annotate-existing-abi-attributes.ll b/llvm/test/CodeGen/AMDGPU/annotate-existing-abi-attributes.ll
index 7e0208cd1f45a..44916c42b1132 100644
--- a/llvm/test/CodeGen/AMDGPU/annotate-existing-abi-attributes.ll
+++ b/llvm/test/CodeGen/AMDGPU/annotate-existing-abi-attributes.ll
@@ -1,5 +1,6 @@
 ; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --function-signature --check-globals
 ; RUN: opt -mtriple=amdgcn-unknown-amdhsa -S -passes=amdgpu-attributor %s | FileCheck %s
+; RUN: opt -mtriple=amdgcn-unknown-amdhsa -S -passes=amdgpu-attributor-cgscc %s | FileCheck %s
 
 ; Check handling for pre-existing attributes on function declarations
 
diff --git a/llvm/test/CodeGen/AMDGPU/annotate-kernel-features-hsa-call.ll b/llvm/test/CodeGen/AMDGPU/annotate-kernel-features-hsa-call.ll
index 9283bd5a3c1f9..e0eb4cd787fbc 100644
--- a/llvm/test/CodeGen/AMDGPU/annotate-kernel-features-hsa-call.ll
+++ b/llvm/test/CodeGen/AMDGPU/annotate-kernel-features-hsa-call.ll
@@ -1,5 +1,6 @@
 ; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --function-signature --check-globals
 ; RUN: opt -mtriple=amdgcn-unknown-amdhsa -S -passes=amdgpu-attributor < %s | FileCheck -check-prefixes=ATTRIBUTOR_HSA %s
+; RUN: opt -mtriple=amdgcn-unknown-amdhsa -S -passes=amdgpu-attributor-cgscc < %s | FileCheck -check-prefixes=ATTRIBUTOR_HSA %s
 
 ; TODO: The test contains UB which is refined by the Attributor and should be removed.
 
diff --git a/llvm/test/CodeGen/AMDGPU/annotate-kernel-features-hsa.ll b/llvm/test/CodeGen/AMDGPU/annotate-kernel-features-hsa.ll
index 8554485136f04..09a42670cf396 100644
--- a/llvm/test/CodeGen/AMDGPU/annotate-kernel-features-hsa.ll
+++ b/llvm/test/CodeGen/AMDGPU/annotate-kernel-features-hsa.ll
@@ -1,5 +1,6 @@
 ; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --function-signature --check-globals
 ; RUN: opt -mtriple=amdgcn-unknown-amdhsa -S -passes=amdgpu-attributor < %s | FileCheck -check-prefix=HSA %s
+; RUN: opt -mtriple=amdgcn-unknown-amdhsa -S -passes=amdgpu-attributor-cgscc < %s | FileCheck -check-prefix=HSA %s
 
 declare i32 @llvm.amdgcn.workgroup.id.x() #0
 declare i32 @llvm.amdgcn.workgroup.id.y() #0
diff --git a/llvm/test/CodeGen/AMDGPU/annotate-kernel-features.ll b/llvm/test/CodeGen/AMDGPU/annotate-kernel-features.ll
index e2a2debc8ca18..fbcd6651b2d62 100644
--- a/llvm/test/CodeGen/AMDGPU/annotate-kernel-features.ll
+++ b/llvm/test/CodeGen/AMDGPU/annotate-kernel-features.ll
@@ -1,5 +1,6 @@
 ; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --function-signature --check-globals
 ; RUN: opt -S -mtriple=amdgcn-unknown-unknown -passes=amdgpu-attributor < %s | FileCheck %s
+; RUN: opt -S -mtriple=amdgcn-unknown-unknown -passes=amdgpu-attributor-cgscc < %s | FileCheck %s
 
 declare i32 @llvm.r600.read.tgid.x() #0
 declare i32 @llvm.r600.read.tgid.y() #0
diff --git a/llvm/test/CodeGen/AMDGPU/attr-amdgpu-align.ll b/llvm/test/CodeGen/AMDGPU/attr-amdgpu-align.ll
index 29845e649a6c6..c55a10607783e 100644
--- a/llvm/test/CodeGen/AMDGPU/attr-amdgpu-align.ll
+++ b/llvm/test/CodeGen/AMDGPU/attr-amdgpu-align.ll
@@ -1,5 +1,6 @@
 ; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 6
 ; RUN: opt  -S -mtriple=amdgcn-amd-amdhsa -passes=amdgpu-attributor %s -o - | FileCheck ...
[truncated]

``````````

</details>


https://github.com/llvm/llvm-project/pull/179719