[llvm] [AMDGPU][MC] Function scope resource usage struct (PR #188031)

Janek van Oirschot via llvm-commits llvm-commits at lists.llvm.org
Fri Mar 27 08:50:59 PDT 2026


https://github.com/JanekvO updated https://github.com/llvm/llvm-project/pull/188031

>From 5dbc1d50743d7b81f317de4a2d749679cac585a6 Mon Sep 17 00:00:00 2001
From: Janek van Oirschot <janek.vanoirschot at amd.com>
Date: Wed, 18 Mar 2026 14:26:51 +0000
Subject: [PATCH 1/3] [AMDGPU][MC] Function scope resource usage struct and
 callgraph info

---
 llvm/docs/AMDGPUUsage.rst                     |  74 +++++++++++-
 llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp   |  29 +++++
 .../AMDGPU/AsmParser/AMDGPUAsmParser.cpp      | 106 ++++++++++++++++
 .../MCTargetDesc/AMDGPUTargetStreamer.cpp     |  64 ++++++++++
 .../MCTargetDesc/AMDGPUTargetStreamer.h       |  25 ++++
 .../AMDGPU/branch-relaxation-gfx1250.ll       |  13 ++
 llvm/test/CodeGen/AMDGPU/branch-relaxation.ll |  13 ++
 llvm/test/CodeGen/AMDGPU/lds-relocs.ll        |   3 +
 .../CodeGen/AMDGPU/resource-info-section.ll   |  98 +++++++++++++++
 .../MC/AMDGPU/amdgpu-resource-usage-err.s     |  64 ++++++++++
 llvm/test/MC/AMDGPU/amdgpu-resource-usage.s   | 114 ++++++++++++++++++
 11 files changed, 601 insertions(+), 2 deletions(-)
 create mode 100644 llvm/test/CodeGen/AMDGPU/resource-info-section.ll
 create mode 100644 llvm/test/MC/AMDGPU/amdgpu-resource-usage-err.s
 create mode 100644 llvm/test/MC/AMDGPU/amdgpu-resource-usage.s

diff --git a/llvm/docs/AMDGPUUsage.rst b/llvm/docs/AMDGPUUsage.rst
index 1ede5ca2d4cf6..23dfb1786f4ba 100644
--- a/llvm/docs/AMDGPUUsage.rst
+++ b/llvm/docs/AMDGPUUsage.rst
@@ -2354,8 +2354,8 @@ As part of the AMDGPU MC layer, AMDGPU provides the following target-specific
 
      =================== ================= ========================================================
 
-Function Resource Usage
------------------------
+Function Resource Usage Symbols
+-------------------------------
 
 A function's resource usage depends on each of its callees' resource usage. The
 expressions used to denote resource usage reflect this by propagating each
@@ -2403,6 +2403,76 @@ unit's worst case (i.e, maxima) ``num_vgpr``, ``num_agpr``, and
 symbolic expressions. These three symbols are ``amdgcn.max_num_vgpr``,
 ``amdgcn.max_num_agpr``, and ``amdgcn.max_num_sgpr``.
 
+Function Resource Usage Asm Directives
+--------------------------------------
+
+A function's resource usage depends on each of its callees' resource usage.
+Accomodating this are the AMDGPU resource usage assembler directives and ELF
+section. The assembler directives emit a pre- and post-marked sequence of
+assembler directives after every function that state a function's resource
+usage and callees. The resource usage this emit is **only for this function's
+usage** and does not yet consider the callees' resource usage. For the
+propagated resource usage, any user of the section or resource info will have
+to walk the callgraph and compute the total use.
+
+  .. table:: Function Resource Usage Asm Directives:
+     :name: function-usage-directive-table
+
+     ====================================== ========= ======================== ===================================================================================================
+     Directive                              Required? Occurrences Per Function Description
+     ====================================== ========= ======================== ===================================================================================================
+     .amdgpu_resource_usage <function name> yes       1                        Denotes the start of resource usage directives for <function name>
+     .end_amdgpu_resource_usage             yes       1                        Denotes end of resource usage directives
+     .num_vgpr <i32>                        yes       1                        Number of VGPRs used by the function
+     .num_agpr <i32>                        yes       1                        Number of AGPRs used by the function
+     .num_sgpr <i32>                        yes       1                        Number of SGPRs used by the function
+     .named_barrier <i32>                   yes       1                        Number of named barriers used by the function
+     .private_seg_size <i32>                yes       1                        Total stack size required for the function
+     .uses_vcc <i1>                         yes       1                        Boolean denoting whether vcc is used in the function
+     .uses_flat_scratch <i1>                yes       1                        Boolean denoting whether flat scratch is used in the function
+     .has_dyn_sized_stack <i1>              yes       1                        Boolean denoting whether stack in the function is dynamically sized
+     .has_recursion <i1>                    yes       1                        Boolean denoting whether recursion is used in the function
+     .has_indirect_call <i1>                yes       1                        Boolean denoting whether the function has an indirect call
+     .callee <function name>                no        0 or more                Callee functions called by the function, each unique callee getting its own .callee directive
+     ====================================== ========= ======================== ===================================================================================================
+
+Function Resource Usage ELF Section
+-----------------------------------
+
+The resource usage section contains binary structs representing each resource
+usage entry for a function. The resource usage section is named
+.AMDGPU.resource_usage and has an additional relocation section (e.g.,
+.rela.AMDGPU.resource_usage) which holds information on which offset of the
+.AMDGPU.resource_usage section denotes which function in addition to tracking
+callers and callees.
+
+Resource usage is a binary struct of its required resource information. The
+booleans are packed into a flag of type i32. The total size of each resource
+usage struct is, therefore, 24-bytes (i.e., sizeof(num_vgpr) + sizeof(num_agpr)
++ sizeof(num_sgpr) + sizeof(named_barrier) + sizeof(private_seg_size) +
+sizeof(flags)). The flags are packed as follows:
+
+  .. table:: Function Resource Usage Flags:
+     :name: function-usage-flags-table
+
+     ===========================    =======
+     Function usage property        Bit
+     ===========================    =======
+     uses_vcc                       0
+     uses_flat_scratch              1
+     has_dyn_sized_stack            2
+     has_recursion                  3
+     has_indirect_call              4
+     ===========================    =======
+
+Resource usage relocation section contains the the offset into the resource
+usage section for each function in a translation unit. In addition to this
+function to resource usage entry mapping, it embeds the callees of each caller
+by having the first relocation to an offset denote the function the entry is
+mapped to and any subsequent relocations for that same offset denote a callee
+of the function mapped to that entry (similar to how CGProfile specifies the
+caller and callee relation.
+
 .. _amdgpu-elf-code-object:
 
 ELF Code Object
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp b/llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp
index 1f83df8099803..b07f0ad49b4fe 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp
@@ -31,6 +31,7 @@
 #include "Utils/AMDGPUBaseInfo.h"
 #include "Utils/AMDKernelCodeTUtils.h"
 #include "Utils/SIDefinesUtils.h"
+#include "llvm/ADT/SmallPtrSet.h"
 #include "llvm/Analysis/OptimizationRemarkEmitter.h"
 #include "llvm/BinaryFormat/ELF.h"
 #include "llvm/CodeGen/MachineFrameInfo.h"
@@ -758,6 +759,34 @@ bool AMDGPUAsmPrinter::runOnMachineFunction(MachineFunction &MF) {
                      OutContext, IsLocal));
   }
 
+  // Emit per-function local resource usage and callee info into
+  // .AMDGPU.resource_info section.
+  {
+    uint32_t Flags = 0;
+    Flags |= (ResourceUsage->UsesVCC ? 1u : 0u) << 0;
+    Flags |= (ResourceUsage->UsesFlatScratch ? 1u : 0u) << 1;
+    Flags |= (ResourceUsage->HasDynamicallySizedStack ? 1u : 0u) << 2;
+    Flags |= (ResourceUsage->HasRecursion ? 1u : 0u) << 3;
+    Flags |= (ResourceUsage->HasIndirectCall ? 1u : 0u) << 4;
+
+    // Collect unique callee symbols.
+    SmallVector<MCSymbol *, 8> CalleeSyms;
+    SmallPtrSet<const Function *, 8> SeenCallees;
+    for (const Function *Callee : ResourceUsage->Callees) {
+      if (SeenCallees.insert(Callee).second)
+        CalleeSyms.push_back(MF.getTarget().getSymbol(Callee));
+    }
+
+    // Emit function local resource usage info. Does not contain any callee
+    // propagated resource info, users of this section info should be able to
+    // gather all resource info and walk the callgraph to combine for any
+    // callee resource info.
+    getTargetStreamer()->emitResourceUsageEntry(
+        CurrentFnSym, ResourceUsage->NumVGPR, ResourceUsage->NumAGPR,
+        ResourceUsage->NumExplicitSGPR, ResourceUsage->NumNamedBarrier,
+        ResourceUsage->PrivateSegmentSize, Flags, CalleeSyms);
+  }
+
   // Emit _dvgpr$ symbol when appropriate.
   emitDVgprSymbol(MF);
 
diff --git a/llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp b/llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp
index fddb36133afb8..1a350d5fee1c4 100644
--- a/llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp
+++ b/llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp
@@ -1422,6 +1422,7 @@ class AMDGPUAsmParser : public MCTargetAsmParser {
   bool ParseDirectivePALMetadataBegin();
   bool ParseDirectivePALMetadata();
   bool ParseDirectiveAMDGPULDS();
+  bool ParseDirectiveAMDGPUResourceUsage();
 
   /// Common code to parse out a block of text (typically YAML) between start and
   /// end directives.
@@ -6741,6 +6742,108 @@ bool AMDGPUAsmParser::ParseDirectiveAMDGPULDS() {
   return false;
 }
 
+/// ParseDirectiveAMDGPUResourceUsage
+///   ::= .amdgpu_resource_usage <symbol>
+///         .num_vgpr <int>
+///         .num_agpr <int>
+///         .num_sgpr <int>
+///         .named_barrier <int>
+///         .private_seg_size <int>
+///         .uses_vcc <int>
+///         .uses_flat_scratch <int>
+///         .has_dyn_sized_stack <int>
+///         .has_recursion <int>
+///         .has_indirect_call <int>
+///         .callee <symbol>  (zero or more)
+///       .end_amdgpu_resource_usage
+bool AMDGPUAsmParser::ParseDirectiveAMDGPUResourceUsage() {
+  StringRef SymName;
+  if (getParser().parseIdentifier(SymName))
+    return TokError("expected symbol name after .amdgpu_resource_usage");
+
+  StringSet<> Seen;
+  MCSymbol *FnSym = getContext().getOrCreateSymbol(SymName);
+
+  uint32_t NumVGPR = 0, NumAGPR = 0, NumSGPR = 0;
+  uint32_t NumNamedBarrier = 0, PrivateSegmentSize = 0;
+  uint32_t UsesVCC = 0, UsesFlatScratch = 0, HasDynSizedStack = 0;
+  uint32_t HasRecursion = 0, HasIndirectCall = 0;
+  SmallVector<MCSymbol *, 4> Callees;
+
+  while (true) {
+    while (trySkipToken(AsmToken::EndOfStatement))
+      ;
+
+    StringRef ID;
+    if (!parseId(ID, "expected field directive or .end_amdgpu_resource_usage"))
+      return true;
+
+    if (ID == ".end_amdgpu_resource_usage")
+      break;
+
+    if (ID == ".callee") {
+      StringRef CalleeName;
+      if (getParser().parseIdentifier(CalleeName))
+        return TokError("expected symbol name after .callee");
+      Callees.push_back(getContext().getOrCreateSymbol(CalleeName));
+      continue;
+    }
+
+    if (!Seen.insert(ID).second)
+      return TokError("resource usage directives already declared");
+
+    int64_t Val;
+    if (getParser().parseAbsoluteExpression(Val))
+      return true;
+    if (Val < 0)
+      return TokError("value must be non-negative");
+
+    if (ID == ".num_vgpr")
+      NumVGPR = Val;
+    else if (ID == ".num_agpr")
+      NumAGPR = Val;
+    else if (ID == ".num_sgpr")
+      NumSGPR = Val;
+    else if (ID == ".named_barrier")
+      NumNamedBarrier = Val;
+    else if (ID == ".private_seg_size")
+      PrivateSegmentSize = Val;
+    else if (ID == ".uses_vcc")
+      UsesVCC = Val;
+    else if (ID == ".uses_flat_scratch")
+      UsesFlatScratch = Val;
+    else if (ID == ".has_dyn_sized_stack")
+      HasDynSizedStack = Val;
+    else if (ID == ".has_recursion")
+      HasRecursion = Val;
+    else if (ID == ".has_indirect_call")
+      HasIndirectCall = Val;
+    else
+      return TokError("unknown field '" + ID + "' in .amdgpu_resource_usage");
+  }
+
+  for (StringRef StrRef :
+       {".num_vgpr", ".num_agpr", ".num_sgpr", ".named_barrier",
+        ".private_seg_size", ".uses_vcc", ".uses_flat_scratch",
+        ".has_dyn_sized_stack", ".has_recursion", ".has_indirect_call"}) {
+    if (!Seen.contains(StrRef))
+      return TokError("requires " + StrRef +
+                      " directive in .amdgpu_resource_usage");
+  }
+
+  uint32_t Flags = 0;
+  Flags |= (UsesVCC ? 1u : 0u) << 0;
+  Flags |= (UsesFlatScratch ? 1u : 0u) << 1;
+  Flags |= (HasDynSizedStack ? 1u : 0u) << 2;
+  Flags |= (HasRecursion ? 1u : 0u) << 3;
+  Flags |= (HasIndirectCall ? 1u : 0u) << 4;
+
+  getTargetStreamer().emitResourceUsageEntry(
+      FnSym, NumVGPR, NumAGPR, NumSGPR, NumNamedBarrier, PrivateSegmentSize,
+      Flags, Callees);
+  return false;
+}
+
 bool AMDGPUAsmParser::ParseDirective(AsmToken DirectiveID) {
   StringRef IDVal = DirectiveID.getString();
 
@@ -6784,6 +6887,9 @@ bool AMDGPUAsmParser::ParseDirective(AsmToken DirectiveID) {
   if (IDVal == PALMD::AssemblerDirective)
     return ParseDirectivePALMetadata();
 
+  if (IDVal == ".amdgpu_resource_usage")
+    return ParseDirectiveAMDGPUResourceUsage();
+
   return true;
 }
 
diff --git a/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUTargetStreamer.cpp b/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUTargetStreamer.cpp
index fae61302ebd90..3631efb41d360 100644
--- a/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUTargetStreamer.cpp
+++ b/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUTargetStreamer.cpp
@@ -660,6 +660,26 @@ void AMDGPUTargetAsmStreamer::EmitAmdhsaKernelDescriptor(
   OS << "\t.end_amdhsa_kernel\n";
 }
 
+void AMDGPUTargetAsmStreamer::emitResourceUsageEntry(
+    MCSymbol *FnSym, uint32_t NumVGPR, uint32_t NumAGPR, uint32_t NumSGPR,
+    uint32_t NumNamedBarrier, uint32_t PrivateSegmentSize, uint32_t Flags,
+    ArrayRef<MCSymbol *> Callees) {
+  OS << "\t.amdgpu_resource_usage " << FnSym->getName() << '\n';
+  OS << "\t\t.num_vgpr " << NumVGPR << '\n';
+  OS << "\t\t.num_agpr " << NumAGPR << '\n';
+  OS << "\t\t.num_sgpr " << NumSGPR << '\n';
+  OS << "\t\t.named_barrier " << NumNamedBarrier << '\n';
+  OS << "\t\t.private_seg_size " << PrivateSegmentSize << '\n';
+  OS << "\t\t.uses_vcc " << ((Flags >> 0) & 1) << '\n';
+  OS << "\t\t.uses_flat_scratch " << ((Flags >> 1) & 1) << '\n';
+  OS << "\t\t.has_dyn_sized_stack " << ((Flags >> 2) & 1) << '\n';
+  OS << "\t\t.has_recursion " << ((Flags >> 3) & 1) << '\n';
+  OS << "\t\t.has_indirect_call " << ((Flags >> 4) & 1) << '\n';
+  for (MCSymbol *Callee : Callees)
+    OS << "\t\t.callee " << Callee->getName() << '\n';
+  OS << "\t.end_amdgpu_resource_usage\n";
+}
+
 //===----------------------------------------------------------------------===//
 // AMDGPUTargetELFStreamer
 //===----------------------------------------------------------------------===//
@@ -1061,3 +1081,47 @@ void AMDGPUTargetELFStreamer::EmitAmdhsaKernelDescriptor(
   for (uint32_t i = 0; i < sizeof(amdhsa::kernel_descriptor_t::reserved3); ++i)
     Streamer.emitInt8(0u);
 }
+
+void AMDGPUTargetELFStreamer::emitResourceUsageEntry(
+    MCSymbol *FnSym, uint32_t NumVGPR, uint32_t NumAGPR, uint32_t NumSGPR,
+    uint32_t NumNamedBarrier, uint32_t PrivateSegmentSize, uint32_t Flags,
+    ArrayRef<MCSymbol *> Callees) {
+  auto &S = getStreamer();
+  auto &Context = S.getContext();
+  const unsigned ResourceInfoEntrySize = 24;
+
+  // TODO: Custom elf section type for support in linker.
+  MCSection *Sec =
+      Context.getELFSection(".AMDGPU.resource_info", ELF::SHT_PROGBITS,
+                            ELF::SHF_EXCLUDE, ResourceInfoEntrySize);
+  S.pushSection();
+  S.switchSection(Sec);
+
+  // Emit R_AMDGPU_NONE relocation pointing to the function symbol.
+  // Use the current section offset (= ResourceInfoEntryCount * 24).
+  auto *SectionSym = MCSymbolRefExpr::create(Sec->getBeginSymbol(), Context);
+  auto *Offset = MCBinaryExpr::createAdd(
+      SectionSym,
+      MCConstantExpr::create(ResourceInfoEntryCount * ResourceInfoEntrySize,
+                             Context),
+      Context);
+  S.emitRelocDirective(*Offset, "R_AMDGPU_NONE",
+                       MCSymbolRefExpr::create(FnSym, Context));
+
+  // Emit callee relocations at the same offset as the function identity.
+  for (MCSymbol *Callee : Callees)
+    S.emitRelocDirective(*Offset, "R_AMDGPU_NONE",
+                         MCSymbolRefExpr::create(Callee, Context));
+
+  ++ResourceInfoEntryCount;
+
+  // Emit the 24-byte struct.
+  S.emitIntValue(NumVGPR, 4);
+  S.emitIntValue(NumAGPR, 4);
+  S.emitIntValue(NumSGPR, 4);
+  S.emitIntValue(NumNamedBarrier, 4);
+  S.emitIntValue(PrivateSegmentSize, 4);
+  S.emitIntValue(Flags, 4);
+
+  S.popSection();
+}
diff --git a/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUTargetStreamer.h b/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUTargetStreamer.h
index 3a0d8dcd2d27c..ab15c83d51d5c 100644
--- a/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUTargetStreamer.h
+++ b/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUTargetStreamer.h
@@ -11,6 +11,7 @@
 
 #include "Utils/AMDGPUBaseInfo.h"
 #include "Utils/AMDGPUPALMetadata.h"
+#include "llvm/ADT/ArrayRef.h"
 #include "llvm/MC/MCStreamer.h"
 
 namespace llvm {
@@ -104,6 +105,15 @@ class AMDGPUTargetStreamer : public MCTargetStreamer {
                              const MCExpr *ReserveVCC,
                              const MCExpr *ReserveFlatScr) {}
 
+  /// Emit a per-function resource usage entry into the
+  /// .AMDGPU.resource_info section, along with callee relocations.
+  virtual void emitResourceUsageEntry(MCSymbol *FnSym, uint32_t NumVGPR,
+                                      uint32_t NumAGPR, uint32_t NumSGPR,
+                                      uint32_t NumNamedBarrier,
+                                      uint32_t PrivateSegmentSize,
+                                      uint32_t Flags,
+                                      ArrayRef<MCSymbol *> Callees = {}) {}
+
   static StringRef getArchNameFromElfMach(unsigned ElfMach);
   static unsigned getElfMach(StringRef GPU);
 
@@ -168,12 +178,21 @@ class AMDGPUTargetAsmStreamer final : public AMDGPUTargetStreamer {
                              const MCExpr *NextVGPR, const MCExpr *NextSGPR,
                              const MCExpr *ReserveVCC,
                              const MCExpr *ReserveFlatScr) override;
+
+  void emitResourceUsageEntry(MCSymbol *FnSym, uint32_t NumVGPR,
+                              uint32_t NumAGPR, uint32_t NumSGPR,
+                              uint32_t NumNamedBarrier,
+                              uint32_t PrivateSegmentSize, uint32_t Flags,
+                              ArrayRef<MCSymbol *> Callees = {}) override;
 };
 
 class AMDGPUTargetELFStreamer final : public AMDGPUTargetStreamer {
   const MCSubtargetInfo &STI;
   MCStreamer &Streamer;
 
+  // Counter for computing relocation offsets.
+  unsigned ResourceInfoEntryCount = 0;
+
   void EmitNote(StringRef Name, const MCExpr *DescSize, unsigned NoteType,
                 function_ref<void(MCELFStreamer &)> EmitDesc);
 
@@ -221,6 +240,12 @@ class AMDGPUTargetELFStreamer final : public AMDGPUTargetStreamer {
                              const MCExpr *NextVGPR, const MCExpr *NextSGPR,
                              const MCExpr *ReserveVCC,
                              const MCExpr *ReserveFlatScr) override;
+
+  void emitResourceUsageEntry(MCSymbol *FnSym, uint32_t NumVGPR,
+                              uint32_t NumAGPR, uint32_t NumSGPR,
+                              uint32_t NumNamedBarrier,
+                              uint32_t PrivateSegmentSize, uint32_t Flags,
+                              ArrayRef<MCSymbol *> Callees = {}) override;
 };
 }
 #endif
diff --git a/llvm/test/CodeGen/AMDGPU/branch-relaxation-gfx1250.ll b/llvm/test/CodeGen/AMDGPU/branch-relaxation-gfx1250.ll
index 779118bd33027..3b09af7368f50 100644
--- a/llvm/test/CodeGen/AMDGPU/branch-relaxation-gfx1250.ll
+++ b/llvm/test/CodeGen/AMDGPU/branch-relaxation-gfx1250.ll
@@ -9,6 +9,19 @@
 ; RUN: llvm-readobj -r %t.o | FileCheck --check-prefix=OBJ %s
 
 ; OBJ:       Relocations [
+; OBJ-NEXT:   Section (5) .rel.AMDGPU.resource_info {
+; OBJ-NEXT:     0x0 R_AMDGPU_NONE uniform_conditional_max_short_forward_branch
+; OBJ-NEXT:     0x18 R_AMDGPU_NONE uniform_conditional_min_long_forward_branch
+; OBJ-NEXT:     0x30 R_AMDGPU_NONE uniform_conditional_min_long_forward_vcnd_branch
+; OBJ-NEXT:     0x48 R_AMDGPU_NONE min_long_forward_vbranch
+; OBJ-NEXT:     0x60 R_AMDGPU_NONE long_backward_sbranch
+; OBJ-NEXT:     0x78 R_AMDGPU_NONE uniform_unconditional_min_long_forward_branch
+; OBJ-NEXT:     0x90 R_AMDGPU_NONE uniform_unconditional_min_long_backward_branch
+; OBJ-NEXT:     0xA8 R_AMDGPU_NONE expand_requires_expand
+; OBJ-NEXT:     0xC0 R_AMDGPU_NONE uniform_inside_divergent
+; OBJ-NEXT:     0xD8 R_AMDGPU_NONE analyze_mask_branch
+; OBJ-NEXT:     0xF0 R_AMDGPU_NONE long_branch_hang
+; OBJ-NEXT:   }
 ; OBJ-NEXT: ]
 
 ; Restrict maximum branch to between +7 and -8 dwords
diff --git a/llvm/test/CodeGen/AMDGPU/branch-relaxation.ll b/llvm/test/CodeGen/AMDGPU/branch-relaxation.ll
index 155e54baf15a1..daa9994461af1 100644
--- a/llvm/test/CodeGen/AMDGPU/branch-relaxation.ll
+++ b/llvm/test/CodeGen/AMDGPU/branch-relaxation.ll
@@ -10,6 +10,19 @@
 ; RUN: llvm-readobj -r %t.o | FileCheck --check-prefix=OBJ %s
 
 ; OBJ:       Relocations [
+; OBJ-NEXT:   Section (5) .rel.AMDGPU.resource_info {
+; OBJ-NEXT:     0x0 R_AMDGPU_NONE uniform_conditional_max_short_forward_branch
+; OBJ-NEXT:     0x18 R_AMDGPU_NONE uniform_conditional_min_long_forward_branch
+; OBJ-NEXT:     0x30 R_AMDGPU_NONE uniform_conditional_min_long_forward_vcnd_branch
+; OBJ-NEXT:     0x48 R_AMDGPU_NONE min_long_forward_vbranch
+; OBJ-NEXT:     0x60 R_AMDGPU_NONE long_backward_sbranch
+; OBJ-NEXT:     0x78 R_AMDGPU_NONE uniform_unconditional_min_long_forward_branch
+; OBJ-NEXT:     0x90 R_AMDGPU_NONE uniform_unconditional_min_long_backward_branch
+; OBJ-NEXT:     0xA8 R_AMDGPU_NONE expand_requires_expand
+; OBJ-NEXT:     0xC0 R_AMDGPU_NONE uniform_inside_divergent
+; OBJ-NEXT:     0xD8 R_AMDGPU_NONE analyze_mask_branch
+; OBJ-NEXT:     0xF0 R_AMDGPU_NONE long_branch_hang
+; OBJ-NEXT:   }
 ; OBJ-NEXT: ]
 
 ; Restrict maximum branch to between +7 and -8 dwords
diff --git a/llvm/test/CodeGen/AMDGPU/lds-relocs.ll b/llvm/test/CodeGen/AMDGPU/lds-relocs.ll
index 447cb62643384..2d45b28bab888 100644
--- a/llvm/test/CodeGen/AMDGPU/lds-relocs.ll
+++ b/llvm/test/CodeGen/AMDGPU/lds-relocs.ll
@@ -9,6 +9,9 @@
 ; ELF-NEXT:     0x{{[0-9A-F]*}} R_AMDGPU_ABS32_LO lds.external
 ; ELF-NEXT:     0x{{[0-9A-F]*}} R_AMDGPU_ABS32_LO lds.defined
 ; ELF-NEXT:   }
+; ELF-NEXT:   Section (6) .rel.AMDGPU.resource_info {
+; ELF-NEXT:     0x0 R_AMDGPU_NONE test_basic
+; ELF-NEXT:   }
 ; ELF-NEXT: ]
 
 ; ELF:      Symbol {
diff --git a/llvm/test/CodeGen/AMDGPU/resource-info-section.ll b/llvm/test/CodeGen/AMDGPU/resource-info-section.ll
new file mode 100644
index 0000000000000..9aa8f401a9f82
--- /dev/null
+++ b/llvm/test/CodeGen/AMDGPU/resource-info-section.ll
@@ -0,0 +1,98 @@
+; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx900 -enable-ipra=0 < %s | FileCheck -check-prefix=ASM %s
+; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx900 -enable-ipra=0 -filetype=obj < %s -o %t
+; RUN: llvm-readelf -r %t | FileCheck -check-prefix=RELOC %s
+
+; ASM-LABEL: {{^}}leaf:
+; ASM: .amdgpu_resource_usage leaf
+; ASM-NEXT: .num_vgpr 0
+; ASM-NEXT: .num_agpr 0
+; ASM-NEXT: .num_sgpr 32
+; ASM-NEXT: .named_barrier 0
+; ASM-NEXT: .private_seg_size 0
+; ASM-NEXT: .uses_vcc 0
+; ASM-NEXT: .uses_flat_scratch 0
+; ASM-NEXT: .has_dyn_sized_stack 0
+; ASM-NEXT: .has_recursion 0
+; ASM-NEXT: .has_indirect_call 0
+; ASM-NEXT: .end_amdgpu_resource_usage
+define void @leaf() {
+  ret void
+}
+
+; ASM-LABEL: {{^}}use_vcc:
+; ASM: .amdgpu_resource_usage use_vcc
+; ASM-NEXT: .num_vgpr 0
+; ASM-NEXT: .num_agpr 0
+; ASM-NEXT: .num_sgpr 32
+; ASM-NEXT: .named_barrier 0
+; ASM-NEXT: .private_seg_size 0
+; ASM-NEXT: .uses_vcc 1
+; ASM-NEXT: .uses_flat_scratch 0
+; ASM-NEXT: .has_dyn_sized_stack 0
+; ASM-NEXT: .has_recursion 0
+; ASM-NEXT: .has_indirect_call 0
+; ASM-NEXT: .end_amdgpu_resource_usage
+define void @use_vcc() {
+  call void asm sideeffect "", "~{vcc}" ()
+  ret void
+}
+
+; ASM-LABEL: {{^}}caller:
+; ASM: .amdgpu_resource_usage caller
+; ASM:      .callee use_vcc
+; ASM-NEXT: .end_amdgpu_resource_usage
+define void @caller() {
+  call void @use_vcc()
+  ret void
+}
+
+; ASM-LABEL: {{^}}kernel:
+; ASM: .amdgpu_resource_usage kernel
+; ASM:      .callee caller
+; ASM-NEXT: .end_amdgpu_resource_usage
+define amdgpu_kernel void @kernel() {
+  call void @caller()
+  ret void
+}
+
+
+; ASM-LABEL: {{^}}rcaller2:
+; ASM: .amdgpu_resource_usage rcaller2
+; ASM:      .callee rcaller1
+; ASM-NEXT: .end_amdgpu_resource_usage
+; ASM-LABEL: {{^}}rcaller1:
+; ASM: .amdgpu_resource_usage rcaller1
+; ASM:      .callee rcaller2
+; ASM-NEXT: .end_amdgpu_resource_usage
+define void @rcaller1() {
+  call void @rcaller2()
+  ret void
+}
+define void @rcaller2() {
+  call void @rcaller1()
+  ret void
+}
+
+; ASM-LABEL: {{^}}kernel_recurse:
+; ASM: .amdgpu_resource_usage kernel
+; ASM:      .callee rcaller1
+; ASM-NEXT: .end_amdgpu_resource_usage
+define amdgpu_kernel void @kernel_recurse() {
+  call void @rcaller1()
+  ret void
+}
+
+; RELOC:      Relocation section '.rela.AMDGPU.resource_info'
+; RELOC:      0000000000000000 {{[0-9a-f]+}} R_AMDGPU_NONE {{[0-9a-f]+}} leaf + 0
+; RELOC-NEXT: 0000000000000018 {{[0-9a-f]+}} R_AMDGPU_NONE {{[0-9a-f]+}} use_vcc + 0
+; RELOC-NEXT: 0000000000000030 {{[0-9a-f]+}} R_AMDGPU_NONE {{[0-9a-f]+}} caller + 0
+; RELOC-NEXT: 0000000000000030 {{[0-9a-f]+}} R_AMDGPU_NONE {{[0-9a-f]+}} use_vcc + 0
+; RELOC-NEXT: 0000000000000048 {{[0-9a-f]+}} R_AMDGPU_NONE {{[0-9a-f]+}} kernel + 0
+; RELOC-NEXT: 0000000000000048 {{[0-9a-f]+}} R_AMDGPU_NONE {{[0-9a-f]+}} caller + 0
+; RELOC-NEXT: 0000000000000060 {{[0-9a-f]+}} R_AMDGPU_NONE {{[0-9a-f]+}} rcaller2 + 0
+; RELOC-NEXT: 0000000000000060 {{[0-9a-f]+}} R_AMDGPU_NONE {{[0-9a-f]+}} rcaller1 + 0
+; RELOC-NEXT: 0000000000000078 {{[0-9a-f]+}} R_AMDGPU_NONE {{[0-9a-f]+}} rcaller1 + 0
+; RELOC-NEXT: 0000000000000078 {{[0-9a-f]+}} R_AMDGPU_NONE {{[0-9a-f]+}} rcaller2 + 0
+; RELOC-NEXT: 0000000000000090 {{[0-9a-f]+}} R_AMDGPU_NONE {{[0-9a-f]+}} kernel_recurse + 0
+; RELOC-NEXT: 0000000000000090 {{[0-9a-f]+}} R_AMDGPU_NONE {{[0-9a-f]+}} rcaller1 + 0
+
diff --git a/llvm/test/MC/AMDGPU/amdgpu-resource-usage-err.s b/llvm/test/MC/AMDGPU/amdgpu-resource-usage-err.s
new file mode 100644
index 0000000000000..e7db9189adc5b
--- /dev/null
+++ b/llvm/test/MC/AMDGPU/amdgpu-resource-usage-err.s
@@ -0,0 +1,64 @@
+// RUN: not llvm-mc -triple amdgcn-amd-amdhsa -mcpu=gfx900 %s -o - -filetype=null 2>&1 | FileCheck %s
+
+// Missing symbol name after .amdgpu_resource_usage.
+	.amdgpu_resource_usage
+// CHECK: :[[@LINE-1]]:{{[0-9]+}}: error: expected symbol name after .amdgpu_resource_usage
+
+// Missing symbol name after .callee.
+	.amdgpu_resource_usage fn1
+		.num_vgpr 0
+		.num_agpr 0
+		.num_sgpr 0
+		.named_barrier 0
+		.private_seg_size 0
+		.uses_vcc 0
+		.uses_flat_scratch 0
+		.has_dyn_sized_stack 0
+		.has_recursion 0
+		.has_indirect_call 0
+		.callee
+// CHECK: :[[@LINE-1]]:{{[0-9]+}}: error: expected symbol name after .callee
+	.end_amdgpu_resource_usage
+
+// Duplicate field directive.
+	.amdgpu_resource_usage fn2
+		.num_vgpr 0
+		.num_vgpr 1
+// CHECK: :[[@LINE-1]]:{{[0-9]+}}: error: resource usage directives already declared
+	.end_amdgpu_resource_usage
+
+// Negative value.
+	.amdgpu_resource_usage fn3
+		.num_vgpr -1
+// CHECK: :[[@LINE-1]]:{{[0-9]+}}: error: value must be non-negative
+	.end_amdgpu_resource_usage
+
+// Unknown field.
+	.amdgpu_resource_usage fn4
+		.num_vgpr 0
+		.num_agpr 0
+		.num_sgpr 0
+		.named_barrier 0
+		.private_seg_size 0
+		.uses_vcc 0
+		.uses_flat_scratch 0
+		.has_dyn_sized_stack 0
+		.has_recursion 0
+		.has_indirect_call 0
+		.bogus_field 42
+// CHECK: :[[@LINE-1]]:{{[0-9]+}}: error: unknown field '.bogus_field' in .amdgpu_resource_usage
+	.end_amdgpu_resource_usage
+
+// Missing required field (.num_sgpr omitted).
+	.amdgpu_resource_usage fn5
+		.num_vgpr 0
+		.num_agpr 0
+		.named_barrier 0
+		.private_seg_size 0
+		.uses_vcc 0
+		.uses_flat_scratch 0
+		.has_dyn_sized_stack 0
+		.has_recursion 0
+		.has_indirect_call 0
+	.end_amdgpu_resource_usage
+// CHECK: :[[@LINE-1]]:{{[0-9]+}}: error: requires .num_sgpr directive in .amdgpu_resource_usage
diff --git a/llvm/test/MC/AMDGPU/amdgpu-resource-usage.s b/llvm/test/MC/AMDGPU/amdgpu-resource-usage.s
new file mode 100644
index 0000000000000..7d10684361526
--- /dev/null
+++ b/llvm/test/MC/AMDGPU/amdgpu-resource-usage.s
@@ -0,0 +1,114 @@
+// RUN: llvm-mc -triple amdgcn-amd-amdhsa -mcpu=gfx900 < %s | FileCheck --check-prefix=ASM %s
+// RUN: llvm-mc -triple amdgcn-amd-amdhsa -mcpu=gfx900 -filetype=obj < %s > %t
+// RUN: llvm-readelf -S %t | FileCheck --check-prefix=ELF-SEC %s
+// RUN: llvm-readelf -r %t | FileCheck --check-prefix=ELF-RELOC %s
+
+.text
+
+.globl bar
+.type bar, at function
+bar:
+  s_endpgm
+
+.type baz, at function
+baz:
+  s_endpgm
+
+.globl foo
+.type foo, at function
+foo:
+  s_endpgm
+
+.extern external_fn
+
+// ASM: .amdgpu_resource_usage bar
+// ASM-NEXT: .num_vgpr 65
+// ASM-NEXT: .num_agpr 0
+// ASM-NEXT: .num_sgpr 25
+// ASM-NEXT: .named_barrier 0
+// ASM-NEXT: .private_seg_size 16
+// ASM-NEXT: .uses_vcc 1
+// ASM-NEXT: .uses_flat_scratch 0
+// ASM-NEXT: .has_dyn_sized_stack 0
+// ASM-NEXT: .has_recursion 0
+// ASM-NEXT: .has_indirect_call 0
+// ASM-NEXT: .callee baz
+// ASM-NEXT: .end_amdgpu_resource_usage
+	.amdgpu_resource_usage bar
+		.num_vgpr 65
+		.num_agpr 0
+		.num_sgpr 25
+		.named_barrier 0
+		.private_seg_size 16
+		.uses_vcc 1
+		.uses_flat_scratch 0
+		.has_dyn_sized_stack 0
+		.has_recursion 0
+		.has_indirect_call 0
+		.callee baz
+	.end_amdgpu_resource_usage
+
+// ASM: .amdgpu_resource_usage foo
+// ASM-NEXT: .num_vgpr 10
+// ASM-NEXT: .num_agpr 4
+// ASM-NEXT: .num_sgpr 8
+// ASM-NEXT: .named_barrier 2
+// ASM-NEXT: .private_seg_size 0
+// ASM-NEXT: .uses_vcc 0
+// ASM-NEXT: .uses_flat_scratch 1
+// ASM-NEXT: .has_dyn_sized_stack 1
+// ASM-NEXT: .has_recursion 1
+// ASM-NEXT: .has_indirect_call 1
+// ASM-NEXT: .callee bar
+// ASM-NEXT: .callee external_fn
+// ASM-NEXT: .end_amdgpu_resource_usage
+	.amdgpu_resource_usage foo
+		.num_vgpr 10
+		.num_agpr 4
+		.num_sgpr 8
+		.named_barrier 2
+		.private_seg_size 0
+		.uses_vcc 0
+		.uses_flat_scratch 1
+		.has_dyn_sized_stack 1
+		.has_recursion 1
+		.has_indirect_call 1
+		.callee bar
+		.callee external_fn
+	.end_amdgpu_resource_usage
+
+// ASM: .amdgpu_resource_usage baz
+// ASM-NEXT: .num_vgpr 2
+// ASM-NEXT: .num_agpr 0
+// ASM-NEXT: .num_sgpr 4
+// ASM-NEXT: .named_barrier 0
+// ASM-NEXT: .private_seg_size 0
+// ASM-NEXT: .uses_vcc 0
+// ASM-NEXT: .uses_flat_scratch 0
+// ASM-NEXT: .has_dyn_sized_stack 0
+// ASM-NEXT: .has_recursion 0
+// ASM-NEXT: .has_indirect_call 0
+// ASM-NEXT: .end_amdgpu_resource_usage
+	.amdgpu_resource_usage baz
+		.num_vgpr 2
+		.num_agpr 0
+		.num_sgpr 4
+		.named_barrier 0
+		.private_seg_size 0
+		.uses_vcc 0
+		.uses_flat_scratch 0
+		.has_dyn_sized_stack 0
+		.has_recursion 0
+		.has_indirect_call 0
+	.end_amdgpu_resource_usage
+
+// ELF-SEC: .AMDGPU.resource_info PROGBITS {{[0-9a-f]+}} {{[0-9a-f]+}} 000048 18 E 0 0 1
+
+// ELF-RELOC:      Relocation section '.rela.AMDGPU.resource_info'
+// ELF-RELOC:      0000000000000000 {{[0-9a-f]+}} R_AMDGPU_NONE {{[0-9a-f]+}} bar + 0
+// ELF-RELOC-NEXT: 0000000000000000 {{[0-9a-f]+}} R_AMDGPU_NONE {{[0-9a-f]+}} baz + 0
+// ELF-RELOC-NEXT: 0000000000000018 {{[0-9a-f]+}} R_AMDGPU_NONE {{[0-9a-f]+}} foo + 0
+// ELF-RELOC-NEXT: 0000000000000018 {{[0-9a-f]+}} R_AMDGPU_NONE {{[0-9a-f]+}} bar + 0
+// ELF-RELOC-NEXT: 0000000000000018 {{[0-9a-f]+}} R_AMDGPU_NONE {{[0-9a-f]+}} external_fn + 0
+// ELF-RELOC-NEXT: 0000000000000030 {{[0-9a-f]+}} R_AMDGPU_NONE {{[0-9a-f]+}} baz + 0
+

>From ce026c013f0f1762356f4252ccf063a395a31f9c Mon Sep 17 00:00:00 2001
From: Janek van Oirschot <janek.vanoirschot at amd.com>
Date: Wed, 25 Mar 2026 16:25:16 +0000
Subject: [PATCH 2/3] Remove callees, remove flags that can be inferred by
 callgraph, rephrase docs

---
 llvm/docs/AMDGPUUsage.rst                     | 37 +++--------
 llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp   | 19 +-----
 .../AMDGPU/AsmParser/AMDGPUAsmParser.cpp      | 34 +++--------
 .../MCTargetDesc/AMDGPUTargetStreamer.cpp     | 17 +-----
 .../MCTargetDesc/AMDGPUTargetStreamer.h       | 13 ++--
 .../AMDGPU/branch-relaxation-gfx1250.ll       |  2 +-
 llvm/test/CodeGen/AMDGPU/branch-relaxation.ll |  2 +-
 llvm/test/CodeGen/AMDGPU/lds-relocs.ll        |  2 +-
 .../CodeGen/AMDGPU/resource-info-section.ll   | 61 +------------------
 .../MC/AMDGPU/amdgpu-resource-usage-err.s     | 22 +------
 llvm/test/MC/AMDGPU/amdgpu-resource-usage.s   | 25 +-------
 11 files changed, 36 insertions(+), 198 deletions(-)

diff --git a/llvm/docs/AMDGPUUsage.rst b/llvm/docs/AMDGPUUsage.rst
index 23dfb1786f4ba..cb54603e42753 100644
--- a/llvm/docs/AMDGPUUsage.rst
+++ b/llvm/docs/AMDGPUUsage.rst
@@ -2408,12 +2408,10 @@ Function Resource Usage Asm Directives
 
 A function's resource usage depends on each of its callees' resource usage.
 Accomodating this are the AMDGPU resource usage assembler directives and ELF
-section. The assembler directives emit a pre- and post-marked sequence of
-assembler directives after every function that state a function's resource
-usage and callees. The resource usage this emit is **only for this function's
-usage** and does not yet consider the callees' resource usage. For the
-propagated resource usage, any user of the section or resource info will have
-to walk the callgraph and compute the total use.
+section. The assembler directives emit a pre- and post-marked sequence after
+every function that state a function's resource usage. The resource usage
+emitted is **only for this function's usage** and does not yet consider the
+callees' resource usage.
 
   .. table:: Function Resource Usage Asm Directives:
      :name: function-usage-directive-table
@@ -2431,25 +2429,20 @@ to walk the callgraph and compute the total use.
      .uses_vcc <i1>                         yes       1                        Boolean denoting whether vcc is used in the function
      .uses_flat_scratch <i1>                yes       1                        Boolean denoting whether flat scratch is used in the function
      .has_dyn_sized_stack <i1>              yes       1                        Boolean denoting whether stack in the function is dynamically sized
-     .has_recursion <i1>                    yes       1                        Boolean denoting whether recursion is used in the function
-     .has_indirect_call <i1>                yes       1                        Boolean denoting whether the function has an indirect call
-     .callee <function name>                no        0 or more                Callee functions called by the function, each unique callee getting its own .callee directive
      ====================================== ========= ======================== ===================================================================================================
 
 Function Resource Usage ELF Section
 -----------------------------------
 
 The resource usage section contains binary structs representing each resource
-usage entry for a function. The resource usage section is named
-.AMDGPU.resource_usage and has an additional relocation section (e.g.,
-.rela.AMDGPU.resource_usage) which holds information on which offset of the
-.AMDGPU.resource_usage section denotes which function in addition to tracking
-callers and callees.
+usage entry for a function. This section is named .AMDGPU.resource_usage and
+has an additional relocation section (e.g., .rela.AMDGPU.resource_usage) that
+maps the offsets within the .AMDGPU.resource_usage section to each function.
 
 Resource usage is a binary struct of its required resource information. The
-booleans are packed into a flag of type i32. The total size of each resource
-usage struct is, therefore, 24-bytes (i.e., sizeof(num_vgpr) + sizeof(num_agpr)
-+ sizeof(num_sgpr) + sizeof(named_barrier) + sizeof(private_seg_size) +
+booleans elements are packed into a flag of type i32. The total size of each
+resource usage struct is 24-bytes (i.e., sizeof(num_vgpr) + sizeof(num_agpr) +
+sizeof(num_sgpr) + sizeof(named_barrier) + sizeof(private_seg_size) +
 sizeof(flags)). The flags are packed as follows:
 
   .. table:: Function Resource Usage Flags:
@@ -2461,18 +2454,8 @@ sizeof(flags)). The flags are packed as follows:
      uses_vcc                       0
      uses_flat_scratch              1
      has_dyn_sized_stack            2
-     has_recursion                  3
-     has_indirect_call              4
      ===========================    =======
 
-Resource usage relocation section contains the the offset into the resource
-usage section for each function in a translation unit. In addition to this
-function to resource usage entry mapping, it embeds the callees of each caller
-by having the first relocation to an offset denote the function the entry is
-mapped to and any subsequent relocations for that same offset denote a callee
-of the function mapped to that entry (similar to how CGProfile specifies the
-caller and callee relation.
-
 .. _amdgpu-elf-code-object:
 
 ELF Code Object
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp b/llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp
index b07f0ad49b4fe..865b3022e864a 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp
@@ -759,32 +759,19 @@ bool AMDGPUAsmPrinter::runOnMachineFunction(MachineFunction &MF) {
                      OutContext, IsLocal));
   }
 
-  // Emit per-function local resource usage and callee info into
-  // .AMDGPU.resource_info section.
+  // Emit per-function local resource usage into .AMDGPU.resource_usage section.
   {
     uint32_t Flags = 0;
     Flags |= (ResourceUsage->UsesVCC ? 1u : 0u) << 0;
     Flags |= (ResourceUsage->UsesFlatScratch ? 1u : 0u) << 1;
     Flags |= (ResourceUsage->HasDynamicallySizedStack ? 1u : 0u) << 2;
-    Flags |= (ResourceUsage->HasRecursion ? 1u : 0u) << 3;
-    Flags |= (ResourceUsage->HasIndirectCall ? 1u : 0u) << 4;
-
-    // Collect unique callee symbols.
-    SmallVector<MCSymbol *, 8> CalleeSyms;
-    SmallPtrSet<const Function *, 8> SeenCallees;
-    for (const Function *Callee : ResourceUsage->Callees) {
-      if (SeenCallees.insert(Callee).second)
-        CalleeSyms.push_back(MF.getTarget().getSymbol(Callee));
-    }
 
     // Emit function local resource usage info. Does not contain any callee
-    // propagated resource info, users of this section info should be able to
-    // gather all resource info and walk the callgraph to combine for any
-    // callee resource info.
+    // propagated resource info.
     getTargetStreamer()->emitResourceUsageEntry(
         CurrentFnSym, ResourceUsage->NumVGPR, ResourceUsage->NumAGPR,
         ResourceUsage->NumExplicitSGPR, ResourceUsage->NumNamedBarrier,
-        ResourceUsage->PrivateSegmentSize, Flags, CalleeSyms);
+        ResourceUsage->PrivateSegmentSize, Flags);
   }
 
   // Emit _dvgpr$ symbol when appropriate.
diff --git a/llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp b/llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp
index 1a350d5fee1c4..3f703bf5f2ba3 100644
--- a/llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp
+++ b/llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp
@@ -6752,9 +6752,6 @@ bool AMDGPUAsmParser::ParseDirectiveAMDGPULDS() {
 ///         .uses_vcc <int>
 ///         .uses_flat_scratch <int>
 ///         .has_dyn_sized_stack <int>
-///         .has_recursion <int>
-///         .has_indirect_call <int>
-///         .callee <symbol>  (zero or more)
 ///       .end_amdgpu_resource_usage
 bool AMDGPUAsmParser::ParseDirectiveAMDGPUResourceUsage() {
   StringRef SymName;
@@ -6767,8 +6764,6 @@ bool AMDGPUAsmParser::ParseDirectiveAMDGPUResourceUsage() {
   uint32_t NumVGPR = 0, NumAGPR = 0, NumSGPR = 0;
   uint32_t NumNamedBarrier = 0, PrivateSegmentSize = 0;
   uint32_t UsesVCC = 0, UsesFlatScratch = 0, HasDynSizedStack = 0;
-  uint32_t HasRecursion = 0, HasIndirectCall = 0;
-  SmallVector<MCSymbol *, 4> Callees;
 
   while (true) {
     while (trySkipToken(AsmToken::EndOfStatement))
@@ -6781,14 +6776,6 @@ bool AMDGPUAsmParser::ParseDirectiveAMDGPUResourceUsage() {
     if (ID == ".end_amdgpu_resource_usage")
       break;
 
-    if (ID == ".callee") {
-      StringRef CalleeName;
-      if (getParser().parseIdentifier(CalleeName))
-        return TokError("expected symbol name after .callee");
-      Callees.push_back(getContext().getOrCreateSymbol(CalleeName));
-      continue;
-    }
-
     if (!Seen.insert(ID).second)
       return TokError("resource usage directives already declared");
 
@@ -6814,20 +6801,15 @@ bool AMDGPUAsmParser::ParseDirectiveAMDGPUResourceUsage() {
       UsesFlatScratch = Val;
     else if (ID == ".has_dyn_sized_stack")
       HasDynSizedStack = Val;
-    else if (ID == ".has_recursion")
-      HasRecursion = Val;
-    else if (ID == ".has_indirect_call")
-      HasIndirectCall = Val;
     else
       return TokError("unknown field '" + ID + "' in .amdgpu_resource_usage");
   }
 
-  for (StringRef StrRef :
-       {".num_vgpr", ".num_agpr", ".num_sgpr", ".named_barrier",
-        ".private_seg_size", ".uses_vcc", ".uses_flat_scratch",
-        ".has_dyn_sized_stack", ".has_recursion", ".has_indirect_call"}) {
+  for (StringRef StrRef : {".num_vgpr", ".num_agpr", ".num_sgpr",
+                           ".named_barrier", ".private_seg_size", ".uses_vcc",
+                           ".uses_flat_scratch", ".has_dyn_sized_stack"}) {
     if (!Seen.contains(StrRef))
-      return TokError("requires " + StrRef +
+      return TokError("missing required " + StrRef +
                       " directive in .amdgpu_resource_usage");
   }
 
@@ -6835,12 +6817,10 @@ bool AMDGPUAsmParser::ParseDirectiveAMDGPUResourceUsage() {
   Flags |= (UsesVCC ? 1u : 0u) << 0;
   Flags |= (UsesFlatScratch ? 1u : 0u) << 1;
   Flags |= (HasDynSizedStack ? 1u : 0u) << 2;
-  Flags |= (HasRecursion ? 1u : 0u) << 3;
-  Flags |= (HasIndirectCall ? 1u : 0u) << 4;
 
-  getTargetStreamer().emitResourceUsageEntry(
-      FnSym, NumVGPR, NumAGPR, NumSGPR, NumNamedBarrier, PrivateSegmentSize,
-      Flags, Callees);
+  getTargetStreamer().emitResourceUsageEntry(FnSym, NumVGPR, NumAGPR, NumSGPR,
+                                             NumNamedBarrier,
+                                             PrivateSegmentSize, Flags);
   return false;
 }
 
diff --git a/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUTargetStreamer.cpp b/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUTargetStreamer.cpp
index 3631efb41d360..a9348edc12642 100644
--- a/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUTargetStreamer.cpp
+++ b/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUTargetStreamer.cpp
@@ -662,8 +662,7 @@ void AMDGPUTargetAsmStreamer::EmitAmdhsaKernelDescriptor(
 
 void AMDGPUTargetAsmStreamer::emitResourceUsageEntry(
     MCSymbol *FnSym, uint32_t NumVGPR, uint32_t NumAGPR, uint32_t NumSGPR,
-    uint32_t NumNamedBarrier, uint32_t PrivateSegmentSize, uint32_t Flags,
-    ArrayRef<MCSymbol *> Callees) {
+    uint32_t NumNamedBarrier, uint32_t PrivateSegmentSize, uint32_t Flags) {
   OS << "\t.amdgpu_resource_usage " << FnSym->getName() << '\n';
   OS << "\t\t.num_vgpr " << NumVGPR << '\n';
   OS << "\t\t.num_agpr " << NumAGPR << '\n';
@@ -673,10 +672,6 @@ void AMDGPUTargetAsmStreamer::emitResourceUsageEntry(
   OS << "\t\t.uses_vcc " << ((Flags >> 0) & 1) << '\n';
   OS << "\t\t.uses_flat_scratch " << ((Flags >> 1) & 1) << '\n';
   OS << "\t\t.has_dyn_sized_stack " << ((Flags >> 2) & 1) << '\n';
-  OS << "\t\t.has_recursion " << ((Flags >> 3) & 1) << '\n';
-  OS << "\t\t.has_indirect_call " << ((Flags >> 4) & 1) << '\n';
-  for (MCSymbol *Callee : Callees)
-    OS << "\t\t.callee " << Callee->getName() << '\n';
   OS << "\t.end_amdgpu_resource_usage\n";
 }
 
@@ -1084,15 +1079,14 @@ void AMDGPUTargetELFStreamer::EmitAmdhsaKernelDescriptor(
 
 void AMDGPUTargetELFStreamer::emitResourceUsageEntry(
     MCSymbol *FnSym, uint32_t NumVGPR, uint32_t NumAGPR, uint32_t NumSGPR,
-    uint32_t NumNamedBarrier, uint32_t PrivateSegmentSize, uint32_t Flags,
-    ArrayRef<MCSymbol *> Callees) {
+    uint32_t NumNamedBarrier, uint32_t PrivateSegmentSize, uint32_t Flags) {
   auto &S = getStreamer();
   auto &Context = S.getContext();
   const unsigned ResourceInfoEntrySize = 24;
 
   // TODO: Custom elf section type for support in linker.
   MCSection *Sec =
-      Context.getELFSection(".AMDGPU.resource_info", ELF::SHT_PROGBITS,
+      Context.getELFSection(".AMDGPU.resource_usage", ELF::SHT_PROGBITS,
                             ELF::SHF_EXCLUDE, ResourceInfoEntrySize);
   S.pushSection();
   S.switchSection(Sec);
@@ -1108,11 +1102,6 @@ void AMDGPUTargetELFStreamer::emitResourceUsageEntry(
   S.emitRelocDirective(*Offset, "R_AMDGPU_NONE",
                        MCSymbolRefExpr::create(FnSym, Context));
 
-  // Emit callee relocations at the same offset as the function identity.
-  for (MCSymbol *Callee : Callees)
-    S.emitRelocDirective(*Offset, "R_AMDGPU_NONE",
-                         MCSymbolRefExpr::create(Callee, Context));
-
   ++ResourceInfoEntryCount;
 
   // Emit the 24-byte struct.
diff --git a/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUTargetStreamer.h b/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUTargetStreamer.h
index ab15c83d51d5c..d9cc6470fa50a 100644
--- a/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUTargetStreamer.h
+++ b/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUTargetStreamer.h
@@ -106,13 +106,12 @@ class AMDGPUTargetStreamer : public MCTargetStreamer {
                              const MCExpr *ReserveFlatScr) {}
 
   /// Emit a per-function resource usage entry into the
-  /// .AMDGPU.resource_info section, along with callee relocations.
+  /// .AMDGPU.resource_usage section.
   virtual void emitResourceUsageEntry(MCSymbol *FnSym, uint32_t NumVGPR,
                                       uint32_t NumAGPR, uint32_t NumSGPR,
                                       uint32_t NumNamedBarrier,
                                       uint32_t PrivateSegmentSize,
-                                      uint32_t Flags,
-                                      ArrayRef<MCSymbol *> Callees = {}) {}
+                                      uint32_t Flags) {}
 
   static StringRef getArchNameFromElfMach(unsigned ElfMach);
   static unsigned getElfMach(StringRef GPU);
@@ -182,8 +181,8 @@ class AMDGPUTargetAsmStreamer final : public AMDGPUTargetStreamer {
   void emitResourceUsageEntry(MCSymbol *FnSym, uint32_t NumVGPR,
                               uint32_t NumAGPR, uint32_t NumSGPR,
                               uint32_t NumNamedBarrier,
-                              uint32_t PrivateSegmentSize, uint32_t Flags,
-                              ArrayRef<MCSymbol *> Callees = {}) override;
+                              uint32_t PrivateSegmentSize,
+                              uint32_t Flags) override;
 };
 
 class AMDGPUTargetELFStreamer final : public AMDGPUTargetStreamer {
@@ -244,8 +243,8 @@ class AMDGPUTargetELFStreamer final : public AMDGPUTargetStreamer {
   void emitResourceUsageEntry(MCSymbol *FnSym, uint32_t NumVGPR,
                               uint32_t NumAGPR, uint32_t NumSGPR,
                               uint32_t NumNamedBarrier,
-                              uint32_t PrivateSegmentSize, uint32_t Flags,
-                              ArrayRef<MCSymbol *> Callees = {}) override;
+                              uint32_t PrivateSegmentSize,
+                              uint32_t Flags) override;
 };
 }
 #endif
diff --git a/llvm/test/CodeGen/AMDGPU/branch-relaxation-gfx1250.ll b/llvm/test/CodeGen/AMDGPU/branch-relaxation-gfx1250.ll
index 3b09af7368f50..b61c27e80b018 100644
--- a/llvm/test/CodeGen/AMDGPU/branch-relaxation-gfx1250.ll
+++ b/llvm/test/CodeGen/AMDGPU/branch-relaxation-gfx1250.ll
@@ -9,7 +9,7 @@
 ; RUN: llvm-readobj -r %t.o | FileCheck --check-prefix=OBJ %s
 
 ; OBJ:       Relocations [
-; OBJ-NEXT:   Section (5) .rel.AMDGPU.resource_info {
+; OBJ-NEXT:   Section (5) .rel.AMDGPU.resource_usage {
 ; OBJ-NEXT:     0x0 R_AMDGPU_NONE uniform_conditional_max_short_forward_branch
 ; OBJ-NEXT:     0x18 R_AMDGPU_NONE uniform_conditional_min_long_forward_branch
 ; OBJ-NEXT:     0x30 R_AMDGPU_NONE uniform_conditional_min_long_forward_vcnd_branch
diff --git a/llvm/test/CodeGen/AMDGPU/branch-relaxation.ll b/llvm/test/CodeGen/AMDGPU/branch-relaxation.ll
index daa9994461af1..ac72b632d1eca 100644
--- a/llvm/test/CodeGen/AMDGPU/branch-relaxation.ll
+++ b/llvm/test/CodeGen/AMDGPU/branch-relaxation.ll
@@ -10,7 +10,7 @@
 ; RUN: llvm-readobj -r %t.o | FileCheck --check-prefix=OBJ %s
 
 ; OBJ:       Relocations [
-; OBJ-NEXT:   Section (5) .rel.AMDGPU.resource_info {
+; OBJ-NEXT:   Section (5) .rel.AMDGPU.resource_usage {
 ; OBJ-NEXT:     0x0 R_AMDGPU_NONE uniform_conditional_max_short_forward_branch
 ; OBJ-NEXT:     0x18 R_AMDGPU_NONE uniform_conditional_min_long_forward_branch
 ; OBJ-NEXT:     0x30 R_AMDGPU_NONE uniform_conditional_min_long_forward_vcnd_branch
diff --git a/llvm/test/CodeGen/AMDGPU/lds-relocs.ll b/llvm/test/CodeGen/AMDGPU/lds-relocs.ll
index 2d45b28bab888..6ef8fe71ee2e8 100644
--- a/llvm/test/CodeGen/AMDGPU/lds-relocs.ll
+++ b/llvm/test/CodeGen/AMDGPU/lds-relocs.ll
@@ -9,7 +9,7 @@
 ; ELF-NEXT:     0x{{[0-9A-F]*}} R_AMDGPU_ABS32_LO lds.external
 ; ELF-NEXT:     0x{{[0-9A-F]*}} R_AMDGPU_ABS32_LO lds.defined
 ; ELF-NEXT:   }
-; ELF-NEXT:   Section (6) .rel.AMDGPU.resource_info {
+; ELF-NEXT:   Section (6) .rel.AMDGPU.resource_usage {
 ; ELF-NEXT:     0x0 R_AMDGPU_NONE test_basic
 ; ELF-NEXT:   }
 ; ELF-NEXT: ]
diff --git a/llvm/test/CodeGen/AMDGPU/resource-info-section.ll b/llvm/test/CodeGen/AMDGPU/resource-info-section.ll
index 9aa8f401a9f82..e745dffaa4fab 100644
--- a/llvm/test/CodeGen/AMDGPU/resource-info-section.ll
+++ b/llvm/test/CodeGen/AMDGPU/resource-info-section.ll
@@ -12,8 +12,6 @@
 ; ASM-NEXT: .uses_vcc 0
 ; ASM-NEXT: .uses_flat_scratch 0
 ; ASM-NEXT: .has_dyn_sized_stack 0
-; ASM-NEXT: .has_recursion 0
-; ASM-NEXT: .has_indirect_call 0
 ; ASM-NEXT: .end_amdgpu_resource_usage
 define void @leaf() {
   ret void
@@ -29,70 +27,13 @@ define void @leaf() {
 ; ASM-NEXT: .uses_vcc 1
 ; ASM-NEXT: .uses_flat_scratch 0
 ; ASM-NEXT: .has_dyn_sized_stack 0
-; ASM-NEXT: .has_recursion 0
-; ASM-NEXT: .has_indirect_call 0
 ; ASM-NEXT: .end_amdgpu_resource_usage
 define void @use_vcc() {
   call void asm sideeffect "", "~{vcc}" ()
   ret void
 }
 
-; ASM-LABEL: {{^}}caller:
-; ASM: .amdgpu_resource_usage caller
-; ASM:      .callee use_vcc
-; ASM-NEXT: .end_amdgpu_resource_usage
-define void @caller() {
-  call void @use_vcc()
-  ret void
-}
-
-; ASM-LABEL: {{^}}kernel:
-; ASM: .amdgpu_resource_usage kernel
-; ASM:      .callee caller
-; ASM-NEXT: .end_amdgpu_resource_usage
-define amdgpu_kernel void @kernel() {
-  call void @caller()
-  ret void
-}
-
-
-; ASM-LABEL: {{^}}rcaller2:
-; ASM: .amdgpu_resource_usage rcaller2
-; ASM:      .callee rcaller1
-; ASM-NEXT: .end_amdgpu_resource_usage
-; ASM-LABEL: {{^}}rcaller1:
-; ASM: .amdgpu_resource_usage rcaller1
-; ASM:      .callee rcaller2
-; ASM-NEXT: .end_amdgpu_resource_usage
-define void @rcaller1() {
-  call void @rcaller2()
-  ret void
-}
-define void @rcaller2() {
-  call void @rcaller1()
-  ret void
-}
-
-; ASM-LABEL: {{^}}kernel_recurse:
-; ASM: .amdgpu_resource_usage kernel
-; ASM:      .callee rcaller1
-; ASM-NEXT: .end_amdgpu_resource_usage
-define amdgpu_kernel void @kernel_recurse() {
-  call void @rcaller1()
-  ret void
-}
-
-; RELOC:      Relocation section '.rela.AMDGPU.resource_info'
+; RELOC:      Relocation section '.rela.AMDGPU.resource_usage'
 ; RELOC:      0000000000000000 {{[0-9a-f]+}} R_AMDGPU_NONE {{[0-9a-f]+}} leaf + 0
 ; RELOC-NEXT: 0000000000000018 {{[0-9a-f]+}} R_AMDGPU_NONE {{[0-9a-f]+}} use_vcc + 0
-; RELOC-NEXT: 0000000000000030 {{[0-9a-f]+}} R_AMDGPU_NONE {{[0-9a-f]+}} caller + 0
-; RELOC-NEXT: 0000000000000030 {{[0-9a-f]+}} R_AMDGPU_NONE {{[0-9a-f]+}} use_vcc + 0
-; RELOC-NEXT: 0000000000000048 {{[0-9a-f]+}} R_AMDGPU_NONE {{[0-9a-f]+}} kernel + 0
-; RELOC-NEXT: 0000000000000048 {{[0-9a-f]+}} R_AMDGPU_NONE {{[0-9a-f]+}} caller + 0
-; RELOC-NEXT: 0000000000000060 {{[0-9a-f]+}} R_AMDGPU_NONE {{[0-9a-f]+}} rcaller2 + 0
-; RELOC-NEXT: 0000000000000060 {{[0-9a-f]+}} R_AMDGPU_NONE {{[0-9a-f]+}} rcaller1 + 0
-; RELOC-NEXT: 0000000000000078 {{[0-9a-f]+}} R_AMDGPU_NONE {{[0-9a-f]+}} rcaller1 + 0
-; RELOC-NEXT: 0000000000000078 {{[0-9a-f]+}} R_AMDGPU_NONE {{[0-9a-f]+}} rcaller2 + 0
-; RELOC-NEXT: 0000000000000090 {{[0-9a-f]+}} R_AMDGPU_NONE {{[0-9a-f]+}} kernel_recurse + 0
-; RELOC-NEXT: 0000000000000090 {{[0-9a-f]+}} R_AMDGPU_NONE {{[0-9a-f]+}} rcaller1 + 0
 
diff --git a/llvm/test/MC/AMDGPU/amdgpu-resource-usage-err.s b/llvm/test/MC/AMDGPU/amdgpu-resource-usage-err.s
index e7db9189adc5b..77e1f30597ebd 100644
--- a/llvm/test/MC/AMDGPU/amdgpu-resource-usage-err.s
+++ b/llvm/test/MC/AMDGPU/amdgpu-resource-usage-err.s
@@ -4,22 +4,6 @@
 	.amdgpu_resource_usage
 // CHECK: :[[@LINE-1]]:{{[0-9]+}}: error: expected symbol name after .amdgpu_resource_usage
 
-// Missing symbol name after .callee.
-	.amdgpu_resource_usage fn1
-		.num_vgpr 0
-		.num_agpr 0
-		.num_sgpr 0
-		.named_barrier 0
-		.private_seg_size 0
-		.uses_vcc 0
-		.uses_flat_scratch 0
-		.has_dyn_sized_stack 0
-		.has_recursion 0
-		.has_indirect_call 0
-		.callee
-// CHECK: :[[@LINE-1]]:{{[0-9]+}}: error: expected symbol name after .callee
-	.end_amdgpu_resource_usage
-
 // Duplicate field directive.
 	.amdgpu_resource_usage fn2
 		.num_vgpr 0
@@ -43,8 +27,6 @@
 		.uses_vcc 0
 		.uses_flat_scratch 0
 		.has_dyn_sized_stack 0
-		.has_recursion 0
-		.has_indirect_call 0
 		.bogus_field 42
 // CHECK: :[[@LINE-1]]:{{[0-9]+}}: error: unknown field '.bogus_field' in .amdgpu_resource_usage
 	.end_amdgpu_resource_usage
@@ -58,7 +40,5 @@
 		.uses_vcc 0
 		.uses_flat_scratch 0
 		.has_dyn_sized_stack 0
-		.has_recursion 0
-		.has_indirect_call 0
 	.end_amdgpu_resource_usage
-// CHECK: :[[@LINE-1]]:{{[0-9]+}}: error: requires .num_sgpr directive in .amdgpu_resource_usage
+// CHECK: :[[@LINE-1]]:{{[0-9]+}}: error: missing required .num_sgpr directive in .amdgpu_resource_usage
diff --git a/llvm/test/MC/AMDGPU/amdgpu-resource-usage.s b/llvm/test/MC/AMDGPU/amdgpu-resource-usage.s
index 7d10684361526..c0483ec63a50a 100644
--- a/llvm/test/MC/AMDGPU/amdgpu-resource-usage.s
+++ b/llvm/test/MC/AMDGPU/amdgpu-resource-usage.s
@@ -30,9 +30,6 @@ foo:
 // ASM-NEXT: .uses_vcc 1
 // ASM-NEXT: .uses_flat_scratch 0
 // ASM-NEXT: .has_dyn_sized_stack 0
-// ASM-NEXT: .has_recursion 0
-// ASM-NEXT: .has_indirect_call 0
-// ASM-NEXT: .callee baz
 // ASM-NEXT: .end_amdgpu_resource_usage
 	.amdgpu_resource_usage bar
 		.num_vgpr 65
@@ -43,9 +40,6 @@ foo:
 		.uses_vcc 1
 		.uses_flat_scratch 0
 		.has_dyn_sized_stack 0
-		.has_recursion 0
-		.has_indirect_call 0
-		.callee baz
 	.end_amdgpu_resource_usage
 
 // ASM: .amdgpu_resource_usage foo
@@ -57,10 +51,6 @@ foo:
 // ASM-NEXT: .uses_vcc 0
 // ASM-NEXT: .uses_flat_scratch 1
 // ASM-NEXT: .has_dyn_sized_stack 1
-// ASM-NEXT: .has_recursion 1
-// ASM-NEXT: .has_indirect_call 1
-// ASM-NEXT: .callee bar
-// ASM-NEXT: .callee external_fn
 // ASM-NEXT: .end_amdgpu_resource_usage
 	.amdgpu_resource_usage foo
 		.num_vgpr 10
@@ -71,10 +61,6 @@ foo:
 		.uses_vcc 0
 		.uses_flat_scratch 1
 		.has_dyn_sized_stack 1
-		.has_recursion 1
-		.has_indirect_call 1
-		.callee bar
-		.callee external_fn
 	.end_amdgpu_resource_usage
 
 // ASM: .amdgpu_resource_usage baz
@@ -86,8 +72,6 @@ foo:
 // ASM-NEXT: .uses_vcc 0
 // ASM-NEXT: .uses_flat_scratch 0
 // ASM-NEXT: .has_dyn_sized_stack 0
-// ASM-NEXT: .has_recursion 0
-// ASM-NEXT: .has_indirect_call 0
 // ASM-NEXT: .end_amdgpu_resource_usage
 	.amdgpu_resource_usage baz
 		.num_vgpr 2
@@ -98,17 +82,12 @@ foo:
 		.uses_vcc 0
 		.uses_flat_scratch 0
 		.has_dyn_sized_stack 0
-		.has_recursion 0
-		.has_indirect_call 0
 	.end_amdgpu_resource_usage
 
-// ELF-SEC: .AMDGPU.resource_info PROGBITS {{[0-9a-f]+}} {{[0-9a-f]+}} 000048 18 E 0 0 1
+// ELF-SEC: .AMDGPU.resource_usage PROGBITS {{[0-9a-f]+}} {{[0-9a-f]+}} 000048 18 E 0 0 1
 
-// ELF-RELOC:      Relocation section '.rela.AMDGPU.resource_info'
+// ELF-RELOC:      Relocation section '.rela.AMDGPU.resource_usage'
 // ELF-RELOC:      0000000000000000 {{[0-9a-f]+}} R_AMDGPU_NONE {{[0-9a-f]+}} bar + 0
-// ELF-RELOC-NEXT: 0000000000000000 {{[0-9a-f]+}} R_AMDGPU_NONE {{[0-9a-f]+}} baz + 0
 // ELF-RELOC-NEXT: 0000000000000018 {{[0-9a-f]+}} R_AMDGPU_NONE {{[0-9a-f]+}} foo + 0
-// ELF-RELOC-NEXT: 0000000000000018 {{[0-9a-f]+}} R_AMDGPU_NONE {{[0-9a-f]+}} bar + 0
-// ELF-RELOC-NEXT: 0000000000000018 {{[0-9a-f]+}} R_AMDGPU_NONE {{[0-9a-f]+}} external_fn + 0
 // ELF-RELOC-NEXT: 0000000000000030 {{[0-9a-f]+}} R_AMDGPU_NONE {{[0-9a-f]+}} baz + 0
 

>From 108fccf884c1a45f3f298b7e194313e280ebeafa Mon Sep 17 00:00:00 2001
From: Janek van Oirschot <janek.vanoirschot at amd.com>
Date: Fri, 27 Mar 2026 15:50:36 +0000
Subject: [PATCH 3/3] Use conservative defaults

---
 llvm/docs/AMDGPUUsage.rst                     | 20 +++++-----
 .../AMDGPU/AsmParser/AMDGPUAsmParser.cpp      |  8 ----
 .../MCTargetDesc/AMDGPUTargetStreamer.cpp     | 38 ++++++++++++++++++-
 .../MCTargetDesc/AMDGPUTargetStreamer.h       |  4 ++
 .../MC/AMDGPU/amdgpu-resource-usage-err.s     | 11 ------
 llvm/test/MC/AMDGPU/amdgpu-resource-usage.s   |  9 ++++-
 6 files changed, 59 insertions(+), 31 deletions(-)

diff --git a/llvm/docs/AMDGPUUsage.rst b/llvm/docs/AMDGPUUsage.rst
index cb54603e42753..584d7c30e796d 100644
--- a/llvm/docs/AMDGPUUsage.rst
+++ b/llvm/docs/AMDGPUUsage.rst
@@ -2419,16 +2419,16 @@ callees' resource usage.
      ====================================== ========= ======================== ===================================================================================================
      Directive                              Required? Occurrences Per Function Description
      ====================================== ========= ======================== ===================================================================================================
-     .amdgpu_resource_usage <function name> yes       1                        Denotes the start of resource usage directives for <function name>
-     .end_amdgpu_resource_usage             yes       1                        Denotes end of resource usage directives
-     .num_vgpr <i32>                        yes       1                        Number of VGPRs used by the function
-     .num_agpr <i32>                        yes       1                        Number of AGPRs used by the function
-     .num_sgpr <i32>                        yes       1                        Number of SGPRs used by the function
-     .named_barrier <i32>                   yes       1                        Number of named barriers used by the function
-     .private_seg_size <i32>                yes       1                        Total stack size required for the function
-     .uses_vcc <i1>                         yes       1                        Boolean denoting whether vcc is used in the function
-     .uses_flat_scratch <i1>                yes       1                        Boolean denoting whether flat scratch is used in the function
-     .has_dyn_sized_stack <i1>              yes       1                        Boolean denoting whether stack in the function is dynamically sized
+     .amdgpu_resource_usage <function name> no        0 or 1                   Denotes the start of resource usage directives for <function name>
+     .end_amdgpu_resource_usage             no        0 or 1                   Denotes end of resource usage directives
+     .num_vgpr <i32>                        no        0 or 1                   Number of VGPRs used by the function (default: max addressable VGPRs)
+     .num_agpr <i32>                        no        0 or 1                   Number of AGPRs used by the function (default: max addressable AGPRs, or 0 if unsupported)
+     .num_sgpr <i32>                        no        0 or 1                   Number of SGPRs used by the function (default: max addressable SGPRs)
+     .named_barrier <i32>                   no        0 or 1                   Number of named barriers used by the function (default: 16)
+     .private_seg_size <i32>                no        0 or 1                   Total stack size required for the function (default: max wave scratch size)
+     .uses_vcc <i1>                         no        0 or 1                   Boolean denoting whether vcc is used in the function (default: 1)
+     .uses_flat_scratch <i1>                no        0 or 1                   Boolean denoting whether flat scratch is used in the function (default: 1)
+     .has_dyn_sized_stack <i1>              no        0 or 1                   Boolean denoting whether stack in the function is dynamically sized (default: 1)
      ====================================== ========= ======================== ===================================================================================================
 
 Function Resource Usage ELF Section
diff --git a/llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp b/llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp
index 3f703bf5f2ba3..b305d3450534f 100644
--- a/llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp
+++ b/llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp
@@ -6805,14 +6805,6 @@ bool AMDGPUAsmParser::ParseDirectiveAMDGPUResourceUsage() {
       return TokError("unknown field '" + ID + "' in .amdgpu_resource_usage");
   }
 
-  for (StringRef StrRef : {".num_vgpr", ".num_agpr", ".num_sgpr",
-                           ".named_barrier", ".private_seg_size", ".uses_vcc",
-                           ".uses_flat_scratch", ".has_dyn_sized_stack"}) {
-    if (!Seen.contains(StrRef))
-      return TokError("missing required " + StrRef +
-                      " directive in .amdgpu_resource_usage");
-  }
-
   uint32_t Flags = 0;
   Flags |= (UsesVCC ? 1u : 0u) << 0;
   Flags |= (UsesFlatScratch ? 1u : 0u) << 1;
diff --git a/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUTargetStreamer.cpp b/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUTargetStreamer.cpp
index a9348edc12642..b829a4c4f349f 100644
--- a/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUTargetStreamer.cpp
+++ b/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUTargetStreamer.cpp
@@ -15,6 +15,7 @@
 #include "AMDGPUMCKernelDescriptor.h"
 #include "AMDGPUMCTargetDesc.h"
 #include "AMDGPUPTNote.h"
+#include "SIDefines.h"
 #include "Utils/AMDGPUBaseInfo.h"
 #include "Utils/AMDKernelCodeTUtils.h"
 #include "llvm/BinaryFormat/AMDGPUMetadataVerifier.h"
@@ -25,6 +26,7 @@
 #include "llvm/MC/MCELFObjectWriter.h"
 #include "llvm/MC/MCELFStreamer.h"
 #include "llvm/MC/MCSubtargetInfo.h"
+#include "llvm/MC/MCSymbolELF.h"
 #include "llvm/Support/AMDGPUMetadata.h"
 #include "llvm/Support/AMDHSAKernelDescriptor.h"
 #include "llvm/Support/CommandLine.h"
@@ -689,8 +691,40 @@ MCELFStreamer &AMDGPUTargetELFStreamer::getStreamer() {
 
 // A hook for emitting stuff at the end.
 // We use it for emitting the accumulated PAL metadata as a .note record.
-// The PAL metadata is reset after it is emitted.
+// The PAL metadata is reset after it is emitted and emitting default function
+// resource usage values for functions with undetermined resource usage.
 void AMDGPUTargetELFStreamer::finish() {
+  if (STI.getTargetTriple().getOS() == Triple::AMDHSA) {
+    // Use conservative defaults.
+    uint32_t NumVGPR = IsaInfo::getAddressableNumVGPRs(&STI, 0);
+    uint32_t NumAGPR =
+        hasMAIInsts(STI) ? IsaInfo::getAddressableNumArchVGPRs(&STI) : 0;
+    uint32_t NumSGPR = IsaInfo::getAddressableNumSGPRs(&STI);
+    uint32_t NumNamedBarrier = AMDGPU::Barrier::NAMED_BARRIER_LAST;
+    // Computed similarly to GCNSubtarget::getMaxWaveScratchSize.
+    uint32_t PrivateSegmentSize;
+    if (isGFX12Plus(STI))
+      PrivateSegmentSize = (64 * 4) * ((1 << 18) - 1);
+    else if (isGFX11(STI))
+      PrivateSegmentSize = (64 * 4) * ((1 << 15) - 1);
+    else
+      PrivateSegmentSize = (256 * 4) * ((1 << 13) - 1);
+    // Assumes all boolean flags set: uses_vcc | uses_flat_scratch |
+    // has_dyn_sized_stack.
+    uint32_t Flags = 0x7;
+
+    for (const MCSymbol &Sym : getStreamer().getAssembler().symbols()) {
+      // Only emit conservative defaults for function with no resource usage
+      // info entries yet.
+      auto &SymELF = static_cast<const MCSymbolELF &>(Sym);
+      if (SymELF.getType() == ELF::STT_FUNC && SymELF.isDefined() &&
+          !FunctionsWithResourceUsage.contains(&Sym))
+        emitResourceUsageEntry(const_cast<MCSymbol *>(&Sym), NumVGPR, NumAGPR,
+                               NumSGPR, NumNamedBarrier, PrivateSegmentSize,
+                               Flags);
+    }
+  }
+
   ELFObjectWriter &W = getStreamer().getWriter();
   W.setELFHeaderEFlags(getEFlags());
   W.setOverrideABIVersion(
@@ -1080,6 +1114,8 @@ void AMDGPUTargetELFStreamer::EmitAmdhsaKernelDescriptor(
 void AMDGPUTargetELFStreamer::emitResourceUsageEntry(
     MCSymbol *FnSym, uint32_t NumVGPR, uint32_t NumAGPR, uint32_t NumSGPR,
     uint32_t NumNamedBarrier, uint32_t PrivateSegmentSize, uint32_t Flags) {
+  FunctionsWithResourceUsage.insert(FnSym);
+
   auto &S = getStreamer();
   auto &Context = S.getContext();
   const unsigned ResourceInfoEntrySize = 24;
diff --git a/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUTargetStreamer.h b/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUTargetStreamer.h
index d9cc6470fa50a..680b35337bcd7 100644
--- a/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUTargetStreamer.h
+++ b/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUTargetStreamer.h
@@ -12,6 +12,7 @@
 #include "Utils/AMDGPUBaseInfo.h"
 #include "Utils/AMDGPUPALMetadata.h"
 #include "llvm/ADT/ArrayRef.h"
+#include "llvm/ADT/SmallPtrSet.h"
 #include "llvm/MC/MCStreamer.h"
 
 namespace llvm {
@@ -192,6 +193,9 @@ class AMDGPUTargetELFStreamer final : public AMDGPUTargetStreamer {
   // Counter for computing relocation offsets.
   unsigned ResourceInfoEntryCount = 0;
 
+  // Track functions with explicit resource usage entries.
+  SmallPtrSet<const MCSymbol *, 8> FunctionsWithResourceUsage;
+
   void EmitNote(StringRef Name, const MCExpr *DescSize, unsigned NoteType,
                 function_ref<void(MCELFStreamer &)> EmitDesc);
 
diff --git a/llvm/test/MC/AMDGPU/amdgpu-resource-usage-err.s b/llvm/test/MC/AMDGPU/amdgpu-resource-usage-err.s
index 77e1f30597ebd..7a27a73f857a8 100644
--- a/llvm/test/MC/AMDGPU/amdgpu-resource-usage-err.s
+++ b/llvm/test/MC/AMDGPU/amdgpu-resource-usage-err.s
@@ -31,14 +31,3 @@
 // CHECK: :[[@LINE-1]]:{{[0-9]+}}: error: unknown field '.bogus_field' in .amdgpu_resource_usage
 	.end_amdgpu_resource_usage
 
-// Missing required field (.num_sgpr omitted).
-	.amdgpu_resource_usage fn5
-		.num_vgpr 0
-		.num_agpr 0
-		.named_barrier 0
-		.private_seg_size 0
-		.uses_vcc 0
-		.uses_flat_scratch 0
-		.has_dyn_sized_stack 0
-	.end_amdgpu_resource_usage
-// CHECK: :[[@LINE-1]]:{{[0-9]+}}: error: missing required .num_sgpr directive in .amdgpu_resource_usage
diff --git a/llvm/test/MC/AMDGPU/amdgpu-resource-usage.s b/llvm/test/MC/AMDGPU/amdgpu-resource-usage.s
index c0483ec63a50a..68e41f0288f3d 100644
--- a/llvm/test/MC/AMDGPU/amdgpu-resource-usage.s
+++ b/llvm/test/MC/AMDGPU/amdgpu-resource-usage.s
@@ -19,6 +19,10 @@ baz:
 foo:
   s_endpgm
 
+.type quux, at function
+quux:
+  s_endpgm
+
 .extern external_fn
 
 // ASM: .amdgpu_resource_usage bar
@@ -84,10 +88,13 @@ foo:
 		.has_dyn_sized_stack 0
 	.end_amdgpu_resource_usage
 
-// ELF-SEC: .AMDGPU.resource_usage PROGBITS {{[0-9a-f]+}} {{[0-9a-f]+}} 000048 18 E 0 0 1
+// ASM-NOT: .amdgpu_resource_usage quux
+
+// ELF-SEC: .AMDGPU.resource_usage PROGBITS {{[0-9a-f]+}} {{[0-9a-f]+}} 000060 18 E 0 0 1
 
 // ELF-RELOC:      Relocation section '.rela.AMDGPU.resource_usage'
 // ELF-RELOC:      0000000000000000 {{[0-9a-f]+}} R_AMDGPU_NONE {{[0-9a-f]+}} bar + 0
 // ELF-RELOC-NEXT: 0000000000000018 {{[0-9a-f]+}} R_AMDGPU_NONE {{[0-9a-f]+}} foo + 0
 // ELF-RELOC-NEXT: 0000000000000030 {{[0-9a-f]+}} R_AMDGPU_NONE {{[0-9a-f]+}} baz + 0
+// ELF-RELOC-NEXT: 0000000000000048 {{[0-9a-f]+}} R_AMDGPU_NONE {{[0-9a-f]+}} quux + 0
 



More information about the llvm-commits mailing list