[clang] [llvm] Computing, storing, and restoring conservative call graphs with LLVM (PR #80104)

via cfe-commits cfe-commits at lists.llvm.org
Tue Jan 30 22:01:41 PST 2024


https://github.com/Prabhuk created https://github.com/llvm/llvm-project/pull/80104

Reviving the original patchset: https://reviews.llvm.org/D105916



>From 6f50d48c44f36b7e883d0b498322c498c3d8897a Mon Sep 17 00:00:00 2001
From: Necip Fazil Yildiran <necip at google.com>
Date: Tue, 7 Nov 2023 10:49:21 -0800
Subject: [PATCH 1/9] [Bitcode][Asm] Parse metadata value from operand bundles

Support parsing operand bundles with metadata value from LLVM IR.

This is almost an NFC since there is no existing producers/consumers for
metadata operand bundle values, except the following call graph section
patches in revision:

Producer: https://reviews.llvm.org/D105909
Consumer: https://reviews.llvm.org/D105915

The test in this revision only ensures that the parser does not
complain. The above-mentioned patches implicity test if passed metadata
value is correct.

Reviewed By: morehouse

Differential Revision: https://reviews.llvm.org/D107038
---
 llvm/lib/AsmParser/LLParser.cpp                  | 16 ++++++++++++++--
 .../2021-07-22-Parse-Metadata-Operand-Bundles.ll |  9 +++++++++
 2 files changed, 23 insertions(+), 2 deletions(-)
 create mode 100644 llvm/test/Bitcode/2021-07-22-Parse-Metadata-Operand-Bundles.ll

diff --git a/llvm/lib/AsmParser/LLParser.cpp b/llvm/lib/AsmParser/LLParser.cpp
index d6c5993797de1..6029a2c671827 100644
--- a/llvm/lib/AsmParser/LLParser.cpp
+++ b/llvm/lib/AsmParser/LLParser.cpp
@@ -3002,9 +3002,21 @@ bool LLParser::parseOptionalOperandBundles(
         return true;
 
       Type *Ty = nullptr;
-      Value *Input = nullptr;
-      if (parseType(Ty) || parseValue(Ty, Input, PFS))
+      if (parseType(Ty))
         return true;
+
+      Value *Input = nullptr;
+      // FIXME: Metadata operand bundle value is garbage when LLVM IR is
+      // compiled to bitcode, then disassembled back to LLVM IR. See D107039
+      // for the reproducers, and https://bugs.llvm.org/show_bug.cgi?id=51264
+      // for the bug report.
+      if (Ty->isMetadataTy()) {
+        if (parseMetadataAsValue(Input, PFS))
+          return true;
+      } else {
+        if (parseValue(Ty, Input, PFS))
+          return true;
+      }
       Inputs.push_back(Input);
     }
 
diff --git a/llvm/test/Bitcode/2021-07-22-Parse-Metadata-Operand-Bundles.ll b/llvm/test/Bitcode/2021-07-22-Parse-Metadata-Operand-Bundles.ll
new file mode 100644
index 0000000000000..35c6131bbad60
--- /dev/null
+++ b/llvm/test/Bitcode/2021-07-22-Parse-Metadata-Operand-Bundles.ll
@@ -0,0 +1,9 @@
+; This test ensures that we parse metadata operand bundle values.
+; RUN: llvm-as < %s
+
+declare void @callee()
+
+define void @call_with_operand_bundle() {
+  call void @callee() [ "op_type"(metadata !"metadata_string") ]
+  ret void
+}

>From 9b4201d9eef0daf1f4fb8abdb1158bae3a809fb2 Mon Sep 17 00:00:00 2001
From: Necip Fazil Yildiran <necip at google.com>
Date: Tue, 7 Nov 2023 10:49:23 -0800
Subject: [PATCH 2/9] [CallGraphSection] Add call graph section options and
 documentation

This is the first of the patch series that adds the support for
computing, storing, and restoring call graphs with LLVM. This adds the
options and the design documentation for computing and storing the call
graphs.

Inferring indirect call targets from a binary is challenging without
source-level information. Hence, the reconstruction of a fine-grained
call graph from the binary is unfeasible for indirect/virtual calls.
To address this, designed solution is to collect the necessary information
to construct the call graph while the source information is present, and
store it in a non-code section of the binary. To enable, use
-fcall-graph-section for Clang, or --call-graph-section for LLVM.

Original RFC: https://lists.llvm.org/pipermail/llvm-dev/2021-June/151044.html
Updated RFC: https://lists.llvm.org/pipermail/llvm-dev/2021-July/151739.html

Reviewed By: morehouse

Differential Revision: https://reviews.llvm.org/D105907
---
 clang/docs/CallGraphSection.rst              | 251 +++++++++++++++++++
 clang/include/clang/Basic/CodeGenOptions.def |   2 +
 clang/include/clang/Driver/Options.td        |   4 +
 clang/lib/CodeGen/BackendUtil.cpp            |   1 +
 clang/lib/Driver/ToolChains/Clang.cpp        |   4 +
 clang/test/Driver/call-graph-section.c       |   5 +
 llvm/include/llvm/CodeGen/CommandFlags.h     |   2 +
 llvm/include/llvm/Target/TargetOptions.h     |   5 +-
 llvm/lib/CodeGen/CommandFlags.cpp            |   7 +
 9 files changed, 280 insertions(+), 1 deletion(-)
 create mode 100644 clang/docs/CallGraphSection.rst
 create mode 100644 clang/test/Driver/call-graph-section.c

diff --git a/clang/docs/CallGraphSection.rst b/clang/docs/CallGraphSection.rst
new file mode 100644
index 0000000000000..d5db507f9ca9a
--- /dev/null
+++ b/clang/docs/CallGraphSection.rst
@@ -0,0 +1,251 @@
+==================
+Call Graph Section
+==================
+
+Introduction
+============
+
+With ``-fcall-graph-section``, the compiler will create a call graph section 
+in the object file. It will include type identifiers for indirect calls and 
+targets. This information can be used to map indirect calls to their receivers 
+with matching types. A complete and high-precision call graph can be 
+reconstructed by complementing this information with disassembly 
+(see ``llvm-objdump --call-graph-info``).
+
+Semantics
+=========
+
+A coarse-grained, type-agnostic call graph may allow indirect calls to target
+any function in the program. This approach ensures completeness since no
+indirect call edge is missing. However, it is generally poor in precision
+due to having unneeded edges.
+
+A call graph section provides type identifiers for indirect calls and targets.
+This information can be used to restrict the receivers of an indirect target to
+indirect calls with matching type. Consequently, the precision for indirect
+call edges are improved while maintaining the completeness.
+
+The ``llvm-objdump`` utility provides a ``--call-graph-info`` option to extract
+full call graph information by parsing the content of the call graph section
+and disassembling the program for complementary information, e.g., direct
+calls.
+
+Section layout
+==============
+
+A call graph section consists of zero or more call graph entries.
+Each entry contains information on a function and its indirect calls.
+
+An entry of a call graph section has the following layout in the binary:
+
++---------------------+-----------------------------------------------------------------------+
+| Element             | Content                                                               |
++=====================+=======================================================================+
+| FormatVersionNumber | Format version number.                                                |
++---------------------+-----------------------------------------------------------------------+
+| FunctionEntryPc     | Function entry address.                                               |
++---------------------+-----------------------------------+-----------------------------------+
+|                     | A flag whether the function is an | - 0: not an indirect target       |
+| FunctionKind        | indirect target, and if so,       | - 1: indirect target, unknown id  |
+|                     | whether its type id is known.     | - 2: indirect target, known id    |
++---------------------+-----------------------------------+-----------------------------------+
+| FunctionTypeId      | Type id for the indirect target. Present only when FunctionKind is 2. |
++---------------------+-----------------------------------------------------------------------+
+| CallSiteCount       | Number of type id to indirect call site mappings that follow.         |
++---------------------+-----------------------------------------------------------------------+
+| CallSiteList        | List of type id and indirect call site pc pairs.                      |
++---------------------+-----------------------------------------------------------------------+
+
+Each element in an entry (including each element of the contained lists and
+pairs) occupies 64-bit space.
+
+The format version number is repeated per entry to support concatenation of
+call graph sections with different format versions by the linker.
+
+As of now, the only supported format version is described above and has version
+number 0.
+
+Type identifiers
+================
+
+The type for an indirect call or target is the function signature.
+The mapping from a type to an identifier is an ABI detail.
+In the current experimental implementation, an identifier of type T is
+computed as follows:
+
+  -  Obtain the generalized mangled name for “typeinfo name for T”.
+  -  Compute MD5 hash of the name as a string.
+  -  Reinterpret the first 8 bytes of the hash as a little-endian 64-bit integer.
+
+To avoid mismatched pointer types, generalizations are applied.
+Pointers in return and argument types are treated as equivalent as long as the
+qualifiers for the type they point to match.
+For example, ``char*``, ``char**``, and ``int*`` are considered equivalent
+types. However, ``char*`` and ``const char*`` are considered separate types.
+
+Missing type identifiers
+========================
+
+For functions, two cases need to be considered. First, if the compiler cannot
+deduce a type id for an indirect target, it will be listed as an indirect target
+without a type id. Second, if an object without a call graph section gets
+linked, the final call graph section will lack information on functions from
+the object. For completeness, these functions need to be taken as receiver to
+any indirect call regardless of their type id.
+``llvm-objdump --call-graph-info`` lists these functions as indirect targets
+with `UNKNOWN` type id.
+
+For indirect calls, current implementation guarantees a type id for each
+compiled call. However, if an object without a call graph section gets linked,
+no type id will be present for its indirect calls. For completeness, these calls
+need to be taken to target any indirect target regardless of their type id. For
+indirect calls, ``llvm-objdump --call-graph-info`` prints 1) a complete list of
+indirect calls, 2) type id to indirect call mappings. The difference of these
+lists allow to deduce the indirect calls with missing type ids.
+
+TODO: measure and report the ratio of missed type ids
+
+Performance
+===========
+
+A call graph section does not affect the executable code and does not occupy
+memory during process execution. Therefore, there is no performance overhead.
+
+The scheme has not yet been optimized for binary size.
+
+TODO: measure and report the increase in the binary size
+
+Example
+=======
+
+For example, consider the following C++ code:
+
+.. code-block:: cpp
+
+    namespace {
+      // Not an indirect target
+      void foo() {}
+    }
+
+    // Indirect target 1
+    void bar() {}
+
+    // Indirect target 2
+    int baz(char a, float *b) {
+      return 0;
+    }
+
+    // Indirect target 3
+    int main() {
+      char a;
+      float b;
+      void (*fp_bar)() = bar;
+      int (*fp_baz1)(char, float*) = baz;
+      int (*fp_baz2)(char, float*) = baz;
+
+      // Indirect call site 1
+      fp_bar();
+
+      // Indirect call site 2
+      fp_baz1(a, &b);
+
+      // Indirect call site 3: shares the type id with indirect call site 2
+      fp_baz2(a, &b);
+
+      // Direct call sites
+      foo();
+      bar();
+      baz(a, &b);
+
+      return 0;
+    }
+
+Following will compile it with a call graph section created in the binary:
+
+.. code-block:: bash
+
+  $ clang -fcall-graph-section example.cpp
+
+During the construction of the call graph section, the type identifiers are 
+computed as follows:
+
++---------------+-----------------------+----------------------------+----------------------------+
+| Function name | Generalized signature | Mangled name (itanium ABI) | Numeric type id (md5 hash) |
++===============+=======================+============================+============================+
+|  bar          | void ()               | _ZTSFvvE.generalized       | f85c699bb8ef20a2           |
++---------------+-----------------------+----------------------------+----------------------------+
+|  baz          | int (char, void*)     | _ZTSFicPvE.generalized     | e3804d2a7f2b03fe           |
++---------------+-----------------------+----------------------------+----------------------------+
+|  main         | int ()                | _ZTSFivE.generalized       | a9494def81a01dc            |
++---------------+-----------------------+----------------------------+----------------------------+
+
+The call graph section will have the following content:
+
++---------------+-----------------+--------------+----------------+---------------+--------------------------------------+
+| FormatVersion | FunctionEntryPc | FunctionKind | FunctionTypeId | CallSiteCount | CallSiteList                         |
++===============+=================+==============+================+===============+======================================+
+| 0             | EntryPc(foo)    | 0            | (empty)        | 0             | (empty)                              |
++---------------+-----------------+--------------+----------------+---------------+--------------------------------------+
+| 0             | EntryPc(bar)    | 2            | TypeId(bar)    | 0             | (empty)                              |
++---------------+-----------------+--------------+----------------+---------------+--------------------------------------+
+| 0             | EntryPc(baz)    | 2            | TypeId(baz)    | 0             | (empty)                              |
++---------------+-----------------+--------------+----------------+---------------+--------------------------------------+
+| 0             | EntryPc(main)   | 2            | TypeId(main)   | 3             | * TypeId(bar), CallSitePc(fp_bar())  |
+|               |                 |              |                |               | * TypeId(baz), CallSitePc(fp_baz1()) |
+|               |                 |              |                |               | * TypeId(baz), CallSitePc(fp_baz2()) |
++---------------+-----------------+--------------+----------------+---------------+--------------------------------------+
+
+
+The ``llvm-objdump`` utility can parse the call graph section and disassemble
+the program to provide complete call graph information. This includes any
+additional call sites from the binary:
+
+.. code-block:: bash
+
+    $ llvm-objdump --call-graph-info a.out
+
+    # Comments are not a part of the llvm-objdump's output but inserted for clarifications.
+
+    a.out:  file format elf64-x86-64
+    # These warnings are due to the functions and the indirect calls coming from linked objects.
+    llvm-objdump: warning: 'a.out': callgraph section does not have type ids for 3 indirect calls
+    llvm-objdump: warning: 'a.out': callgraph section does not have information for 10 functions
+
+    # Unknown targets are the 10 functions the warnings mention.
+    INDIRECT TARGET TYPES (TYPEID [FUNC_ADDR,])
+    UNKNOWN 401000 401100 401234 401050 401090 4010d0 4011d0 401020 401060 401230
+    a9494def81a01dc 401150            # main()
+    f85c699bb8ef20a2 401120           # bar()
+    e3804d2a7f2b03fe 401130           # baz()
+
+    # Notice that the call sites share the same type id as target functions
+    INDIRECT CALL TYPES (TYPEID [CALL_SITE_ADDR,])
+    f85c699bb8ef20a2 401181           # Indirect call site 1 (fp_bar())
+    e3804d2a7f2b03fe 401191 4011a1    # Indirect call site 2 and 3 (fp_baz1() and fp_baz2())
+
+    INDIRECT CALL SITES (CALLER_ADDR [CALL_SITE_ADDR,])
+    401000 401012                     # _init
+    401150 401181 401191 4011a1       # main calls fp_bar(), fp_baz1(), fp_baz2()
+    4011d0 401215                     # __libc_csu_init
+    401020 40104a                     # _start
+
+    DIRECT CALL SITES (CALLER_ADDR [(CALL_SITE_ADDR, TARGET_ADDR),])
+    4010d0 4010e2 401060              # __do_global_dtors_aux
+    401150 4011a6 401110 4011ab 401120 4011ba 401130   # main calls foo(), bar(), baz()
+    4011d0 4011fd 401000              # __libc_csu_init
+
+    FUNCTIONS (FUNC_ENTRY_ADDR, SYM_NAME)
+    401000 _init
+    401100 frame_dummy
+    401234 _fini
+    401050 _dl_relocate_static_pie
+    401090 register_tm_clones
+    4010d0 __do_global_dtors_aux
+    401110 _ZN12_GLOBAL__N_13fooEv    # (anonymous namespace)::foo()
+    401150 main                       # main
+    4011d0 __libc_csu_init
+    401020 _start
+    401060 deregister_tm_clones
+    401120 _Z3barv                    # bar()
+    401130 _Z3bazcPf                  # baz(char, float*)
+    401230 __libc_csu_fini
diff --git a/clang/include/clang/Basic/CodeGenOptions.def b/clang/include/clang/Basic/CodeGenOptions.def
index 7c0bfe3284961..b3fdff323e505 100644
--- a/clang/include/clang/Basic/CodeGenOptions.def
+++ b/clang/include/clang/Basic/CodeGenOptions.def
@@ -75,6 +75,8 @@ CODEGENOPT(EnableNoundefAttrs, 1, 0) ///< Enable emitting `noundef` attributes o
 CODEGENOPT(DebugPassManager, 1, 0) ///< Prints debug information for the new
                                    ///< pass manager.
 CODEGENOPT(DisableRedZone    , 1, 0) ///< Set when -mno-red-zone is enabled.
+CODEGENOPT(CallGraphSection, 1, 0) ///< Emit a call graph section into the
+                                   ///< object file.
 CODEGENOPT(EmitCallSiteInfo, 1, 0) ///< Emit call site info only in the case of
                                    ///< '-g' + 'O>0' level.
 CODEGENOPT(IndirectTlsSegRefs, 1, 0) ///< Set when -mno-tls-direct-seg-refs
diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td
index d940665969339..d8ede62aa48eb 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -3991,6 +3991,10 @@ defm data_sections : BoolFOption<"data-sections",
   PosFlag<SetTrue, [], [ClangOption, CC1Option],
           "Place each data in its own section">,
   NegFlag<SetFalse>>;
+defm call_graph_section : BoolFOption<"call-graph-section",
+  CodeGenOpts<"CallGraphSection">, DefaultFalse,
+  PosFlag<SetTrue, [], [CC1Option], "Emit a call graph section">,
+  NegFlag<SetFalse>>;
 defm stack_size_section : BoolFOption<"stack-size-section",
   CodeGenOpts<"StackSizeSection">, DefaultFalse,
   PosFlag<SetTrue, [], [ClangOption, CC1Option],
diff --git a/clang/lib/CodeGen/BackendUtil.cpp b/clang/lib/CodeGen/BackendUtil.cpp
index 7877e20d77f77..0962d8c65103b 100644
--- a/clang/lib/CodeGen/BackendUtil.cpp
+++ b/clang/lib/CodeGen/BackendUtil.cpp
@@ -408,6 +408,7 @@ static bool initTargetOptions(DiagnosticsEngine &Diags,
   Options.StackUsageOutput = CodeGenOpts.StackUsageOutput;
   Options.EmitAddrsig = CodeGenOpts.Addrsig;
   Options.ForceDwarfFrameSection = CodeGenOpts.ForceDwarfFrameSection;
+  Options.EmitCallGraphSection = CodeGenOpts.CallGraphSection;
   Options.EmitCallSiteInfo = CodeGenOpts.EmitCallSiteInfo;
   Options.EnableAIXExtendedAltivecABI = LangOpts.EnableAIXExtendedAltivecABI;
   Options.XRayFunctionIndex = CodeGenOpts.XRayFunctionIndex;
diff --git a/clang/lib/Driver/ToolChains/Clang.cpp b/clang/lib/Driver/ToolChains/Clang.cpp
index 8d8965fdf76fb..e7ff581b50a12 100644
--- a/clang/lib/Driver/ToolChains/Clang.cpp
+++ b/clang/lib/Driver/ToolChains/Clang.cpp
@@ -6245,6 +6245,10 @@ void Clang::ConstructJob(Compilation &C, const JobAction &JA,
     CmdArgs.push_back(A->getValue());
   }
 
+  if (Args.hasFlag(options::OPT_fcall_graph_section,
+                   options::OPT_fno_call_graph_section, false))
+    CmdArgs.push_back("-fcall-graph-section");
+
   Args.addOptInFlag(CmdArgs, options::OPT_fstack_size_section,
                     options::OPT_fno_stack_size_section);
 
diff --git a/clang/test/Driver/call-graph-section.c b/clang/test/Driver/call-graph-section.c
new file mode 100644
index 0000000000000..5832aa6754137
--- /dev/null
+++ b/clang/test/Driver/call-graph-section.c
@@ -0,0 +1,5 @@
+// RUN: %clang -### -S -fcall-graph-section %s 2>&1 | FileCheck --check-prefix=CALL-GRAPH-SECTION %s
+// RUN: %clang -### -S -fcall-graph-section -fno-call-graph-section %s 2>&1 | FileCheck --check-prefix=NO-CALL-GRAPH-SECTION %s
+
+// CALL-GRAPH-SECTION: "-fcall-graph-section"
+// NO-CALL-GRAPH-SECTION-NOT: "-fcall-graph-section"
diff --git a/llvm/include/llvm/CodeGen/CommandFlags.h b/llvm/include/llvm/CodeGen/CommandFlags.h
index bf166a63edb07..0b887f8659c86 100644
--- a/llvm/include/llvm/CodeGen/CommandFlags.h
+++ b/llvm/include/llvm/CodeGen/CommandFlags.h
@@ -132,6 +132,8 @@ bool getEnableStackSizeSection();
 
 bool getEnableAddrsig();
 
+bool getEnableCallGraphSection();
+
 bool getEmitCallSiteInfo();
 
 bool getEnableMachineFunctionSplitter();
diff --git a/llvm/include/llvm/Target/TargetOptions.h b/llvm/include/llvm/Target/TargetOptions.h
index d02d1699813c9..dfa421f5d7b19 100644
--- a/llvm/include/llvm/Target/TargetOptions.h
+++ b/llvm/include/llvm/Target/TargetOptions.h
@@ -148,7 +148,7 @@ namespace llvm {
           EmulatedTLS(false), EnableTLSDESC(false), EnableIPRA(false),
           EmitStackSizeSection(false), EnableMachineOutliner(false),
           EnableMachineFunctionSplitter(false), SupportsDefaultOutlining(false),
-          EmitAddrsig(false), EmitCallSiteInfo(false),
+          EmitAddrsig(false), EmitCallGraphSection(false), EmitCallSiteInfo(false),
           SupportsDebugEntryValues(false), EnableDebugEntryValues(false),
           ValueTrackingVariableLocations(false), ForceDwarfFrameSection(false),
           XRayFunctionIndex(true), DebugStrictDwarf(false), Hotpatch(false),
@@ -323,6 +323,9 @@ namespace llvm {
     /// to selectively generate basic block sections.
     std::shared_ptr<MemoryBuffer> BBSectionsFuncListBuf;
 
+    /// Emit section containing call graph metadata.
+    unsigned EmitCallGraphSection : 1;
+
     /// The flag enables call site info production. It is used only for debug
     /// info, and it is restricted only to optimized code. This can be used for
     /// something else, so that should be controlled in the frontend.
diff --git a/llvm/lib/CodeGen/CommandFlags.cpp b/llvm/lib/CodeGen/CommandFlags.cpp
index 51406fb287e66..bcf65fb5b528d 100644
--- a/llvm/lib/CodeGen/CommandFlags.cpp
+++ b/llvm/lib/CodeGen/CommandFlags.cpp
@@ -100,6 +100,7 @@ CGOPT(EABI, EABIVersion)
 CGOPT(DebuggerKind, DebuggerTuningOpt)
 CGOPT(bool, EnableStackSizeSection)
 CGOPT(bool, EnableAddrsig)
+CGOPT(bool, EnableCallGraphSection)
 CGOPT(bool, EmitCallSiteInfo)
 CGOPT(bool, EnableMachineFunctionSplitter)
 CGOPT(bool, EnableDebugEntryValues)
@@ -452,6 +453,11 @@ codegen::RegisterCodeGenFlags::RegisterCodeGenFlags() {
       cl::init(false));
   CGBINDOPT(EnableAddrsig);
 
+  static cl::opt<bool> EnableCallGraphSection(
+      "call-graph-section", cl::desc("Emit a call graph section"),
+      cl::init(false));
+  CGBINDOPT(EnableCallGraphSection);
+
   static cl::opt<bool> EmitCallSiteInfo(
       "emit-call-site-info",
       cl::desc(
@@ -580,6 +586,7 @@ codegen::InitTargetOptionsFromCodeGenFlags(const Triple &TheTriple) {
   Options.EmitStackSizeSection = getEnableStackSizeSection();
   Options.EnableMachineFunctionSplitter = getEnableMachineFunctionSplitter();
   Options.EmitAddrsig = getEnableAddrsig();
+  Options.EmitCallGraphSection = getEnableCallGraphSection();
   Options.EmitCallSiteInfo = getEmitCallSiteInfo();
   Options.EnableDebugEntryValues = getEnableDebugEntryValues();
   Options.ForceDwarfFrameSection = getForceDwarfFrameSection();

>From afc0d2fe35fc3f4441c851ff33bc4c5a0f76429e Mon Sep 17 00:00:00 2001
From: Necip Fazil Yildiran <necip at google.com>
Date: Tue, 7 Nov 2023 10:49:27 -0800
Subject: [PATCH 3/9] [clang][CallGraphSection] Add type id metadata to
 indirect call and targets

Create and add generalized type identifier metadata to indirect calls,
and to functions that may be target to indirect calls.

Type identifiers will be used by the back-end to construct the call
graph section to precisely represent the possible targets for indirect calls.
The type information is deliberately pulled from the front-end to have extra
precision since some type information is lost at IR, and to ensure
consistent type identifiers between object files compiled at different
times (as C/C++ standards require language-level types to match).

Original RFC: https://lists.llvm.org/pipermail/llvm-dev/2021-June/151044.html
Updated RFC: https://lists.llvm.org/pipermail/llvm-dev/2021-July/151739.html

Reviewed By: morehouse

Differential Revision: https://reviews.llvm.org/D105909
---
 clang/lib/CodeGen/CGCall.cpp                  |  22 ++++
 clang/lib/CodeGen/CodeGenModule.cpp           |  19 ++-
 clang/test/CodeGen/call-graph-section-1.cpp   | 110 ++++++++++++++++++
 clang/test/CodeGen/call-graph-section-2.cpp   |  95 +++++++++++++++
 clang/test/CodeGen/call-graph-section-3.cpp   |  52 +++++++++
 clang/test/CodeGen/call-graph-section.c       |  85 ++++++++++++++
 llvm/include/llvm/IR/LLVMContext.h            |   1 +
 .../SelectionDAG/SelectionDAGBuilder.cpp      |  10 +-
 llvm/lib/IR/LLVMContext.cpp                   |   5 +
 llvm/lib/IR/Verifier.cpp                      |   7 +-
 llvm/test/Verifier/operand-bundles.ll         |  13 +++
 11 files changed, 410 insertions(+), 9 deletions(-)
 create mode 100644 clang/test/CodeGen/call-graph-section-1.cpp
 create mode 100644 clang/test/CodeGen/call-graph-section-2.cpp
 create mode 100644 clang/test/CodeGen/call-graph-section-3.cpp
 create mode 100644 clang/test/CodeGen/call-graph-section.c

diff --git a/clang/lib/CodeGen/CGCall.cpp b/clang/lib/CodeGen/CGCall.cpp
index 28c211aa631e4..ae8c9a507b1ea 100644
--- a/clang/lib/CodeGen/CGCall.cpp
+++ b/clang/lib/CodeGen/CGCall.cpp
@@ -5614,6 +5614,28 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo &CallInfo,
   AllocAlignAttrEmitter AllocAlignAttrEmitter(*this, TargetDecl, CallArgs);
   Attrs = AllocAlignAttrEmitter.TryEmitAsCallSiteAttribute(Attrs);
 
+  if (CGM.getCodeGenOpts().CallGraphSection) {
+    // FIXME: create operand bundle only for indirect calls, not for all
+
+    assert((TargetDecl && TargetDecl->getFunctionType() ||
+            Callee.getAbstractInfo().getCalleeFunctionProtoType()) &&
+           "cannot find callsite type");
+
+    QualType CST;
+    if (TargetDecl && TargetDecl->getFunctionType())
+      CST = QualType(TargetDecl->getFunctionType(), 0);
+    else if (const auto *FPT =
+                 Callee.getAbstractInfo().getCalleeFunctionProtoType())
+      CST = QualType(FPT, 0);
+
+    if (!CST.isNull()) {
+      auto *TypeIdMD = CGM.CreateMetadataIdentifierGeneralized(CST);
+      auto *TypeIdMDVal =
+          llvm::MetadataAsValue::get(getLLVMContext(), TypeIdMD);
+      BundleList.emplace_back("type", TypeIdMDVal);
+    }
+  }
+
   // Emit the actual call/invoke instruction.
   llvm::CallBase *CI;
   if (!InvokeDest) {
diff --git a/clang/lib/CodeGen/CodeGenModule.cpp b/clang/lib/CodeGen/CodeGenModule.cpp
index 6ec54cc01c923..15429be933cc1 100644
--- a/clang/lib/CodeGen/CodeGenModule.cpp
+++ b/clang/lib/CodeGen/CodeGenModule.cpp
@@ -2714,7 +2714,17 @@ static void setLinkageForGV(llvm::GlobalValue *GV, const NamedDecl *ND) {
 
 void CodeGenModule::CreateFunctionTypeMetadataForIcall(const FunctionDecl *FD,
                                                        llvm::Function *F) {
-  // Only if we are checking indirect calls.
+  bool EmittedMDIdGeneralized = false;
+  if (CodeGenOpts.CallGraphSection &&
+      (!F->hasLocalLinkage() ||
+       F->getFunction().hasAddressTaken(nullptr, /* IgnoreCallbackUses */ true,
+                                        /* IgnoreAssumeLikeCalls */ true,
+                                        /* IgnoreLLVMUsed */ false))) {
+    F->addTypeMetadata(0, CreateMetadataIdentifierGeneralized(FD->getType()));
+    EmittedMDIdGeneralized = true;
+  }
+
+  // Add additional metadata only if we are checking indirect calls with CFI.
   if (!LangOpts.Sanitize.has(SanitizerKind::CFIICall))
     return;
 
@@ -2725,7 +2735,9 @@ void CodeGenModule::CreateFunctionTypeMetadataForIcall(const FunctionDecl *FD,
 
   llvm::Metadata *MD = CreateMetadataIdentifierForType(FD->getType());
   F->addTypeMetadata(0, MD);
-  F->addTypeMetadata(0, CreateMetadataIdentifierGeneralized(FD->getType()));
+  // Add the generalized identifier if not added already.
+  if (!EmittedMDIdGeneralized)
+    F->addTypeMetadata(0, CreateMetadataIdentifierGeneralized(FD->getType()));
 
   // Emit a hash-based bit set entry for cross-DSO calls.
   if (CodeGenOpts.SanitizeCfiCrossDso)
@@ -2860,7 +2872,8 @@ void CodeGenModule::SetFunctionAttributes(GlobalDecl GD, llvm::Function *F,
   // are non-canonical then we need type metadata in order to produce the local
   // jump table.
   if (!CodeGenOpts.SanitizeCfiCrossDso ||
-      !CodeGenOpts.SanitizeCfiCanonicalJumpTables)
+      !CodeGenOpts.SanitizeCfiCanonicalJumpTables ||
+      CodeGenOpts.CallGraphSection)
     CreateFunctionTypeMetadataForIcall(FD, F);
 
   if (LangOpts.Sanitize.has(SanitizerKind::KCFI))
diff --git a/clang/test/CodeGen/call-graph-section-1.cpp b/clang/test/CodeGen/call-graph-section-1.cpp
new file mode 100644
index 0000000000000..a7ef25d7fc06a
--- /dev/null
+++ b/clang/test/CodeGen/call-graph-section-1.cpp
@@ -0,0 +1,110 @@
+// Tests that we assign appropriate identifiers to indirect calls and targets
+// specifically for C++ class and instance methods.
+
+// RUN: %clang_cc1 -triple x86_64-unknown-linux -fcall-graph-section -S \
+// RUN: -emit-llvm -o %t %s
+// RUN: FileCheck --check-prefix=FT %s < %t
+// RUN: FileCheck --check-prefix=CST %s < %t
+
+////////////////////////////////////////////////////////////////////////////////
+// Class definitions (check for indirect target metadata)
+
+class Cls1 {
+public:
+  // FT-DAG: define {{.*}} i32* @_ZN4Cls18receiverEPcPf({{.*}} !type [[F_TCLS1RECEIVER:![0-9]+]]
+  static int *receiver(char *a, float *b) { return 0; }
+};
+
+class Cls2 {
+public:
+  int *(*fp)(char *, float *);
+
+  // FT-DAG: define {{.*}} i32 @_ZN4Cls22f1Ecfd({{.*}} !type [[F_TCLS2F1:![0-9]+]]
+  int f1(char a, float b, double c) { return 0; }
+
+  // FT-DAG: define {{.*}} i32* @_ZN4Cls22f2EPcPfPd({{.*}} !type [[F_TCLS2F2:![0-9]+]]
+  int *f2(char *a, float *b, double *c) { return 0; }
+
+  // FT-DAG: define {{.*}} void @_ZN4Cls22f3E4Cls1({{.*}} !type [[F_TCLS2F3F4:![0-9]+]]
+  void f3(Cls1 a) {}
+
+  // FT-DAG: define {{.*}} void @_ZN4Cls22f4E4Cls1({{.*}} !type [[F_TCLS2F3F4]]
+  void f4(const Cls1 a) {}
+
+  // FT-DAG: define {{.*}} void @_ZN4Cls22f5EP4Cls1({{.*}} !type [[F_TCLS2F5:![0-9]+]]
+  void f5(Cls1 *a) {}
+
+  // FT-DAG: define {{.*}} void @_ZN4Cls22f6EPK4Cls1({{.*}} !type [[F_TCLS2F6:![0-9]+]]
+  void f6(const Cls1 *a) {}
+
+  // FT-DAG: define {{.*}} void @_ZN4Cls22f7ER4Cls1({{.*}} !type [[F_TCLS2F7:![0-9]+]]
+  void f7(Cls1 &a) {}
+
+  // FT-DAG: define {{.*}} void @_ZN4Cls22f8ERK4Cls1({{.*}} !type [[F_TCLS2F8:![0-9]+]]
+  void f8(const Cls1 &a) {}
+
+  // FT-DAG: define {{.*}} void @_ZNK4Cls22f9Ev({{.*}} !type [[F_TCLS2F9:![0-9]+]]
+  void f9() const {}
+};
+
+// FT-DAG: [[F_TCLS1RECEIVER]] = !{i64 0, !"_ZTSFPvS_S_E.generalized"}
+// FT-DAG: [[F_TCLS2F2]]   = !{i64 0, !"_ZTSFPvS_S_S_E.generalized"}
+// FT-DAG: [[F_TCLS2F1]]   = !{i64 0, !"_ZTSFicfdE.generalized"}
+// FT-DAG: [[F_TCLS2F3F4]] = !{i64 0, !"_ZTSFv4Cls1E.generalized"}
+// FT-DAG: [[F_TCLS2F5]]   = !{i64 0, !"_ZTSFvPvE.generalized"}
+// FT-DAG: [[F_TCLS2F6]]   = !{i64 0, !"_ZTSFvPKvE.generalized"}
+// FT-DAG: [[F_TCLS2F7]]   = !{i64 0, !"_ZTSFvR4Cls1E.generalized"}
+// FT-DAG: [[F_TCLS2F8]]   = !{i64 0, !"_ZTSFvRK4Cls1E.generalized"}
+// FT-DAG: [[F_TCLS2F9]]   = !{i64 0, !"_ZTSKFvvE.generalized"}
+
+////////////////////////////////////////////////////////////////////////////////
+// Callsites (check for indirect callsite operand bundles)
+
+// CST-LABEL: define {{.*}} @_Z3foov
+void foo() {
+  Cls2 ObjCls2;
+  ObjCls2.fp = &Cls1::receiver;
+
+  // CST: call i32* {{.*}} [ "type"(metadata !"_ZTSFPvS_S_E.generalized") ]
+  ObjCls2.fp(0, 0);
+
+  auto fp_f1 = &Cls2::f1;
+  auto fp_f2 = &Cls2::f2;
+  auto fp_f3 = &Cls2::f3;
+  auto fp_f4 = &Cls2::f4;
+  auto fp_f5 = &Cls2::f5;
+  auto fp_f6 = &Cls2::f6;
+  auto fp_f7 = &Cls2::f7;
+  auto fp_f8 = &Cls2::f8;
+  auto fp_f9 = &Cls2::f9;
+
+  Cls2 *ObjCls2Ptr = &ObjCls2;
+  Cls1 Cls1Param;
+
+  // CST: call i32 %{{.*}}(%class.Cls2*{{.*}} [ "type"(metadata !"_ZTSFicfdE.generalized") ]
+  (ObjCls2Ptr->*fp_f1)(0, 0, 0);
+
+  // CST: call i32* %{{.*}}(%class.Cls2*{{.*}} [ "type"(metadata !"_ZTSFPvS_S_S_E.generalized") ]
+  (ObjCls2Ptr->*fp_f2)(0, 0, 0);
+
+  // CST: call void %{{.*}}(%class.Cls2*{{.*}} [ "type"(metadata !"_ZTSFv4Cls1E.generalized") ]
+  (ObjCls2Ptr->*fp_f3)(Cls1Param);
+
+  // CST: call void  %{{.*}}(%class.Cls2*{{.*}} [ "type"(metadata !"_ZTSFv4Cls1E.generalized") ]
+  (ObjCls2Ptr->*fp_f4)(Cls1Param);
+
+  // CST: call void %{{.*}}(%class.Cls2*{{.*}} [ "type"(metadata !"_ZTSFvPvE.generalized") ]
+  (ObjCls2Ptr->*fp_f5)(&Cls1Param);
+
+  // CST: call void %{{.*}}(%class.Cls2*{{.*}} [ "type"(metadata !"_ZTSFvPKvE.generalized") ]
+  (ObjCls2Ptr->*fp_f6)(&Cls1Param);
+
+  // CST: call void %{{.*}}(%class.Cls2*{{.*}} [ "type"(metadata !"_ZTSFvR4Cls1E.generalized") ]
+  (ObjCls2Ptr->*fp_f7)(Cls1Param);
+
+  // CST: call void %{{.*}}(%class.Cls2*{{.*}} [ "type"(metadata !"_ZTSFvRK4Cls1E.generalized") ]
+  (ObjCls2Ptr->*fp_f8)(Cls1Param);
+
+  // CST: call void %{{.*}}(%class.Cls2*{{.*}} [ "type"(metadata !"_ZTSKFvvE.generalized") ]
+  (ObjCls2Ptr->*fp_f9)();
+}
diff --git a/clang/test/CodeGen/call-graph-section-2.cpp b/clang/test/CodeGen/call-graph-section-2.cpp
new file mode 100644
index 0000000000000..c98448bebb3b5
--- /dev/null
+++ b/clang/test/CodeGen/call-graph-section-2.cpp
@@ -0,0 +1,95 @@
+// Tests that we assign appropriate identifiers to indirect calls and targets
+// specifically for C++ templates.
+
+// RUN: %clang_cc1 -triple x86_64-unknown-linux -fcall-graph-section -S \
+// RUN: -emit-llvm -o %t %s
+// RUN: FileCheck --check-prefix=FT    %s < %t
+// RUN: FileCheck --check-prefix=CST   %s < %t
+// RUN: FileCheck --check-prefix=CHECK %s < %t
+
+////////////////////////////////////////////////////////////////////////////////
+// Class definitions and template classes (check for indirect target metadata)
+
+class Cls1 {};
+
+// Cls2 is instantiated with T=Cls1 in foo(). Following checks are for this
+// instantiation.
+template <class T>
+class Cls2 {
+public:
+  // FT: define {{.*}} void @_ZN4Cls2I4Cls1E2f1Ev({{.*}} !type [[F_TCLS2F1:![0-9]+]]
+  void f1() {}
+
+  // FT: define {{.*}} void @_ZN4Cls2I4Cls1E2f2ES0_({{.*}} !type [[F_TCLS2F2:![0-9]+]]
+  void f2(T a) {}
+
+  // FT: define {{.*}} void @_ZN4Cls2I4Cls1E2f3EPS0_({{.*}} !type [[F_TCLS2F3:![0-9]+]]
+  void f3(T *a) {}
+
+  // FT: define {{.*}} void @_ZN4Cls2I4Cls1E2f4EPKS0_({{.*}} !type [[F_TCLS2F4:![0-9]+]]
+  void f4(const T *a) {}
+
+  // FT: define {{.*}} void @_ZN4Cls2I4Cls1E2f5ERS0_({{.*}} !type [[F_TCLS2F5:![0-9]+]]
+  void f5(T &a) {}
+
+  // FT: define {{.*}} void @_ZN4Cls2I4Cls1E2f6ERKS0_({{.*}} !type [[F_TCLS2F6:![0-9]+]]
+  void f6(const T &a) {}
+
+  // Mixed type function pointer member
+  T *(*fp)(T a, T *b, const T *c, T &d, const T &e);
+};
+
+// FT-DAG: [[F_TCLS2F1]] = !{i64 0, !"_ZTSFvvE.generalized"}
+// FT-DAG: [[F_TCLS2F2]] = !{i64 0, !"_ZTSFv4Cls1E.generalized"}
+// FT-DAG: [[F_TCLS2F3]] = !{i64 0, !"_ZTSFvPvE.generalized"}
+// FT-DAG: [[F_TCLS2F4]] = !{i64 0, !"_ZTSFvPKvE.generalized"}
+// FT-DAG: [[F_TCLS2F5]] = !{i64 0, !"_ZTSFvR4Cls1E.generalized"}
+// FT-DAG: [[F_TCLS2F6]] = !{i64 0, !"_ZTSFvRK4Cls1E.generalized"}
+
+////////////////////////////////////////////////////////////////////////////////
+// Callsites (check for indirect callsite operand bundles)
+
+template <class T>
+T *T_func(T a, T *b, const T *c, T &d, const T &e) { return b; }
+
+// CST-LABEL: define {{.*}} @_Z3foov
+void foo() {
+  // Methods for Cls2<Cls1> is checked above within the template description.
+  Cls2<Cls1> Obj;
+
+  // CHECK-DAG: define {{.*}} @_Z6T_funcI4Cls1EPT_S1_S2_PKS1_RS1_RS3_({{.*}} !type [[F_TFUNC_CLS1:![0-9]+]]
+  // CHECK-DAG: [[F_TFUNC_CLS1]] = !{i64 0, !"_ZTSFPv4Cls1S_PKvRS0_RKS0_E.generalized"}
+  Obj.fp = T_func<Cls1>;
+  Cls1 Cls1Obj;
+
+  // CST: call %class.Cls1* {{.*}} [ "type"(metadata !"_ZTSFPv4Cls1S_PKvRS0_RKS0_E.generalized") ]
+  Obj.fp(Cls1Obj, &Cls1Obj, &Cls1Obj, Cls1Obj, Cls1Obj);
+
+  // Make indirect calls to Cls2's member methods
+  auto fp_f1 = &Cls2<Cls1>::f1;
+  auto fp_f2 = &Cls2<Cls1>::f2;
+  auto fp_f3 = &Cls2<Cls1>::f3;
+  auto fp_f4 = &Cls2<Cls1>::f4;
+  auto fp_f5 = &Cls2<Cls1>::f5;
+  auto fp_f6 = &Cls2<Cls1>::f6;
+
+  auto *Obj2Ptr = &Obj;
+
+  // CST: call void %{{.*}}(%class.Cls2*{{.*}} [ "type"(metadata !"_ZTSFvvE.generalized") ]
+  (Obj2Ptr->*fp_f1)();
+
+  // CST: call void %{{.*}}(%class.Cls2*{{.*}} [ "type"(metadata !"_ZTSFv4Cls1E.generalized") ]
+  (Obj2Ptr->*fp_f2)(Cls1Obj);
+
+  // CST: call void %{{.*}}(%class.Cls2*{{.*}} [ "type"(metadata !"_ZTSFvPvE.generalized") ]
+  (Obj2Ptr->*fp_f3)(&Cls1Obj);
+
+  // CST: call void %{{.*}}(%class.Cls2*{{.*}} [ "type"(metadata !"_ZTSFvPKvE.generalized") ]
+  (Obj2Ptr->*fp_f4)(&Cls1Obj);
+
+  // CST: call void %{{.*}}(%class.Cls2*{{.*}} [ "type"(metadata !"_ZTSFvR4Cls1E.generalized") ]
+  (Obj2Ptr->*fp_f5)(Cls1Obj);
+
+  // CST: call void %{{.*}}(%class.Cls2*{{.*}} [ "type"(metadata !"_ZTSFvRK4Cls1E.generalized") ]
+  (Obj2Ptr->*fp_f6)(Cls1Obj);
+}
diff --git a/clang/test/CodeGen/call-graph-section-3.cpp b/clang/test/CodeGen/call-graph-section-3.cpp
new file mode 100644
index 0000000000000..d4a0d669d94a1
--- /dev/null
+++ b/clang/test/CodeGen/call-graph-section-3.cpp
@@ -0,0 +1,52 @@
+// Tests that we assign appropriate identifiers to indirect calls and targets
+// specifically for virtual methods.
+
+// RUN: %clang_cc1 -triple x86_64-unknown-linux -fcall-graph-section -S \
+// RUN: -emit-llvm -o %t %s
+// RUN: FileCheck --check-prefix=FT %s < %t
+// RUN: FileCheck --check-prefix=CST %s < %t
+
+////////////////////////////////////////////////////////////////////////////////
+// Class definitions (check for indirect target metadata)
+
+class Base {
+public:
+  // FT-DAG: define {{.*}} @_ZN4Base2vfEPc({{.*}} !type [[F_TVF:![0-9]+]]
+  virtual int vf(char *a) { return 0; };
+};
+
+class Derived : public Base {
+public:
+  // FT-DAG: define {{.*}} @_ZN7Derived2vfEPc({{.*}} !type [[F_TVF]]
+  int vf(char *a) override { return 1; };
+};
+
+// FT-DAG: [[F_TVF]] = !{i64 0, !"_ZTSFiPvE.generalized"}
+
+////////////////////////////////////////////////////////////////////////////////
+// Callsites (check for indirect callsite operand bundles)
+
+// CST-LABEL: define {{.*}} @_Z3foov
+void foo() {
+  auto B = Base();
+  auto D = Derived();
+
+  Base *Bptr = &B;
+  Base *BptrToD = &D;
+  Derived *Dptr = &D;
+
+  auto FpBaseVf = &Base::vf;
+  auto FpDerivedVf = &Derived::vf;
+
+  // CST: call i32 %{{.*}}(%class.Base* {{.*}} [ "type"(metadata !"_ZTSFiPvE.generalized") ]
+  (Bptr->*FpBaseVf)(0);
+
+  // CST: call i32 %{{.*}}(%class.Base* {{.*}} [ "type"(metadata !"_ZTSFiPvE.generalized") ]
+  (BptrToD->*FpBaseVf)(0);
+
+  // CST: call i32 %{{.*}}(%class.Base* {{.*}} [ "type"(metadata !"_ZTSFiPvE.generalized") ]
+  (Dptr->*FpBaseVf)(0);
+
+  // CST: call i32 %{{.*}}(%class.Derived* {{.*}} [ "type"(metadata !"_ZTSFiPvE.generalized") ]
+  (Dptr->*FpDerivedVf)(0);
+}
diff --git a/clang/test/CodeGen/call-graph-section.c b/clang/test/CodeGen/call-graph-section.c
new file mode 100644
index 0000000000000..9b2ef81c39574
--- /dev/null
+++ b/clang/test/CodeGen/call-graph-section.c
@@ -0,0 +1,85 @@
+// Tests that we assign appropriate identifiers to indirect calls and targets.
+
+// RUN: %clang_cc1 -triple x86_64-unknown-linux -fcall-graph-section -S \
+// RUN: -emit-llvm -o - %s | FileCheck --check-prefixes=CHECK,ITANIUM %s
+
+// RUN: %clang_cc1 -triple x86_64-pc-windows-msvc -fcall-graph-section -S \
+// RUN: -emit-llvm -o - %s | FileCheck --check-prefixes=CHECK,MS %s
+
+// CHECK-DAG: define {{(dso_local)?}} void @foo({{.*}} !type [[F_TVOID:![0-9]+]]
+void foo() {
+}
+
+// CHECK-DAG: define {{(dso_local)?}} void @bar({{.*}} !type [[F_TVOID]]
+void bar() {
+  void (*fp)() = foo;
+  // ITANIUM: call {{.*}} [ "type"(metadata !"_ZTSFvE.generalized") ]
+  // MS:      call {{.*}} [ "type"(metadata !"?6AX at Z.generalized") ]
+  fp();
+}
+
+// CHECK-DAG: define {{(dso_local)?}} i32 @baz({{.*}} !type [[F_TPRIMITIVE:![0-9]+]]
+int baz(char a, float b, double c) {
+  return 1;
+}
+
+// CHECK-DAG: define {{(dso_local)?}} i32* @qux({{.*}} !type [[F_TPTR:![0-9]+]]
+int *qux(char *a, float *b, double *c) {
+  return 0;
+}
+
+// CHECK-DAG: define {{(dso_local)?}} void @corge({{.*}} !type [[F_TVOID]]
+void corge() {
+  int (*fp_baz)(char, float, double) = baz;
+  // ITANIUM: call i32 {{.*}} [ "type"(metadata !"_ZTSFicfdE.generalized") ]
+  // MS:      call i32 {{.*}} [ "type"(metadata !"?6AHDMN at Z.generalized") ]
+  fp_baz('a', .0f, .0);
+
+  int *(*fp_qux)(char *, float *, double *) = qux;
+  // ITANIUM: call i32* {{.*}} [ "type"(metadata !"_ZTSFPvS_S_S_E.generalized") ]
+  // MS:      call i32* {{.*}} [ "type"(metadata !"?6APEAXPEAX00 at Z.generalized") ]
+  fp_qux(0, 0, 0);
+}
+
+struct st1 {
+  int *(*fp)(char *, float *, double *);
+};
+
+struct st2 {
+  struct st1 m;
+};
+
+// CHECK-DAG: define {{(dso_local)?}} void @stparam({{.*}} !type [[F_TSTRUCT:![0-9]+]]
+void stparam(struct st2 a, struct st2 *b) {}
+
+// CHECK-DAG: define {{(dso_local)?}} void @stf({{.*}} !type [[F_TVOID]]
+void stf() {
+  struct st1 St1;
+  St1.fp = qux;
+  // ITANIUM: call i32* {{.*}} [ "type"(metadata !"_ZTSFPvS_S_S_E.generalized") ]
+  // MS:      call i32* {{.*}} [ "type"(metadata !"?6APEAXPEAX00 at Z.generalized") ]
+  St1.fp(0, 0, 0);
+
+  struct st2 St2;
+  St2.m.fp = qux;
+  // ITANIUM: call i32* {{.*}} [ "type"(metadata !"_ZTSFPvS_S_S_E.generalized") ]
+  // MS:      call i32* {{.*}} [ "type"(metadata !"?6APEAXPEAX00 at Z.generalized") ]
+  St2.m.fp(0, 0, 0);
+
+  // ITANIUM: call void {{.*}} [ "type"(metadata !"_ZTSFv3st2PvE.generalized") ]
+  // MS:      call void {{.*}} [ "type"(metadata !"?6AXUst2@@PEAX at Z.generalized") ]
+  void (*fp_stparam)(struct st2, struct st2 *) = stparam;
+  fp_stparam(St2, &St2);
+}
+
+// ITANIUM-DAG: [[F_TVOID]] = !{i64 0, !"_ZTSFvE.generalized"}
+// MS-DAG:      [[F_TVOID]] = !{i64 0, !"?6AX at Z.generalized"}
+
+// ITANIUM-DAG: [[F_TPRIMITIVE]] = !{i64 0, !"_ZTSFicfdE.generalized"}
+// MS-DAG:      [[F_TPRIMITIVE]] = !{i64 0, !"?6AHDMN at Z.generalized"}
+
+// ITANIUM-DAG: [[F_TPTR]] = !{i64 0, !"_ZTSFPvS_S_S_E.generalized"}
+// MS-DAG:      [[F_TPTR]] = !{i64 0, !"?6APEAXPEAX00 at Z.generalized"}
+
+// ITANIUM-DAG: [[F_TSTRUCT]] = !{i64 0, !"_ZTSFv3st2PvE.generalized"}
+// MS-DAG:      [[F_TSTRUCT]] = !{i64 0, !"?6AXUst2@@PEAX at Z.generalized"}
diff --git a/llvm/include/llvm/IR/LLVMContext.h b/llvm/include/llvm/IR/LLVMContext.h
index e5786afd72214..20a0bf70edecd 100644
--- a/llvm/include/llvm/IR/LLVMContext.h
+++ b/llvm/include/llvm/IR/LLVMContext.h
@@ -96,6 +96,7 @@ class LLVMContext {
     OB_ptrauth = 7,                // "ptrauth"
     OB_kcfi = 8,                   // "kcfi"
     OB_convergencectrl = 9,        // "convergencectrl"
+    OB_type = 10,                  // "type"
   };
 
   /// getMDKindID - Return a unique non-zero ID for the specified metadata kind.
diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
index 5ce1013f30fd1..50a4855ed96de 100644
--- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
@@ -3138,7 +3138,7 @@ void SelectionDAGBuilder::visitInvoke(const InvokeInst &I) {
              {LLVMContext::OB_deopt, LLVMContext::OB_gc_transition,
               LLVMContext::OB_gc_live, LLVMContext::OB_funclet,
               LLVMContext::OB_cfguardtarget,
-              LLVMContext::OB_clang_arc_attachedcall}) &&
+              LLVMContext::OB_clang_arc_attachedcall, LLVMContext::OB_type}) &&
          "Cannot lower invokes with arbitrary operand bundles yet!");
 
   const Value *Callee(I.getCalledOperand());
@@ -3225,8 +3225,9 @@ void SelectionDAGBuilder::visitCallBr(const CallBrInst &I) {
 
   // Deopt bundles are lowered in LowerCallSiteWithDeoptBundle, and we don't
   // have to do anything here to lower funclet bundles.
-  assert(!I.hasOperandBundlesOtherThan(
-             {LLVMContext::OB_deopt, LLVMContext::OB_funclet}) &&
+  assert(!I.hasOperandBundlesOtherThan({LLVMContext::OB_deopt,
+                                        LLVMContext::OB_funclet,
+                                        LLVMContext::OB_type}) &&
          "Cannot lower callbrs with arbitrary operand bundles yet!");
 
   assert(I.isInlineAsm() && "Only know how to handle inlineasm callbr");
@@ -8948,7 +8949,8 @@ void SelectionDAGBuilder::visitCall(const CallInst &I) {
   assert(!I.hasOperandBundlesOtherThan(
              {LLVMContext::OB_deopt, LLVMContext::OB_funclet,
               LLVMContext::OB_cfguardtarget, LLVMContext::OB_preallocated,
-              LLVMContext::OB_clang_arc_attachedcall, LLVMContext::OB_kcfi}) &&
+              LLVMContext::OB_clang_arc_attachedcall, LLVMContext::OB_kcfi,
+              LLVMContext::OB_type}) &&
          "Cannot lower calls with arbitrary operand bundles!");
 
   SDValue Callee = getValue(I.getCalledOperand());
diff --git a/llvm/lib/IR/LLVMContext.cpp b/llvm/lib/IR/LLVMContext.cpp
index 57077e786efc2..d987acf309ea3 100644
--- a/llvm/lib/IR/LLVMContext.cpp
+++ b/llvm/lib/IR/LLVMContext.cpp
@@ -97,6 +97,11 @@ LLVMContext::LLVMContext() : pImpl(new LLVMContextImpl(*this)) {
          "convergencectrl operand bundle id drifted!");
   (void)ConvergenceCtrlEntry;
 
+  auto *TypeEntry = pImpl->getOrInsertBundleTag("type");
+  assert(TypeEntry->second == LLVMContext::OB_type &&
+         "type operand bundle id drifted!");
+  (void)TypeEntry;
+
   SyncScope::ID SingleThreadSSID =
       pImpl->getOrInsertSyncScopeID("singlethread");
   assert(SingleThreadSSID == SyncScope::SingleThread &&
diff --git a/llvm/lib/IR/Verifier.cpp b/llvm/lib/IR/Verifier.cpp
index 91cf91fbc788b..929e2d3a1473b 100644
--- a/llvm/lib/IR/Verifier.cpp
+++ b/llvm/lib/IR/Verifier.cpp
@@ -3545,13 +3545,13 @@ void Verifier::visitCallBase(CallBase &Call) {
       visitIntrinsicCall(ID, Call);
 
   // Verify that a callsite has at most one "deopt", at most one "funclet", at
-  // most one "gc-transition", at most one "cfguardtarget", at most one
+  // most one "gc-transition", at most one "cfguardtarget", at most one "type", at most one
   // "preallocated" operand bundle, and at most one "ptrauth" operand bundle.
   bool FoundDeoptBundle = false, FoundFuncletBundle = false,
        FoundGCTransitionBundle = false, FoundCFGuardTargetBundle = false,
        FoundPreallocatedBundle = false, FoundGCLiveBundle = false,
        FoundPtrauthBundle = false, FoundKCFIBundle = false,
-       FoundAttachedCallBundle = false;
+       FoundAttachedCallBundle = false, FoundTypeBundle = false;
   for (unsigned i = 0, e = Call.getNumOperandBundles(); i < e; ++i) {
     OperandBundleUse BU = Call.getOperandBundleAt(i);
     uint32_t Tag = BU.getTagID();
@@ -3614,6 +3614,9 @@ void Verifier::visitCallBase(CallBase &Call) {
             "Multiple \"clang.arc.attachedcall\" operand bundles", Call);
       FoundAttachedCallBundle = true;
       verifyAttachedCallBundle(Call, BU);
+    } else if (Tag == LLVMContext::OB_type) {
+      Check(!FoundTypeBundle, "Multiple \"type\" operand bundles", Call);
+      FoundTypeBundle = true;
     }
   }
 
diff --git a/llvm/test/Verifier/operand-bundles.ll b/llvm/test/Verifier/operand-bundles.ll
index db85b6ae6ef5f..788006494edce 100644
--- a/llvm/test/Verifier/operand-bundles.ll
+++ b/llvm/test/Verifier/operand-bundles.ll
@@ -105,4 +105,17 @@ declare ptr @objc_retainAutoreleasedReturnValue(ptr)
 declare ptr @objc_unsafeClaimAutoreleasedReturnValue(ptr)
 declare void @llvm.assume(i1)
 
+define void @f_type(i32* %ptr) {
+; CHECK: Multiple "type" operand bundles
+; CHECK-NEXT: call void @g() [ "type"(metadata !"_ZTSFvE.generalized"), "type"(metadata !"_ZTSFvE.generalized") ]
+; CHECK-NOT: call void @g() [ "type"(metadata !"_ZTSFvE.generalized") ]
+
+ entry:
+  %l = load i32, i32* %ptr
+  call void @g() [ "type"(metadata !"_ZTSFvE.generalized"), "type"(metadata !"_ZTSFvE.generalized") ]
+  call void @g() [ "type"(metadata !"_ZTSFvE.generalized") ]
+  %x = add i32 42, 1
+  ret void
+}
+
 attributes #0 = { noreturn }

>From ec2d71d45848c9142dbcb474a72f7ba8482a0e3f Mon Sep 17 00:00:00 2001
From: Necip Fazil Yildiran <necip at google.com>
Date: Tue, 7 Nov 2023 10:49:30 -0800
Subject: [PATCH 4/9] [MIPS][CallSiteInfo][NFC] Fill CallSiteInfo only when
 needed

Argument-register pairs in CallSiteInfo is only needed when EmitCallSiteInfo
is on. Currently, the pairs are always pushed to the vector but only used
when EmitCallSiteInfo is on.

Don't fill the CallSiteInfo vector unless used.

Reviewed By: morehouse

Differential Revision: https://reviews.llvm.org/D107108
---
 llvm/lib/Target/Mips/MipsISelLowering.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/llvm/lib/Target/Mips/MipsISelLowering.cpp b/llvm/lib/Target/Mips/MipsISelLowering.cpp
index b2812f87914df..db29b4b0c6f1a 100644
--- a/llvm/lib/Target/Mips/MipsISelLowering.cpp
+++ b/llvm/lib/Target/Mips/MipsISelLowering.cpp
@@ -3369,7 +3369,7 @@ MipsTargetLowering::LowerCall(TargetLowering::CallLoweringInfo &CLI,
 
       // Collect CSInfo about which register passes which parameter.
       const TargetOptions &Options = DAG.getTarget().Options;
-      if (Options.SupportsDebugEntryValues)
+      if (Options.EmitCallSiteInfo && Options.SupportsDebugEntryValues)
         CSInfo.emplace_back(VA.getLocReg(), i);
 
       continue;

>From b12178f0d8880032f11333f0c9f82be133ea81bd Mon Sep 17 00:00:00 2001
From: Necip Fazil Yildiran <necip at google.com>
Date: Tue, 7 Nov 2023 10:49:33 -0800
Subject: [PATCH 5/9] [CallSiteInfo][NFC] CallSiteInfo ->
 CallSiteInfo.ArgRegPairs

CallSiteInfo is originally used only for argument - register pairs. Make
it struct, in which we can store additional data for call sites.

Also, the variables/methods used for CallSiteInfo are named for its
original use case, e.g., CallFwdRegsInfo. Refactor these for the upcoming
use, e.g. addCallArgsForwardingRegs() -> addCallSiteInfo().

An upcoming patch will add type ids for indirect calls to propogate them from
middle-end to the back-end. The type ids will be then used to emit the call
graph section.

Original RFC: https://lists.llvm.org/pipermail/llvm-dev/2021-June/151044.html
Updated RFC: https://lists.llvm.org/pipermail/llvm-dev/2021-July/151739.html

Reviewed By: morehouse

Differential Revision: https://reviews.llvm.org/D107109
---
 llvm/include/llvm/CodeGen/MachineFunction.h          | 11 ++++++-----
 llvm/include/llvm/CodeGen/SelectionDAG.h             |  3 +--
 llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp           |  6 +++---
 llvm/lib/CodeGen/MIRParser/MIRParser.cpp             |  4 ++--
 llvm/lib/CodeGen/MIRPrinter.cpp                      |  2 +-
 llvm/lib/CodeGen/SelectionDAG/ScheduleDAGSDNodes.cpp |  6 ++++--
 llvm/lib/Target/AArch64/AArch64ISelLowering.cpp      |  9 +++++----
 llvm/lib/Target/ARM/ARMISelLowering.cpp              |  2 +-
 llvm/lib/Target/Mips/MipsISelLowering.cpp            |  2 +-
 llvm/lib/Target/X86/X86ISelLoweringCall.cpp          |  2 +-
 10 files changed, 25 insertions(+), 22 deletions(-)

diff --git a/llvm/include/llvm/CodeGen/MachineFunction.h b/llvm/include/llvm/CodeGen/MachineFunction.h
index 05c9b14a423cd..0d78a10423b27 100644
--- a/llvm/include/llvm/CodeGen/MachineFunction.h
+++ b/llvm/include/llvm/CodeGen/MachineFunction.h
@@ -481,9 +481,11 @@ class LLVM_EXTERNAL_VISIBILITY MachineFunction {
       assert(Arg < (1 << 16) && "Arg out of range");
     }
   };
-  /// Vector of call argument and its forwarding register.
-  using CallSiteInfo = SmallVector<ArgRegPair, 1>;
-  using CallSiteInfoImpl = SmallVectorImpl<ArgRegPair>;
+
+  struct CallSiteInfo {
+    /// Vector of call argument and its forwarding register.
+    SmallVector<ArgRegPair, 1> ArgRegPairs;
+  };
 
 private:
   Delegate *TheDelegate = nullptr;
@@ -1323,8 +1325,7 @@ class LLVM_EXTERNAL_VISIBILITY MachineFunction {
   }
 
   /// Start tracking the arguments passed to the call \p CallI.
-  void addCallArgsForwardingRegs(const MachineInstr *CallI,
-                                 CallSiteInfoImpl &&CallInfo) {
+  void addCallSiteInfo(const MachineInstr *CallI, CallSiteInfo &&CallInfo) {
     assert(CallI->isCandidateForCallSiteEntry());
     bool Inserted =
         CallSitesInfo.try_emplace(CallI, std::move(CallInfo)).second;
diff --git a/llvm/include/llvm/CodeGen/SelectionDAG.h b/llvm/include/llvm/CodeGen/SelectionDAG.h
index b9ec30754f0c3..490c8b8d417d3 100644
--- a/llvm/include/llvm/CodeGen/SelectionDAG.h
+++ b/llvm/include/llvm/CodeGen/SelectionDAG.h
@@ -279,7 +279,6 @@ class SelectionDAG {
   SDDbgInfo *DbgInfo;
 
   using CallSiteInfo = MachineFunction::CallSiteInfo;
-  using CallSiteInfoImpl = MachineFunction::CallSiteInfoImpl;
 
   struct NodeExtraInfo {
     CallSiteInfo CSInfo;
@@ -2305,7 +2304,7 @@ class SelectionDAG {
   }
 
   /// Set CallSiteInfo to be associated with Node.
-  void addCallSiteInfo(const SDNode *Node, CallSiteInfoImpl &&CallInfo) {
+  void addCallSiteInfo(const SDNode *Node, CallSiteInfo &&CallInfo) {
     SDEI[Node].CSInfo = std::move(CallInfo);
   }
   /// Return CallSiteInfo associated with Node, or a default if none exists.
diff --git a/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp b/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
index 6b5ad62e083e3..a85a8e04666aa 100644
--- a/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
+++ b/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
@@ -798,10 +798,10 @@ static void collectCallSiteParameters(const MachineInstr *CallMI,
                                       ParamSet &Params) {
   const MachineFunction *MF = CallMI->getMF();
   const auto &CalleesMap = MF->getCallSitesInfo();
-  auto CallFwdRegsInfo = CalleesMap.find(CallMI);
+  auto CSInfo = CalleesMap.find(CallMI);
 
   // There is no information for the call instruction.
-  if (CallFwdRegsInfo == CalleesMap.end())
+  if (CSInfo == CalleesMap.end())
     return;
 
   const MachineBasicBlock *MBB = CallMI->getParent();
@@ -815,7 +815,7 @@ static void collectCallSiteParameters(const MachineInstr *CallMI,
       DIExpression::get(MF->getFunction().getContext(), {});
 
   // Add all the forwarding registers into the ForwardedRegWorklist.
-  for (const auto &ArgReg : CallFwdRegsInfo->second) {
+  for (const auto &ArgReg : CSInfo->second.ArgRegPairs) {
     bool InsertedReg =
         ForwardedRegWorklist.insert({ArgReg.Reg, {{ArgReg.Reg, EmptyExpr}}})
             .second;
diff --git a/llvm/lib/CodeGen/MIRParser/MIRParser.cpp b/llvm/lib/CodeGen/MIRParser/MIRParser.cpp
index 78d7e62797ce5..baa69522a974f 100644
--- a/llvm/lib/CodeGen/MIRParser/MIRParser.cpp
+++ b/llvm/lib/CodeGen/MIRParser/MIRParser.cpp
@@ -425,11 +425,11 @@ bool MIRParserImpl::initializeCallSiteInfo(
       Register Reg;
       if (parseNamedRegisterReference(PFS, Reg, ArgRegPair.Reg.Value, Error))
         return error(Error, ArgRegPair.Reg.SourceRange);
-      CSInfo.emplace_back(Reg, ArgRegPair.ArgNo);
+      CSInfo.ArgRegPairs.emplace_back(Reg, ArgRegPair.ArgNo);
     }
 
     if (TM.Options.EmitCallSiteInfo)
-      MF.addCallArgsForwardingRegs(&*CallI, std::move(CSInfo));
+      MF.addCallSiteInfo(&*CallI, std::move(CSInfo));
   }
 
   if (YamlMF.CallSitesInfo.size() && !TM.Options.EmitCallSiteInfo)
diff --git a/llvm/lib/CodeGen/MIRPrinter.cpp b/llvm/lib/CodeGen/MIRPrinter.cpp
index b19a377f07448..e9879533563e8 100644
--- a/llvm/lib/CodeGen/MIRPrinter.cpp
+++ b/llvm/lib/CodeGen/MIRPrinter.cpp
@@ -540,7 +540,7 @@ void MIRPrinter::convertCallSiteObjects(yaml::MachineFunction &YMF,
         std::distance(CallI->getParent()->instr_begin(), CallI);
     YmlCS.CallLocation = CallLocation;
     // Construct call arguments and theirs forwarding register info.
-    for (auto ArgReg : CSInfo.second) {
+    for (auto ArgReg : CSInfo.second.ArgRegPairs) {
       yaml::CallSiteInfo::ArgRegPair YmlArgReg;
       YmlArgReg.ArgNo = ArgReg.ArgNo;
       printRegMIR(ArgReg.Reg, YmlArgReg.Reg, TRI);
diff --git a/llvm/lib/CodeGen/SelectionDAG/ScheduleDAGSDNodes.cpp b/llvm/lib/CodeGen/SelectionDAG/ScheduleDAGSDNodes.cpp
index c9e2745f00c95..c3e2ecbb0eef0 100644
--- a/llvm/lib/CodeGen/SelectionDAG/ScheduleDAGSDNodes.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/ScheduleDAGSDNodes.cpp
@@ -888,8 +888,10 @@ EmitSchedule(MachineBasicBlock::iterator &InsertPos) {
     }
 
     if (MI->isCandidateForCallSiteEntry() &&
-        DAG->getTarget().Options.EmitCallSiteInfo)
-      MF.addCallArgsForwardingRegs(MI, DAG->getCallSiteInfo(Node));
+        DAG->getTarget().Options.EmitCallSiteInfo) {
+      //MF.addCallArgsForwardingRegs(MI, DAG->getCallSiteInfo(Node));
+      MF.addCallSiteInfo(MI, DAG->getCallSiteInfo(Node));
+    }
 
     if (DAG->getNoMergeSiteInfo(Node)) {
       MI->setFlag(MachineInstr::MIFlag::NoMerge);
diff --git a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
index 6cbec6c686bf3..ad05dcfc377bf 100644
--- a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+++ b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
@@ -7911,15 +7911,16 @@ AArch64TargetLowering::LowerCall(CallLoweringInfo &CLI,
         // Call site info is used for function's parameter entry value
         // tracking. For now we track only simple cases when parameter
         // is transferred through whole register.
-        llvm::erase_if(CSInfo, [&VA](MachineFunction::ArgRegPair ArgReg) {
-          return ArgReg.Reg == VA.getLocReg();
-        });
+        llvm::erase_if(CSInfo.ArgRegPairs,
+                       [&VA](MachineFunction::ArgRegPair ArgReg) {
+                         return ArgReg.Reg == VA.getLocReg();
+                       });
       } else {
         RegsToPass.emplace_back(VA.getLocReg(), Arg);
         RegsUsed.insert(VA.getLocReg());
         const TargetOptions &Options = DAG.getTarget().Options;
         if (Options.EmitCallSiteInfo)
-          CSInfo.emplace_back(VA.getLocReg(), i);
+          CSInfo.ArgRegPairs.emplace_back(VA.getLocReg(), i);
       }
     } else {
       assert(VA.isMemLoc());
diff --git a/llvm/lib/Target/ARM/ARMISelLowering.cpp b/llvm/lib/Target/ARM/ARMISelLowering.cpp
index bf8c877a547cd..0b78e4e4c7470 100644
--- a/llvm/lib/Target/ARM/ARMISelLowering.cpp
+++ b/llvm/lib/Target/ARM/ARMISelLowering.cpp
@@ -2571,7 +2571,7 @@ ARMTargetLowering::LowerCall(TargetLowering::CallLoweringInfo &CLI,
       }
       const TargetOptions &Options = DAG.getTarget().Options;
       if (Options.EmitCallSiteInfo)
-        CSInfo.emplace_back(VA.getLocReg(), i);
+        CSInfo.ArgRegPairs.emplace_back(VA.getLocReg(), i);
       RegsToPass.push_back(std::make_pair(VA.getLocReg(), Arg));
     } else if (isByVal) {
       assert(VA.isMemLoc());
diff --git a/llvm/lib/Target/Mips/MipsISelLowering.cpp b/llvm/lib/Target/Mips/MipsISelLowering.cpp
index db29b4b0c6f1a..1ee35cb5df8b0 100644
--- a/llvm/lib/Target/Mips/MipsISelLowering.cpp
+++ b/llvm/lib/Target/Mips/MipsISelLowering.cpp
@@ -3370,7 +3370,7 @@ MipsTargetLowering::LowerCall(TargetLowering::CallLoweringInfo &CLI,
       // Collect CSInfo about which register passes which parameter.
       const TargetOptions &Options = DAG.getTarget().Options;
       if (Options.EmitCallSiteInfo && Options.SupportsDebugEntryValues)
-        CSInfo.emplace_back(VA.getLocReg(), i);
+        CSInfo.ArgRegPairs.emplace_back(VA.getLocReg(), i);
 
       continue;
     }
diff --git a/llvm/lib/Target/X86/X86ISelLoweringCall.cpp b/llvm/lib/Target/X86/X86ISelLoweringCall.cpp
index d75bd4171fde9..e9b857750d3ca 100644
--- a/llvm/lib/Target/X86/X86ISelLoweringCall.cpp
+++ b/llvm/lib/Target/X86/X86ISelLoweringCall.cpp
@@ -2212,7 +2212,7 @@ X86TargetLowering::LowerCall(TargetLowering::CallLoweringInfo &CLI,
       RegsToPass.push_back(std::make_pair(VA.getLocReg(), Arg));
       const TargetOptions &Options = DAG.getTarget().Options;
       if (Options.EmitCallSiteInfo)
-        CSInfo.emplace_back(VA.getLocReg(), I);
+        CSInfo.ArgRegPairs.emplace_back(VA.getLocReg(), I);
       if (isVarArg && IsWin64) {
         // Win64 ABI requires argument XMM reg to be copied to the corresponding
         // shadow reg if callee is a varargs function.

>From 636158d2d1168c030c16d5040c2afff288b493e6 Mon Sep 17 00:00:00 2001
From: Necip Fazil Yildiran <necip at google.com>
Date: Tue, 7 Nov 2023 10:49:36 -0800
Subject: [PATCH 6/9] [CallSiteInfo][CallGraphSection] Extend CallSiteInfo with
 TypeId

Add TypeId field to CallSiteInfo for tracking callee type id. Read the type
id in and out by the MIR parser/printer.

With this patch, TypeId is only set from callSites.typeId fields from MIR
inputs. An upcoming patch will pass type ids from the clang front-end,
and they will be read and set while lowering in the LLVM middle-end.

Original RFC: https://lists.llvm.org/pipermail/llvm-dev/2021-June/151044.html
Updated RFC: https://lists.llvm.org/pipermail/llvm-dev/2021-July/151739.html

Reviewed By: morehouse

Differential Revision: https://reviews.llvm.org/D107110
---
 llvm/include/llvm/CodeGen/MIRYamlMapping.h    |  5 ++
 llvm/include/llvm/CodeGen/MachineFunction.h   |  7 +-
 llvm/lib/CodeGen/MIRParser/MIRParser.cpp      | 10 ++-
 llvm/lib/CodeGen/MIRPrinter.cpp               |  3 +
 llvm/lib/CodeGen/MachineFunction.cpp          |  2 +-
 .../Bitcode/operand-bundles-bc-analyzer.ll    |  1 +
 .../CodeGen/MIR/X86/call-site-info-typeid.mir | 68 +++++++++++++++++++
 7 files changed, 91 insertions(+), 5 deletions(-)
 create mode 100644 llvm/test/CodeGen/MIR/X86/call-site-info-typeid.mir

diff --git a/llvm/include/llvm/CodeGen/MIRYamlMapping.h b/llvm/include/llvm/CodeGen/MIRYamlMapping.h
index bb8dbb0478ff5..d71cdc63414dd 100644
--- a/llvm/include/llvm/CodeGen/MIRYamlMapping.h
+++ b/llvm/include/llvm/CodeGen/MIRYamlMapping.h
@@ -480,6 +480,10 @@ struct CallSiteInfo {
   MachineInstrLoc CallLocation;
   std::vector<ArgRegPair> ArgForwardingRegs;
 
+  /// Numeric callee type identifier used for call graph section.
+  using TypeIdTy = std::optional<uint64_t>;
+  TypeIdTy TypeId;
+
   bool operator==(const CallSiteInfo &Other) const {
     return CallLocation.BlockNum == Other.CallLocation.BlockNum &&
            CallLocation.Offset == Other.CallLocation.Offset;
@@ -508,6 +512,7 @@ template <> struct MappingTraits<CallSiteInfo> {
     YamlIO.mapRequired("offset", CSInfo.CallLocation.Offset);
     YamlIO.mapOptional("fwdArgRegs", CSInfo.ArgForwardingRegs,
                        std::vector<CallSiteInfo::ArgRegPair>());
+    YamlIO.mapOptional("typeId", CSInfo.TypeId);
   }
 
   static const bool flow = true;
diff --git a/llvm/include/llvm/CodeGen/MachineFunction.h b/llvm/include/llvm/CodeGen/MachineFunction.h
index 0d78a10423b27..04fdaeaf2ed06 100644
--- a/llvm/include/llvm/CodeGen/MachineFunction.h
+++ b/llvm/include/llvm/CodeGen/MachineFunction.h
@@ -485,6 +485,9 @@ class LLVM_EXTERNAL_VISIBILITY MachineFunction {
   struct CallSiteInfo {
     /// Vector of call argument and its forwarding register.
     SmallVector<ArgRegPair, 1> ArgRegPairs;
+
+    /// Callee type id.
+    ConstantInt *TypeId = nullptr;
   };
 
 private:
@@ -492,7 +495,7 @@ class LLVM_EXTERNAL_VISIBILITY MachineFunction {
   GISelChangeObserver *Observer = nullptr;
 
   using CallSiteInfoMap = DenseMap<const MachineInstr *, CallSiteInfo>;
-  /// Map a call instruction to call site arguments forwarding info.
+  /// Map a call instruction to call site arguments forwarding and type id.
   CallSiteInfoMap CallSitesInfo;
 
   /// A helper function that returns call site info for a give call
@@ -1324,7 +1327,7 @@ class LLVM_EXTERNAL_VISIBILITY MachineFunction {
     });
   }
 
-  /// Start tracking the arguments passed to the call \p CallI.
+  /// Start tracking the arguments passed to the call \p CallI and call type.
   void addCallSiteInfo(const MachineInstr *CallI, CallSiteInfo &&CallInfo) {
     assert(CallI->isCandidateForCallSiteEntry());
     bool Inserted =
diff --git a/llvm/lib/CodeGen/MIRParser/MIRParser.cpp b/llvm/lib/CodeGen/MIRParser/MIRParser.cpp
index baa69522a974f..4e64471649a5c 100644
--- a/llvm/lib/CodeGen/MIRParser/MIRParser.cpp
+++ b/llvm/lib/CodeGen/MIRParser/MIRParser.cpp
@@ -427,12 +427,18 @@ bool MIRParserImpl::initializeCallSiteInfo(
         return error(Error, ArgRegPair.Reg.SourceRange);
       CSInfo.ArgRegPairs.emplace_back(Reg, ArgRegPair.ArgNo);
     }
+    if (YamlCSInfo.TypeId.has_value()) {
+      IntegerType *Int64Ty = Type::getInt64Ty(Context);
+      CSInfo.TypeId = ConstantInt::get(Int64Ty, YamlCSInfo.TypeId.value(),
+                                       /*isSigned=*/false);
+    }
 
-    if (TM.Options.EmitCallSiteInfo)
+    if (TM.Options.EmitCallSiteInfo || TM.Options.EmitCallGraphSection)
       MF.addCallSiteInfo(&*CallI, std::move(CSInfo));
   }
 
-  if (YamlMF.CallSitesInfo.size() && !TM.Options.EmitCallSiteInfo)
+  if (YamlMF.CallSitesInfo.size() &&
+      !(TM.Options.EmitCallSiteInfo || TM.Options.EmitCallGraphSection))
     return error(Twine("Call site info provided but not used"));
   return false;
 }
diff --git a/llvm/lib/CodeGen/MIRPrinter.cpp b/llvm/lib/CodeGen/MIRPrinter.cpp
index e9879533563e8..9cd939aaa186e 100644
--- a/llvm/lib/CodeGen/MIRPrinter.cpp
+++ b/llvm/lib/CodeGen/MIRPrinter.cpp
@@ -546,6 +546,9 @@ void MIRPrinter::convertCallSiteObjects(yaml::MachineFunction &YMF,
       printRegMIR(ArgReg.Reg, YmlArgReg.Reg, TRI);
       YmlCS.ArgForwardingRegs.emplace_back(YmlArgReg);
     }
+    // Get type id.
+    if (CSInfo.second.TypeId)
+      YmlCS.TypeId = CSInfo.second.TypeId->getZExtValue();
     YMF.CallSitesInfo.push_back(YmlCS);
   }
 
diff --git a/llvm/lib/CodeGen/MachineFunction.cpp b/llvm/lib/CodeGen/MachineFunction.cpp
index 57af571ed9bfd..005d1f4ef24d0 100644
--- a/llvm/lib/CodeGen/MachineFunction.cpp
+++ b/llvm/lib/CodeGen/MachineFunction.cpp
@@ -867,7 +867,7 @@ MachineFunction::getCallSiteInfo(const MachineInstr *MI) {
   assert(MI->isCandidateForCallSiteEntry() &&
          "Call site info refers only to call (MI) candidates");
 
-  if (!Target.Options.EmitCallSiteInfo)
+  if (!Target.Options.EmitCallSiteInfo && !Target.Options.EmitCallGraphSection)
     return CallSitesInfo.end();
   return CallSitesInfo.find(MI);
 }
diff --git a/llvm/test/Bitcode/operand-bundles-bc-analyzer.ll b/llvm/test/Bitcode/operand-bundles-bc-analyzer.ll
index d860104b9cb3d..5628e17b4936e 100644
--- a/llvm/test/Bitcode/operand-bundles-bc-analyzer.ll
+++ b/llvm/test/Bitcode/operand-bundles-bc-analyzer.ll
@@ -13,6 +13,7 @@
 ; CHECK-NEXT:    <OPERAND_BUNDLE_TAG
 ; CHECK-NEXT:    <OPERAND_BUNDLE_TAG
 ; CHECK-NEXT:    <OPERAND_BUNDLE_TAG
+; CHECK-NEXT:    <OPERAND_BUNDLE_TAG
 ; CHECK-NEXT:  </OPERAND_BUNDLE_TAGS_BLOCK
 
 ; CHECK:   <FUNCTION_BLOCK
diff --git a/llvm/test/CodeGen/MIR/X86/call-site-info-typeid.mir b/llvm/test/CodeGen/MIR/X86/call-site-info-typeid.mir
new file mode 100644
index 0000000000000..5ab797bfcc18f
--- /dev/null
+++ b/llvm/test/CodeGen/MIR/X86/call-site-info-typeid.mir
@@ -0,0 +1,68 @@
+# Test MIR printer and parser for type id field in callSites. It is used
+# for propogating call site type identifiers to emit in the call graph section.
+
+# RUN: llc --call-graph-section %s -run-pass=none -o - | FileCheck %s
+# CHECK: name: main
+# CHECK: callSites:
+# CHECK-NEXT: - { bb: {{.*}}, offset: {{.*}}, fwdArgRegs: [], typeId:
+# CHECK-NEXT: 123456789 }
+
+--- |
+  ; ModuleID = 'test.ll'
+  source_filename = "test.ll"
+  target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
+  target triple = "x86_64-unknown-linux-gnu"
+  
+  define dso_local void @foo(i8 signext %a) {
+  entry:
+    ret void
+  }
+  
+  define dso_local i32 @main() {
+  entry:
+    %retval = alloca i32, align 4
+    %fp = alloca void (i8)*, align 8
+    store i32 0, i32* %retval, align 4
+    store void (i8)* @foo, void (i8)** %fp, align 8
+    %0 = load void (i8)*, void (i8)** %fp, align 8
+    call void %0(i8 signext 97)
+    ret i32 0
+  }
+
+...
+---
+name:            foo
+tracksRegLiveness: true
+body:             |
+  bb.0.entry:
+    RET 0
+
+...
+---
+name:            main
+tracksRegLiveness: true
+stack:
+  - { id: 0, name: retval, type: default, offset: 0, size: 4, alignment: 4, 
+      stack-id: default, callee-saved-register: '', callee-saved-restored: true, 
+      debug-info-variable: '', debug-info-expression: '', debug-info-location: '' }
+  - { id: 1, name: fp, type: default, offset: 0, size: 8, alignment: 8, 
+      stack-id: default, callee-saved-register: '', callee-saved-restored: true, 
+      debug-info-variable: '', debug-info-expression: '', debug-info-location: '' }
+callSites:
+  - { bb: 0, offset: 6, fwdArgRegs: [], typeId: 
+    123456789 }
+body:             |
+  bb.0.entry:
+    MOV32mi %stack.0.retval, 1, $noreg, 0, $noreg, 0 :: (store (s32) into %ir.retval)
+    MOV64mi32 %stack.1.fp, 1, $noreg, 0, $noreg, @foo :: (store (s64) into %ir.fp)
+    %0:gr64 = MOV32ri64 @foo
+    ADJCALLSTACKDOWN64 0, 0, 0, implicit-def dead $rsp, implicit-def dead $eflags, implicit-def dead $ssp, implicit $rsp, implicit $ssp
+    %1:gr32 = MOV32ri 97
+    $edi = COPY %1
+    CALL64r killed %0, csr_64, implicit $rsp, implicit $ssp, implicit $edi, implicit-def $rsp, implicit-def $ssp
+    ADJCALLSTACKUP64 0, 0, implicit-def dead $rsp, implicit-def dead $eflags, implicit-def dead $ssp, implicit $rsp, implicit $ssp
+    %2:gr32 = MOV32r0 implicit-def dead $eflags
+    $eax = COPY %2
+    RET 0, $eax
+
+...

>From a273aa897a3bfdba90bdbb9f728745e3e7b1b9da Mon Sep 17 00:00:00 2001
From: Necip Fazil Yildiran <necip at google.com>
Date: Tue, 7 Nov 2023 10:49:39 -0800
Subject: [PATCH 7/9] [CallSiteInfo][CallGraphSection] Extract and propagate
 indirect call type ids

Extract numeric type ids for indirect calls, and carry them to the back-end
with CallSiteInfo. The numeric type ids will be used at the back-end to emit
the call graph section.

Original RFC: https://lists.llvm.org/pipermail/llvm-dev/2021-June/151044.html
Updated RFC: https://lists.llvm.org/pipermail/llvm-dev/2021-July/151739.html

Reviewed By: morehouse

Differential Revision: https://reviews.llvm.org/D107111
---
 llvm/include/llvm/CodeGen/MachineFunction.h   |  34 ++++++
 .../SelectionDAG/ScheduleDAGSDNodes.cpp       |   4 +-
 .../Target/AArch64/AArch64ISelLowering.cpp    |   5 +
 llvm/lib/Target/ARM/ARMISelLowering.cpp       |   5 +
 llvm/lib/Target/Mips/MipsISelLowering.cpp     |   6 +-
 llvm/lib/Target/X86/X86FastISel.cpp           |   8 ++
 llvm/lib/Target/X86/X86ISelLoweringCall.cpp   |   4 +
 .../CodeGen/AArch64/call-site-info-typeid.ll  |  39 ++++++
 .../test/CodeGen/ARM/call-site-info-typeid.ll |  39 ++++++
 .../CodeGen/MIR/X86/call-site-info-typeid.ll  | 112 ++++++++++++++++++
 .../CodeGen/Mips/call-site-info-typeid.ll     |  39 ++++++
 .../test/CodeGen/X86/call-site-info-typeid.ll |  39 ++++++
 12 files changed, 331 insertions(+), 3 deletions(-)
 create mode 100644 llvm/test/CodeGen/AArch64/call-site-info-typeid.ll
 create mode 100644 llvm/test/CodeGen/ARM/call-site-info-typeid.ll
 create mode 100644 llvm/test/CodeGen/MIR/X86/call-site-info-typeid.ll
 create mode 100644 llvm/test/CodeGen/Mips/call-site-info-typeid.ll
 create mode 100644 llvm/test/CodeGen/X86/call-site-info-typeid.ll

diff --git a/llvm/include/llvm/CodeGen/MachineFunction.h b/llvm/include/llvm/CodeGen/MachineFunction.h
index 04fdaeaf2ed06..8f38d48e68c38 100644
--- a/llvm/include/llvm/CodeGen/MachineFunction.h
+++ b/llvm/include/llvm/CodeGen/MachineFunction.h
@@ -27,7 +27,9 @@
 #include "llvm/CodeGen/MachineBasicBlock.h"
 #include "llvm/CodeGen/MachineInstr.h"
 #include "llvm/CodeGen/MachineMemOperand.h"
+#include "llvm/IR/Constants.h"
 #include "llvm/IR/EHPersonalities.h"
+#include "llvm/IR/Instructions.h"
 #include "llvm/Support/Allocator.h"
 #include "llvm/Support/ArrayRecycler.h"
 #include "llvm/Support/AtomicOrdering.h"
@@ -488,6 +490,38 @@ class LLVM_EXTERNAL_VISIBILITY MachineFunction {
 
     /// Callee type id.
     ConstantInt *TypeId = nullptr;
+
+    CallSiteInfo() {}
+
+    /// Extracts the numeric type id from the CallBase's type operand bundle,
+    /// and sets TypeId. This is used as type id for the indirect call in the
+    /// call graph section.
+    CallSiteInfo(const CallBase &CB) {
+      // Call graph section needs numeric type id only for indirect calls.
+      if (!CB.isIndirectCall())
+        return;
+
+      auto Opt = CB.getOperandBundle(LLVMContext::OB_type);
+      if (!Opt.has_value()) {
+        errs() << "warning: cannot find indirect call type operand bundle for  "
+                  "call graph section\n";
+        return;
+      }
+
+      // Get generalized type id string
+      auto OB = Opt.value();
+      assert(OB.Inputs.size() == 1 && "invalid input size");
+      auto *OBVal = OB.Inputs.front().get();
+      auto *TypeIdMD = cast<MetadataAsValue>(OBVal)->getMetadata();
+      auto *TypeIdStr = cast<MDString>(TypeIdMD);
+      assert(TypeIdStr->getString().endswith(".generalized") &&
+             "invalid type identifier");
+
+      // Compute numeric type id from generalized type id string
+      uint64_t TypeIdVal = llvm::MD5Hash(TypeIdStr->getString());
+      IntegerType *Int64Ty = Type::getInt64Ty(CB.getContext());
+      TypeId = llvm::ConstantInt::get(Int64Ty, TypeIdVal, /*IsSigned=*/false);
+    }
   };
 
 private:
diff --git a/llvm/lib/CodeGen/SelectionDAG/ScheduleDAGSDNodes.cpp b/llvm/lib/CodeGen/SelectionDAG/ScheduleDAGSDNodes.cpp
index c3e2ecbb0eef0..4bd90e450491e 100644
--- a/llvm/lib/CodeGen/SelectionDAG/ScheduleDAGSDNodes.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/ScheduleDAGSDNodes.cpp
@@ -888,8 +888,8 @@ EmitSchedule(MachineBasicBlock::iterator &InsertPos) {
     }
 
     if (MI->isCandidateForCallSiteEntry() &&
-        DAG->getTarget().Options.EmitCallSiteInfo) {
-      //MF.addCallArgsForwardingRegs(MI, DAG->getCallSiteInfo(Node));
+        (DAG->getTarget().Options.EmitCallSiteInfo ||
+         DAG->getTarget().Options.EmitCallGraphSection)) {      
       MF.addCallSiteInfo(MI, DAG->getCallSiteInfo(Node));
     }
 
diff --git a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
index ad05dcfc377bf..0c02de9b058f6 100644
--- a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+++ b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
@@ -7544,6 +7544,7 @@ AArch64TargetLowering::LowerCall(CallLoweringInfo &CLI,
   bool &IsTailCall = CLI.IsTailCall;
   CallingConv::ID &CallConv = CLI.CallConv;
   bool IsVarArg = CLI.IsVarArg;
+  const auto *CB = CLI.CB;
 
   MachineFunction &MF = DAG.getMachineFunction();
   MachineFunction::CallSiteInfo CSInfo;
@@ -7583,6 +7584,10 @@ AArch64TargetLowering::LowerCall(CallLoweringInfo &CLI,
                     *DAG.getContext());
   RetCCInfo.AnalyzeCallResult(Ins, RetCC);
 
+  // Set type id for call site info.
+  if (MF.getTarget().Options.EmitCallGraphSection && CB && CB->isIndirectCall())
+    CSInfo = MachineFunction::CallSiteInfo(*CB);
+
   // Check callee args/returns for SVE registers and set calling convention
   // accordingly.
   if (CallConv == CallingConv::C || CallConv == CallingConv::Fast) {
diff --git a/llvm/lib/Target/ARM/ARMISelLowering.cpp b/llvm/lib/Target/ARM/ARMISelLowering.cpp
index 0b78e4e4c7470..796d7a586f2f8 100644
--- a/llvm/lib/Target/ARM/ARMISelLowering.cpp
+++ b/llvm/lib/Target/ARM/ARMISelLowering.cpp
@@ -2359,6 +2359,7 @@ ARMTargetLowering::LowerCall(TargetLowering::CallLoweringInfo &CLI,
   CallingConv::ID CallConv              = CLI.CallConv;
   bool doesNotRet                       = CLI.DoesNotReturn;
   bool isVarArg                         = CLI.IsVarArg;
+  const auto *CB = CLI.CB;
 
   MachineFunction &MF = DAG.getMachineFunction();
   ARMFunctionInfo *AFI = MF.getInfo<ARMFunctionInfo>();
@@ -2375,6 +2376,10 @@ ARMTargetLowering::LowerCall(TargetLowering::CallLoweringInfo &CLI,
       !Subtarget->noBTIAtReturnTwice())
     GuardWithBTI = AFI->branchTargetEnforcement();
 
+  // Set type id for call site info.
+  if (MF.getTarget().Options.EmitCallGraphSection && CB && CB->isIndirectCall())
+    CSInfo = MachineFunction::CallSiteInfo(*CB);
+
   // Determine whether this is a non-secure function call.
   if (CLI.CB && CLI.CB->getAttributes().hasFnAttr("cmse_nonsecure_call"))
     isCmseNSCall = true;
diff --git a/llvm/lib/Target/Mips/MipsISelLowering.cpp b/llvm/lib/Target/Mips/MipsISelLowering.cpp
index 1ee35cb5df8b0..13e68d2948133 100644
--- a/llvm/lib/Target/Mips/MipsISelLowering.cpp
+++ b/llvm/lib/Target/Mips/MipsISelLowering.cpp
@@ -3173,6 +3173,7 @@ MipsTargetLowering::LowerCall(TargetLowering::CallLoweringInfo &CLI,
   bool &IsTailCall                      = CLI.IsTailCall;
   CallingConv::ID CallConv              = CLI.CallConv;
   bool IsVarArg                         = CLI.IsVarArg;
+  const auto *CB = CLI.CB;
 
   MachineFunction &MF = DAG.getMachineFunction();
   MachineFrameInfo &MFI = MF.getFrameInfo();
@@ -3230,8 +3231,11 @@ MipsTargetLowering::LowerCall(TargetLowering::CallLoweringInfo &CLI,
   // Get a count of how many bytes are to be pushed on the stack.
   unsigned StackSize = CCInfo.getStackSize();
 
-  // Call site info for function parameters tracking.
+  // Call site info for function parameters tracking and call base type info.
   MachineFunction::CallSiteInfo CSInfo;
+  // Set type id for call site info.
+  if (MF.getTarget().Options.EmitCallGraphSection && CB && CB->isIndirectCall())
+    CSInfo = MachineFunction::CallSiteInfo(*CB);
 
   // Check if it's really possible to do a tail call. Restrict it to functions
   // that are part of this compilation unit.
diff --git a/llvm/lib/Target/X86/X86FastISel.cpp b/llvm/lib/Target/X86/X86FastISel.cpp
index 1ce1e6f6a5635..9a763b0a78488 100644
--- a/llvm/lib/Target/X86/X86FastISel.cpp
+++ b/llvm/lib/Target/X86/X86FastISel.cpp
@@ -3630,6 +3630,12 @@ bool X86FastISel::fastLowerCall(CallLoweringInfo &CLI) {
   CLI.NumResultRegs = RVLocs.size();
   CLI.Call = MIB;
 
+  // Add call site info for call graph section.
+  if (TM.Options.EmitCallGraphSection && CB && CB->isIndirectCall()) {
+    MachineFunction::CallSiteInfo CSInfo(*CB);
+    MF->addCallSiteInfo(CLI.Call, std::move(CSInfo));
+  }
+
   return true;
 }
 
@@ -4025,6 +4031,8 @@ bool X86FastISel::tryToFoldLoadIntoMI(MachineInstr *MI, unsigned OpNo,
     MO.setReg(IndexReg);
   }
 
+  if (MI->isCall())
+    FuncInfo.MF->moveCallSiteInfo(MI, Result);
   Result->addMemOperand(*FuncInfo.MF, createMachineMemOperandFor(LI));
   Result->cloneInstrSymbols(*FuncInfo.MF, *MI);
   MachineBasicBlock::iterator I(MI);
diff --git a/llvm/lib/Target/X86/X86ISelLoweringCall.cpp b/llvm/lib/Target/X86/X86ISelLoweringCall.cpp
index e9b857750d3ca..dace636925357 100644
--- a/llvm/lib/Target/X86/X86ISelLoweringCall.cpp
+++ b/llvm/lib/Target/X86/X86ISelLoweringCall.cpp
@@ -2009,6 +2009,10 @@ X86TargetLowering::LowerCall(TargetLowering::CallLoweringInfo &CLI,
   if (CallConv == CallingConv::X86_INTR)
     report_fatal_error("X86 interrupts may not be called directly");
 
+  // Set type id for call site info.
+  if (MF.getTarget().Options.EmitCallGraphSection && CB && CB->isIndirectCall())
+    CSInfo = MachineFunction::CallSiteInfo(*CB);
+
   bool IsMustTail = CLI.CB && CLI.CB->isMustTailCall();
   if (Subtarget.isPICStyleGOT() && !IsGuaranteeTCO && !IsMustTail) {
     // If we are using a GOT, disable tail calls to external symbols with
diff --git a/llvm/test/CodeGen/AArch64/call-site-info-typeid.ll b/llvm/test/CodeGen/AArch64/call-site-info-typeid.ll
new file mode 100644
index 0000000000000..f0a6b44755c5c
--- /dev/null
+++ b/llvm/test/CodeGen/AArch64/call-site-info-typeid.ll
@@ -0,0 +1,39 @@
+; Tests that call site type ids can be extracted and set from type operand
+; bundles.
+
+; Verify the exact typeId value to ensure it is not garbage but the value
+; computed as the type id from the type operand bundle.
+; RUN: llc --call-graph-section -mtriple aarch64-linux-gnu %s -stop-before=finalize-isel -o - | FileCheck %s
+
+; ModuleID = 'test.c'
+source_filename = "test.c"
+target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128"
+target triple = "aarch64-unknown-linux-gnu"
+
+define dso_local void @foo(i8 signext %a) !type !3 {
+entry:
+  ret void
+}
+
+; CHECK: name: main
+define dso_local i32 @main() !type !4 {
+entry:
+  %retval = alloca i32, align 4
+  %fp = alloca void (i8)*, align 8
+  store i32 0, i32* %retval, align 4
+  store void (i8)* @foo, void (i8)** %fp, align 8
+  %0 = load void (i8)*, void (i8)** %fp, align 8
+  ; CHECK: callSites:
+  ; CHECK-NEXT: - { bb: {{.*}}, offset: {{.*}}, fwdArgRegs: [], typeId:
+  ; CHECK-NEXT: 7854600665770582568 }
+  call void %0(i8 signext 97) [ "type"(metadata !"_ZTSFvcE.generalized") ]
+  ret i32 0
+}
+
+!llvm.module.flags = !{!0, !1, !2}
+
+!0 = !{i32 1, !"wchar_size", i32 4}
+!1 = !{i32 7, !"uwtable", i32 1}
+!2 = !{i32 7, !"frame-pointer", i32 2}
+!3 = !{i64 0, !"_ZTSFvcE.generalized"}
+!4 = !{i64 0, !"_ZTSFiE.generalized"}
diff --git a/llvm/test/CodeGen/ARM/call-site-info-typeid.ll b/llvm/test/CodeGen/ARM/call-site-info-typeid.ll
new file mode 100644
index 0000000000000..ec7f8a425051b
--- /dev/null
+++ b/llvm/test/CodeGen/ARM/call-site-info-typeid.ll
@@ -0,0 +1,39 @@
+; Tests that call site type ids can be extracted and set from type operand
+; bundles.
+
+; Verify the exact typeId value to ensure it is not garbage but the value
+; computed as the type id from the type operand bundle.
+; RUN: llc --call-graph-section -mtriple arm-linux-gnu %s -stop-before=finalize-isel -o - | FileCheck %s
+
+; ModuleID = 'test.c'
+source_filename = "test.c"
+target datalayout = "e-m:e-p:32:32-Fi8-i64:64-v128:64:128-a:0:32-n32-S64"
+target triple = "armv4t-unknown-linux-gnu"
+
+define dso_local void @foo(i8 signext %a) !type !3 {
+entry:
+  ret void
+}
+
+; CHECK: name: main
+define dso_local i32 @main() !type !4 {
+entry:
+  %retval = alloca i32, align 4
+  %fp = alloca void (i8)*, align 8
+  store i32 0, i32* %retval, align 4
+  store void (i8)* @foo, void (i8)** %fp, align 8
+  %0 = load void (i8)*, void (i8)** %fp, align 8
+  ; CHECK: callSites:
+  ; CHECK-NEXT: - { bb: {{.*}}, offset: {{.*}}, fwdArgRegs: [], typeId:
+  ; CHECK-NEXT: 7854600665770582568 }
+  call void %0(i8 signext 97) [ "type"(metadata !"_ZTSFvcE.generalized") ]
+  ret i32 0
+}
+
+!llvm.module.flags = !{!0, !1, !2}
+
+!0 = !{i32 1, !"wchar_size", i32 4}
+!1 = !{i32 7, !"uwtable", i32 1}
+!2 = !{i32 7, !"frame-pointer", i32 2}
+!3 = !{i64 0, !"_ZTSFvcE.generalized"}
+!4 = !{i64 0, !"_ZTSFiE.generalized"}
diff --git a/llvm/test/CodeGen/MIR/X86/call-site-info-typeid.ll b/llvm/test/CodeGen/MIR/X86/call-site-info-typeid.ll
new file mode 100644
index 0000000000000..b769a721cac06
--- /dev/null
+++ b/llvm/test/CodeGen/MIR/X86/call-site-info-typeid.ll
@@ -0,0 +1,112 @@
+; Test MIR printer and parser for type id field in call site info. Test that
+; it works well with/without --emit-call-site-info.
+
+; Multiplex --call-graph-section and -emit-call-site-info as both utilize
+; CallSiteInfo and callSites.
+
+;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
+;; Test printer and parser with --call-graph-section only.
+
+; Test printer.
+; Verify that fwdArgRegs is not set, typeId is set.
+; Verify the exact typeId value to ensure it is not garbage but the value
+; computed as the type id from the type operand bundle.
+; RUN: llc --call-graph-section %s -stop-before=finalize-isel -o %t1.mir
+; RUN: cat %t1.mir | FileCheck %s --check-prefix=PRINTER_CGS
+; PRINTER_CGS: name: main
+; PRINTER_CGS: callSites:
+; PRINTER_CGS-NEXT: - { bb: {{.*}}, offset: {{.*}}, fwdArgRegs: [], typeId:
+; PRINTER_CGS-NEXT: 7854600665770582568 }
+
+
+; Test parser.
+; Verify that we get the same result.
+; RUN: llc --call-graph-section %t1.mir -run-pass=finalize-isel -o - \
+; RUN: | FileCheck %s --check-prefix=PARSER_CGS
+; PARSER_CGS: name: main
+; PARSER_CGS: callSites:
+; PARSER_CGS-NEXT: - { bb: {{.*}}, offset: {{.*}}, fwdArgRegs: [], typeId:
+; PARSER_CGS-NEXT: 7854600665770582568 }
+
+;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
+;; Test printer and parser with -emit-call-site-info only.
+
+; Test printer.
+; Verify that fwdArgRegs is set, typeId is not set.
+; RUN: llc -emit-call-site-info %s -stop-before=finalize-isel -o %t2.mir
+; RUN: cat %t2.mir | FileCheck %s --check-prefix=PRINTER_CSI
+; PRINTER_CSI: name: main
+; PRINTER_CSI: callSites:
+; PRINTER_CSI-NEXT: - { bb: {{.*}}, offset: {{.*}}, fwdArgRegs:
+; PRINTER_CSI-NEXT: { arg: 0, reg: '$edi' }
+; PRINTER_CSI-NOT: typeId:
+
+
+; Test parser.
+; Verify that we get the same result.
+; RUN: llc -emit-call-site-info %t2.mir -run-pass=finalize-isel -o - \
+; RUN: | FileCheck %s --check-prefix=PARSER_CSI
+; PARSER_CSI: name: main
+; PARSER_CSI: callSites:
+; PARSER_CSI-NEXT: - { bb: {{.*}}, offset: {{.*}}, fwdArgRegs:
+; PARSER_CSI-NEXT: { arg: 0, reg: '$edi' }
+; PARSER_CSI-NOT: typeId:
+
+;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
+;; Test printer and parser with both -emit-call-site-info and --call-graph-section.
+
+; Test printer.
+; Verify both fwdArgRegs and typeId are set.
+; Verify the exact typeId value to ensure it is not garbage but the value
+; computed as the type id from the type operand bundle.
+; RUN: llc --call-graph-section -emit-call-site-info %s -stop-before=finalize-isel -o %t2.mir
+; RUN: cat %t2.mir | FileCheck %s --check-prefix=PRINTER_CGS_CSI
+; PRINTER_CGS_CSI: name: main
+; PRINTER_CGS_CSI: callSites:
+; PRINTER_CGS_CSI-NEXT: - { bb: {{.*}}, offset: {{.*}}, fwdArgRegs:
+; PRINTER_CGS_CSI-NEXT: { arg: 0, reg: '$edi' }, typeId:
+; PRINTER_CGS_CSI-NEXT:   7854600665770582568 }
+
+
+; Test parser.
+; Verify that we get the same result.
+; RUN: llc --call-graph-section -emit-call-site-info %t2.mir -run-pass=finalize-isel -o - \
+; RUN: | FileCheck %s --check-prefix=PARSER_CGS_CSI
+; PARSER_CGS_CSI: name: main
+; PARSER_CGS_CSI: callSites:
+; PARSER_CGS_CSI-NEXT: - { bb: {{.*}}, offset: {{.*}}, fwdArgRegs:
+; PARSER_CGS_CSI-NEXT: { arg: 0, reg: '$edi' }, typeId:
+; PARSER_CGS_CSI-NEXT:   7854600665770582568 }
+
+;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
+
+; ModuleID = 'test.c'
+source_filename = "test.c"
+target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-unknown-linux-gnu"
+
+; Function Attrs: noinline nounwind optnone uwtable
+define dso_local void @foo(i8 signext %a) !type !3 {
+entry:
+  ret void
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define dso_local i32 @main() !type !4 {
+entry:
+  %retval = alloca i32, align 4
+  %fp = alloca void (i8)*, align 8
+  store i32 0, i32* %retval, align 4
+  store void (i8)* @foo, void (i8)** %fp, align 8
+  %0 = load void (i8)*, void (i8)** %fp, align 8
+  call void %0(i8 signext 97) [ "type"(metadata !"_ZTSFvcE.generalized") ]
+  ret i32 0
+}
+
+!llvm.module.flags = !{!0, !1, !2}
+
+!0 = !{i32 1, !"wchar_size", i32 4}
+!1 = !{i32 7, !"uwtable", i32 1}
+!2 = !{i32 7, !"frame-pointer", i32 2}
+!3 = !{i64 0, !"_ZTSFvcE.generalized"}
+!4 = !{i64 0, !"_ZTSFiE.generalized"}
diff --git a/llvm/test/CodeGen/Mips/call-site-info-typeid.ll b/llvm/test/CodeGen/Mips/call-site-info-typeid.ll
new file mode 100644
index 0000000000000..8596ebb5aa094
--- /dev/null
+++ b/llvm/test/CodeGen/Mips/call-site-info-typeid.ll
@@ -0,0 +1,39 @@
+; Tests that call site type ids can be extracted and set from type operand
+; bundles.
+
+; Verify the exact typeId value to ensure it is not garbage but the value
+; computed as the type id from the type operand bundle.
+; RUN: llc --call-graph-section -mtriple=mips-linux-gnu %s -stop-before=finalize-isel -o - | FileCheck %s
+
+; ModuleID = 'test.c'
+source_filename = "test.c"
+target datalayout = "E-m:m-p:32:32-i8:8:32-i16:16:32-i64:64-n32-S64"
+target triple = "mips-unknown-linux-gnu"
+
+define dso_local void @foo(i8 signext %a) !type !3 {
+entry:
+  ret void
+}
+
+; CHECK: name: main
+define dso_local i32 @main() !type !4 {
+entry:
+  %retval = alloca i32, align 4
+  %fp = alloca void (i8)*, align 8
+  store i32 0, i32* %retval, align 4
+  store void (i8)* @foo, void (i8)** %fp, align 8
+  %0 = load void (i8)*, void (i8)** %fp, align 8
+  ; CHECK: callSites:
+  ; CHECK-NEXT: - { bb: {{.*}}, offset: {{.*}}, fwdArgRegs: [], typeId:
+  ; CHECK-NEXT: 7854600665770582568 }
+  call void %0(i8 signext 97) [ "type"(metadata !"_ZTSFvcE.generalized") ]
+  ret i32 0
+}
+
+!llvm.module.flags = !{!0, !1, !2}
+
+!0 = !{i32 1, !"wchar_size", i32 4}
+!1 = !{i32 7, !"uwtable", i32 1}
+!2 = !{i32 7, !"frame-pointer", i32 2}
+!3 = !{i64 0, !"_ZTSFvcE.generalized"}
+!4 = !{i64 0, !"_ZTSFiE.generalized"}
diff --git a/llvm/test/CodeGen/X86/call-site-info-typeid.ll b/llvm/test/CodeGen/X86/call-site-info-typeid.ll
new file mode 100644
index 0000000000000..61777b770155d
--- /dev/null
+++ b/llvm/test/CodeGen/X86/call-site-info-typeid.ll
@@ -0,0 +1,39 @@
+; Tests that call site type ids can be extracted and set from type operand
+; bundles.
+
+; Verify the exact typeId value to ensure it is not garbage but the value
+; computed as the type id from the type operand bundle.
+; RUN: llc --call-graph-section -mtriple=x86_64-unknown-linux %s -stop-before=finalize-isel -o - | FileCheck %s
+
+; ModuleID = 'test.c'
+source_filename = "test.c"
+target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-unknown-linux-gnu"
+
+define dso_local void @foo(i8 signext %a) !type !3 {
+entry:
+  ret void
+}
+
+; CHECK: name: main
+define dso_local i32 @main() !type !4 {
+entry:
+  %retval = alloca i32, align 4
+  %fp = alloca void (i8)*, align 8
+  store i32 0, i32* %retval, align 4
+  store void (i8)* @foo, void (i8)** %fp, align 8
+  %0 = load void (i8)*, void (i8)** %fp, align 8
+  ; CHECK: callSites:
+  ; CHECK-NEXT: - { bb: {{.*}}, offset: {{.*}}, fwdArgRegs: [], typeId:
+  ; CHECK-NEXT: 7854600665770582568 }
+  call void %0(i8 signext 97) [ "type"(metadata !"_ZTSFvcE.generalized") ]
+  ret i32 0
+}
+
+!llvm.module.flags = !{!0, !1, !2}
+
+!0 = !{i32 1, !"wchar_size", i32 4}
+!1 = !{i32 7, !"uwtable", i32 1}
+!2 = !{i32 7, !"frame-pointer", i32 2}
+!3 = !{i64 0, !"_ZTSFvcE.generalized"}
+!4 = !{i64 0, !"_ZTSFiE.generalized"}

>From 41db61906eca7bd9805b53ab4be5da73d959c95b Mon Sep 17 00:00:00 2001
From: Necip Fazil Yildiran <necip at google.com>
Date: Tue, 7 Nov 2023 10:49:56 -0800
Subject: [PATCH 8/9] [AsmPrinter][CallGraphSection] Emit call graph section

Collect the necessary information for constructing the call graph section, and
emit to .callgraph section of the binary.

Numeric type identifiers for indirect calls and targets are computed from type
identifiers passed from clang front-end.

CGSectionFuncComdatCreator pass is used to create comdats for functions whose
symbols will be referenced from the call graph section. A call graph section
is created per function group, and is linked to the relevant function. This
enables dead-stripping of call graph symbols if linked function gets removed.

Original RFC: https://lists.llvm.org/pipermail/llvm-dev/2021-June/151044.html
Updated RFC: https://lists.llvm.org/pipermail/llvm-dev/2021-July/151739.html

Reviewed By: morehouse

Differential Revision: https://reviews.llvm.org/D105916
---
 llvm/include/llvm/CodeGen/AsmPrinter.h     |  29 +++++
 llvm/include/llvm/MC/MCObjectFileInfo.h    |   5 +
 llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp | 127 ++++++++++++++++++++-
 llvm/lib/MC/MCObjectFileInfo.cpp           |  20 ++++
 llvm/test/CodeGen/call-graph-section.ll    |  73 ++++++++++++
 5 files changed, 253 insertions(+), 1 deletion(-)
 create mode 100644 llvm/test/CodeGen/call-graph-section.ll

diff --git a/llvm/include/llvm/CodeGen/AsmPrinter.h b/llvm/include/llvm/CodeGen/AsmPrinter.h
index fbd198a75a243..f5660071309c2 100644
--- a/llvm/include/llvm/CodeGen/AsmPrinter.h
+++ b/llvm/include/llvm/CodeGen/AsmPrinter.h
@@ -15,6 +15,7 @@
 #ifndef LLVM_CODEGEN_ASMPRINTER_H
 #define LLVM_CODEGEN_ASMPRINTER_H
 
+#include "llvm/ADT/DenseMap.h"
 #include "llvm/ADT/DenseMap.h"
 #include "llvm/ADT/MapVector.h"
 #include "llvm/ADT/SmallVector.h"
@@ -188,6 +189,32 @@ class AsmPrinter : public MachineFunctionPass {
   /// Emit comments in assembly output if this is true.
   bool VerboseAsm;
 
+  /// Store symbols and type identifiers used to create call graph section
+  /// entries related to a function.
+  struct FunctionInfo {
+    /// Numeric type identifier used in call graph section for indirect calls
+    /// and targets.
+    using CGTypeId = uint64_t;
+
+    /// Enumeration of function kinds, and their mapping to function kind values
+    /// stored in call graph section entries.
+    /// Must match the enum in llvm/tools/llvm-objdump/llvm-objdump.cpp.
+    enum FunctionKind {
+      /// Function cannot be target to indirect calls.
+      NOT_INDIRECT_TARGET = 0,
+
+      /// Function may be target to indirect calls but its type id is unknown.
+      INDIRECT_TARGET_UNKNOWN_TID = 1,
+
+      /// Function may be target to indirect calls and its type id is known.
+      INDIRECT_TARGET_KNOWN_TID = 2,
+    };
+
+    /// Map type identifiers to callsite labels. Labels are only for indirect
+    /// calls and inclusive of all indirect calls of the function.
+    SmallVector<std::pair<CGTypeId, MCSymbol *>> CallSiteLabels;
+  };
+
   /// Output stream for the stack usage file (i.e., .su file).
   std::unique_ptr<raw_fd_ostream> StackUsageStream;
 
@@ -422,6 +449,8 @@ class AsmPrinter : public MachineFunctionPass {
   void emitKCFITrapEntry(const MachineFunction &MF, const MCSymbol *Symbol);
   virtual void emitKCFITypeId(const MachineFunction &MF);
 
+  void emitCallGraphSection(const MachineFunction &MF, FunctionInfo &FuncInfo);
+
   void emitPseudoProbe(const MachineInstr &MI);
 
   void emitRemarksSection(remarks::RemarkStreamer &RS);
diff --git a/llvm/include/llvm/MC/MCObjectFileInfo.h b/llvm/include/llvm/MC/MCObjectFileInfo.h
index 2b2adf5012def..e1f0c7eb1dfb8 100644
--- a/llvm/include/llvm/MC/MCObjectFileInfo.h
+++ b/llvm/include/llvm/MC/MCObjectFileInfo.h
@@ -68,6 +68,9 @@ class MCObjectFileInfo {
   /// Language Specific Data Area information is emitted to.
   MCSection *LSDASection = nullptr;
 
+  /// Section containing metadata on call graph.
+  MCSection *CallGraphSection = nullptr;
+
   /// If exception handling is supported by the target and the target can
   /// support a compact representation of the CIE and FDE, this is the section
   /// to emit them into.
@@ -354,6 +357,8 @@ class MCObjectFileInfo {
   MCSection *getFaultMapSection() const { return FaultMapSection; }
   MCSection *getRemarksSection() const { return RemarksSection; }
 
+  MCSection *getCallGraphSection(const MCSection &TextSec) const;
+
   MCSection *getStackSizesSection(const MCSection &TextSec) const;
 
   MCSection *getBBAddrMapSection(const MCSection &TextSec) const;
diff --git a/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp b/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
index 0d573562de969..81d2717005d68 100644
--- a/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
+++ b/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
@@ -1557,6 +1557,105 @@ void AsmPrinter::emitStackUsage(const MachineFunction &MF) {
     *StackUsageStream << "static\n";
 }
 
+/// Extracts a generalized numeric type identifier of a Function's type from
+/// type metadata. Returns null if metadata cannot be found.
+static ConstantInt *extractNumericCGTypeId(const Function &F) {
+  SmallVector<MDNode *, 2> Types;
+  F.getMetadata(LLVMContext::MD_type, Types);
+  MDString *MDGeneralizedTypeId = nullptr;
+  for (const auto &Type : Types) {
+    if (Type->getNumOperands() == 2 && isa<MDString>(Type->getOperand(1))) {
+      auto *TMDS = cast<MDString>(Type->getOperand(1));
+      if (TMDS->getString().endswith("generalized")) {
+        MDGeneralizedTypeId = TMDS;
+        break;
+      }
+    }
+  }
+
+  if (!MDGeneralizedTypeId) {
+    errs() << "warning: can't find indirect target type id metadata "
+           << "for " << F.getName() << "\n";
+    return nullptr;
+  }
+
+  uint64_t TypeIdVal = llvm::MD5Hash(MDGeneralizedTypeId->getString());
+  Type *Int64Ty = Type::getInt64Ty(F.getContext());
+  return cast<ConstantInt>(ConstantInt::get(Int64Ty, TypeIdVal));
+}
+
+/// Emits call graph section.
+void AsmPrinter::emitCallGraphSection(const MachineFunction &MF,
+                                      FunctionInfo &FuncInfo) {
+  if (!MF.getTarget().Options.EmitCallGraphSection)
+    return;
+
+  // Switch to the call graph section for the function
+  MCSection *FuncCGSection =
+      getObjFileLowering().getCallGraphSection(*getCurrentSection());
+  assert(FuncCGSection && "null call graph section");
+  OutStreamer->pushSection();
+  OutStreamer->switchSection(FuncCGSection);
+
+  // Emit format version number.
+  OutStreamer->emitInt64(0);
+
+  // Emit function's self information, which is composed of:
+  //  1) FunctionEntryPc
+  //  2) FunctionKind: Whether the function is indirect target, and if so,
+  //     whether its type id is known.
+  //  3) FunctionTypeId: Emit only when the function is an indirect target
+  //     and its type id is known.
+
+  // Emit function entry pc.
+  const MCSymbol *FunctionSymbol = getFunctionBegin();
+  OutStreamer->emitSymbolValue(FunctionSymbol, TM.getProgramPointerSize());
+
+  // If this function has external linkage or has its address taken and
+  // it is not a callback, then anything could call it.
+  const Function &F = MF.getFunction();
+  bool IsIndirectTarget =
+      !F.hasLocalLinkage() || F.hasAddressTaken(nullptr,
+                                                /*IgnoreCallbackUses=*/true,
+                                                /*IgnoreAssumeLikeCalls=*/true,
+                                                /*IgnoreLLVMUsed=*/false);
+
+  // FIXME: FunctionKind takes a few values but emitted as a 64-bit value.
+  // Can be optimized to occupy 2 bits instead.
+  // Emit function kind, and type id if available.
+  if (!IsIndirectTarget) {
+    OutStreamer->emitInt64(FunctionInfo::FunctionKind::NOT_INDIRECT_TARGET);
+  } else {
+    const auto *TypeId = extractNumericCGTypeId(F);
+    if (TypeId) {
+      OutStreamer->emitInt64(
+          FunctionInfo::FunctionKind::INDIRECT_TARGET_KNOWN_TID);
+      OutStreamer->emitInt64(TypeId->getZExtValue());
+    } else {
+      OutStreamer->emitInt64(
+          FunctionInfo::FunctionKind::INDIRECT_TARGET_UNKNOWN_TID);
+    }
+  }
+
+  // Emit callsite labels, where each element is a pair of type id and
+  // indirect callsite pc.
+  const auto &CallSiteLabels = FuncInfo.CallSiteLabels;
+
+  // Emit the count of pairs.
+  OutStreamer->emitInt64(CallSiteLabels.size());
+
+  // Emit the type id and call site label pairs.
+  for (const std::pair<uint64_t, MCSymbol *> &El : CallSiteLabels) {
+    auto TypeId = El.first;
+    const auto &Label = El.second;
+    OutStreamer->emitInt64(TypeId);
+    OutStreamer->emitSymbolValue(Label, TM.getProgramPointerSize());
+  }
+  FuncInfo.CallSiteLabels.clear();
+
+  OutStreamer->popSection();
+}
+
 void AsmPrinter::emitPCSectionsLabel(const MachineFunction &MF,
                                      const MDNode &MD) {
   MCSymbol *S = MF.getContext().createTempSymbol("pcsection");
@@ -1706,6 +1805,8 @@ void AsmPrinter::emitFunctionBody() {
   bool IsEHa = MMI->getModule()->getModuleFlag("eh-asynch");
 
   bool CanDoExtraAnalysis = ORE->allowExtraAnalysis(DEBUG_TYPE);
+  FunctionInfo FuncInfo;
+  const auto &CallSitesInfoMap = MF->getCallSitesInfo();
   for (auto &MBB : *MF) {
     // Print a label for the basic block.
     emitBasicBlockStart(MBB);
@@ -1819,6 +1920,26 @@ void AsmPrinter::emitFunctionBody() {
         break;
       }
 
+      // FIXME: Some indirect calls can get lowered to jump instructions,
+      // resulting in emitting labels for them. The extra information can
+      // be neglected while disassembling but still takes space in the binary.
+      if (TM.Options.EmitCallGraphSection && MI.isCall()) {
+        // Only indirect calls have type identifiers set.
+        const auto &CallSiteInfo = CallSitesInfoMap.find(&MI);
+        if (CallSiteInfo != CallSitesInfoMap.end()) {
+          if (auto *TypeId = CallSiteInfo->second.TypeId) {
+            // Emit label.
+            MCSymbol *S = MF->getContext().createTempSymbol();
+            OutStreamer->emitLabel(S);
+
+            // Get type id value.
+            uint64_t TypeIdVal = TypeId->getZExtValue();
+
+            // Add to function's callsite labels.
+            FuncInfo.CallSiteLabels.emplace_back(TypeIdVal, S);
+          }
+        }
+      }
       // If there is a post-instruction symbol, emit a label for it here.
       if (MCSymbol *S = MI.getPostInstrSymbol())
         OutStreamer->emitLabel(S);
@@ -2000,6 +2121,9 @@ void AsmPrinter::emitFunctionBody() {
   // Emit section containing stack size metadata.
   emitStackSizeSection(*MF);
 
+  // Emit section containing call graph metadata.
+  emitCallGraphSection(*MF, FuncInfo);
+
   // Emit .su file containing function stack size information.
   emitStackUsage(*MF);
 
@@ -2582,7 +2706,8 @@ void AsmPrinter::SetupMachineFunction(MachineFunction &MF) {
       F.hasFnAttribute("function-instrument") ||
       F.hasFnAttribute("xray-instruction-threshold") ||
       needFuncLabels(MF) || NeedsLocalForSize ||
-      MF.getTarget().Options.EmitStackSizeSection || MF.hasBBLabels()) {
+      MF.getTarget().Options.EmitStackSizeSection ||
+      MF.getTarget().Options.EmitCallGraphSection || MF.hasBBLabels()) {
     CurrentFnBegin = createTempSymbol("func_begin");
     if (NeedsLocalForSize)
       CurrentFnSymForSize = CurrentFnBegin;
diff --git a/llvm/lib/MC/MCObjectFileInfo.cpp b/llvm/lib/MC/MCObjectFileInfo.cpp
index 7b382c131ef91..2734ed17cac47 100644
--- a/llvm/lib/MC/MCObjectFileInfo.cpp
+++ b/llvm/lib/MC/MCObjectFileInfo.cpp
@@ -534,6 +534,8 @@ void MCObjectFileInfo::initELFMCObjectFileInfo(const Triple &T, bool Large) {
   EHFrameSection =
       Ctx->getELFSection(".eh_frame", EHSectionType, EHSectionFlags);
 
+  CallGraphSection = Ctx->getELFSection(".callgraph", ELF::SHT_PROGBITS, 0);
+
   StackSizesSection = Ctx->getELFSection(".stack_sizes", ELF::SHT_PROGBITS, 0);
 
   PseudoProbeSection = Ctx->getELFSection(".pseudo_probe", DebugSecType, 0);
@@ -1127,6 +1129,24 @@ MCSection *MCObjectFileInfo::getDwarfComdatSection(const char *Name,
   llvm_unreachable("Unknown ObjectFormatType");
 }
 
+MCSection *
+MCObjectFileInfo::getCallGraphSection(const MCSection &TextSec) const {
+  if (Ctx->getObjectFileType() != MCContext::IsELF)
+    return CallGraphSection;
+
+  const MCSectionELF &ElfSec = static_cast<const MCSectionELF &>(TextSec);
+  unsigned Flags = ELF::SHF_LINK_ORDER;
+  StringRef GroupName;
+  if (const MCSymbol *Group = ElfSec.getGroup()) {
+    GroupName = Group->getName();
+    Flags |= ELF::SHF_GROUP;
+  }
+
+  return Ctx->getELFSection(".callgraph", ELF::SHT_PROGBITS, Flags, 0,
+                            GroupName, true, ElfSec.getUniqueID(),
+                            cast<MCSymbolELF>(TextSec.getBeginSymbol()));
+}
+
 MCSection *
 MCObjectFileInfo::getStackSizesSection(const MCSection &TextSec) const {
   if ((Ctx->getObjectFileType() != MCContext::IsELF) ||
diff --git a/llvm/test/CodeGen/call-graph-section.ll b/llvm/test/CodeGen/call-graph-section.ll
new file mode 100644
index 0000000000000..eb10161a62e96
--- /dev/null
+++ b/llvm/test/CodeGen/call-graph-section.ll
@@ -0,0 +1,73 @@
+; Tests that we store the type identifiers in .callgraph section of the binary.
+
+; RUN: llc --call-graph-section -filetype=obj -o - < %s | \
+; RUN: llvm-readelf -x .callgraph - | FileCheck %s
+
+target triple = "x86_64-unknown-linux-gnu"
+
+define dso_local void @foo() #0 !type !4 {
+entry:
+  ret void
+}
+
+define dso_local i32 @bar(i8 signext %a) #0 !type !5 {
+entry:
+  %a.addr = alloca i8, align 1
+  store i8 %a, i8* %a.addr, align 1
+  ret i32 0
+}
+
+define dso_local i32* @baz(i8* %a) #0 !type !6 {
+entry:
+  %a.addr = alloca i8*, align 8
+  store i8* %a, i8** %a.addr, align 8
+  ret i32* null
+}
+
+define dso_local i32 @main() #0 !type !7 {
+entry:
+  %retval = alloca i32, align 4
+  %fp_foo = alloca void (...)*, align 8
+  %a = alloca i8, align 1
+  %fp_bar = alloca i32 (i8)*, align 8
+  %fp_baz = alloca i32* (i8*)*, align 8
+  store i32 0, i32* %retval, align 4
+  store void (...)* bitcast (void ()* @foo to void (...)*), void (...)** %fp_foo, align 8
+  %0 = load void (...)*, void (...)** %fp_foo, align 8
+  call void (...) %0() [ "type"(metadata !"_ZTSFvE.generalized") ]
+  store i32 (i8)* @bar, i32 (i8)** %fp_bar, align 8
+  %1 = load i32 (i8)*, i32 (i8)** %fp_bar, align 8
+  %2 = load i8, i8* %a, align 1
+  %call = call i32 %1(i8 signext %2) [ "type"(metadata !"_ZTSFicE.generalized") ]
+  store i32* (i8*)* @baz, i32* (i8*)** %fp_baz, align 8
+  %3 = load i32* (i8*)*, i32* (i8*)** %fp_baz, align 8
+  %call1 = call i32* %3(i8* %a) [ "type"(metadata !"_ZTSFPvS_E.generalized") ]
+  call void @foo() [ "type"(metadata !"_ZTSFvE.generalized") ]
+  %4 = load i8, i8* %a, align 1
+  %call2 = call i32 @bar(i8 signext %4) [ "type"(metadata !"_ZTSFicE.generalized") ]
+  %call3 = call i32* @baz(i8* %a) [ "type"(metadata !"_ZTSFPvS_E.generalized") ]
+  ret i32 0
+}
+
+attributes #0 = { noinline nounwind optnone uwtable "frame-pointer"="all" "min-legal-vector-width"="0" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+cx8,+fxsr,+mmx,+sse,+sse2,+x87" "tune-cpu"="generic" }
+
+!llvm.module.flags = !{!0, !1, !2}
+!llvm.ident = !{!3}
+
+; Check that the numeric type id (md5 hash) for the below type ids are emitted
+; to the callgraph section.
+
+; CHECK: Hex dump of section '.callgraph':
+
+!0 = !{i32 1, !"wchar_size", i32 4}
+!1 = !{i32 7, !"uwtable", i32 1}
+!2 = !{i32 7, !"frame-pointer", i32 2}
+!3 = !{!"clang version 13.0.0 (git at github.com:llvm/llvm-project.git 6d35f403b91c2f2c604e23763f699d580370ca96)"}
+; CHECK-DAG: 2444f731 f5eecb3e
+!4 = !{i64 0, !"_ZTSFvE.generalized"}
+; CHECK-DAG: 5486bc59 814b8e30
+!5 = !{i64 0, !"_ZTSFicE.generalized"}
+; CHECK-DAG: 7ade6814 f897fd77
+!6 = !{i64 0, !"_ZTSFPvS_E.generalized"}
+; CHECK-DAG: caaf769a 600968fa
+!7 = !{i64 0, !"_ZTSFiE.generalized"}

>From 3c1b217eb819525e7d338b129e0fae6847608e61 Mon Sep 17 00:00:00 2001
From: prabhukr <prabhukr at google.com>
Date: Tue, 30 Jan 2024 21:41:26 -0800
Subject: [PATCH 9/9] Fixes to codegen test -- need to be amended to D105909

Need to go to add type id metadata patch
---
 clang/test/CodeGen/call-graph-section-1.cpp | 24 ++++++++++-----------
 clang/test/CodeGen/call-graph-section-2.cpp | 14 ++++++------
 clang/test/CodeGen/call-graph-section-3.cpp |  8 +++----
 clang/test/CodeGen/call-graph-section.c     | 14 ++++++------
 4 files changed, 30 insertions(+), 30 deletions(-)

diff --git a/clang/test/CodeGen/call-graph-section-1.cpp b/clang/test/CodeGen/call-graph-section-1.cpp
index a7ef25d7fc06a..25397e94422b7 100644
--- a/clang/test/CodeGen/call-graph-section-1.cpp
+++ b/clang/test/CodeGen/call-graph-section-1.cpp
@@ -11,7 +11,7 @@
 
 class Cls1 {
 public:
-  // FT-DAG: define {{.*}} i32* @_ZN4Cls18receiverEPcPf({{.*}} !type [[F_TCLS1RECEIVER:![0-9]+]]
+  // FT-DAG: define {{.*}} ptr @_ZN4Cls18receiverEPcPf({{.*}} !type [[F_TCLS1RECEIVER:![0-9]+]]
   static int *receiver(char *a, float *b) { return 0; }
 };
 
@@ -22,7 +22,7 @@ class Cls2 {
   // FT-DAG: define {{.*}} i32 @_ZN4Cls22f1Ecfd({{.*}} !type [[F_TCLS2F1:![0-9]+]]
   int f1(char a, float b, double c) { return 0; }
 
-  // FT-DAG: define {{.*}} i32* @_ZN4Cls22f2EPcPfPd({{.*}} !type [[F_TCLS2F2:![0-9]+]]
+  // FT-DAG: define {{.*}} ptr @_ZN4Cls22f2EPcPfPd({{.*}} !type [[F_TCLS2F2:![0-9]+]]
   int *f2(char *a, float *b, double *c) { return 0; }
 
   // FT-DAG: define {{.*}} void @_ZN4Cls22f3E4Cls1({{.*}} !type [[F_TCLS2F3F4:![0-9]+]]
@@ -65,7 +65,7 @@ void foo() {
   Cls2 ObjCls2;
   ObjCls2.fp = &Cls1::receiver;
 
-  // CST: call i32* {{.*}} [ "type"(metadata !"_ZTSFPvS_S_E.generalized") ]
+  // CST: call noundef ptr %{{.*}} [ "type"(metadata !"_ZTSFPvS_S_E.generalized") ]
   ObjCls2.fp(0, 0);
 
   auto fp_f1 = &Cls2::f1;
@@ -81,30 +81,30 @@ void foo() {
   Cls2 *ObjCls2Ptr = &ObjCls2;
   Cls1 Cls1Param;
 
-  // CST: call i32 %{{.*}}(%class.Cls2*{{.*}} [ "type"(metadata !"_ZTSFicfdE.generalized") ]
+  // CST: call noundef i32 %{{.*}} [ "type"(metadata !"_ZTSFicfdE.generalized") ]
   (ObjCls2Ptr->*fp_f1)(0, 0, 0);
 
-  // CST: call i32* %{{.*}}(%class.Cls2*{{.*}} [ "type"(metadata !"_ZTSFPvS_S_S_E.generalized") ]
+  // CST: call noundef ptr %{{.*}} [ "type"(metadata !"_ZTSFPvS_S_S_E.generalized") ]
   (ObjCls2Ptr->*fp_f2)(0, 0, 0);
 
-  // CST: call void %{{.*}}(%class.Cls2*{{.*}} [ "type"(metadata !"_ZTSFv4Cls1E.generalized") ]
+  // CST: call void %{{.*}} [ "type"(metadata !"_ZTSFv4Cls1E.generalized") ]
   (ObjCls2Ptr->*fp_f3)(Cls1Param);
 
-  // CST: call void  %{{.*}}(%class.Cls2*{{.*}} [ "type"(metadata !"_ZTSFv4Cls1E.generalized") ]
+  // CST: call void  %{{.*}} [ "type"(metadata !"_ZTSFv4Cls1E.generalized") ]
   (ObjCls2Ptr->*fp_f4)(Cls1Param);
 
-  // CST: call void %{{.*}}(%class.Cls2*{{.*}} [ "type"(metadata !"_ZTSFvPvE.generalized") ]
+  // CST: call void %{{.*}} [ "type"(metadata !"_ZTSFvPvE.generalized") ]
   (ObjCls2Ptr->*fp_f5)(&Cls1Param);
 
-  // CST: call void %{{.*}}(%class.Cls2*{{.*}} [ "type"(metadata !"_ZTSFvPKvE.generalized") ]
+  // CST: call void %{{.*}} [ "type"(metadata !"_ZTSFvPKvE.generalized") ]
   (ObjCls2Ptr->*fp_f6)(&Cls1Param);
 
-  // CST: call void %{{.*}}(%class.Cls2*{{.*}} [ "type"(metadata !"_ZTSFvR4Cls1E.generalized") ]
+  // CST: call void %{{.*}} [ "type"(metadata !"_ZTSFvR4Cls1E.generalized") ]
   (ObjCls2Ptr->*fp_f7)(Cls1Param);
 
-  // CST: call void %{{.*}}(%class.Cls2*{{.*}} [ "type"(metadata !"_ZTSFvRK4Cls1E.generalized") ]
+  // CST: call void %{{.*}} [ "type"(metadata !"_ZTSFvRK4Cls1E.generalized") ]
   (ObjCls2Ptr->*fp_f8)(Cls1Param);
 
-  // CST: call void %{{.*}}(%class.Cls2*{{.*}} [ "type"(metadata !"_ZTSKFvvE.generalized") ]
+  // CST: call void %{{.*}} [ "type"(metadata !"_ZTSKFvvE.generalized") ]
   (ObjCls2Ptr->*fp_f9)();
 }
diff --git a/clang/test/CodeGen/call-graph-section-2.cpp b/clang/test/CodeGen/call-graph-section-2.cpp
index c98448bebb3b5..4187faea495be 100644
--- a/clang/test/CodeGen/call-graph-section-2.cpp
+++ b/clang/test/CodeGen/call-graph-section-2.cpp
@@ -62,7 +62,7 @@ void foo() {
   Obj.fp = T_func<Cls1>;
   Cls1 Cls1Obj;
 
-  // CST: call %class.Cls1* {{.*}} [ "type"(metadata !"_ZTSFPv4Cls1S_PKvRS0_RKS0_E.generalized") ]
+  // CST: call noundef ptr %{{.*}} [ "type"(metadata !"_ZTSFPv4Cls1S_PKvRS0_RKS0_E.generalized") ]
   Obj.fp(Cls1Obj, &Cls1Obj, &Cls1Obj, Cls1Obj, Cls1Obj);
 
   // Make indirect calls to Cls2's member methods
@@ -75,21 +75,21 @@ void foo() {
 
   auto *Obj2Ptr = &Obj;
 
-  // CST: call void %{{.*}}(%class.Cls2*{{.*}} [ "type"(metadata !"_ZTSFvvE.generalized") ]
+  // CST: call void %{{.*}} [ "type"(metadata !"_ZTSFvvE.generalized") ]
   (Obj2Ptr->*fp_f1)();
 
-  // CST: call void %{{.*}}(%class.Cls2*{{.*}} [ "type"(metadata !"_ZTSFv4Cls1E.generalized") ]
+  // CST: call void %{{.*}} [ "type"(metadata !"_ZTSFv4Cls1E.generalized") ]
   (Obj2Ptr->*fp_f2)(Cls1Obj);
 
-  // CST: call void %{{.*}}(%class.Cls2*{{.*}} [ "type"(metadata !"_ZTSFvPvE.generalized") ]
+  // CST: call void %{{.*}} [ "type"(metadata !"_ZTSFvPvE.generalized") ]
   (Obj2Ptr->*fp_f3)(&Cls1Obj);
 
-  // CST: call void %{{.*}}(%class.Cls2*{{.*}} [ "type"(metadata !"_ZTSFvPKvE.generalized") ]
+  // CST: call void %{{.*}} [ "type"(metadata !"_ZTSFvPKvE.generalized") ]
   (Obj2Ptr->*fp_f4)(&Cls1Obj);
 
-  // CST: call void %{{.*}}(%class.Cls2*{{.*}} [ "type"(metadata !"_ZTSFvR4Cls1E.generalized") ]
+  // CST: call void %{{.*}} [ "type"(metadata !"_ZTSFvR4Cls1E.generalized") ]
   (Obj2Ptr->*fp_f5)(Cls1Obj);
 
-  // CST: call void %{{.*}}(%class.Cls2*{{.*}} [ "type"(metadata !"_ZTSFvRK4Cls1E.generalized") ]
+  // CST: call void %{{.*}} [ "type"(metadata !"_ZTSFvRK4Cls1E.generalized") ]
   (Obj2Ptr->*fp_f6)(Cls1Obj);
 }
diff --git a/clang/test/CodeGen/call-graph-section-3.cpp b/clang/test/CodeGen/call-graph-section-3.cpp
index d4a0d669d94a1..77bb110c664ef 100644
--- a/clang/test/CodeGen/call-graph-section-3.cpp
+++ b/clang/test/CodeGen/call-graph-section-3.cpp
@@ -38,15 +38,15 @@ void foo() {
   auto FpBaseVf = &Base::vf;
   auto FpDerivedVf = &Derived::vf;
 
-  // CST: call i32 %{{.*}}(%class.Base* {{.*}} [ "type"(metadata !"_ZTSFiPvE.generalized") ]
+  // CST: call noundef i32 %{{.*}} [ "type"(metadata !"_ZTSFiPvE.generalized") ]
   (Bptr->*FpBaseVf)(0);
 
-  // CST: call i32 %{{.*}}(%class.Base* {{.*}} [ "type"(metadata !"_ZTSFiPvE.generalized") ]
+  // CST: call noundef i32 %{{.*}} [ "type"(metadata !"_ZTSFiPvE.generalized") ]
   (BptrToD->*FpBaseVf)(0);
 
-  // CST: call i32 %{{.*}}(%class.Base* {{.*}} [ "type"(metadata !"_ZTSFiPvE.generalized") ]
+  // CST: call noundef i32 %{{.*}} [ "type"(metadata !"_ZTSFiPvE.generalized") ]
   (Dptr->*FpBaseVf)(0);
 
-  // CST: call i32 %{{.*}}(%class.Derived* {{.*}} [ "type"(metadata !"_ZTSFiPvE.generalized") ]
+  // CST: call noundef i32 %{{.*}} [ "type"(metadata !"_ZTSFiPvE.generalized") ]
   (Dptr->*FpDerivedVf)(0);
 }
diff --git a/clang/test/CodeGen/call-graph-section.c b/clang/test/CodeGen/call-graph-section.c
index 9b2ef81c39574..ed1bdb876152a 100644
--- a/clang/test/CodeGen/call-graph-section.c
+++ b/clang/test/CodeGen/call-graph-section.c
@@ -23,7 +23,7 @@ int baz(char a, float b, double c) {
   return 1;
 }
 
-// CHECK-DAG: define {{(dso_local)?}} i32* @qux({{.*}} !type [[F_TPTR:![0-9]+]]
+// CHECK-DAG: define {{(dso_local)?}} ptr @qux({{.*}} !type [[F_TPTR:![0-9]+]]
 int *qux(char *a, float *b, double *c) {
   return 0;
 }
@@ -36,8 +36,8 @@ void corge() {
   fp_baz('a', .0f, .0);
 
   int *(*fp_qux)(char *, float *, double *) = qux;
-  // ITANIUM: call i32* {{.*}} [ "type"(metadata !"_ZTSFPvS_S_S_E.generalized") ]
-  // MS:      call i32* {{.*}} [ "type"(metadata !"?6APEAXPEAX00 at Z.generalized") ]
+  // ITANIUM: call ptr {{.*}} [ "type"(metadata !"_ZTSFPvS_S_S_E.generalized") ]
+  // MS:      call ptr {{.*}} [ "type"(metadata !"?6APEAXPEAX00 at Z.generalized") ]
   fp_qux(0, 0, 0);
 }
 
@@ -56,14 +56,14 @@ void stparam(struct st2 a, struct st2 *b) {}
 void stf() {
   struct st1 St1;
   St1.fp = qux;
-  // ITANIUM: call i32* {{.*}} [ "type"(metadata !"_ZTSFPvS_S_S_E.generalized") ]
-  // MS:      call i32* {{.*}} [ "type"(metadata !"?6APEAXPEAX00 at Z.generalized") ]
+  // ITANIUM: call ptr {{.*}} [ "type"(metadata !"_ZTSFPvS_S_S_E.generalized") ]
+  // MS:      call ptr {{.*}} [ "type"(metadata !"?6APEAXPEAX00 at Z.generalized") ]
   St1.fp(0, 0, 0);
 
   struct st2 St2;
   St2.m.fp = qux;
-  // ITANIUM: call i32* {{.*}} [ "type"(metadata !"_ZTSFPvS_S_S_E.generalized") ]
-  // MS:      call i32* {{.*}} [ "type"(metadata !"?6APEAXPEAX00 at Z.generalized") ]
+  // ITANIUM: call ptr {{.*}} [ "type"(metadata !"_ZTSFPvS_S_S_E.generalized") ]
+  // MS:      call ptr {{.*}} [ "type"(metadata !"?6APEAXPEAX00 at Z.generalized") ]
   St2.m.fp(0, 0, 0);
 
   // ITANIUM: call void {{.*}} [ "type"(metadata !"_ZTSFv3st2PvE.generalized") ]



More information about the cfe-commits mailing list