[llvm] [clang] [PGO][GlobalValue][LTO]In GlobalValues::getGlobalIdentifier, use semicolon as delimiter for local-linkage varibles. (PR #74008)

Mingming Liu via cfe-commits cfe-commits at lists.llvm.org
Thu Nov 30 23:08:45 PST 2023


https://github.com/minglotus-6 updated https://github.com/llvm/llvm-project/pull/74008

>From 4cb5b087485124a7f2375fdc018b42a0401e6409 Mon Sep 17 00:00:00 2001
From: mingmingl <mingmingl at google.com>
Date: Thu, 30 Nov 2023 15:41:37 -0800
Subject: [PATCH 1/2] [PGO][GlobalValue][LTO]In
 GlobalValues::getGlobalIdentifier, use semicolon as delimiter for
 local-linkage varibles.

Commit fe05193 (phab D156569), IRPGO names uses format
'[<filepath>;]<linkage-name>' while prior format is
[<filepath>:<linkage-name>'. The format change would break the use caes
demonstrated in (updated)
llvm/test/Transforms/PGOProfile/thinlto_indirect_call_promotion.ll

This patch changes GlobalValues::getGlobalIdentifer to use the
semicolon.

To elaborate on the scenario how things break without this PR
1. IRPGO raw profiles stores (compressed) IRPGO names of functions in
   one section, and per-function profile data in another section. One
   field in per-function profile data is the MD5 hash of IRPGO names.
2. When raw profiles are converted to indexed format profiles, the
   profiled address is mapped to the MD5 hash of the callee.
3. In thin-lto prelink pipeline, MD5 hash of IRPGO names will be
   annotated as value profiles, and used to import indirect-call-prom
   candidates. If the annotated MD5 hash is computed from the new format
   while import uses the prior format, the callee cannot be imported.

The updated test case Transforms/PGOProfile/thinlto_indirect_call_promotion.ll exercise the following path
- Annotate raw profiles and generate import summaries. Using the
  imported summaries, it tests that functions are correctly imported and
  ICP transformations happened.
---
 llvm/lib/IR/Globals.cpp                       |   4 +-
 llvm/lib/ProfileData/InstrProf.cpp            |  28 +++++--
 .../thinlto-function-summary-originalnames.ll |   6 +-
 llvm/test/ThinLTO/X86/memprof-basic.ll        |  26 +++----
 .../X86/memprof-duplicate-context-ids.ll      |  12 +--
 .../ThinLTO/X86/memprof-funcassigncloning.ll  |   6 +-
 llvm/test/ThinLTO/X86/memprof-indirectcall.ll |  32 ++++----
 llvm/test/ThinLTO/X86/memprof-inlined.ll      |  14 ++--
 .../Inputs/thinlto_icall_prom.profdata        | Bin 0 -> 976 bytes
 .../Inputs/thinlto_indirect_call_promotion.ll |  32 ++++++--
 .../Inputs/update_icall_promotion_inputs.sh   |  70 ++++++++++++++++++
 .../thinlto_indirect_call_promotion.ll        |  52 ++++++-------
 12 files changed, 194 insertions(+), 88 deletions(-)
 create mode 100644 llvm/test/Transforms/PGOProfile/Inputs/thinlto_icall_prom.profdata
 create mode 100644 llvm/test/Transforms/PGOProfile/Inputs/update_icall_promotion_inputs.sh

diff --git a/llvm/lib/IR/Globals.cpp b/llvm/lib/IR/Globals.cpp
index 7bd4503a689e4ae..e821de3b198f1b6 100644
--- a/llvm/lib/IR/Globals.cpp
+++ b/llvm/lib/IR/Globals.cpp
@@ -158,9 +158,9 @@ std::string GlobalValue::getGlobalIdentifier(StringRef Name,
     // that it will stay the same, e.g., if the files are checked out from
     // version control in different locations.
     if (FileName.empty())
-      NewName = NewName.insert(0, "<unknown>:");
+      NewName = NewName.insert(0, "<unknown>;");
     else
-      NewName = NewName.insert(0, FileName.str() + ":");
+      NewName = NewName.insert(0, FileName.str() + ";");
   }
   return NewName;
 }
diff --git a/llvm/lib/ProfileData/InstrProf.cpp b/llvm/lib/ProfileData/InstrProf.cpp
index 236b083a1e2155b..d9ad5c8b6f6838d 100644
--- a/llvm/lib/ProfileData/InstrProf.cpp
+++ b/llvm/lib/ProfileData/InstrProf.cpp
@@ -246,11 +246,27 @@ std::string InstrProfError::message() const {
 
 char InstrProfError::ID = 0;
 
-std::string getPGOFuncName(StringRef RawFuncName,
-                           GlobalValue::LinkageTypes Linkage,
+std::string getPGOFuncName(StringRef Name, GlobalValue::LinkageTypes Linkage,
                            StringRef FileName,
                            uint64_t Version LLVM_ATTRIBUTE_UNUSED) {
-  return GlobalValue::getGlobalIdentifier(RawFuncName, Linkage, FileName);
+  // Value names may be prefixed with a binary '1' to indicate
+  // that the backend should not modify the symbols due to any platform
+  // naming convention. Do not include that '1' in the PGO profile name.
+  if (Name[0] == '\1')
+    Name = Name.substr(1);
+
+  std::string NewName = std::string(Name);
+  if (llvm::GlobalValue::isLocalLinkage(Linkage)) {
+    // For local symbols, prepend the main file name to distinguish them.
+    // Do not include the full path in the file name since there's no guarantee
+    // that it will stay the same, e.g., if the files are checked out from
+    // version control in different locations.
+    if (FileName.empty())
+      NewName = NewName.insert(0, "<unknown>:");
+    else
+      NewName = NewName.insert(0, FileName.str() + ":");
+  }
+  return NewName;
 }
 
 // Strip NumPrefix level of directory name from PathNameStr. If the number of
@@ -300,12 +316,8 @@ getIRPGONameForGlobalObject(const GlobalObject &GO,
                             GlobalValue::LinkageTypes Linkage,
                             StringRef FileName) {
   SmallString<64> Name;
-  if (llvm::GlobalValue::isLocalLinkage(Linkage)) {
-    Name.append(FileName.empty() ? "<unknown>" : FileName);
-    Name.append(";");
-  }
   Mangler().getNameWithPrefix(Name, &GO, /*CannotUsePrivateLabel=*/true);
-  return Name.str().str();
+  return GlobalValue::getGlobalIdentifier(Name, Linkage, FileName);
 }
 
 static std::optional<std::string> lookupPGONameFromMetadata(MDNode *MD) {
diff --git a/llvm/test/Bitcode/thinlto-function-summary-originalnames.ll b/llvm/test/Bitcode/thinlto-function-summary-originalnames.ll
index 4d840d1f8ec8dda..24bb2a4efff509b 100644
--- a/llvm/test/Bitcode/thinlto-function-summary-originalnames.ll
+++ b/llvm/test/Bitcode/thinlto-function-summary-originalnames.ll
@@ -6,9 +6,9 @@
 ; COMBINED:       <GLOBALVAL_SUMMARY_BLOCK
 ; COMBINED-NEXT:    <VERSION
 ; COMBINED-NEXT:    <FLAGS
-; COMBINED-NEXT:    <VALUE_GUID {{.*}} op1=4947176790635855146/>
-; COMBINED-NEXT:    <VALUE_GUID {{.*}} op1=-6591587165810580810/>
-; COMBINED-NEXT:    <VALUE_GUID {{.*}} op1=-4377693495213223786/>
+; COMBINED-NEXT:    <VALUE_GUID {{.*}} op1=686735765308251824/>
+; COMBINED-NEXT:    <VALUE_GUID {{.*}} op1=4507502870619175775/>
+; COMBINED-NEXT:    <VALUE_GUID {{.*}} op1=-8118561185538785069/>
 ; COMBINED-DAG:    <COMBINED{{ }}
 ; COMBINED-DAG:    <COMBINED_ORIGINAL_NAME op0=6699318081062747564/>
 ; COMBINED-DAG:    <COMBINED_GLOBALVAR_INIT_REFS
diff --git a/llvm/test/ThinLTO/X86/memprof-basic.ll b/llvm/test/ThinLTO/X86/memprof-basic.ll
index 0d466830ba57d62..54e01e5fcdf9555 100644
--- a/llvm/test/ThinLTO/X86/memprof-basic.ll
+++ b/llvm/test/ThinLTO/X86/memprof-basic.ll
@@ -148,7 +148,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 		Edge from Callee [[BAR]] to Caller: [[BAZ:0x[a-z0-9]+]] AllocTypes: NotColdCold ContextIds: 1 2
 
 ; DUMP: Node [[BAZ]]
-; DUMP: 	Callee: 9832687305761716512 (_Z3barv) Clones: 0 StackIds: 2	(clone 0)
+; DUMP: 	Callee: 11481133863268513686 (_Z3barv) Clones: 0 StackIds: 2	(clone 0)
 ; DUMP: 	AllocTypes: NotColdCold
 ; DUMP: 	ContextIds: 1 2
 ; DUMP: 	CalleeEdges:
@@ -157,7 +157,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 		Edge from Callee [[BAZ]] to Caller: [[FOO:0x[a-z0-9]+]] AllocTypes: NotColdCold ContextIds: 1 2
 
 ; DUMP: Node [[FOO]]
-; DUMP: 	Callee: 5878270615442837395 (_Z3bazv) Clones: 0 StackIds: 3	(clone 0)
+; DUMP: 	Callee: 1807954217441101578 (_Z3bazv) Clones: 0 StackIds: 3	(clone 0)
 ; DUMP: 	AllocTypes: NotColdCold
 ; DUMP: 	ContextIds: 1 2
 ; DUMP: 	CalleeEdges:
@@ -167,7 +167,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 		Edge from Callee [[FOO]] to Caller: [[MAIN2:0x[a-z0-9]+]] AllocTypes: Cold ContextIds: 2
 
 ; DUMP: Node [[MAIN1]]
-; DUMP: 	Callee: 6731117468105397038 (_Z3foov) Clones: 0 StackIds: 0	(clone 0)
+; DUMP: 	Callee: 8107868197919466657 (_Z3foov) Clones: 0 StackIds: 0	(clone 0)
 ; DUMP: 	AllocTypes: NotCold
 ; DUMP: 	ContextIds: 1
 ; DUMP: 	CalleeEdges:
@@ -175,7 +175,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 	CallerEdges:
 
 ; DUMP: Node [[MAIN2]]
-; DUMP: 	Callee: 6731117468105397038 (_Z3foov) Clones: 0 StackIds: 1	(clone 0)
+; DUMP: 	Callee: 8107868197919466657 (_Z3foov) Clones: 0 StackIds: 1	(clone 0)
 ; DUMP: 	AllocTypes: Cold
 ; DUMP: 	ContextIds: 2
 ; DUMP: 	CalleeEdges:
@@ -197,7 +197,7 @@ attributes #0 = { noinline optnone }
 ; DUMP:		Clones: [[BAR2:0x[a-z0-9]+]]
 
 ; DUMP: Node [[BAZ]]
-; DUMP: 	Callee: 9832687305761716512 (_Z3barv) Clones: 0 StackIds: 2    (clone 0)
+; DUMP: 	Callee: 11481133863268513686 (_Z3barv) Clones: 0 StackIds: 2    (clone 0)
 ; DUMP: 	AllocTypes: NotCold
 ; DUMP: 	ContextIds: 1
 ; DUMP: 	CalleeEdges:
@@ -207,7 +207,7 @@ attributes #0 = { noinline optnone }
 ; DUMP:		Clones: [[BAZ2:0x[a-z0-9]+]]
 
 ; DUMP: Node [[FOO]]
-; DUMP: 	Callee: 5878270615442837395 (_Z3bazv) Clones: 0 StackIds: 3    (clone 0)
+; DUMP: 	Callee: 1807954217441101578 (_Z3bazv) Clones: 0 StackIds: 3    (clone 0)
 ; DUMP: 	AllocTypes: NotCold
 ; DUMP: 	ContextIds: 1
 ; DUMP: 	CalleeEdges:
@@ -217,7 +217,7 @@ attributes #0 = { noinline optnone }
 ; DUMP:		Clones: [[FOO2:0x[a-z0-9]+]]
 
 ; DUMP: Node [[MAIN1]]
-; DUMP: 	Callee: 6731117468105397038 (_Z3foov) Clones: 0 StackIds: 0     (clone 0)
+; DUMP: 	Callee: 8107868197919466657 (_Z3foov) Clones: 0 StackIds: 0     (clone 0)
 ; DUMP: 	AllocTypes: NotCold
 ; DUMP: 	ContextIds: 1
 ; DUMP: 	CalleeEdges:
@@ -225,7 +225,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 	CallerEdges:
 
 ; DUMP: Node [[MAIN2]]
-; DUMP: 	Callee: 6731117468105397038 (_Z3foov) Clones: 0 StackIds: 1     (clone 0)
+; DUMP: 	Callee: 8107868197919466657 (_Z3foov) Clones: 0 StackIds: 1     (clone 0)
 ; DUMP: 	AllocTypes: Cold
 ; DUMP: 	ContextIds: 2
 ; DUMP: 	CalleeEdges:
@@ -233,7 +233,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 	CallerEdges:
 
 ; DUMP: Node [[FOO2]]
-; DUMP: 	Callee: 5878270615442837395 (_Z3bazv) Clones: 0 StackIds: 3    (clone 0)
+; DUMP: 	Callee: 1807954217441101578 (_Z3bazv) Clones: 0 StackIds: 3    (clone 0)
 ; DUMP: 	AllocTypes: Cold
 ; DUMP: 	ContextIds: 2
 ; DUMP: 	CalleeEdges:
@@ -243,7 +243,7 @@ attributes #0 = { noinline optnone }
 ; DUMP:		Clone of [[FOO]]
 
 ; DUMP: Node [[BAZ2]]
-; DUMP: 	Callee: 9832687305761716512 (_Z3barv) Clones: 0 StackIds: 2    (clone 0)
+; DUMP: 	Callee: 11481133863268513686 (_Z3barv) Clones: 0 StackIds: 2    (clone 0)
 ; DUMP: 	AllocTypes: Cold
 ; DUMP: 	ContextIds: 2
 ; DUMP: 	CalleeEdges:
@@ -344,7 +344,7 @@ attributes #0 = { noinline optnone }
 ; DOTCLONED: }
 
 
-; DISTRIB: ^[[BAZ:[0-9]+]] = gv: (guid: 5878270615442837395, {{.*}} callsites: ((callee: ^[[BAR:[0-9]+]], clones: (0, 1)
-; DISTRIB: ^[[FOO:[0-9]+]] = gv: (guid: 6731117468105397038, {{.*}} callsites: ((callee: ^[[BAZ]], clones: (0, 1)
-; DISTRIB: ^[[BAR]] = gv: (guid: 9832687305761716512, {{.*}} allocs: ((versions: (notcold, cold)
+; DISTRIB: ^[[BAZ:[0-9]+]] = gv: (guid: 1807954217441101578, {{.*}} callsites: ((callee: ^[[BAR:[0-9]+]], clones: (0, 1)
+; DISTRIB: ^[[FOO:[0-9]+]] = gv: (guid: 8107868197919466657, {{.*}} callsites: ((callee: ^[[BAZ]], clones: (0, 1)
+; DISTRIB: ^[[BAR]] = gv: (guid: 11481133863268513686, {{.*}} allocs: ((versions: (notcold, cold)
 ; DISTRIB: ^[[MAIN:[0-9]+]] = gv: (guid: 15822663052811949562, {{.*}} callsites: ((callee: ^[[FOO]], clones: (0), {{.*}} (callee: ^[[FOO]], clones: (1)
diff --git a/llvm/test/ThinLTO/X86/memprof-duplicate-context-ids.ll b/llvm/test/ThinLTO/X86/memprof-duplicate-context-ids.ll
index f7ba0d27dca78a7..7a0b4a36dbad4dd 100644
--- a/llvm/test/ThinLTO/X86/memprof-duplicate-context-ids.ll
+++ b/llvm/test/ThinLTO/X86/memprof-duplicate-context-ids.ll
@@ -260,8 +260,10 @@ attributes #0 = { noinline optnone}
 ; STATS-BE: 1 memprof-context-disambiguation - Number of original (not cloned) allocations with memprof profiles during ThinLTO backend
 
 
-; DISTRIB: ^[[C:[0-9]+]] = gv: (guid: 1643923691937891493, {{.*}} callsites: ((callee: ^[[D:[0-9]+]], clones: (1)
-; DISTRIB: ^[[D]] = gv: (guid: 4881081444663423788, {{.*}} allocs: ((versions: (notcold, cold)
-; DISTRIB: ^[[B:[0-9]+]] = gv: (guid: 14590037969532473829, {{.*}} callsites: ((callee: ^[[D]], clones: (1)
-; DISTRIB: ^[[F:[0-9]+]] = gv: (guid: 17035303613541779335, {{.*}} callsites: ((callee: ^[[D]], clones: (0)
-; DISTRIB: ^[[E:[0-9]+]] = gv: (guid: 17820708772846654376, {{.*}} callsites: ((callee: ^[[D]], clones: (1)
+; DISTRIB: ^[[E:[0-9]+]] = gv: (guid: 331966645857188136, {{.*}} callsites: ((callee: ^[[D:[0-9]+]], clones: (1)
+; DISTRIB: ^[[D]] = gv: (guid: 11079124245221721799, {{.*}} allocs: ((versions: (notcold, cold)
+; DISTRIB: ^[[F:[0-9]+]] = gv: (guid: 11254287701717398916, {{.*}} callsites: ((callee: ^[[D]], clones: (0)
+; DISTRIB: ^[[B:[0-9]+]] = gv: (guid: 13579056193435805313, {{.*}} callsites: ((callee: ^[[D]], clones: (1)
+; DISTRIB: ^[[C:[0-9]+]] = gv: (guid: 15101436305866936160, {{.*}} callsites: ((callee: ^[[D:[0-9]+]], clones: (1)
+
+
diff --git a/llvm/test/ThinLTO/X86/memprof-funcassigncloning.ll b/llvm/test/ThinLTO/X86/memprof-funcassigncloning.ll
index 9a72ae43b2f1e48..f1a494d077fefca 100644
--- a/llvm/test/ThinLTO/X86/memprof-funcassigncloning.ll
+++ b/llvm/test/ThinLTO/X86/memprof-funcassigncloning.ll
@@ -176,7 +176,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 	Clones: [[ENEW1CLONE:0x[a-z0-9]+]]
 
 ; DUMP: Node [[D:0x[a-z0-9]+]]
-; DUMP: 	Callee: 10758063066234039248 (_Z1EPPcS0_) Clones: 0 StackIds: 0 (clone 0)
+; DUMP: 	Callee: 16147627620923572899 (_Z1EPPcS0_) Clones: 0 StackIds: 0 (clone 0)
 ; DUMP: 	AllocTypes: NotColdCold
 ; DUMP: 	ContextIds: 1 6
 ; DUMP: 	CalleeEdges:
@@ -185,7 +185,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 	CallerEdges:
 
 ; DUMP: Node [[C]]
-; DUMP: 	Callee: 10758063066234039248 (_Z1EPPcS0_) Clones: 0 StackIds: 1 (clone 0)
+; DUMP: 	Callee: 16147627620923572899 (_Z1EPPcS0_) Clones: 0 StackIds: 1 (clone 0)
 ; DUMP: 	AllocTypes: NotColdCold
 ; DUMP: 	ContextIds: 2 5
 ; DUMP: 	CalleeEdges:
@@ -194,7 +194,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 	CallerEdges:
 
 ; DUMP: Node [[B]]
-; DUMP: 	Callee: 10758063066234039248 (_Z1EPPcS0_) Clones: 0 StackIds: 2 (clone 0)
+; DUMP: 	Callee: 16147627620923572899 (_Z1EPPcS0_) Clones: 0 StackIds: 2 (clone 0)
 ; DUMP: 	AllocTypes: NotCold
 ; DUMP: 	ContextIds: 3 4
 ; DUMP: 	CalleeEdges:
diff --git a/llvm/test/ThinLTO/X86/memprof-indirectcall.ll b/llvm/test/ThinLTO/X86/memprof-indirectcall.ll
index 76273959f4f4ac8..07a52f441ca2783 100644
--- a/llvm/test/ThinLTO/X86/memprof-indirectcall.ll
+++ b/llvm/test/ThinLTO/X86/memprof-indirectcall.ll
@@ -202,7 +202,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 		Edge from Callee [[FOO]] to Caller: [[MAIN2:0x[a-z0-9]+]] AllocTypes: Cold ContextIds: 6
 
 ; DUMP: Node [[AX]]
-; DUMP: 	Callee: 12914368124089294956 (_Z3foov) Clones: 0 StackIds: 6	(clone 0)
+; DUMP: 	Callee: 15844184524768596045 (_Z3foov) Clones: 0 StackIds: 6	(clone 0)
 ; DUMP: 	AllocTypes: NotColdCold
 ; DUMP: 	ContextIds: 1 2
 ; DUMP: 	CalleeEdges:
@@ -225,7 +225,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 		Edge from Callee [[BAR]] to Caller: [[MAIN6:0x[a-z0-9]+]] AllocTypes: NotCold ContextIds: 5
 
 ; DUMP: Node [[MAIN3]]
-; DUMP: 	Callee: 4095956691517954349 (_Z3barP1A) Clones: 0 StackIds: 4	(clone 0)
+; DUMP: 	Callee: 2040285415115148168 (_Z3barP1A) Clones: 0 StackIds: 4	(clone 0)
 ; DUMP: 	AllocTypes: NotCold
 ; DUMP: 	ContextIds: 1
 ; DUMP: 	CalleeEdges:
@@ -233,7 +233,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 	CallerEdges:
 
 ; DUMP: Node [[MAIN4]]
-; DUMP: 	Callee: 4095956691517954349 (_Z3barP1A) Clones: 0 StackIds: 5	(clone 0)
+; DUMP: 	Callee: 2040285415115148168 (_Z3barP1A) Clones: 0 StackIds: 5	(clone 0)
 ; DUMP: 	AllocTypes: Cold
 ; DUMP: 	ContextIds: 2
 ; DUMP: 	CalleeEdges:
@@ -241,7 +241,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 	CallerEdges:
 
 ; DUMP: Node [[MAIN1]]
-; DUMP: 	Callee: 12914368124089294956 (_Z3foov) Clones: 0 StackIds: 0	(clone 0)
+; DUMP: 	Callee: 15844184524768596045 (_Z3foov) Clones: 0 StackIds: 0	(clone 0)
 ; DUMP: 	AllocTypes: NotCold
 ; DUMP: 	ContextIds: 3
 ; DUMP: 	CalleeEdges:
@@ -249,7 +249,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 	CallerEdges:
 
 ; DUMP: Node [[BX]]
-; DUMP: 	Callee: 12914368124089294956 (_Z3foov) Clones: 0 StackIds: 7	(clone 0)
+; DUMP: 	Callee: 15844184524768596045 (_Z3foov) Clones: 0 StackIds: 7	(clone 0)
 ; DUMP: 	AllocTypes: NotColdCold
 ; DUMP: 	ContextIds: 4 5
 ; DUMP: 	CalleeEdges:
@@ -258,7 +258,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 		Edge from Callee [[BX]] to Caller: [[BAR]] AllocTypes: NotColdCold ContextIds: 4 5
 
 ; DUMP: Node [[MAIN5]]
-; DUMP: 	Callee: 4095956691517954349 (_Z3barP1A) Clones: 0 StackIds: 2	(clone 0)
+; DUMP: 	Callee: 2040285415115148168 (_Z3barP1A) Clones: 0 StackIds: 2	(clone 0)
 ; DUMP: 	AllocTypes: Cold
 ; DUMP: 	ContextIds: 4
 ; DUMP: 	CalleeEdges:
@@ -266,7 +266,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 	CallerEdges:
 
 ; DUMP: Node [[MAIN6]]
-; DUMP: 	Callee: 4095956691517954349 (_Z3barP1A) Clones: 0 StackIds: 3	(clone 0)
+; DUMP: 	Callee: 2040285415115148168 (_Z3barP1A) Clones: 0 StackIds: 3	(clone 0)
 ; DUMP: 	AllocTypes: NotCold
 ; DUMP: 	ContextIds: 5
 ; DUMP: 	CalleeEdges:
@@ -274,7 +274,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 	CallerEdges:
 
 ; DUMP: Node [[MAIN2]]
-; DUMP: 	Callee: 12914368124089294956 (_Z3foov) Clones: 0 StackIds: 1	(clone 0)
+; DUMP: 	Callee: 15844184524768596045 (_Z3foov) Clones: 0 StackIds: 1	(clone 0)
 ; DUMP: 	AllocTypes: Cold
 ; DUMP: 	ContextIds: 6
 ; DUMP: 	CalleeEdges:
@@ -302,7 +302,7 @@ attributes #0 = { noinline optnone }
 ; DUMP:		Clones: [[FOO2:0x[a-z0-9]+]]
 
 ; DUMP: Node [[AX]]
-; DUMP: 	Callee: 12914368124089294956 (_Z3foov) Clones: 0 StackIds: 6    (clone 0)
+; DUMP: 	Callee: 15844184524768596045 (_Z3foov) Clones: 0 StackIds: 6    (clone 0)
 ; DUMP: 	AllocTypes: NotColdCold
 ; DUMP: 	ContextIds: 1 2
 ; DUMP: 	CalleeEdges:
@@ -324,7 +324,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 		Edge from Callee [[BAR]] to Caller: [[MAIN6]] AllocTypes: NotCold ContextIds: 5
 
 ; DUMP: Node [[MAIN3]]
-; DUMP: 	Callee: 4095956691517954349 (_Z3barP1A) Clones: 0 StackIds: 4   (clone 0)
+; DUMP: 	Callee: 2040285415115148168 (_Z3barP1A) Clones: 0 StackIds: 4   (clone 0)
 ; DUMP: 	AllocTypes: NotCold
 ; DUMP: 	ContextIds: 1
 ; DUMP: 	CalleeEdges:
@@ -332,7 +332,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 	CallerEdges:
 
 ; DUMP: Node [[MAIN4]]
-; DUMP: 	Callee: 4095956691517954349 (_Z3barP1A) Clones: 0 StackIds: 5   (clone 0)
+; DUMP: 	Callee: 2040285415115148168 (_Z3barP1A) Clones: 0 StackIds: 5   (clone 0)
 ; DUMP: 	AllocTypes: Cold
 ; DUMP: 	ContextIds: 2
 ; DUMP: 	CalleeEdges:
@@ -340,7 +340,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 	CallerEdges:
 
 ; DUMP: Node [[MAIN1]]
-; DUMP: 	Callee: 12914368124089294956 (_Z3foov) Clones: 0 StackIds: 0    (clone 0)
+; DUMP: 	Callee: 15844184524768596045 (_Z3foov) Clones: 0 StackIds: 0    (clone 0)
 ; DUMP: 	AllocTypes: NotCold
 ; DUMP: 	ContextIds: 3
 ; DUMP: 	CalleeEdges:
@@ -348,7 +348,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 	CallerEdges:
 
 ; DUMP: Node [[BX]]
-; DUMP: 	Callee: 12914368124089294956 (_Z3foov) Clones: 0 StackIds: 7    (clone 0)
+; DUMP: 	Callee: 15844184524768596045 (_Z3foov) Clones: 0 StackIds: 7    (clone 0)
 ; DUMP: 	AllocTypes: NotColdCold
 ; DUMP: 	ContextIds: 4 5
 ; DUMP: 	CalleeEdges:
@@ -357,7 +357,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 		Edge from Callee [[BX]] to Caller: [[BAR]] AllocTypes: NotColdCold ContextIds: 4 5
 
 ; DUMP: Node [[MAIN5]]
-; DUMP: 	Callee: 4095956691517954349 (_Z3barP1A) Clones: 0 StackIds: 2   (clone 0)
+; DUMP: 	Callee: 2040285415115148168 (_Z3barP1A) Clones: 0 StackIds: 2   (clone 0)
 ; DUMP: 	AllocTypes: Cold
 ; DUMP: 	ContextIds: 4
 ; DUMP: 	CalleeEdges:
@@ -365,7 +365,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 	CallerEdges:
 
 ; DUMP: Node [[MAIN6]]
-; DUMP: 	Callee: 4095956691517954349 (_Z3barP1A) Clones: 0 StackIds: 3   (clone 0)
+; DUMP: 	Callee: 2040285415115148168 (_Z3barP1A) Clones: 0 StackIds: 3   (clone 0)
 ; DUMP: 	AllocTypes: NotCold
 ; DUMP: 	ContextIds: 5
 ; DUMP: 	CalleeEdges:
@@ -373,7 +373,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 	CallerEdges:
 
 ; DUMP: Node [[MAIN2]]
-; DUMP: 	Callee: 12914368124089294956 (_Z3foov) Clones: 0 StackIds: 1    (clone 0)
+; DUMP: 	Callee: 15844184524768596045 (_Z3foov) Clones: 0 StackIds: 1    (clone 0)
 ; DUMP: 	AllocTypes: Cold
 ; DUMP: 	ContextIds: 6
 ; DUMP: 	CalleeEdges:
diff --git a/llvm/test/ThinLTO/X86/memprof-inlined.ll b/llvm/test/ThinLTO/X86/memprof-inlined.ll
index feb9c94344223c9..89df345b2204239 100644
--- a/llvm/test/ThinLTO/X86/memprof-inlined.ll
+++ b/llvm/test/ThinLTO/X86/memprof-inlined.ll
@@ -170,7 +170,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 		Edge from Callee [[FOO2]] to Caller: [[MAIN2:0x[a-z0-9]+]] AllocTypes: Cold ContextIds: 2
 
 ; DUMP: Node [[MAIN1]]
-; DUMP: 	Callee: 2229562716906371625 (_Z3foov) Clones: 0 StackIds: 2	(clone 0)
+; DUMP: 	Callee: 644169328058379925 (_Z3foov) Clones: 0 StackIds: 2	(clone 0)
 ; DUMP: 	AllocTypes: NotCold
 ; DUMP: 	ContextIds: 1 3
 ; DUMP: 	CalleeEdges:
@@ -179,7 +179,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 	CallerEdges:
 
 ; DUMP: Node [[MAIN2]]
-; DUMP: 	Callee: 2229562716906371625 (_Z3foov) Clones: 0 StackIds: 3	(clone 0)
+; DUMP: 	Callee: 644169328058379925 (_Z3foov) Clones: 0 StackIds: 3	(clone 0)
 ; DUMP: 	AllocTypes: Cold
 ; DUMP: 	ContextIds: 2 4
 ; DUMP: 	CalleeEdges:
@@ -201,7 +201,7 @@ attributes #0 = { noinline optnone }
 ;; This is the node synthesized for the call to bar in foo that was created
 ;; by inlining baz into foo.
 ; DUMP: Node [[FOO]]
-; DUMP: 	Callee: 16064618363798697104 (_Z3barv) Clones: 0 StackIds: 0, 1	(clone 0)
+; DUMP: 	Callee: 10349908617508457487 (_Z3barv) Clones: 0 StackIds: 0, 1	(clone 0)
 ; DUMP: 	AllocTypes: NotColdCold
 ; DUMP: 	ContextIds: 3 4
 ; DUMP: 	CalleeEdges:
@@ -234,7 +234,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 		Edge from Callee [[FOO2]] to Caller: [[MAIN2]] AllocTypes: Cold ContextIds: 2
 
 ; DUMP: Node [[MAIN1]]
-; DUMP:         Callee: 2229562716906371625 (_Z3foov) Clones: 0 StackIds: 2     (clone 0)
+; DUMP:         Callee: 644169328058379925 (_Z3foov) Clones: 0 StackIds: 2     (clone 0)
 ; DUMP: 	AllocTypes: NotCold
 ; DUMP: 	ContextIds: 1 3
 ; DUMP: 	CalleeEdges:
@@ -243,7 +243,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 	CallerEdges:
 
 ; DUMP: Node [[MAIN2]]
-; DUMP:         Callee: 2229562716906371625 (_Z3foov) Clones: 0 StackIds: 3     (clone 0)
+; DUMP:         Callee: 644169328058379925 (_Z3foov) Clones: 0 StackIds: 3     (clone 0)
 ; DUMP: 	AllocTypes: Cold
 ; DUMP: 	ContextIds: 2 4
 ; DUMP: 	CalleeEdges:
@@ -264,7 +264,7 @@ attributes #0 = { noinline optnone }
 ; DUMP:         Clones: [[BAR2:0x[a-z0-9]+]]
 
 ; DUMP: Node [[FOO]]
-; DUMP:         Callee: 16064618363798697104 (_Z3barv) Clones: 0 StackIds: 0, 1 (clone 0)
+; DUMP:         Callee: 10349908617508457487 (_Z3barv) Clones: 0 StackIds: 0, 1 (clone 0)
 ; DUMP: 	AllocTypes: NotCold
 ; DUMP: 	ContextIds: 3
 ; DUMP: 	CalleeEdges:
@@ -274,7 +274,7 @@ attributes #0 = { noinline optnone }
 ; DUMP:         Clones: [[FOO3]]
 
 ; DUMP: Node [[FOO3]]
-; DUMP:         Callee: 16064618363798697104 (_Z3barv) Clones: 0 StackIds: 0, 1 (clone 0)
+; DUMP:         Callee: 10349908617508457487 (_Z3barv) Clones: 0 StackIds: 0, 1 (clone 0)
 ; DUMP: 	AllocTypes: Cold
 ; DUMP: 	ContextIds: 4
 ; DUMP: 	CalleeEdges:
diff --git a/llvm/test/Transforms/PGOProfile/Inputs/thinlto_icall_prom.profdata b/llvm/test/Transforms/PGOProfile/Inputs/thinlto_icall_prom.profdata
new file mode 100644
index 0000000000000000000000000000000000000000..df563c53000f0dead9134fb7d45a5539a1f0cd0f
GIT binary patch
literal 976
zcmeyLQ&5zjmf6V700xW at 3PDydBiJC;2{b+%R9XN^vp{K995l=V9+*CLC<BdJ&<0Tn
zGY6*6ffwQcbnyq1AvU9nH%LKTh%T<MkR5Cz%sWg_`wysdViiwV$Awj#!4%>Xn0}af
z3wHB@)uW4lsN(~R!~6qtmw_}tR`Ccs?BaJEv5T8IVHdZ at A<oF~>uT?Fvy`c~VKJux
zb_WAPZenKM|NsBrU*_`Vg1Ht(LzOUaKp9L7#xo at E>CU at jDhw5YnUa&4q?ep*9UtXm
zo}8GIlbUK+hHfY<h|mp<o at t!pb5*B?ppga`M#5qN-AG1;tEw7SiHGv;!_sR4R7rf4
zp<#MXeo|sid|GK<a+yb~<=gx}(p)gd!Qu%$1T291VPYT}rXJ>am`h;c1Cxh^94yqK
V&OlI5g>EnnP?>jVe1rxF3jo^+RI300

literal 0
HcmV?d00001

diff --git a/llvm/test/Transforms/PGOProfile/Inputs/thinlto_indirect_call_promotion.ll b/llvm/test/Transforms/PGOProfile/Inputs/thinlto_indirect_call_promotion.ll
index 7412120bb52cf50..4514eeb1451ba66 100644
--- a/llvm/test/Transforms/PGOProfile/Inputs/thinlto_indirect_call_promotion.ll
+++ b/llvm/test/Transforms/PGOProfile/Inputs/thinlto_indirect_call_promotion.ll
@@ -1,16 +1,38 @@
-target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
+; ModuleID = 'lib.bc'
+source_filename = "lib.cc"
+target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"
 target triple = "x86_64-unknown-linux-gnu"
 
-source_filename = "thinlto_indirect_call_promotion.c"
+ at calleeAddrs = dso_local local_unnamed_addr global [2 x ptr] [ptr @_ZL7callee0v, ptr @_ZL7callee1v], align 16
 
-define void @a() {
+define internal void @_ZL7callee0v() {
 entry:
   ret void
 }
 
-define internal void @c() !PGOFuncName !1 {
+define internal void @_ZL7callee1v() {
 entry:
   ret void
 }
 
-!1 = !{!"thinlto_indirect_call_promotion.c:c"}
+define dso_local void @_Z11global_funcv() {
+entry:
+  br label %for.cond
+
+for.cond:                                         ; preds = %for.body, %entry
+  %i.0 = phi i32 [ 0, %entry ], [ %inc, %for.body ]
+  %cmp = icmp ult i32 %i.0, 5
+  br i1 %cmp, label %for.body, label %for.cond.cleanup
+
+for.cond.cleanup:                                 ; preds = %for.cond
+  ret void
+
+for.body:                                         ; preds = %for.cond
+  %rem = and i32 %i.0, 1
+  %idxprom = zext nneg i32 %rem to i64
+  %arrayidx = getelementptr inbounds [2 x ptr], ptr @calleeAddrs, i64 0, i64 %idxprom
+  %0 = load ptr, ptr %arrayidx ;, align 8, !tbaa !5
+  call void %0()
+  %inc = add nuw nsw i32 %i.0, 1
+  br label %for.cond ;, !llvm.loop !9
+}
\ No newline at end of file
diff --git a/llvm/test/Transforms/PGOProfile/Inputs/update_icall_promotion_inputs.sh b/llvm/test/Transforms/PGOProfile/Inputs/update_icall_promotion_inputs.sh
new file mode 100644
index 000000000000000..6c4fc1f5c339acc
--- /dev/null
+++ b/llvm/test/Transforms/PGOProfile/Inputs/update_icall_promotion_inputs.sh
@@ -0,0 +1,70 @@
+#!/bin/bash
+
+if [ $# -lt 2 ]; then
+  echo "Path to clang and llvm-profdata required!"
+  echo "Usage: update_icall_promotion_inputs.sh /path/to/updated/clang /path/to/updated/llvm-profdata"
+  exit 1
+else
+  CLANG=$1
+  LLVMPROFDATA=$2
+fi
+
+# Allows the script to be invoked from other directories.
+OUTDIR=$(dirname $(realpath -s $0))
+
+# Creates trivial header file to expose global_func.
+cat > ${OUTDIR}/lib.h << EOF
+void global_func();
+EOF
+
+# Creates lib.cc. global_func might call one of two indirect callees. Both
+# indirect callees have internal linkage.
+cat > ${OUTDIR}/lib.cc << EOF
+#include "lib.h"
+
+static void callee0() {}
+static void callee1() {}
+
+typedef void (*FPT)(); 
+FPT calleeAddrs[] = {callee0, callee1};
+
+void global_func() {
+    FPT fp = nullptr;
+    for (int i = 0; i < 5; i++) {
+      fp = calleeAddrs[i % 2];
+      fp();
+    }
+}
+EOF
+
+# Create main.cc that calls `global_func` in lib.cc
+cat > ${OUTDIR}/main.cc << EOF
+#include "lib.h"
+
+int main() {
+    global_func();
+}
+EOF
+
+COMMON_FLAGS="-fuse-ld=lld -O2"
+
+# cd into OUTDIR
+cd ${OUTDIR}
+
+# Generate instrumented binary
+${CLANG} ${COMMON_FLAGS} -fprofile-generate=. lib.h lib.cc main.cc
+# Create raw profiles
+env LLVM_PROFILE_FILE=icall_prom.profraw ./a.out
+# Create indexed profiles
+${LLVMPROFDATA} merge icall_prom.profraw -o thinlto_icall_prom.profdata
+
+# Clean up intermediate files.
+rm a.out
+rm ${OUTDIR}/icall_prom.profraw
+rm ${OUTDIR}/lib.h.pch
+rm ${OUTDIR}/lib.h
+rm ${OUTDIR}/lib.cc
+rm ${OUTDIR}/main.cc
+
+# Go back to original directory
+cd -
\ No newline at end of file
diff --git a/llvm/test/Transforms/PGOProfile/thinlto_indirect_call_promotion.ll b/llvm/test/Transforms/PGOProfile/thinlto_indirect_call_promotion.ll
index 173296f223e56ae..30969fef52da292 100644
--- a/llvm/test/Transforms/PGOProfile/thinlto_indirect_call_promotion.ll
+++ b/llvm/test/Transforms/PGOProfile/thinlto_indirect_call_promotion.ll
@@ -1,39 +1,39 @@
-; Do setup work for all below tests: generate bitcode and combined index
-; RUN: opt -module-summary %s -o %t.bc
-; RUN: opt -module-summary %p/Inputs/thinlto_indirect_call_promotion.ll -o %t2.bc
+; The raw profiles and reduced IR inputs are generated from Inputs/update_icall_promotion_inputs.sh
+
+; Do setup work for all below tests: annotate value profiles, generate bitcode and combined index
+; RUN: opt -passes=pgo-instr-use -pgo-test-profile-file=%p/Inputs/thinlto_icall_prom.profdata -module-summary %s -o %t.bc
+
+; Explicitly turn off ICP pass in Inputs/thinlto_indirect_call_promotion.ll. So ICP happens in this file after _Z11global_funcv and two indirect callees are imported here. 
+; RUN: opt -disable-icp -passes=pgo-instr-use -pgo-test-profile-file=%p/Inputs/thinlto_icall_prom.profdata -module-summary %p/Inputs/thinlto_indirect_call_promotion.ll -o %t2.bc
 ; RUN: llvm-lto -thinlto -o %t3 %t.bc %t2.bc
 
+; Tests that callees are correctly imported.
 ; RUN: opt -passes=function-import -summary-file %t3.thinlto.bc %t.bc -o %t4.bc -print-imports 2>&1 | FileCheck %s --check-prefix=IMPORTS
-; IMPORTS-DAG: Import a
-; IMPORTS-DAG: Import c
+; IMPORTS: Import _ZL7callee0v.llvm{{.*}}
+; IMPORTS: Import _ZL7callee1v.llvm{{.*}}
+; IMPORTS: Import _Z11global_funcv
 
-; RUN: opt %t4.bc -icp-lto -passes=pgo-icall-prom -S | FileCheck %s --check-prefix=ICALL-PROM
+; Tests that ICP transformations happen.
+; Both candidates are ICP'ed, check there is no `!VP` in the IR.
+; RUN: opt %t4.bc -icp-lto -passes=pgo-icall-prom -S | FileCheck %s --check-prefix=ICALL-PROM --implicit-check-not="!VP"
 ; RUN: opt %t4.bc -icp-lto -passes=pgo-icall-prom -S -pass-remarks=pgo-icall-prom 2>&1 | FileCheck %s --check-prefix=PASS-REMARK
-; PASS-REMARK: Promote indirect call to a with count 1 out of 1
-; PASS-REMARK: Promote indirect call to c.llvm.0 with count 1 out of 1
 
-target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
-target triple = "x86_64-unknown-linux-gnu"
+; PASS-REMARK: Promote indirect call to _ZL7callee0v.llvm.0 with count 3 out of 5
+; PASS-REMARK: Promote indirect call to _ZL7callee1v.llvm.0 with count 2 out of 2
 
- at foo = external local_unnamed_addr global ptr, align 8
- at bar = external local_unnamed_addr global ptr, align 8
+target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-unknown-linux-gnu"
 
-define i32 @main() local_unnamed_addr {
+define dso_local noundef i32 @main() {
 entry:
-  %0 = load ptr, ptr @foo, align 8
-; ICALL-PROM:   br i1 %{{[0-9]+}}, label %if.true.direct_targ, label %if.false.orig_indirect, !prof [[BRANCH_WEIGHT:![0-9]+]]
-  tail call void %0(), !prof !1
-  %1 = load ptr, ptr @bar, align 8
-; ICALL-PROM:   br i1 %{{[0-9]+}}, label %if.true.direct_targ1, label %if.false.orig_indirect2, !prof [[BRANCH_WEIGHT:![0-9]+]]
-  tail call void %1(), !prof !2
+  tail call void @_Z11global_funcv()
   ret i32 0
 }
 
-!1 = !{!"VP", i32 0, i64 1, i64 -6289574019528802036, i64 1}
-!2 = !{!"VP", i32 0, i64 1, i64 591260329866125152, i64 1}
+declare void @_Z11global_funcv()
+
+; ICALL-PROM:   br i1 %{{[0-9]+}}, label %if.true.direct_targ, label %if.false.orig_indirect, !prof [[BRANCH_WEIGHT1:![0-9]+]]
+; ICALL-PROM:   br i1 %{{[0-9]+}}, label %if.true.direct_targ1, label %if.false.orig_indirect2, !prof [[BRANCH_WEIGHT2:![0-9]+]]
 
-; Should not have a VP annotation on new indirect call (check before and after
-; branch_weights annotation).
-; ICALL-PROM-NOT: !"VP"
-; ICALL-PROM: [[BRANCH_WEIGHT]] = !{!"branch_weights", i32 1, i32 0}
-; ICALL-PROM-NOT: !"VP"
+; ICALL-PROM: [[BRANCH_WEIGHT1]] = !{!"branch_weights", i32 3, i32 2}
+; ICALL-PROM: [[BRANCH_WEIGHT2]] = !{!"branch_weights", i32 2, i32 0}
\ No newline at end of file

>From 25773fa1752214e0e766b7467f888a904b554fb5 Mon Sep 17 00:00:00 2001
From: mingmingl <mingmingl at google.com>
Date: Thu, 30 Nov 2023 21:50:13 -0800
Subject: [PATCH 2/2] address feedback

---
 clang/lib/CodeGen/CodeGenPGO.cpp              |   4 ++-
 llvm/include/llvm/ProfileData/InstrProf.h     |  18 ++++++------
 llvm/lib/IR/Globals.cpp                       |   1 -
 llvm/lib/ProfileData/InstrProf.cpp            |  27 ++++++++++++------
 .../X86/memprof-duplicate-context-ids.ll      |   2 --
 .../Inputs/thinlto_icall_prom.profdata        | Bin 976 -> 976 bytes
 .../Inputs/thinlto_indirect_call_promotion.ll |  14 ++++-----
 .../Inputs/update_icall_promotion_inputs.sh   |   8 +++---
 .../thinlto_indirect_call_promotion.ll        |  22 ++++++++------
 9 files changed, 56 insertions(+), 40 deletions(-)

diff --git a/clang/lib/CodeGen/CodeGenPGO.cpp b/clang/lib/CodeGen/CodeGenPGO.cpp
index 81bf8ea696b1647..2d3a4c779b6025d 100644
--- a/clang/lib/CodeGen/CodeGenPGO.cpp
+++ b/clang/lib/CodeGen/CodeGenPGO.cpp
@@ -34,7 +34,9 @@ using namespace CodeGen;
 void CodeGenPGO::setFuncName(StringRef Name,
                              llvm::GlobalValue::LinkageTypes Linkage) {
   llvm::IndexedInstrProfReader *PGOReader = CGM.getPGOReader();
-  FuncName = llvm::getPGOFuncName(
+  // FIXME: Maybe use IRPGOFuncName (not the legacy format) in clang
+  // instrumentation.
+  FuncName = llvm::getLegacyPGOFuncName(
       Name, Linkage, CGM.getCodeGenOpts().MainFileName,
       PGOReader ? PGOReader->getVersion() : llvm::IndexedInstrProf::Version);
 
diff --git a/llvm/include/llvm/ProfileData/InstrProf.h b/llvm/include/llvm/ProfileData/InstrProf.h
index 3bc677d5b6d8670..0ee12abede175cc 100644
--- a/llvm/include/llvm/ProfileData/InstrProf.h
+++ b/llvm/include/llvm/ProfileData/InstrProf.h
@@ -171,6 +171,8 @@ inline StringRef getInstrProfCounterBiasVarName() {
 /// Return the marker used to separate PGO names during serialization.
 inline StringRef getInstrProfNameSeparator() { return "\01"; }
 
+/// DEPRECATED. Use getIRPGOFuncName for new code. See that function for
+/// details.
 /// Return the modified name for function \c F suitable to be
 /// used the key for profile lookup. Variable \c InLTO indicates if this
 /// is called in LTO optimization passes.
@@ -181,10 +183,10 @@ std::string getPGOFuncName(const Function &F, bool InLTO = false,
 /// used the key for profile lookup. The function's original
 /// name is \c RawFuncName and has linkage of type \c Linkage.
 /// The function is defined in module \c FileName.
-std::string getPGOFuncName(StringRef RawFuncName,
-                           GlobalValue::LinkageTypes Linkage,
-                           StringRef FileName,
-                           uint64_t Version = INSTR_PROF_INDEX_VERSION);
+std::string getLegacyPGOFuncName(StringRef RawFuncName,
+                                 GlobalValue::LinkageTypes Linkage,
+                                 StringRef FileName,
+                                 uint64_t Version = INSTR_PROF_INDEX_VERSION);
 
 /// \return the modified name for function \c F suitable to be
 /// used as the key for IRPGO profile lookup. \c InLTO indicates if this is
@@ -197,18 +199,18 @@ std::pair<StringRef, StringRef> getParsedIRPGOFuncName(StringRef IRPGOFuncName);
 
 /// Return the name of the global variable used to store a function
 /// name in PGO instrumentation. \c FuncName is the name of the function
-/// returned by the \c getPGOFuncName call.
+/// returned by the \c getIRPGOFuncName call.
 std::string getPGOFuncNameVarName(StringRef FuncName,
                                   GlobalValue::LinkageTypes Linkage);
 
 /// Create and return the global variable for function name used in PGO
 /// instrumentation. \c FuncName is the name of the function returned
-/// by \c getPGOFuncName call.
+/// by \c getIRPGOFuncName call.
 GlobalVariable *createPGOFuncNameVar(Function &F, StringRef PGOFuncName);
 
 /// Create and return the global variable for function name used in PGO
 /// instrumentation.  /// \c FuncName is the name of the function
-/// returned by \c getPGOFuncName call, \c M is the owning module,
+/// returned by \c getIRPGOFuncName call, \c M is the owning module,
 /// and \c Linkage is the linkage of the instrumented function.
 GlobalVariable *createPGOFuncNameVar(Module &M,
                                      GlobalValue::LinkageTypes Linkage,
@@ -420,7 +422,7 @@ uint64_t ComputeHash(StringRef K);
 /// A symbol table used for function PGO name look-up with keys
 /// (such as pointers, md5hash values) to the function. A function's
 /// PGO name or name's md5hash are used in retrieving the profile
-/// data of the function. See \c getPGOFuncName() method for details
+/// data of the function. See \c getIRPGOFuncName() method for details
 /// on how PGO name is formed.
 class InstrProfSymtab {
 public:
diff --git a/llvm/lib/IR/Globals.cpp b/llvm/lib/IR/Globals.cpp
index e821de3b198f1b6..00b2dc804783185 100644
--- a/llvm/lib/IR/Globals.cpp
+++ b/llvm/lib/IR/Globals.cpp
@@ -144,7 +144,6 @@ void GlobalObject::copyAttributesFrom(const GlobalObject *Src) {
 std::string GlobalValue::getGlobalIdentifier(StringRef Name,
                                              GlobalValue::LinkageTypes Linkage,
                                              StringRef FileName) {
-
   // Value names may be prefixed with a binary '1' to indicate
   // that the backend should not modify the symbols due to any platform
   // naming convention. Do not include that '1' in the PGO profile name.
diff --git a/llvm/lib/ProfileData/InstrProf.cpp b/llvm/lib/ProfileData/InstrProf.cpp
index d9ad5c8b6f6838d..3992cbd14b2b428 100644
--- a/llvm/lib/ProfileData/InstrProf.cpp
+++ b/llvm/lib/ProfileData/InstrProf.cpp
@@ -246,9 +246,10 @@ std::string InstrProfError::message() const {
 
 char InstrProfError::ID = 0;
 
-std::string getPGOFuncName(StringRef Name, GlobalValue::LinkageTypes Linkage,
-                           StringRef FileName,
-                           uint64_t Version LLVM_ATTRIBUTE_UNUSED) {
+std::string getLegacyPGOFuncName(StringRef Name,
+                                 GlobalValue::LinkageTypes Linkage,
+                                 StringRef FileName,
+                                 uint64_t Version LLVM_ATTRIBUTE_UNUSED) {
   // Value names may be prefixed with a binary '1' to indicate
   // that the backend should not modify the symbols due to any platform
   // naming convention. Do not include that '1' in the PGO profile name.
@@ -303,7 +304,7 @@ static StringRef getStrippedSourceFileName(const GlobalObject &GO) {
 // ; is used because it is unlikely to be found in either <filepath> or
 // <linkage-name>.
 //
-// Older compilers used getPGOFuncName() which has the format
+// Older compilers used getLegacyPGOFuncName() which has the format
 // [<filepath>:]<function-name>. <filepath> is used to discriminate between
 // possibly identical function names when linkage is local and <function-name>
 // simply comes from F.getName(). This caused trouble for Objective-C functions
@@ -316,6 +317,12 @@ getIRPGONameForGlobalObject(const GlobalObject &GO,
                             GlobalValue::LinkageTypes Linkage,
                             StringRef FileName) {
   SmallString<64> Name;
+  // Keep mangler handling outside of `getGlobalIdentifier` for two reasons.
+  // First of all, passing global object gives other information (e.g. linkage)
+  // besides its name. and these information might affect mangled name.
+  // Secondly, `getGlobalIdentifier` only drops `\1` prefix but mangler might
+  // do more changes. Moving mangler's way of handling `\1` into
+  // `getGlobalIdentifier` might introduce unwanted change for existing callers.
   Mangler().getNameWithPrefix(Name, &GO, /*CannotUsePrivateLabel=*/true);
   return GlobalValue::getGlobalIdentifier(Name, Linkage, FileName);
 }
@@ -364,14 +371,17 @@ std::string getIRPGOFuncName(const Function &F, bool InLTO) {
   return getIRPGOObjectName(F, InLTO, getPGOFuncNameMetadata(F));
 }
 
+// DEPRECATED. Use `getIRPGOFuncName`for new code. See that function for
+// details. The implementation is kept for profile matching from older profiles.
+// FIXME: Possibly rename this to `getLegacyPGOFuncName` and update all callers.
 // This is similar to `getIRPGOFuncName` except that this function calls
-// 'getPGOFuncName' to get a name and `getIRPGOFuncName` calls
+// 'getLegacyPGOFuncName' to get a name and `getIRPGOFuncName` calls
 // 'getIRPGONameForGlobalObject'. See the difference between two callees in the
 // comments of `getIRPGONameForGlobalObject`.
 std::string getPGOFuncName(const Function &F, bool InLTO, uint64_t Version) {
   if (!InLTO) {
     auto FileName = getStrippedSourceFileName(F);
-    return getPGOFuncName(F.getName(), F.getLinkage(), FileName, Version);
+    return getLegacyPGOFuncName(F.getName(), F.getLinkage(), FileName, Version);
   }
 
   // In LTO mode (when InLTO is true), first check if there is a meta data.
@@ -381,7 +391,7 @@ std::string getPGOFuncName(const Function &F, bool InLTO, uint64_t Version) {
   // If there is no meta data, the function must be a global before the value
   // profile annotation pass. Its current linkage may be internal if it is
   // internalized in LTO mode.
-  return getPGOFuncName(F.getName(), GlobalValue::ExternalLinkage, "");
+  return getLegacyPGOFuncName(F.getName(), GlobalValue::ExternalLinkage, "");
 }
 
 // See getIRPGOFuncName() for a discription of the format.
@@ -396,7 +406,8 @@ getParsedIRPGOFuncName(StringRef IRPGOFuncName) {
 StringRef getFuncNameWithoutPrefix(StringRef PGOFuncName, StringRef FileName) {
   if (FileName.empty())
     return PGOFuncName;
-  // Drop the file name including ':'. See also getPGOFuncName.
+  // Drop the file name including ':' or ';'. See getIRPGONameForGlobalObject as
+  // well.
   if (PGOFuncName.startswith(FileName))
     PGOFuncName = PGOFuncName.drop_front(FileName.size() + 1);
   return PGOFuncName;
diff --git a/llvm/test/ThinLTO/X86/memprof-duplicate-context-ids.ll b/llvm/test/ThinLTO/X86/memprof-duplicate-context-ids.ll
index 7a0b4a36dbad4dd..65d794e9cba87c6 100644
--- a/llvm/test/ThinLTO/X86/memprof-duplicate-context-ids.ll
+++ b/llvm/test/ThinLTO/X86/memprof-duplicate-context-ids.ll
@@ -265,5 +265,3 @@ attributes #0 = { noinline optnone}
 ; DISTRIB: ^[[F:[0-9]+]] = gv: (guid: 11254287701717398916, {{.*}} callsites: ((callee: ^[[D]], clones: (0)
 ; DISTRIB: ^[[B:[0-9]+]] = gv: (guid: 13579056193435805313, {{.*}} callsites: ((callee: ^[[D]], clones: (1)
 ; DISTRIB: ^[[C:[0-9]+]] = gv: (guid: 15101436305866936160, {{.*}} callsites: ((callee: ^[[D:[0-9]+]], clones: (1)
-
-
diff --git a/llvm/test/Transforms/PGOProfile/Inputs/thinlto_icall_prom.profdata b/llvm/test/Transforms/PGOProfile/Inputs/thinlto_icall_prom.profdata
index df563c53000f0dead9134fb7d45a5539a1f0cd0f..90f4ed31a7a0aa5cdf65aaf3f8316a7c1c2b4b1d 100644
GIT binary patch
delta 102
zcmcb>et~@h6O)wStIZ03H%vak%>V%!P&z(pau$=}BqldT#>umo&T+!jF-<mP5}n+@
YEU3T)QRxe%Cqrqdeufv5H!{lu08<nee*gdg

delta 74
zcmcb>et~@h6VqgWCKoOy2ICo$_jKpoF`Yb at NpZ3=lh)*^OlLV^f=rVOnH45?GV=?%
OF)^Tkcat|V%L4#d>J$tB

diff --git a/llvm/test/Transforms/PGOProfile/Inputs/thinlto_indirect_call_promotion.ll b/llvm/test/Transforms/PGOProfile/Inputs/thinlto_indirect_call_promotion.ll
index 4514eeb1451ba66..bc8dc868b3a8ebd 100644
--- a/llvm/test/Transforms/PGOProfile/Inputs/thinlto_indirect_call_promotion.ll
+++ b/llvm/test/Transforms/PGOProfile/Inputs/thinlto_indirect_call_promotion.ll
@@ -3,19 +3,17 @@ source_filename = "lib.cc"
 target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"
 target triple = "x86_64-unknown-linux-gnu"
 
- at calleeAddrs = dso_local local_unnamed_addr global [2 x ptr] [ptr @_ZL7callee0v, ptr @_ZL7callee1v], align 16
+ at calleeAddrs = global [2 x ptr] [ptr @_ZL7callee0v, ptr @_Z7callee1v]
 
 define internal void @_ZL7callee0v() {
-entry:
   ret void
 }
 
-define internal void @_ZL7callee1v() {
-entry:
+define void @_Z7callee1v() {
   ret void
 }
 
-define dso_local void @_Z11global_funcv() {
+define void @_Z11global_funcv() {
 entry:
   br label %for.cond
 
@@ -31,8 +29,8 @@ for.body:                                         ; preds = %for.cond
   %rem = and i32 %i.0, 1
   %idxprom = zext nneg i32 %rem to i64
   %arrayidx = getelementptr inbounds [2 x ptr], ptr @calleeAddrs, i64 0, i64 %idxprom
-  %0 = load ptr, ptr %arrayidx ;, align 8, !tbaa !5
+  %0 = load ptr, ptr %arrayidx
   call void %0()
   %inc = add nuw nsw i32 %i.0, 1
-  br label %for.cond ;, !llvm.loop !9
-}
\ No newline at end of file
+  br label %for.cond
+}
diff --git a/llvm/test/Transforms/PGOProfile/Inputs/update_icall_promotion_inputs.sh b/llvm/test/Transforms/PGOProfile/Inputs/update_icall_promotion_inputs.sh
index 6c4fc1f5c339acc..1e2df0185c82af9 100644
--- a/llvm/test/Transforms/PGOProfile/Inputs/update_icall_promotion_inputs.sh
+++ b/llvm/test/Transforms/PGOProfile/Inputs/update_icall_promotion_inputs.sh
@@ -17,13 +17,13 @@ cat > ${OUTDIR}/lib.h << EOF
 void global_func();
 EOF
 
-# Creates lib.cc. global_func might call one of two indirect callees. Both
-# indirect callees have internal linkage.
+# Creates lib.cc. global_func might call one of two indirect callees. One callee
+# has internal linkage and the other has external linkage.
 cat > ${OUTDIR}/lib.cc << EOF
 #include "lib.h"
 
 static void callee0() {}
-static void callee1() {}
+void callee1() {}
 
 typedef void (*FPT)(); 
 FPT calleeAddrs[] = {callee0, callee1};
@@ -67,4 +67,4 @@ rm ${OUTDIR}/lib.cc
 rm ${OUTDIR}/main.cc
 
 # Go back to original directory
-cd -
\ No newline at end of file
+cd -
diff --git a/llvm/test/Transforms/PGOProfile/thinlto_indirect_call_promotion.ll b/llvm/test/Transforms/PGOProfile/thinlto_indirect_call_promotion.ll
index 30969fef52da292..27fea39ed0a4062 100644
--- a/llvm/test/Transforms/PGOProfile/thinlto_indirect_call_promotion.ll
+++ b/llvm/test/Transforms/PGOProfile/thinlto_indirect_call_promotion.ll
@@ -1,16 +1,22 @@
 ; The raw profiles and reduced IR inputs are generated from Inputs/update_icall_promotion_inputs.sh
 
-; Do setup work for all below tests: annotate value profiles, generate bitcode and combined index
+; Do setup work for all below tests: annotate value profiles, generate bitcode and combined index.
+; Explicitly turn off ICP pass in Inputs/thinlto_indirect_call_promotion.ll.
+; This way ICP happens in %t.bc after _Z11global_funcv and two indirect callees are imported.
 ; RUN: opt -passes=pgo-instr-use -pgo-test-profile-file=%p/Inputs/thinlto_icall_prom.profdata -module-summary %s -o %t.bc
-
-; Explicitly turn off ICP pass in Inputs/thinlto_indirect_call_promotion.ll. So ICP happens in this file after _Z11global_funcv and two indirect callees are imported here. 
 ; RUN: opt -disable-icp -passes=pgo-instr-use -pgo-test-profile-file=%p/Inputs/thinlto_icall_prom.profdata -module-summary %p/Inputs/thinlto_indirect_call_promotion.ll -o %t2.bc
 ; RUN: llvm-lto -thinlto -o %t3 %t.bc %t2.bc
 
-; Tests that callees are correctly imported.
+; Test that callee with local linkage has `PGOFuncName` metadata while callee with external doesn't have it.
+; RUN: llvm-dis %t2.bc -o - | FileCheck %s --check-prefix=PGOName
+; PGOName: define internal void @_ZL7callee0v() {{.*}} !prof !{{[0-9]+}} !PGOFuncName ![[MD:[0-9]+]] {
+; PGOName: define void @_Z7callee1v() {{.*}} !prof !{{[0-9]+}} {
+; PGOName: ![[MD]] = !{!"lib.cc;_ZL7callee0v"}
+
+; Tests that both external and internal callees are correctly imported.
 ; RUN: opt -passes=function-import -summary-file %t3.thinlto.bc %t.bc -o %t4.bc -print-imports 2>&1 | FileCheck %s --check-prefix=IMPORTS
 ; IMPORTS: Import _ZL7callee0v.llvm{{.*}}
-; IMPORTS: Import _ZL7callee1v.llvm{{.*}}
+; IMPORTS: Import _Z7callee1v
 ; IMPORTS: Import _Z11global_funcv
 
 ; Tests that ICP transformations happen.
@@ -19,12 +25,12 @@
 ; RUN: opt %t4.bc -icp-lto -passes=pgo-icall-prom -S -pass-remarks=pgo-icall-prom 2>&1 | FileCheck %s --check-prefix=PASS-REMARK
 
 ; PASS-REMARK: Promote indirect call to _ZL7callee0v.llvm.0 with count 3 out of 5
-; PASS-REMARK: Promote indirect call to _ZL7callee1v.llvm.0 with count 2 out of 2
+; PASS-REMARK: Promote indirect call to _Z7callee1v with count 2 out of 2
 
 target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"
 target triple = "x86_64-unknown-linux-gnu"
 
-define dso_local noundef i32 @main() {
+define i32 @main() {
 entry:
   tail call void @_Z11global_funcv()
   ret i32 0
@@ -36,4 +42,4 @@ declare void @_Z11global_funcv()
 ; ICALL-PROM:   br i1 %{{[0-9]+}}, label %if.true.direct_targ1, label %if.false.orig_indirect2, !prof [[BRANCH_WEIGHT2:![0-9]+]]
 
 ; ICALL-PROM: [[BRANCH_WEIGHT1]] = !{!"branch_weights", i32 3, i32 2}
-; ICALL-PROM: [[BRANCH_WEIGHT2]] = !{!"branch_weights", i32 2, i32 0}
\ No newline at end of file
+; ICALL-PROM: [[BRANCH_WEIGHT2]] = !{!"branch_weights", i32 2, i32 0}



More information about the cfe-commits mailing list