[llvm] [PGO][GlobalValue][LTO]In GlobalValues::getGlobalIdentifier, use semicolon as delimiter for local-linkage varibles. (PR #74008)
Mingming Liu via llvm-commits
llvm-commits at lists.llvm.org
Thu Nov 30 16:11:09 PST 2023
https://github.com/minglotus-6 created https://github.com/llvm/llvm-project/pull/74008
Commit fe05193 (phab D156569), IRPGO names uses format `[<filepath>;]<linkage-name>` while prior format is `[<filepath>:<linkage-name>`. The format change would break the use caes demonstrated in (updated)
llvm/test/Transforms/PGOProfile/thinlto_indirect_call_promotion.ll
This patch changes `GlobalValues::getGlobalIdentifer` to use the semicolon.
To elaborate on the scenario how things break without this PR
1. IRPGO raw profiles stores (compressed) IRPGO names of functions in one section, and per-function profile data in another section. One field in per-function profile data is the MD5 hash of IRPGO names.
2. When raw profiles are converted to indexed format profiles, the profiled address is mapped to the MD5 hash of the callee.
3. In thin-lto prelink pipeline, MD5 hash of IRPGO names will be annotated as value profiles, and used to import indirect-call-prom candidates. If the annotated MD5 hash is computed from the new format while import uses the prior format, the callee cannot be imported.
The updated test case Transforms/PGOProfile/thinlto_indirect_call_promotion.ll exercise the following path
- Annotate raw profiles, turn-off ICP transformations in the thin-lto prelink pipeline and generate import summaries. Using the imported summaries, it tests that functions are correctly imported and ICP transformations happened.
>From 4cb5b087485124a7f2375fdc018b42a0401e6409 Mon Sep 17 00:00:00 2001
From: mingmingl <mingmingl at google.com>
Date: Thu, 30 Nov 2023 15:41:37 -0800
Subject: [PATCH] [PGO][GlobalValue][LTO]In GlobalValues::getGlobalIdentifier,
use semicolon as delimiter for local-linkage varibles.
Commit fe05193 (phab D156569), IRPGO names uses format
'[<filepath>;]<linkage-name>' while prior format is
[<filepath>:<linkage-name>'. The format change would break the use caes
demonstrated in (updated)
llvm/test/Transforms/PGOProfile/thinlto_indirect_call_promotion.ll
This patch changes GlobalValues::getGlobalIdentifer to use the
semicolon.
To elaborate on the scenario how things break without this PR
1. IRPGO raw profiles stores (compressed) IRPGO names of functions in
one section, and per-function profile data in another section. One
field in per-function profile data is the MD5 hash of IRPGO names.
2. When raw profiles are converted to indexed format profiles, the
profiled address is mapped to the MD5 hash of the callee.
3. In thin-lto prelink pipeline, MD5 hash of IRPGO names will be
annotated as value profiles, and used to import indirect-call-prom
candidates. If the annotated MD5 hash is computed from the new format
while import uses the prior format, the callee cannot be imported.
The updated test case Transforms/PGOProfile/thinlto_indirect_call_promotion.ll exercise the following path
- Annotate raw profiles and generate import summaries. Using the
imported summaries, it tests that functions are correctly imported and
ICP transformations happened.
---
llvm/lib/IR/Globals.cpp | 4 +-
llvm/lib/ProfileData/InstrProf.cpp | 28 +++++--
.../thinlto-function-summary-originalnames.ll | 6 +-
llvm/test/ThinLTO/X86/memprof-basic.ll | 26 +++----
.../X86/memprof-duplicate-context-ids.ll | 12 +--
.../ThinLTO/X86/memprof-funcassigncloning.ll | 6 +-
llvm/test/ThinLTO/X86/memprof-indirectcall.ll | 32 ++++----
llvm/test/ThinLTO/X86/memprof-inlined.ll | 14 ++--
.../Inputs/thinlto_icall_prom.profdata | Bin 0 -> 976 bytes
.../Inputs/thinlto_indirect_call_promotion.ll | 32 ++++++--
.../Inputs/update_icall_promotion_inputs.sh | 70 ++++++++++++++++++
.../thinlto_indirect_call_promotion.ll | 52 ++++++-------
12 files changed, 194 insertions(+), 88 deletions(-)
create mode 100644 llvm/test/Transforms/PGOProfile/Inputs/thinlto_icall_prom.profdata
create mode 100644 llvm/test/Transforms/PGOProfile/Inputs/update_icall_promotion_inputs.sh
diff --git a/llvm/lib/IR/Globals.cpp b/llvm/lib/IR/Globals.cpp
index 7bd4503a689e4ae..e821de3b198f1b6 100644
--- a/llvm/lib/IR/Globals.cpp
+++ b/llvm/lib/IR/Globals.cpp
@@ -158,9 +158,9 @@ std::string GlobalValue::getGlobalIdentifier(StringRef Name,
// that it will stay the same, e.g., if the files are checked out from
// version control in different locations.
if (FileName.empty())
- NewName = NewName.insert(0, "<unknown>:");
+ NewName = NewName.insert(0, "<unknown>;");
else
- NewName = NewName.insert(0, FileName.str() + ":");
+ NewName = NewName.insert(0, FileName.str() + ";");
}
return NewName;
}
diff --git a/llvm/lib/ProfileData/InstrProf.cpp b/llvm/lib/ProfileData/InstrProf.cpp
index 236b083a1e2155b..d9ad5c8b6f6838d 100644
--- a/llvm/lib/ProfileData/InstrProf.cpp
+++ b/llvm/lib/ProfileData/InstrProf.cpp
@@ -246,11 +246,27 @@ std::string InstrProfError::message() const {
char InstrProfError::ID = 0;
-std::string getPGOFuncName(StringRef RawFuncName,
- GlobalValue::LinkageTypes Linkage,
+std::string getPGOFuncName(StringRef Name, GlobalValue::LinkageTypes Linkage,
StringRef FileName,
uint64_t Version LLVM_ATTRIBUTE_UNUSED) {
- return GlobalValue::getGlobalIdentifier(RawFuncName, Linkage, FileName);
+ // Value names may be prefixed with a binary '1' to indicate
+ // that the backend should not modify the symbols due to any platform
+ // naming convention. Do not include that '1' in the PGO profile name.
+ if (Name[0] == '\1')
+ Name = Name.substr(1);
+
+ std::string NewName = std::string(Name);
+ if (llvm::GlobalValue::isLocalLinkage(Linkage)) {
+ // For local symbols, prepend the main file name to distinguish them.
+ // Do not include the full path in the file name since there's no guarantee
+ // that it will stay the same, e.g., if the files are checked out from
+ // version control in different locations.
+ if (FileName.empty())
+ NewName = NewName.insert(0, "<unknown>:");
+ else
+ NewName = NewName.insert(0, FileName.str() + ":");
+ }
+ return NewName;
}
// Strip NumPrefix level of directory name from PathNameStr. If the number of
@@ -300,12 +316,8 @@ getIRPGONameForGlobalObject(const GlobalObject &GO,
GlobalValue::LinkageTypes Linkage,
StringRef FileName) {
SmallString<64> Name;
- if (llvm::GlobalValue::isLocalLinkage(Linkage)) {
- Name.append(FileName.empty() ? "<unknown>" : FileName);
- Name.append(";");
- }
Mangler().getNameWithPrefix(Name, &GO, /*CannotUsePrivateLabel=*/true);
- return Name.str().str();
+ return GlobalValue::getGlobalIdentifier(Name, Linkage, FileName);
}
static std::optional<std::string> lookupPGONameFromMetadata(MDNode *MD) {
diff --git a/llvm/test/Bitcode/thinlto-function-summary-originalnames.ll b/llvm/test/Bitcode/thinlto-function-summary-originalnames.ll
index 4d840d1f8ec8dda..24bb2a4efff509b 100644
--- a/llvm/test/Bitcode/thinlto-function-summary-originalnames.ll
+++ b/llvm/test/Bitcode/thinlto-function-summary-originalnames.ll
@@ -6,9 +6,9 @@
; COMBINED: <GLOBALVAL_SUMMARY_BLOCK
; COMBINED-NEXT: <VERSION
; COMBINED-NEXT: <FLAGS
-; COMBINED-NEXT: <VALUE_GUID {{.*}} op1=4947176790635855146/>
-; COMBINED-NEXT: <VALUE_GUID {{.*}} op1=-6591587165810580810/>
-; COMBINED-NEXT: <VALUE_GUID {{.*}} op1=-4377693495213223786/>
+; COMBINED-NEXT: <VALUE_GUID {{.*}} op1=686735765308251824/>
+; COMBINED-NEXT: <VALUE_GUID {{.*}} op1=4507502870619175775/>
+; COMBINED-NEXT: <VALUE_GUID {{.*}} op1=-8118561185538785069/>
; COMBINED-DAG: <COMBINED{{ }}
; COMBINED-DAG: <COMBINED_ORIGINAL_NAME op0=6699318081062747564/>
; COMBINED-DAG: <COMBINED_GLOBALVAR_INIT_REFS
diff --git a/llvm/test/ThinLTO/X86/memprof-basic.ll b/llvm/test/ThinLTO/X86/memprof-basic.ll
index 0d466830ba57d62..54e01e5fcdf9555 100644
--- a/llvm/test/ThinLTO/X86/memprof-basic.ll
+++ b/llvm/test/ThinLTO/X86/memprof-basic.ll
@@ -148,7 +148,7 @@ attributes #0 = { noinline optnone }
; DUMP: Edge from Callee [[BAR]] to Caller: [[BAZ:0x[a-z0-9]+]] AllocTypes: NotColdCold ContextIds: 1 2
; DUMP: Node [[BAZ]]
-; DUMP: Callee: 9832687305761716512 (_Z3barv) Clones: 0 StackIds: 2 (clone 0)
+; DUMP: Callee: 11481133863268513686 (_Z3barv) Clones: 0 StackIds: 2 (clone 0)
; DUMP: AllocTypes: NotColdCold
; DUMP: ContextIds: 1 2
; DUMP: CalleeEdges:
@@ -157,7 +157,7 @@ attributes #0 = { noinline optnone }
; DUMP: Edge from Callee [[BAZ]] to Caller: [[FOO:0x[a-z0-9]+]] AllocTypes: NotColdCold ContextIds: 1 2
; DUMP: Node [[FOO]]
-; DUMP: Callee: 5878270615442837395 (_Z3bazv) Clones: 0 StackIds: 3 (clone 0)
+; DUMP: Callee: 1807954217441101578 (_Z3bazv) Clones: 0 StackIds: 3 (clone 0)
; DUMP: AllocTypes: NotColdCold
; DUMP: ContextIds: 1 2
; DUMP: CalleeEdges:
@@ -167,7 +167,7 @@ attributes #0 = { noinline optnone }
; DUMP: Edge from Callee [[FOO]] to Caller: [[MAIN2:0x[a-z0-9]+]] AllocTypes: Cold ContextIds: 2
; DUMP: Node [[MAIN1]]
-; DUMP: Callee: 6731117468105397038 (_Z3foov) Clones: 0 StackIds: 0 (clone 0)
+; DUMP: Callee: 8107868197919466657 (_Z3foov) Clones: 0 StackIds: 0 (clone 0)
; DUMP: AllocTypes: NotCold
; DUMP: ContextIds: 1
; DUMP: CalleeEdges:
@@ -175,7 +175,7 @@ attributes #0 = { noinline optnone }
; DUMP: CallerEdges:
; DUMP: Node [[MAIN2]]
-; DUMP: Callee: 6731117468105397038 (_Z3foov) Clones: 0 StackIds: 1 (clone 0)
+; DUMP: Callee: 8107868197919466657 (_Z3foov) Clones: 0 StackIds: 1 (clone 0)
; DUMP: AllocTypes: Cold
; DUMP: ContextIds: 2
; DUMP: CalleeEdges:
@@ -197,7 +197,7 @@ attributes #0 = { noinline optnone }
; DUMP: Clones: [[BAR2:0x[a-z0-9]+]]
; DUMP: Node [[BAZ]]
-; DUMP: Callee: 9832687305761716512 (_Z3barv) Clones: 0 StackIds: 2 (clone 0)
+; DUMP: Callee: 11481133863268513686 (_Z3barv) Clones: 0 StackIds: 2 (clone 0)
; DUMP: AllocTypes: NotCold
; DUMP: ContextIds: 1
; DUMP: CalleeEdges:
@@ -207,7 +207,7 @@ attributes #0 = { noinline optnone }
; DUMP: Clones: [[BAZ2:0x[a-z0-9]+]]
; DUMP: Node [[FOO]]
-; DUMP: Callee: 5878270615442837395 (_Z3bazv) Clones: 0 StackIds: 3 (clone 0)
+; DUMP: Callee: 1807954217441101578 (_Z3bazv) Clones: 0 StackIds: 3 (clone 0)
; DUMP: AllocTypes: NotCold
; DUMP: ContextIds: 1
; DUMP: CalleeEdges:
@@ -217,7 +217,7 @@ attributes #0 = { noinline optnone }
; DUMP: Clones: [[FOO2:0x[a-z0-9]+]]
; DUMP: Node [[MAIN1]]
-; DUMP: Callee: 6731117468105397038 (_Z3foov) Clones: 0 StackIds: 0 (clone 0)
+; DUMP: Callee: 8107868197919466657 (_Z3foov) Clones: 0 StackIds: 0 (clone 0)
; DUMP: AllocTypes: NotCold
; DUMP: ContextIds: 1
; DUMP: CalleeEdges:
@@ -225,7 +225,7 @@ attributes #0 = { noinline optnone }
; DUMP: CallerEdges:
; DUMP: Node [[MAIN2]]
-; DUMP: Callee: 6731117468105397038 (_Z3foov) Clones: 0 StackIds: 1 (clone 0)
+; DUMP: Callee: 8107868197919466657 (_Z3foov) Clones: 0 StackIds: 1 (clone 0)
; DUMP: AllocTypes: Cold
; DUMP: ContextIds: 2
; DUMP: CalleeEdges:
@@ -233,7 +233,7 @@ attributes #0 = { noinline optnone }
; DUMP: CallerEdges:
; DUMP: Node [[FOO2]]
-; DUMP: Callee: 5878270615442837395 (_Z3bazv) Clones: 0 StackIds: 3 (clone 0)
+; DUMP: Callee: 1807954217441101578 (_Z3bazv) Clones: 0 StackIds: 3 (clone 0)
; DUMP: AllocTypes: Cold
; DUMP: ContextIds: 2
; DUMP: CalleeEdges:
@@ -243,7 +243,7 @@ attributes #0 = { noinline optnone }
; DUMP: Clone of [[FOO]]
; DUMP: Node [[BAZ2]]
-; DUMP: Callee: 9832687305761716512 (_Z3barv) Clones: 0 StackIds: 2 (clone 0)
+; DUMP: Callee: 11481133863268513686 (_Z3barv) Clones: 0 StackIds: 2 (clone 0)
; DUMP: AllocTypes: Cold
; DUMP: ContextIds: 2
; DUMP: CalleeEdges:
@@ -344,7 +344,7 @@ attributes #0 = { noinline optnone }
; DOTCLONED: }
-; DISTRIB: ^[[BAZ:[0-9]+]] = gv: (guid: 5878270615442837395, {{.*}} callsites: ((callee: ^[[BAR:[0-9]+]], clones: (0, 1)
-; DISTRIB: ^[[FOO:[0-9]+]] = gv: (guid: 6731117468105397038, {{.*}} callsites: ((callee: ^[[BAZ]], clones: (0, 1)
-; DISTRIB: ^[[BAR]] = gv: (guid: 9832687305761716512, {{.*}} allocs: ((versions: (notcold, cold)
+; DISTRIB: ^[[BAZ:[0-9]+]] = gv: (guid: 1807954217441101578, {{.*}} callsites: ((callee: ^[[BAR:[0-9]+]], clones: (0, 1)
+; DISTRIB: ^[[FOO:[0-9]+]] = gv: (guid: 8107868197919466657, {{.*}} callsites: ((callee: ^[[BAZ]], clones: (0, 1)
+; DISTRIB: ^[[BAR]] = gv: (guid: 11481133863268513686, {{.*}} allocs: ((versions: (notcold, cold)
; DISTRIB: ^[[MAIN:[0-9]+]] = gv: (guid: 15822663052811949562, {{.*}} callsites: ((callee: ^[[FOO]], clones: (0), {{.*}} (callee: ^[[FOO]], clones: (1)
diff --git a/llvm/test/ThinLTO/X86/memprof-duplicate-context-ids.ll b/llvm/test/ThinLTO/X86/memprof-duplicate-context-ids.ll
index f7ba0d27dca78a7..7a0b4a36dbad4dd 100644
--- a/llvm/test/ThinLTO/X86/memprof-duplicate-context-ids.ll
+++ b/llvm/test/ThinLTO/X86/memprof-duplicate-context-ids.ll
@@ -260,8 +260,10 @@ attributes #0 = { noinline optnone}
; STATS-BE: 1 memprof-context-disambiguation - Number of original (not cloned) allocations with memprof profiles during ThinLTO backend
-; DISTRIB: ^[[C:[0-9]+]] = gv: (guid: 1643923691937891493, {{.*}} callsites: ((callee: ^[[D:[0-9]+]], clones: (1)
-; DISTRIB: ^[[D]] = gv: (guid: 4881081444663423788, {{.*}} allocs: ((versions: (notcold, cold)
-; DISTRIB: ^[[B:[0-9]+]] = gv: (guid: 14590037969532473829, {{.*}} callsites: ((callee: ^[[D]], clones: (1)
-; DISTRIB: ^[[F:[0-9]+]] = gv: (guid: 17035303613541779335, {{.*}} callsites: ((callee: ^[[D]], clones: (0)
-; DISTRIB: ^[[E:[0-9]+]] = gv: (guid: 17820708772846654376, {{.*}} callsites: ((callee: ^[[D]], clones: (1)
+; DISTRIB: ^[[E:[0-9]+]] = gv: (guid: 331966645857188136, {{.*}} callsites: ((callee: ^[[D:[0-9]+]], clones: (1)
+; DISTRIB: ^[[D]] = gv: (guid: 11079124245221721799, {{.*}} allocs: ((versions: (notcold, cold)
+; DISTRIB: ^[[F:[0-9]+]] = gv: (guid: 11254287701717398916, {{.*}} callsites: ((callee: ^[[D]], clones: (0)
+; DISTRIB: ^[[B:[0-9]+]] = gv: (guid: 13579056193435805313, {{.*}} callsites: ((callee: ^[[D]], clones: (1)
+; DISTRIB: ^[[C:[0-9]+]] = gv: (guid: 15101436305866936160, {{.*}} callsites: ((callee: ^[[D:[0-9]+]], clones: (1)
+
+
diff --git a/llvm/test/ThinLTO/X86/memprof-funcassigncloning.ll b/llvm/test/ThinLTO/X86/memprof-funcassigncloning.ll
index 9a72ae43b2f1e48..f1a494d077fefca 100644
--- a/llvm/test/ThinLTO/X86/memprof-funcassigncloning.ll
+++ b/llvm/test/ThinLTO/X86/memprof-funcassigncloning.ll
@@ -176,7 +176,7 @@ attributes #0 = { noinline optnone }
; DUMP: Clones: [[ENEW1CLONE:0x[a-z0-9]+]]
; DUMP: Node [[D:0x[a-z0-9]+]]
-; DUMP: Callee: 10758063066234039248 (_Z1EPPcS0_) Clones: 0 StackIds: 0 (clone 0)
+; DUMP: Callee: 16147627620923572899 (_Z1EPPcS0_) Clones: 0 StackIds: 0 (clone 0)
; DUMP: AllocTypes: NotColdCold
; DUMP: ContextIds: 1 6
; DUMP: CalleeEdges:
@@ -185,7 +185,7 @@ attributes #0 = { noinline optnone }
; DUMP: CallerEdges:
; DUMP: Node [[C]]
-; DUMP: Callee: 10758063066234039248 (_Z1EPPcS0_) Clones: 0 StackIds: 1 (clone 0)
+; DUMP: Callee: 16147627620923572899 (_Z1EPPcS0_) Clones: 0 StackIds: 1 (clone 0)
; DUMP: AllocTypes: NotColdCold
; DUMP: ContextIds: 2 5
; DUMP: CalleeEdges:
@@ -194,7 +194,7 @@ attributes #0 = { noinline optnone }
; DUMP: CallerEdges:
; DUMP: Node [[B]]
-; DUMP: Callee: 10758063066234039248 (_Z1EPPcS0_) Clones: 0 StackIds: 2 (clone 0)
+; DUMP: Callee: 16147627620923572899 (_Z1EPPcS0_) Clones: 0 StackIds: 2 (clone 0)
; DUMP: AllocTypes: NotCold
; DUMP: ContextIds: 3 4
; DUMP: CalleeEdges:
diff --git a/llvm/test/ThinLTO/X86/memprof-indirectcall.ll b/llvm/test/ThinLTO/X86/memprof-indirectcall.ll
index 76273959f4f4ac8..07a52f441ca2783 100644
--- a/llvm/test/ThinLTO/X86/memprof-indirectcall.ll
+++ b/llvm/test/ThinLTO/X86/memprof-indirectcall.ll
@@ -202,7 +202,7 @@ attributes #0 = { noinline optnone }
; DUMP: Edge from Callee [[FOO]] to Caller: [[MAIN2:0x[a-z0-9]+]] AllocTypes: Cold ContextIds: 6
; DUMP: Node [[AX]]
-; DUMP: Callee: 12914368124089294956 (_Z3foov) Clones: 0 StackIds: 6 (clone 0)
+; DUMP: Callee: 15844184524768596045 (_Z3foov) Clones: 0 StackIds: 6 (clone 0)
; DUMP: AllocTypes: NotColdCold
; DUMP: ContextIds: 1 2
; DUMP: CalleeEdges:
@@ -225,7 +225,7 @@ attributes #0 = { noinline optnone }
; DUMP: Edge from Callee [[BAR]] to Caller: [[MAIN6:0x[a-z0-9]+]] AllocTypes: NotCold ContextIds: 5
; DUMP: Node [[MAIN3]]
-; DUMP: Callee: 4095956691517954349 (_Z3barP1A) Clones: 0 StackIds: 4 (clone 0)
+; DUMP: Callee: 2040285415115148168 (_Z3barP1A) Clones: 0 StackIds: 4 (clone 0)
; DUMP: AllocTypes: NotCold
; DUMP: ContextIds: 1
; DUMP: CalleeEdges:
@@ -233,7 +233,7 @@ attributes #0 = { noinline optnone }
; DUMP: CallerEdges:
; DUMP: Node [[MAIN4]]
-; DUMP: Callee: 4095956691517954349 (_Z3barP1A) Clones: 0 StackIds: 5 (clone 0)
+; DUMP: Callee: 2040285415115148168 (_Z3barP1A) Clones: 0 StackIds: 5 (clone 0)
; DUMP: AllocTypes: Cold
; DUMP: ContextIds: 2
; DUMP: CalleeEdges:
@@ -241,7 +241,7 @@ attributes #0 = { noinline optnone }
; DUMP: CallerEdges:
; DUMP: Node [[MAIN1]]
-; DUMP: Callee: 12914368124089294956 (_Z3foov) Clones: 0 StackIds: 0 (clone 0)
+; DUMP: Callee: 15844184524768596045 (_Z3foov) Clones: 0 StackIds: 0 (clone 0)
; DUMP: AllocTypes: NotCold
; DUMP: ContextIds: 3
; DUMP: CalleeEdges:
@@ -249,7 +249,7 @@ attributes #0 = { noinline optnone }
; DUMP: CallerEdges:
; DUMP: Node [[BX]]
-; DUMP: Callee: 12914368124089294956 (_Z3foov) Clones: 0 StackIds: 7 (clone 0)
+; DUMP: Callee: 15844184524768596045 (_Z3foov) Clones: 0 StackIds: 7 (clone 0)
; DUMP: AllocTypes: NotColdCold
; DUMP: ContextIds: 4 5
; DUMP: CalleeEdges:
@@ -258,7 +258,7 @@ attributes #0 = { noinline optnone }
; DUMP: Edge from Callee [[BX]] to Caller: [[BAR]] AllocTypes: NotColdCold ContextIds: 4 5
; DUMP: Node [[MAIN5]]
-; DUMP: Callee: 4095956691517954349 (_Z3barP1A) Clones: 0 StackIds: 2 (clone 0)
+; DUMP: Callee: 2040285415115148168 (_Z3barP1A) Clones: 0 StackIds: 2 (clone 0)
; DUMP: AllocTypes: Cold
; DUMP: ContextIds: 4
; DUMP: CalleeEdges:
@@ -266,7 +266,7 @@ attributes #0 = { noinline optnone }
; DUMP: CallerEdges:
; DUMP: Node [[MAIN6]]
-; DUMP: Callee: 4095956691517954349 (_Z3barP1A) Clones: 0 StackIds: 3 (clone 0)
+; DUMP: Callee: 2040285415115148168 (_Z3barP1A) Clones: 0 StackIds: 3 (clone 0)
; DUMP: AllocTypes: NotCold
; DUMP: ContextIds: 5
; DUMP: CalleeEdges:
@@ -274,7 +274,7 @@ attributes #0 = { noinline optnone }
; DUMP: CallerEdges:
; DUMP: Node [[MAIN2]]
-; DUMP: Callee: 12914368124089294956 (_Z3foov) Clones: 0 StackIds: 1 (clone 0)
+; DUMP: Callee: 15844184524768596045 (_Z3foov) Clones: 0 StackIds: 1 (clone 0)
; DUMP: AllocTypes: Cold
; DUMP: ContextIds: 6
; DUMP: CalleeEdges:
@@ -302,7 +302,7 @@ attributes #0 = { noinline optnone }
; DUMP: Clones: [[FOO2:0x[a-z0-9]+]]
; DUMP: Node [[AX]]
-; DUMP: Callee: 12914368124089294956 (_Z3foov) Clones: 0 StackIds: 6 (clone 0)
+; DUMP: Callee: 15844184524768596045 (_Z3foov) Clones: 0 StackIds: 6 (clone 0)
; DUMP: AllocTypes: NotColdCold
; DUMP: ContextIds: 1 2
; DUMP: CalleeEdges:
@@ -324,7 +324,7 @@ attributes #0 = { noinline optnone }
; DUMP: Edge from Callee [[BAR]] to Caller: [[MAIN6]] AllocTypes: NotCold ContextIds: 5
; DUMP: Node [[MAIN3]]
-; DUMP: Callee: 4095956691517954349 (_Z3barP1A) Clones: 0 StackIds: 4 (clone 0)
+; DUMP: Callee: 2040285415115148168 (_Z3barP1A) Clones: 0 StackIds: 4 (clone 0)
; DUMP: AllocTypes: NotCold
; DUMP: ContextIds: 1
; DUMP: CalleeEdges:
@@ -332,7 +332,7 @@ attributes #0 = { noinline optnone }
; DUMP: CallerEdges:
; DUMP: Node [[MAIN4]]
-; DUMP: Callee: 4095956691517954349 (_Z3barP1A) Clones: 0 StackIds: 5 (clone 0)
+; DUMP: Callee: 2040285415115148168 (_Z3barP1A) Clones: 0 StackIds: 5 (clone 0)
; DUMP: AllocTypes: Cold
; DUMP: ContextIds: 2
; DUMP: CalleeEdges:
@@ -340,7 +340,7 @@ attributes #0 = { noinline optnone }
; DUMP: CallerEdges:
; DUMP: Node [[MAIN1]]
-; DUMP: Callee: 12914368124089294956 (_Z3foov) Clones: 0 StackIds: 0 (clone 0)
+; DUMP: Callee: 15844184524768596045 (_Z3foov) Clones: 0 StackIds: 0 (clone 0)
; DUMP: AllocTypes: NotCold
; DUMP: ContextIds: 3
; DUMP: CalleeEdges:
@@ -348,7 +348,7 @@ attributes #0 = { noinline optnone }
; DUMP: CallerEdges:
; DUMP: Node [[BX]]
-; DUMP: Callee: 12914368124089294956 (_Z3foov) Clones: 0 StackIds: 7 (clone 0)
+; DUMP: Callee: 15844184524768596045 (_Z3foov) Clones: 0 StackIds: 7 (clone 0)
; DUMP: AllocTypes: NotColdCold
; DUMP: ContextIds: 4 5
; DUMP: CalleeEdges:
@@ -357,7 +357,7 @@ attributes #0 = { noinline optnone }
; DUMP: Edge from Callee [[BX]] to Caller: [[BAR]] AllocTypes: NotColdCold ContextIds: 4 5
; DUMP: Node [[MAIN5]]
-; DUMP: Callee: 4095956691517954349 (_Z3barP1A) Clones: 0 StackIds: 2 (clone 0)
+; DUMP: Callee: 2040285415115148168 (_Z3barP1A) Clones: 0 StackIds: 2 (clone 0)
; DUMP: AllocTypes: Cold
; DUMP: ContextIds: 4
; DUMP: CalleeEdges:
@@ -365,7 +365,7 @@ attributes #0 = { noinline optnone }
; DUMP: CallerEdges:
; DUMP: Node [[MAIN6]]
-; DUMP: Callee: 4095956691517954349 (_Z3barP1A) Clones: 0 StackIds: 3 (clone 0)
+; DUMP: Callee: 2040285415115148168 (_Z3barP1A) Clones: 0 StackIds: 3 (clone 0)
; DUMP: AllocTypes: NotCold
; DUMP: ContextIds: 5
; DUMP: CalleeEdges:
@@ -373,7 +373,7 @@ attributes #0 = { noinline optnone }
; DUMP: CallerEdges:
; DUMP: Node [[MAIN2]]
-; DUMP: Callee: 12914368124089294956 (_Z3foov) Clones: 0 StackIds: 1 (clone 0)
+; DUMP: Callee: 15844184524768596045 (_Z3foov) Clones: 0 StackIds: 1 (clone 0)
; DUMP: AllocTypes: Cold
; DUMP: ContextIds: 6
; DUMP: CalleeEdges:
diff --git a/llvm/test/ThinLTO/X86/memprof-inlined.ll b/llvm/test/ThinLTO/X86/memprof-inlined.ll
index feb9c94344223c9..89df345b2204239 100644
--- a/llvm/test/ThinLTO/X86/memprof-inlined.ll
+++ b/llvm/test/ThinLTO/X86/memprof-inlined.ll
@@ -170,7 +170,7 @@ attributes #0 = { noinline optnone }
; DUMP: Edge from Callee [[FOO2]] to Caller: [[MAIN2:0x[a-z0-9]+]] AllocTypes: Cold ContextIds: 2
; DUMP: Node [[MAIN1]]
-; DUMP: Callee: 2229562716906371625 (_Z3foov) Clones: 0 StackIds: 2 (clone 0)
+; DUMP: Callee: 644169328058379925 (_Z3foov) Clones: 0 StackIds: 2 (clone 0)
; DUMP: AllocTypes: NotCold
; DUMP: ContextIds: 1 3
; DUMP: CalleeEdges:
@@ -179,7 +179,7 @@ attributes #0 = { noinline optnone }
; DUMP: CallerEdges:
; DUMP: Node [[MAIN2]]
-; DUMP: Callee: 2229562716906371625 (_Z3foov) Clones: 0 StackIds: 3 (clone 0)
+; DUMP: Callee: 644169328058379925 (_Z3foov) Clones: 0 StackIds: 3 (clone 0)
; DUMP: AllocTypes: Cold
; DUMP: ContextIds: 2 4
; DUMP: CalleeEdges:
@@ -201,7 +201,7 @@ attributes #0 = { noinline optnone }
;; This is the node synthesized for the call to bar in foo that was created
;; by inlining baz into foo.
; DUMP: Node [[FOO]]
-; DUMP: Callee: 16064618363798697104 (_Z3barv) Clones: 0 StackIds: 0, 1 (clone 0)
+; DUMP: Callee: 10349908617508457487 (_Z3barv) Clones: 0 StackIds: 0, 1 (clone 0)
; DUMP: AllocTypes: NotColdCold
; DUMP: ContextIds: 3 4
; DUMP: CalleeEdges:
@@ -234,7 +234,7 @@ attributes #0 = { noinline optnone }
; DUMP: Edge from Callee [[FOO2]] to Caller: [[MAIN2]] AllocTypes: Cold ContextIds: 2
; DUMP: Node [[MAIN1]]
-; DUMP: Callee: 2229562716906371625 (_Z3foov) Clones: 0 StackIds: 2 (clone 0)
+; DUMP: Callee: 644169328058379925 (_Z3foov) Clones: 0 StackIds: 2 (clone 0)
; DUMP: AllocTypes: NotCold
; DUMP: ContextIds: 1 3
; DUMP: CalleeEdges:
@@ -243,7 +243,7 @@ attributes #0 = { noinline optnone }
; DUMP: CallerEdges:
; DUMP: Node [[MAIN2]]
-; DUMP: Callee: 2229562716906371625 (_Z3foov) Clones: 0 StackIds: 3 (clone 0)
+; DUMP: Callee: 644169328058379925 (_Z3foov) Clones: 0 StackIds: 3 (clone 0)
; DUMP: AllocTypes: Cold
; DUMP: ContextIds: 2 4
; DUMP: CalleeEdges:
@@ -264,7 +264,7 @@ attributes #0 = { noinline optnone }
; DUMP: Clones: [[BAR2:0x[a-z0-9]+]]
; DUMP: Node [[FOO]]
-; DUMP: Callee: 16064618363798697104 (_Z3barv) Clones: 0 StackIds: 0, 1 (clone 0)
+; DUMP: Callee: 10349908617508457487 (_Z3barv) Clones: 0 StackIds: 0, 1 (clone 0)
; DUMP: AllocTypes: NotCold
; DUMP: ContextIds: 3
; DUMP: CalleeEdges:
@@ -274,7 +274,7 @@ attributes #0 = { noinline optnone }
; DUMP: Clones: [[FOO3]]
; DUMP: Node [[FOO3]]
-; DUMP: Callee: 16064618363798697104 (_Z3barv) Clones: 0 StackIds: 0, 1 (clone 0)
+; DUMP: Callee: 10349908617508457487 (_Z3barv) Clones: 0 StackIds: 0, 1 (clone 0)
; DUMP: AllocTypes: Cold
; DUMP: ContextIds: 4
; DUMP: CalleeEdges:
diff --git a/llvm/test/Transforms/PGOProfile/Inputs/thinlto_icall_prom.profdata b/llvm/test/Transforms/PGOProfile/Inputs/thinlto_icall_prom.profdata
new file mode 100644
index 0000000000000000000000000000000000000000..df563c53000f0dead9134fb7d45a5539a1f0cd0f
GIT binary patch
literal 976
zcmeyLQ&5zjmf6V700xW at 3PDydBiJC;2{b+%R9XN^vp{K995l=V9+*CLC<BdJ&<0Tn
zGY6*6ffwQcbnyq1AvU9nH%LKTh%T<MkR5Cz%sWg_`wysdViiwV$Awj#!4%>Xn0}af
z3wHB@)uW4lsN(~R!~6qtmw_}tR`Ccs?BaJEv5T8IVHdZ at A<oF~>uT?Fvy`c~VKJux
zb_WAPZenKM|NsBrU*_`Vg1Ht(LzOUaKp9L7#xo at E>CU at jDhw5YnUa&4q?ep*9UtXm
zo}8GIlbUK+hHfY<h|mp<o at t!pb5*B?ppga`M#5qN-AG1;tEw7SiHGv;!_sR4R7rf4
zp<#MXeo|sid|GK<a+yb~<=gx}(p)gd!Qu%$1T291VPYT}rXJ>am`h;c1Cxh^94yqK
V&OlI5g>EnnP?>jVe1rxF3jo^+RI300
literal 0
HcmV?d00001
diff --git a/llvm/test/Transforms/PGOProfile/Inputs/thinlto_indirect_call_promotion.ll b/llvm/test/Transforms/PGOProfile/Inputs/thinlto_indirect_call_promotion.ll
index 7412120bb52cf50..4514eeb1451ba66 100644
--- a/llvm/test/Transforms/PGOProfile/Inputs/thinlto_indirect_call_promotion.ll
+++ b/llvm/test/Transforms/PGOProfile/Inputs/thinlto_indirect_call_promotion.ll
@@ -1,16 +1,38 @@
-target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
+; ModuleID = 'lib.bc'
+source_filename = "lib.cc"
+target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"
-source_filename = "thinlto_indirect_call_promotion.c"
+ at calleeAddrs = dso_local local_unnamed_addr global [2 x ptr] [ptr @_ZL7callee0v, ptr @_ZL7callee1v], align 16
-define void @a() {
+define internal void @_ZL7callee0v() {
entry:
ret void
}
-define internal void @c() !PGOFuncName !1 {
+define internal void @_ZL7callee1v() {
entry:
ret void
}
-!1 = !{!"thinlto_indirect_call_promotion.c:c"}
+define dso_local void @_Z11global_funcv() {
+entry:
+ br label %for.cond
+
+for.cond: ; preds = %for.body, %entry
+ %i.0 = phi i32 [ 0, %entry ], [ %inc, %for.body ]
+ %cmp = icmp ult i32 %i.0, 5
+ br i1 %cmp, label %for.body, label %for.cond.cleanup
+
+for.cond.cleanup: ; preds = %for.cond
+ ret void
+
+for.body: ; preds = %for.cond
+ %rem = and i32 %i.0, 1
+ %idxprom = zext nneg i32 %rem to i64
+ %arrayidx = getelementptr inbounds [2 x ptr], ptr @calleeAddrs, i64 0, i64 %idxprom
+ %0 = load ptr, ptr %arrayidx ;, align 8, !tbaa !5
+ call void %0()
+ %inc = add nuw nsw i32 %i.0, 1
+ br label %for.cond ;, !llvm.loop !9
+}
\ No newline at end of file
diff --git a/llvm/test/Transforms/PGOProfile/Inputs/update_icall_promotion_inputs.sh b/llvm/test/Transforms/PGOProfile/Inputs/update_icall_promotion_inputs.sh
new file mode 100644
index 000000000000000..6c4fc1f5c339acc
--- /dev/null
+++ b/llvm/test/Transforms/PGOProfile/Inputs/update_icall_promotion_inputs.sh
@@ -0,0 +1,70 @@
+#!/bin/bash
+
+if [ $# -lt 2 ]; then
+ echo "Path to clang and llvm-profdata required!"
+ echo "Usage: update_icall_promotion_inputs.sh /path/to/updated/clang /path/to/updated/llvm-profdata"
+ exit 1
+else
+ CLANG=$1
+ LLVMPROFDATA=$2
+fi
+
+# Allows the script to be invoked from other directories.
+OUTDIR=$(dirname $(realpath -s $0))
+
+# Creates trivial header file to expose global_func.
+cat > ${OUTDIR}/lib.h << EOF
+void global_func();
+EOF
+
+# Creates lib.cc. global_func might call one of two indirect callees. Both
+# indirect callees have internal linkage.
+cat > ${OUTDIR}/lib.cc << EOF
+#include "lib.h"
+
+static void callee0() {}
+static void callee1() {}
+
+typedef void (*FPT)();
+FPT calleeAddrs[] = {callee0, callee1};
+
+void global_func() {
+ FPT fp = nullptr;
+ for (int i = 0; i < 5; i++) {
+ fp = calleeAddrs[i % 2];
+ fp();
+ }
+}
+EOF
+
+# Create main.cc that calls `global_func` in lib.cc
+cat > ${OUTDIR}/main.cc << EOF
+#include "lib.h"
+
+int main() {
+ global_func();
+}
+EOF
+
+COMMON_FLAGS="-fuse-ld=lld -O2"
+
+# cd into OUTDIR
+cd ${OUTDIR}
+
+# Generate instrumented binary
+${CLANG} ${COMMON_FLAGS} -fprofile-generate=. lib.h lib.cc main.cc
+# Create raw profiles
+env LLVM_PROFILE_FILE=icall_prom.profraw ./a.out
+# Create indexed profiles
+${LLVMPROFDATA} merge icall_prom.profraw -o thinlto_icall_prom.profdata
+
+# Clean up intermediate files.
+rm a.out
+rm ${OUTDIR}/icall_prom.profraw
+rm ${OUTDIR}/lib.h.pch
+rm ${OUTDIR}/lib.h
+rm ${OUTDIR}/lib.cc
+rm ${OUTDIR}/main.cc
+
+# Go back to original directory
+cd -
\ No newline at end of file
diff --git a/llvm/test/Transforms/PGOProfile/thinlto_indirect_call_promotion.ll b/llvm/test/Transforms/PGOProfile/thinlto_indirect_call_promotion.ll
index 173296f223e56ae..30969fef52da292 100644
--- a/llvm/test/Transforms/PGOProfile/thinlto_indirect_call_promotion.ll
+++ b/llvm/test/Transforms/PGOProfile/thinlto_indirect_call_promotion.ll
@@ -1,39 +1,39 @@
-; Do setup work for all below tests: generate bitcode and combined index
-; RUN: opt -module-summary %s -o %t.bc
-; RUN: opt -module-summary %p/Inputs/thinlto_indirect_call_promotion.ll -o %t2.bc
+; The raw profiles and reduced IR inputs are generated from Inputs/update_icall_promotion_inputs.sh
+
+; Do setup work for all below tests: annotate value profiles, generate bitcode and combined index
+; RUN: opt -passes=pgo-instr-use -pgo-test-profile-file=%p/Inputs/thinlto_icall_prom.profdata -module-summary %s -o %t.bc
+
+; Explicitly turn off ICP pass in Inputs/thinlto_indirect_call_promotion.ll. So ICP happens in this file after _Z11global_funcv and two indirect callees are imported here.
+; RUN: opt -disable-icp -passes=pgo-instr-use -pgo-test-profile-file=%p/Inputs/thinlto_icall_prom.profdata -module-summary %p/Inputs/thinlto_indirect_call_promotion.ll -o %t2.bc
; RUN: llvm-lto -thinlto -o %t3 %t.bc %t2.bc
+; Tests that callees are correctly imported.
; RUN: opt -passes=function-import -summary-file %t3.thinlto.bc %t.bc -o %t4.bc -print-imports 2>&1 | FileCheck %s --check-prefix=IMPORTS
-; IMPORTS-DAG: Import a
-; IMPORTS-DAG: Import c
+; IMPORTS: Import _ZL7callee0v.llvm{{.*}}
+; IMPORTS: Import _ZL7callee1v.llvm{{.*}}
+; IMPORTS: Import _Z11global_funcv
-; RUN: opt %t4.bc -icp-lto -passes=pgo-icall-prom -S | FileCheck %s --check-prefix=ICALL-PROM
+; Tests that ICP transformations happen.
+; Both candidates are ICP'ed, check there is no `!VP` in the IR.
+; RUN: opt %t4.bc -icp-lto -passes=pgo-icall-prom -S | FileCheck %s --check-prefix=ICALL-PROM --implicit-check-not="!VP"
; RUN: opt %t4.bc -icp-lto -passes=pgo-icall-prom -S -pass-remarks=pgo-icall-prom 2>&1 | FileCheck %s --check-prefix=PASS-REMARK
-; PASS-REMARK: Promote indirect call to a with count 1 out of 1
-; PASS-REMARK: Promote indirect call to c.llvm.0 with count 1 out of 1
-target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
-target triple = "x86_64-unknown-linux-gnu"
+; PASS-REMARK: Promote indirect call to _ZL7callee0v.llvm.0 with count 3 out of 5
+; PASS-REMARK: Promote indirect call to _ZL7callee1v.llvm.0 with count 2 out of 2
- at foo = external local_unnamed_addr global ptr, align 8
- at bar = external local_unnamed_addr global ptr, align 8
+target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-unknown-linux-gnu"
-define i32 @main() local_unnamed_addr {
+define dso_local noundef i32 @main() {
entry:
- %0 = load ptr, ptr @foo, align 8
-; ICALL-PROM: br i1 %{{[0-9]+}}, label %if.true.direct_targ, label %if.false.orig_indirect, !prof [[BRANCH_WEIGHT:![0-9]+]]
- tail call void %0(), !prof !1
- %1 = load ptr, ptr @bar, align 8
-; ICALL-PROM: br i1 %{{[0-9]+}}, label %if.true.direct_targ1, label %if.false.orig_indirect2, !prof [[BRANCH_WEIGHT:![0-9]+]]
- tail call void %1(), !prof !2
+ tail call void @_Z11global_funcv()
ret i32 0
}
-!1 = !{!"VP", i32 0, i64 1, i64 -6289574019528802036, i64 1}
-!2 = !{!"VP", i32 0, i64 1, i64 591260329866125152, i64 1}
+declare void @_Z11global_funcv()
+
+; ICALL-PROM: br i1 %{{[0-9]+}}, label %if.true.direct_targ, label %if.false.orig_indirect, !prof [[BRANCH_WEIGHT1:![0-9]+]]
+; ICALL-PROM: br i1 %{{[0-9]+}}, label %if.true.direct_targ1, label %if.false.orig_indirect2, !prof [[BRANCH_WEIGHT2:![0-9]+]]
-; Should not have a VP annotation on new indirect call (check before and after
-; branch_weights annotation).
-; ICALL-PROM-NOT: !"VP"
-; ICALL-PROM: [[BRANCH_WEIGHT]] = !{!"branch_weights", i32 1, i32 0}
-; ICALL-PROM-NOT: !"VP"
+; ICALL-PROM: [[BRANCH_WEIGHT1]] = !{!"branch_weights", i32 3, i32 2}
+; ICALL-PROM: [[BRANCH_WEIGHT2]] = !{!"branch_weights", i32 2, i32 0}
\ No newline at end of file
More information about the llvm-commits
mailing list