[PATCH] D36351: [lld][ELF] Add profile guided section layout
Michael Spencer via llvm-commits
llvm-commits at lists.llvm.org
Mon Dec 4 14:30:29 PST 2017
I've attached updated patches.
I just tested these patches on Linux based on {llvm, clang, compiler-rt,
lld} as of r319686 and everything worked just fine with the following steps:
* Applied the patches.
* Built llvm, clang, compiler-rt and lld in a Release build.
(cgprofile-build)
* Built llvm, and lld in a Release build using the clang from
cgprofile-build with `-fprofile-instr-generate`. (instrumented-build)
* Ran `check-lld` on instrumented-build
* Ran `cgprofile-build/llvm-profdata merge
<PATH-TO-LLVM-SOURCE>/llvm-project/lld/test/ELF/default.profraw -o
default.profdata` in the instrumented-build directory
* Built llvm, and lld in a Release build using the clang from
cgprofile-build with `-fprofile-instr-use=default.profdata`.
(cg-profile-optimized-build)
* Checked that the .o files from cg-profile-optimized-build had call graph
profiles using `cgprofile-build/llvm-readobj -elf-cg-profile`
- Michael Spencer
On Mon, Nov 27, 2017 at 9:11 PM, Michael Spencer <bigcheesegs at gmail.com>
wrote:
> On Mon, Nov 27, 2017 at 9:07 PM, Davide Italiano <davide at freebsd.org>
> wrote:
>
>> Maybe this happens only on Linux? As far as I can tell the patch has been
>> developed on Windows entirely. Michael, have you tried linux?
>>
>
> It has been tested mostly on Linux.
>
> - Michael Spencer
>
>
>>
>> On Nov 27, 2017 9:05 PM, "Michael Spencer" <bigcheesegs at gmail.com> wrote:
>>
>>> On Mon, Nov 27, 2017 at 5:44 PM, Rui Ueyama <ruiu at google.com> wrote:
>>>
>>>> I rebased the patches to HEAD and tried to use it again. When I ran
>>>> `bin/llvm-profdata merge default.profraw -o default.profdata`, it creates
>>>> almost empty default.profdata and it shrunk default.profraw. That seems
>>>> pretty odd to me. Michael, do you know what is going on?
>>>>
>>>> Before:
>>>>
>>>> -rw-r----- 1 ruiu eng 29M Nov 27 17:39 default.profraw
>>>>
>>>> After:
>>>>
>>>> -rw-r----- 1 ruiu eng 560 Nov 27 17:39 default.profdata
>>>> -rw-r----- 1 ruiu eng 2.2M Nov 27 17:39 default.profraw
>>>>
>>>> The rebased patch is available at https://reviews.llvm.org/D40534.
>>>>
>>>
>>> I'm not sure what the issue is without more information. My patch
>>> doesn't change how llvm profile information works, it only uses the data.
>>>
>>> - Michael Spencer
>>>
>>>
>>>>
>>>> On Mon, Nov 27, 2017 at 2:01 PM, Davide Italiano <davide at freebsd.org>
>>>> wrote:
>>>>
>>>>> On Mon, Nov 27, 2017 at 1:49 PM, Rui Ueyama <ruiu at google.com> wrote:
>>>>> > I'm so sorry that I didn't respond in a timely manner. I'm reading
>>>>> it again
>>>>> > now.
>>>>> >
>>>>>
>>>>> No worries, definitely not your fault. The whole back and forth has
>>>>> been a little slow, but this has been around for almost 6 months now
>>>>> (and has been in development for much longer) so we should really
>>>>> decide whether we want to ship it or kill it :)
>>>>>
>>>>
>>>>
>>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20171204/8676ea59/attachment.html>
-------------- next part --------------
diff --git a/include/llvm/InitializePasses.h b/include/llvm/InitializePasses.h
index 4935ba1a30d..8cbc84e500f 100644
--- a/include/llvm/InitializePasses.h
+++ b/include/llvm/InitializePasses.h
@@ -84,6 +84,7 @@ void initializeCallSiteSplittingLegacyPassPass(PassRegistry&);
void initializeCFGOnlyPrinterLegacyPassPass(PassRegistry&);
void initializeCFGOnlyViewerLegacyPassPass(PassRegistry&);
void initializeCFGPrinterLegacyPassPass(PassRegistry&);
+void initializeCFGProfilePassPass(PassRegistry&);
void initializeCFGSimplifyPassPass(PassRegistry&);
void initializeCFGViewerLegacyPassPass(PassRegistry&);
void initializeCFLAndersAAWrapperPassPass(PassRegistry&);
diff --git a/include/llvm/LinkAllPasses.h b/include/llvm/LinkAllPasses.h
index 39d1ec6cffb..d1a3e8ac3e9 100644
--- a/include/llvm/LinkAllPasses.h
+++ b/include/llvm/LinkAllPasses.h
@@ -76,6 +76,7 @@ namespace {
(void) llvm::createCallGraphDOTPrinterPass();
(void) llvm::createCallGraphViewerPass();
(void) llvm::createCFGSimplificationPass();
+ (void) llvm::createCFGProfilePass();
(void) llvm::createCFLAndersAAWrapperPass();
(void) llvm::createCFLSteensAAWrapperPass();
(void) llvm::createStructurizeCFGPass();
diff --git a/include/llvm/MC/MCAssembler.h b/include/llvm/MC/MCAssembler.h
index 1ce6b09355d..30912377da6 100644
--- a/include/llvm/MC/MCAssembler.h
+++ b/include/llvm/MC/MCAssembler.h
@@ -393,6 +393,13 @@ public:
const MCLOHContainer &getLOHContainer() const {
return const_cast<MCAssembler *>(this)->getLOHContainer();
}
+
+ struct CGProfileEntry {
+ const MCSymbol *From;
+ const MCSymbol *To;
+ uint64_t Count;
+ };
+ std::vector<CGProfileEntry> CGProfile;
/// @}
/// \name Backend Data Access
/// @{
diff --git a/include/llvm/MC/MCELFStreamer.h b/include/llvm/MC/MCELFStreamer.h
index c5b66a163c8..3402980c13b 100644
--- a/include/llvm/MC/MCELFStreamer.h
+++ b/include/llvm/MC/MCELFStreamer.h
@@ -66,6 +66,9 @@ public:
void EmitValueToAlignment(unsigned, int64_t, unsigned, unsigned) override;
+ void emitCGProfileEntry(const MCSymbol *From, const MCSymbol *To,
+ uint64_t Count) override;
+
void FinishImpl() override;
void EmitBundleAlignMode(unsigned AlignPow2) override;
diff --git a/include/llvm/MC/MCStreamer.h b/include/llvm/MC/MCStreamer.h
index 58003d7d596..ec86dc39741 100644
--- a/include/llvm/MC/MCStreamer.h
+++ b/include/llvm/MC/MCStreamer.h
@@ -848,6 +848,9 @@ public:
SMLoc Loc = SMLoc());
virtual void EmitWinEHHandlerData(SMLoc Loc = SMLoc());
+ virtual void emitCGProfileEntry(const MCSymbol *From, const MCSymbol *To,
+ uint64_t Count);
+
/// Get the .pdata section used for the given section. Typically the given
/// section is either the main .text section or some other COMDAT .text
/// section, but it may be any section containing code.
diff --git a/include/llvm/Object/ELFTypes.h b/include/llvm/Object/ELFTypes.h
index 83b688548fd..905916e910c 100644
--- a/include/llvm/Object/ELFTypes.h
+++ b/include/llvm/Object/ELFTypes.h
@@ -40,6 +40,7 @@ template <class ELFT> struct Elf_Versym_Impl;
template <class ELFT> struct Elf_Hash_Impl;
template <class ELFT> struct Elf_GnuHash_Impl;
template <class ELFT> struct Elf_Chdr_Impl;
+template <class ELFT> struct Elf_CGProfile_Impl;
template <endianness E, bool Is64> struct ELFType {
private:
@@ -66,6 +67,7 @@ public:
using Hash = Elf_Hash_Impl<ELFType<E, Is64>>;
using GnuHash = Elf_GnuHash_Impl<ELFType<E, Is64>>;
using Chdr = Elf_Chdr_Impl<ELFType<E, Is64>>;
+ using CGProfile = Elf_CGProfile_Impl<ELFType<E, Is64>>;
using DynRange = ArrayRef<Dyn>;
using ShdrRange = ArrayRef<Shdr>;
using SymRange = ArrayRef<Sym>;
@@ -590,6 +592,14 @@ struct Elf_Chdr_Impl<ELFType<TargetEndianness, true>> {
Elf_Xword ch_addralign;
};
+template <class ELFT>
+struct Elf_CGProfile_Impl {
+ LLVM_ELF_IMPORT_TYPES_ELFT(ELFT)
+ Elf_Word cgp_from;
+ Elf_Word cgp_to;
+ Elf_Xword cgp_weight;
+};
+
// MIPS .reginfo section
template <class ELFT>
struct Elf_Mips_RegInfo;
diff --git a/include/llvm/Transforms/Instrumentation.h b/include/llvm/Transforms/Instrumentation.h
index 0d76328a2f8..ed6758f3bd3 100644
--- a/include/llvm/Transforms/Instrumentation.h
+++ b/include/llvm/Transforms/Instrumentation.h
@@ -180,6 +180,8 @@ struct SanitizerCoverageOptions {
ModulePass *createSanitizerCoverageModulePass(
const SanitizerCoverageOptions &Options = SanitizerCoverageOptions());
+ModulePass *createCFGProfilePass();
+
/// \brief Calculate what to divide by to scale counts.
///
/// Given the maximum count, calculate a divisor that will scale all the
diff --git a/lib/CodeGen/TargetLoweringObjectFileImpl.cpp b/lib/CodeGen/TargetLoweringObjectFileImpl.cpp
index 910ca4682b9..0b784fa384c 100644
--- a/lib/CodeGen/TargetLoweringObjectFileImpl.cpp
+++ b/lib/CodeGen/TargetLoweringObjectFileImpl.cpp
@@ -98,16 +98,60 @@ void TargetLoweringObjectFileELF::emitModuleMetadata(
StringRef Section;
GetObjCImageInfo(M, Version, Flags, Section);
- if (Section.empty())
- return;
+ if (!Section.empty()) {
+ auto &C = getContext();
+ auto *S = C.getELFSection(Section, ELF::SHT_PROGBITS, ELF::SHF_ALLOC);
+ Streamer.SwitchSection(S);
+ Streamer.EmitLabel(C.getOrCreateSymbol(StringRef("OBJC_IMAGE_INFO")));
+ Streamer.EmitIntValue(Version, 4);
+ Streamer.EmitIntValue(Flags, 4);
+ Streamer.AddBlankLine();
+ }
- auto &C = getContext();
- auto *S = C.getELFSection(Section, ELF::SHT_PROGBITS, ELF::SHF_ALLOC);
- Streamer.SwitchSection(S);
- Streamer.EmitLabel(C.getOrCreateSymbol(StringRef("OBJC_IMAGE_INFO")));
- Streamer.EmitIntValue(Version, 4);
- Streamer.EmitIntValue(Flags, 4);
- Streamer.AddBlankLine();
+ SmallVector<Module::ModuleFlagEntry, 8> ModuleFlags;
+ M.getModuleFlagsMetadata(ModuleFlags);
+
+ MDNode *CFGProfile = nullptr;
+
+ for (const auto &MFE : ModuleFlags) {
+ StringRef Key = MFE.Key->getString();
+ if (Key == "CFG Profile") {
+ CFGProfile = cast<MDNode>(MFE.Val);
+ break;
+ }
+ }
+
+ if (!CFGProfile)
+ return;
+ /*MCSectionELF *Sec =
+ getContext().getELFSection(".note.llvm.callgraph", ELF::SHT_NOTE, 0);
+ Streamer.SwitchSection(Sec);
+ SmallString<256> Out;
+ for (const auto &Edge : CFGProfile->operands()) {
+ raw_svector_ostream O(Out);
+ MDNode *E = cast<MDNode>(Edge);
+ O << cast<MDString>(E->getOperand(0))->getString() << " "
+ << cast<MDString>(E->getOperand(1))->getString() << " "
+ << cast<ConstantAsMetadata>(E->getOperand(2))
+ ->getValue()
+ ->getUniqueInteger()
+ .getZExtValue()
+ << "\n";
+ Streamer.EmitBytes(O.str());
+ Out.clear();
+ }*/
+ for (const auto &Edge : CFGProfile->operands()) {
+ MDNode *E = cast<MDNode>(Edge);
+ const MCSymbol *From = Streamer.getContext().getOrCreateSymbol(
+ cast<MDString>(E->getOperand(0))->getString());
+ const MCSymbol *To = Streamer.getContext().getOrCreateSymbol(
+ cast<MDString>(E->getOperand(1))->getString());
+ uint64_t Count = cast<ConstantAsMetadata>(E->getOperand(2))
+ ->getValue()
+ ->getUniqueInteger()
+ .getZExtValue();
+ Streamer.emitCGProfileEntry(From, To, Count);
+ }
}
MCSymbol *TargetLoweringObjectFileELF::getCFIPersonalitySymbol(
diff --git a/lib/MC/ELFObjectWriter.cpp b/lib/MC/ELFObjectWriter.cpp
index e11eaaa3060..b54fc1693aa 100644
--- a/lib/MC/ELFObjectWriter.cpp
+++ b/lib/MC/ELFObjectWriter.cpp
@@ -1299,6 +1299,13 @@ void ELFObjectWriter::writeObject(MCAssembler &Asm,
}
}
+ MCSectionELF *CGProfileSection = nullptr;
+ if (!Asm.CGProfile.empty()) {
+ CGProfileSection =
+ Ctx.getELFSection(".note.llvm.cgprofile", ELF::SHT_NOTE, 0, 16, "");
+ SectionIndexMap[CGProfileSection] = addToSectionTable(CGProfileSection);
+ }
+
for (MCSectionELF *Group : Groups) {
align(Group->getAlignment());
@@ -1333,6 +1340,17 @@ void ELFObjectWriter::writeObject(MCAssembler &Asm,
SectionOffsets[RelSection] = std::make_pair(SecStart, SecEnd);
}
+ if (CGProfileSection) {
+ uint64_t SecStart = getStream().tell();
+ for (const MCAssembler::CGProfileEntry &CGPE : Asm.CGProfile) {
+ write32(CGPE.From->getIndex());
+ write32(CGPE.To->getIndex());
+ write64(CGPE.Count);
+ }
+ uint64_t SecEnd = getStream().tell();
+ SectionOffsets[CGProfileSection] = std::make_pair(SecStart, SecEnd);
+ }
+
{
uint64_t SecStart = getStream().tell();
const MCSectionELF *Sec = createStringTable(Ctx);
diff --git a/lib/MC/MCAsmStreamer.cpp b/lib/MC/MCAsmStreamer.cpp
index d1b475f2ca0..f0cd1a5ddef 100644
--- a/lib/MC/MCAsmStreamer.cpp
+++ b/lib/MC/MCAsmStreamer.cpp
@@ -291,6 +291,9 @@ public:
SMLoc Loc) override;
void EmitWinEHHandlerData(SMLoc Loc) override;
+ void emitCGProfileEntry(const MCSymbol *From, const MCSymbol *To,
+ uint64_t Count) override;
+
void EmitInstruction(const MCInst &Inst, const MCSubtargetInfo &STI,
bool PrintSchedInfo) override;
@@ -1561,6 +1564,16 @@ void MCAsmStreamer::EmitWinCFIEndProlog(SMLoc Loc) {
EmitEOL();
}
+void MCAsmStreamer::emitCGProfileEntry(const MCSymbol *From, const MCSymbol *To,
+ uint64_t Count) {
+ OS << "\t.cg_profile ";
+ From->print(OS, MAI);
+ OS << ", ";
+ To->print(OS, MAI);
+ OS << ", " << Count;
+ EmitEOL();
+}
+
void MCAsmStreamer::AddEncodingComment(const MCInst &Inst,
const MCSubtargetInfo &STI,
bool PrintSchedInfo) {
diff --git a/lib/MC/MCELFStreamer.cpp b/lib/MC/MCELFStreamer.cpp
index 366125962a5..292e365b058 100644
--- a/lib/MC/MCELFStreamer.cpp
+++ b/lib/MC/MCELFStreamer.cpp
@@ -365,6 +365,11 @@ void MCELFStreamer::EmitValueToAlignment(unsigned ByteAlignment,
ValueSize, MaxBytesToEmit);
}
+void MCELFStreamer::emitCGProfileEntry(const MCSymbol *From, const MCSymbol *To,
+ uint64_t Count) {
+ getAssembler().CGProfile.push_back({From, To, Count});
+}
+
void MCELFStreamer::EmitIdent(StringRef IdentString) {
MCSection *Comment = getAssembler().getContext().getELFSection(
".comment", ELF::SHT_PROGBITS, ELF::SHF_MERGE | ELF::SHF_STRINGS, 1, "");
diff --git a/lib/MC/MCParser/ELFAsmParser.cpp b/lib/MC/MCParser/ELFAsmParser.cpp
index 38720c23ff2..3a62a49968a 100644
--- a/lib/MC/MCParser/ELFAsmParser.cpp
+++ b/lib/MC/MCParser/ELFAsmParser.cpp
@@ -85,6 +85,7 @@ public:
addDirectiveHandler<
&ELFAsmParser::ParseDirectiveSymbolAttribute>(".hidden");
addDirectiveHandler<&ELFAsmParser::ParseDirectiveSubsection>(".subsection");
+ addDirectiveHandler<&ELFAsmParser::ParseDirectiveCGProfile>(".cg_profile");
}
// FIXME: Part of this logic is duplicated in the MCELFStreamer. What is
@@ -149,6 +150,7 @@ public:
bool ParseDirectiveWeakref(StringRef, SMLoc);
bool ParseDirectiveSymbolAttribute(StringRef, SMLoc);
bool ParseDirectiveSubsection(StringRef, SMLoc);
+ bool ParseDirectiveCGProfile(StringRef, SMLoc);
private:
bool ParseSectionName(StringRef &SectionName);
@@ -838,6 +840,40 @@ bool ELFAsmParser::ParseDirectiveSubsection(StringRef, SMLoc) {
return false;
}
+/// ParseDirectiveCGProfile
+/// ::= .cg_profile identifier, identifier, <number>
+bool ELFAsmParser::ParseDirectiveCGProfile(StringRef, SMLoc) {
+ StringRef From;
+ if (getParser().parseIdentifier(From))
+ return TokError("expected identifier in directive");
+
+ if (getLexer().isNot(AsmToken::Comma))
+ return TokError("expected a comma");
+ Lex();
+
+ StringRef To;
+ if (getParser().parseIdentifier(To))
+ return TokError("expected identifier in directive");
+
+ if (getLexer().isNot(AsmToken::Comma))
+ return TokError("expected a comma");
+ Lex();
+
+ int64_t Count;
+ if (getParser().parseIntToken(
+ Count, "expected integer count in '.cg_profile' directive"))
+ return true;
+
+ if (getLexer().isNot(AsmToken::EndOfStatement))
+ return TokError("unexpected token in directive");
+
+ MCSymbol *FromSym = getContext().getOrCreateSymbol(From);
+ MCSymbol *ToSym = getContext().getOrCreateSymbol(To);
+
+ getStreamer().emitCGProfileEntry(FromSym, ToSym, Count);
+ return false;
+}
+
namespace llvm {
MCAsmParserExtension *createELFAsmParser() {
diff --git a/lib/MC/MCStreamer.cpp b/lib/MC/MCStreamer.cpp
index 4067df0eaf5..e6a00fb46e4 100644
--- a/lib/MC/MCStreamer.cpp
+++ b/lib/MC/MCStreamer.cpp
@@ -639,6 +639,10 @@ void MCStreamer::EmitWinEHHandlerData(SMLoc Loc) {
getContext().reportError(Loc, "Chained unwind areas can't have handlers!");
}
+void MCStreamer::emitCGProfileEntry(const MCSymbol *From, const MCSymbol *To,
+ uint64_t Count) {
+}
+
static MCSection *getWinCFISection(MCContext &Context, unsigned *NextWinCFIID,
MCSection *MainCFISec,
const MCSection *TextSec) {
diff --git a/lib/Transforms/IPO/PassManagerBuilder.cpp b/lib/Transforms/IPO/PassManagerBuilder.cpp
index abab7e194ad..9c27fca24ea 100644
--- a/lib/Transforms/IPO/PassManagerBuilder.cpp
+++ b/lib/Transforms/IPO/PassManagerBuilder.cpp
@@ -672,6 +672,8 @@ void PassManagerBuilder::populateModulePassManager(
MPM.add(createConstantMergePass()); // Merge dup global constants
}
+ MPM.add(createCFGProfilePass());
+
if (MergeFunctions)
MPM.add(createMergeFunctionsPass());
diff --git a/lib/Transforms/Instrumentation/CFGProfile.cpp b/lib/Transforms/Instrumentation/CFGProfile.cpp
new file mode 100644
index 00000000000..6aa76d35a24
--- /dev/null
+++ b/lib/Transforms/Instrumentation/CFGProfile.cpp
@@ -0,0 +1,103 @@
+//===-- CFGProfile.cpp ----------------------------------------------------===//
+//
+// The LLVM Compiler Infrastructure
+//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+//
+//===----------------------------------------------------------------------===//
+
+#include "llvm/Analysis/BlockFrequencyInfo.h"
+#include "llvm/Analysis/BranchProbabilityInfo.h"
+#include "llvm/IR/Constants.h"
+#include "llvm/IR/Instructions.h"
+#include "llvm/IR/MDBuilder.h"
+#include "llvm/IR/PassManager.h"
+#include "llvm/Transforms/Instrumentation.h"
+
+#include <array>
+
+using namespace llvm;
+
+class CFGProfilePass : public ModulePass {
+public:
+ static char ID;
+
+ CFGProfilePass() : ModulePass(ID) {
+ initializeCFGProfilePassPass(
+ *PassRegistry::getPassRegistry());
+ }
+
+ StringRef getPassName() const override { return "CFGProfilePass"; }
+
+private:
+ bool runOnModule(Module &M) override;
+
+ void getAnalysisUsage(AnalysisUsage &AU) const override {
+ AU.addRequired<BlockFrequencyInfoWrapperPass>();
+ AU.addRequired<BranchProbabilityInfoWrapperPass>();
+ }
+};
+
+bool CFGProfilePass::runOnModule(Module &M) {
+ if (skipModule(M))
+ return false;
+
+ llvm::DenseMap<std::pair<StringRef, StringRef>, uint64_t> Counts;
+
+ for (auto &F : M) {
+ if (F.isDeclaration())
+ continue;
+ getAnalysis<BranchProbabilityInfoWrapperPass>(F).getBPI();
+ auto &BFI = getAnalysis<BlockFrequencyInfoWrapperPass>(F).getBFI();
+ for (auto &BB : F) {
+ Optional<uint64_t> BBCount = BFI.getBlockProfileCount(&BB);
+ if (!BBCount)
+ continue;
+ for (auto &I : BB) {
+ auto *CI = dyn_cast<CallInst>(&I);
+ if (!CI)
+ continue;
+ Function *CalledF = CI->getCalledFunction();
+ if (!CalledF || CalledF->isIntrinsic())
+ continue;
+
+ uint64_t &Count =
+ Counts[std::make_pair(F.getName(), CalledF->getName())];
+ Count = SaturatingAdd(Count, *BBCount);
+ }
+ }
+ }
+
+ if (Counts.empty())
+ return false;
+
+ LLVMContext &Context = M.getContext();
+ MDBuilder MDB(Context);
+ std::vector<Metadata *> Nodes;
+
+ for (auto E : Counts) {
+ SmallVector<Metadata *, 3> Vals;
+ Vals.push_back(MDB.createString(E.first.first));
+ Vals.push_back(MDB.createString(E.first.second));
+ Vals.push_back(MDB.createConstant(
+ ConstantInt::get(Type::getInt64Ty(Context), E.second)));
+ Nodes.push_back(MDNode::get(Context, Vals));
+ }
+
+ M.addModuleFlag(Module::Append, "CFG Profile", MDNode::get(Context, Nodes));
+
+ return true;
+}
+
+char CFGProfilePass::ID = 0;
+INITIALIZE_PASS_BEGIN(CFGProfilePass, "cfg-profile",
+ "Generate profile information from the CFG.", false, false)
+ INITIALIZE_PASS_DEPENDENCY(BlockFrequencyInfoWrapperPass)
+ INITIALIZE_PASS_DEPENDENCY(BranchProbabilityInfoWrapperPass)
+ INITIALIZE_PASS_END(CFGProfilePass, "cfg-profile",
+ "Generate profile information from the CFG.", false, false)
+
+ModulePass *llvm::createCFGProfilePass() {
+ return new CFGProfilePass();
+}
diff --git a/lib/Transforms/Instrumentation/CMakeLists.txt b/lib/Transforms/Instrumentation/CMakeLists.txt
index f2806e278e6..9b33edf0631 100644
--- a/lib/Transforms/Instrumentation/CMakeLists.txt
+++ b/lib/Transforms/Instrumentation/CMakeLists.txt
@@ -1,6 +1,7 @@
add_llvm_library(LLVMInstrumentation
AddressSanitizer.cpp
BoundsChecking.cpp
+ CFGProfile.cpp
DataFlowSanitizer.cpp
GCOVProfiling.cpp
MemorySanitizer.cpp
diff --git a/lib/Transforms/Instrumentation/Instrumentation.cpp b/lib/Transforms/Instrumentation/Instrumentation.cpp
index ed5e9dba396..0a3b45d18b8 100644
--- a/lib/Transforms/Instrumentation/Instrumentation.cpp
+++ b/lib/Transforms/Instrumentation/Instrumentation.cpp
@@ -60,6 +60,7 @@ void llvm::initializeInstrumentation(PassRegistry &Registry) {
initializeAddressSanitizerModulePass(Registry);
initializeBoundsCheckingLegacyPassPass(Registry);
initializeGCOVProfilerLegacyPassPass(Registry);
+ initializeCFGProfilePassPass(Registry);
initializePGOInstrumentationGenLegacyPassPass(Registry);
initializePGOInstrumentationUseLegacyPassPass(Registry);
initializePGOIndirectCallPromotionLegacyPassPass(Registry);
diff --git a/tools/llvm-readobj/ELFDumper.cpp b/tools/llvm-readobj/ELFDumper.cpp
index 9678667abff..6beb13643b8 100644
--- a/tools/llvm-readobj/ELFDumper.cpp
+++ b/tools/llvm-readobj/ELFDumper.cpp
@@ -97,6 +97,7 @@ using namespace ELF;
using Elf_Vernaux = typename ELFO::Elf_Vernaux; \
using Elf_Verdef = typename ELFO::Elf_Verdef; \
using Elf_Verdaux = typename ELFO::Elf_Verdaux; \
+ using Elf_CGProfile = typename ELFT::CGProfile; \
using uintX_t = typename ELFO::uintX_t;
namespace {
@@ -161,6 +162,8 @@ public:
void printHashHistogram() override;
+ void printCGProfile() override;
+
void printNotes() override;
private:
@@ -205,6 +208,7 @@ private:
const Elf_Hash *HashTable = nullptr;
const Elf_GnuHash *GnuHashTable = nullptr;
const Elf_Shdr *DotSymtabSec = nullptr;
+ const Elf_Shdr *DotCGProfileSec = nullptr;
StringRef DynSymtabName;
ArrayRef<Elf_Word> ShndxTable;
@@ -249,9 +253,11 @@ public:
Elf_Rela_Range dyn_relas() const;
std::string getFullSymbolName(const Elf_Sym *Symbol, StringRef StrTable,
bool IsDynamic) const;
+ StringRef getStaticSymbolName(uint32_t Index) const;
void printSymbolsHelper(bool IsDynamic) const;
const Elf_Shdr *getDotSymtabSec() const { return DotSymtabSec; }
+ const Elf_Shdr *getDotCGProfileSec() const { return DotCGProfileSec; }
ArrayRef<Elf_Word> getShndxTable() const { return ShndxTable; }
StringRef getDynamicStringTable() const { return DynamicStringTable; }
const DynRegionInfo &getDynRelRegion() const { return DynRelRegion; }
@@ -309,6 +315,7 @@ public:
bool IsDynamic) = 0;
virtual void printProgramHeaders(const ELFFile<ELFT> *Obj) = 0;
virtual void printHashHistogram(const ELFFile<ELFT> *Obj) = 0;
+ virtual void printCGProfile(const ELFFile<ELFT> *Obj) = 0;
virtual void printNotes(const ELFFile<ELFT> *Obj) = 0;
const ELFDumper<ELFT> *dumper() const { return Dumper; }
@@ -336,6 +343,7 @@ public:
size_t Offset) override;
void printProgramHeaders(const ELFO *Obj) override;
void printHashHistogram(const ELFFile<ELFT> *Obj) override;
+ void printCGProfile(const ELFFile<ELFT> *Obj) override;
void printNotes(const ELFFile<ELFT> *Obj) override;
private:
@@ -394,6 +402,7 @@ public:
void printDynamicRelocations(const ELFO *Obj) override;
void printProgramHeaders(const ELFO *Obj) override;
void printHashHistogram(const ELFFile<ELFT> *Obj) override;
+ void printCGProfile(const ELFFile<ELFT> *Obj) override;
void printNotes(const ELFFile<ELFT> *Obj) override;
private:
@@ -734,6 +743,16 @@ std::string ELFDumper<ELFT>::getFullSymbolName(const Elf_Sym *Symbol,
return FullSymbolName;
}
+template <typename ELFT>
+StringRef ELFDumper<ELFT>::getStaticSymbolName(uint32_t Index) const {
+ StringRef StrTable = unwrapOrError(Obj->getStringTableForSymtab(*DotSymtabSec));
+ Elf_Sym_Range Syms = unwrapOrError(Obj->symbols(DotSymtabSec));
+ if (Index >= Syms.size())
+ reportError("Invalid symbol index");
+ const Elf_Sym *Sym = &Syms[Index];
+ return unwrapOrError(Sym->getName(StrTable));
+}
+
template <typename ELFT>
static void
getSectionNameIndex(const ELFFile<ELFT> &Obj, const typename ELFT::Sym *Symbol,
@@ -1342,6 +1361,12 @@ ELFDumper<ELFT>::ELFDumper(const ELFFile<ELFT> *Obj, ScopedPrinter &Writer)
reportError("Multiple SHT_GNU_verneed");
dot_gnu_version_r_sec = &Sec;
break;
+ case ELF::SHT_NOTE:
+ if (unwrapOrError(Obj->getSectionName(&Sec)) != ".note.llvm.cgprofile")
+ break;
+ if (DotCGProfileSec != nullptr)
+ reportError("Multiple .note.llvm.cgprofile");
+ DotCGProfileSec = &Sec;
}
}
@@ -1486,6 +1511,10 @@ template <class ELFT> void ELFDumper<ELFT>::printHashHistogram() {
ELFDumperStyle->printHashHistogram(Obj);
}
+template <class ELFT> void ELFDumper<ELFT>::printCGProfile() {
+ ELFDumperStyle->printCGProfile(Obj);
+}
+
template <class ELFT> void ELFDumper<ELFT>::printNotes() {
ELFDumperStyle->printNotes(Obj);
}
@@ -3388,6 +3417,11 @@ void GNUStyle<ELFT>::printHashHistogram(const ELFFile<ELFT> *Obj) {
}
}
+template <class ELFT>
+void GNUStyle<ELFT>::printCGProfile(const ELFFile<ELFT> *Obj) {
+ OS<< "GNUStyle::printCGProfile not implemented\n";
+}
+
static std::string getGNUNoteTypeName(const uint32_t NT) {
static const struct {
uint32_t ID;
@@ -3988,6 +4022,22 @@ void LLVMStyle<ELFT>::printHashHistogram(const ELFFile<ELFT> *Obj) {
W.startLine() << "Hash Histogram not implemented!\n";
}
+
+
+template <class ELFT>
+void LLVMStyle<ELFT>::printCGProfile(const ELFFile<ELFT> *Obj) {
+ ListScope L(W, "CGProfile");
+ if (!this->dumper()->getDotCGProfileSec())
+ return;
+ auto CGProfile = unwrapOrError(Obj->template getSectionContentsAsArray<Elf_CGProfile>(this->dumper()->getDotCGProfileSec()));
+ for (const Elf_CGProfile &CGPE : CGProfile) {
+ DictScope D(W, "CGProfileEntry");
+ W.printNumber("From", this->dumper()->getStaticSymbolName(CGPE.cgp_from), CGPE.cgp_from);
+ W.printNumber("To", this->dumper()->getStaticSymbolName(CGPE.cgp_to), CGPE.cgp_to);
+ W.printNumber("Weight", CGPE.cgp_weight);
+ }
+}
+
template <class ELFT>
void LLVMStyle<ELFT>::printNotes(const ELFFile<ELFT> *Obj) {
W.startLine() << "printNotes not implemented!\n";
diff --git a/tools/llvm-readobj/ObjDumper.h b/tools/llvm-readobj/ObjDumper.h
index c5b331d944a..bf55dac0407 100644
--- a/tools/llvm-readobj/ObjDumper.h
+++ b/tools/llvm-readobj/ObjDumper.h
@@ -47,6 +47,7 @@ public:
virtual void printVersionInfo() {}
virtual void printGroupSections() {}
virtual void printHashHistogram() {}
+ virtual void printCGProfile() {}
virtual void printNotes() {}
// Only implemented for ARM ELF at this time.
diff --git a/tools/llvm-readobj/llvm-readobj.cpp b/tools/llvm-readobj/llvm-readobj.cpp
index 61b4c12926c..5c7039e6094 100644
--- a/tools/llvm-readobj/llvm-readobj.cpp
+++ b/tools/llvm-readobj/llvm-readobj.cpp
@@ -284,6 +284,8 @@ namespace opts {
cl::alias HashHistogramShort("I", cl::desc("Alias for -elf-hash-histogram"),
cl::aliasopt(HashHistogram));
+ cl::opt<bool> CGProfile("elf-cg-profile", cl::desc("Display callgraph profile section"));
+
cl::opt<OutputStyleTy>
Output("elf-output-style", cl::desc("Specify ELF dump style"),
cl::values(clEnumVal(LLVM, "LLVM default style"),
@@ -439,6 +441,8 @@ static void dumpObject(const ObjectFile *Obj) {
Dumper->printGroupSections();
if (opts::HashHistogram)
Dumper->printHashHistogram();
+ if (opts::CGProfile)
+ Dumper->printCGProfile();
if (opts::Notes)
Dumper->printNotes();
}
-------------- next part --------------
diff --git a/ELF/CMakeLists.txt b/ELF/CMakeLists.txt
index 511529a4c..44125b849 100644
--- a/ELF/CMakeLists.txt
+++ b/ELF/CMakeLists.txt
@@ -18,6 +18,7 @@ add_lld_library(lldELF
Arch/SPARCV9.cpp
Arch/X86.cpp
Arch/X86_64.cpp
+ CallGraphSort.cpp
Driver.cpp
DriverUtils.cpp
EhFrame.cpp
diff --git a/ELF/CallGraphSort.cpp b/ELF/CallGraphSort.cpp
new file mode 100644
index 000000000..774637973
--- /dev/null
+++ b/ELF/CallGraphSort.cpp
@@ -0,0 +1,318 @@
+//===- CallGraphSort.cpp --------------------------------------------------===//
+//
+// The LLVM Linker
+//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+//
+//===----------------------------------------------------------------------===//
+///
+/// \file This file implements Call-Chain Clustering from:
+/// Optimizing Function Placement for Large-Scale Data-Center Applications
+/// https://research.fb.com/wp-content/uploads/2017/01/cgo2017-hfsort-final1.pdf
+///
+/// The goal of this algorithm is to improve runtime performance of the final
+/// executable by arranging code sections such that page table and i-cache
+/// misses are minimized.
+///
+/// Definitions:
+/// * Cluster
+/// * An ordered list of input sections which are layed out as a unit. At the
+/// beginning of the algorithm each input section has its own cluster and
+/// the weight of the cluster is the sum of the weight of all incomming
+/// edges.
+/// * Call-Chain Clustering (C�) Heuristic
+/// * Defines when and how clusters are combined. Pick the highest weight edge
+/// from cluster _u_ to _v_ then move the sections in _v_ and append them to
+/// _u_ unless the combined size would be larger than the page size.
+/// * Density
+/// * The weight of the cluster divided by the size of the cluster. This is a
+/// proxy for the ammount of execution time spent per byte of the cluster.
+///
+/// It does so given a call graph profile by the following:
+/// * Build a call graph from the profile
+/// * While there are unresolved edges
+/// * Find the edge with the highest weight
+/// * Check if merging the two clusters would create a cluster larger than the
+/// target page size
+/// * If not, contract that edge putting the callee after the caller
+/// * Sort remaining clusters by density
+///
+//===----------------------------------------------------------------------===//
+
+#include "CallGraphSort.h"
+#include "SymbolTable.h"
+#include "Target.h"
+
+#include "llvm/Support/MathExtras.h"
+
+#include <queue>
+#include <unordered_set>
+
+using namespace llvm;
+using namespace lld;
+using namespace lld::elf;
+
+namespace {
+class CallGraphSort {
+ using NodeIndex = std::ptrdiff_t;
+ using EdgeIndex = std::ptrdiff_t;
+
+ struct Node {
+ Node() = default;
+ Node(const InputSectionBase *IS);
+ std::vector<const InputSectionBase *> Sections;
+ std::vector<EdgeIndex> IncidentEdges;
+ int64_t Size = 0;
+ uint64_t Weight = 0;
+ };
+
+ struct Edge {
+ NodeIndex From;
+ NodeIndex To;
+ mutable uint64_t Weight;
+ bool operator==(const Edge Other) const;
+ bool operator<(const Edge Other) const;
+ void kill();
+ bool isDead() const;
+ };
+
+ struct EdgeDenseMapInfo {
+ static Edge getEmptyKey() {
+ return {DenseMapInfo<NodeIndex>::getEmptyKey(),
+ DenseMapInfo<NodeIndex>::getEmptyKey(), 0};
+ }
+ static Edge getTombstoneKey() {
+ return {DenseMapInfo<NodeIndex>::getTombstoneKey(),
+ DenseMapInfo<NodeIndex>::getTombstoneKey(), 0};
+ }
+ static unsigned getHashValue(const Edge &Val) {
+ return hash_combine(DenseMapInfo<NodeIndex>::getHashValue(Val.From),
+ DenseMapInfo<NodeIndex>::getHashValue(Val.To));
+ }
+ static bool isEqual(const Edge &LHS, const Edge &RHS) { return LHS == RHS; }
+ };
+
+ std::vector<Node> Nodes;
+ std::vector<Edge> Edges;
+ struct EdgePriorityCmp {
+ std::vector<Edge> &Edges;
+ bool operator()(EdgeIndex A, EdgeIndex B) const {
+ return Edges[A].Weight < Edges[B].Weight;
+ }
+ };
+ std::priority_queue<EdgeIndex, std::vector<EdgeIndex>, EdgePriorityCmp>
+ WorkQueue{EdgePriorityCmp{Edges}};
+
+ void contractEdge(EdgeIndex CEI);
+ void generateClusters();
+
+public:
+ CallGraphSort(DenseMap<std::pair<const Symbol *, const Symbol *>,
+ uint64_t> &Profile);
+
+ DenseMap<const InputSectionBase *, int> run();
+};
+} // end anonymous namespace
+
+CallGraphSort::Node::Node(const InputSectionBase *IS) {
+ Sections.push_back(IS);
+ Size = IS->getSize();
+}
+
+bool CallGraphSort::Edge::operator==(const Edge Other) const {
+ return From == Other.From && To == Other.To;
+}
+
+bool CallGraphSort::Edge::operator<(const Edge Other) const {
+ if (From != Other.From)
+ return From < Other.From;
+ return To < Other.To;
+}
+
+void CallGraphSort::Edge::kill() {
+ From = 0;
+ To = 0;
+}
+
+bool CallGraphSort::Edge::isDead() const { return From == 0 && To == 0; }
+
+// Take the edge list in Config->CallGraphProfile, resolve symbol names to
+// Symbols, and generate a graph between InputSections with the provided
+// weights.
+CallGraphSort::CallGraphSort(
+ DenseMap<std::pair<const Symbol *, const Symbol *>, uint64_t>
+ &Profile) {
+ DenseMap<const InputSectionBase *, NodeIndex> SecToNode;
+ DenseMap<Edge, EdgeIndex, EdgeDenseMapInfo> EdgeMap;
+
+ auto GetOrCreateNode = [&](const InputSectionBase *IS) -> NodeIndex {
+ auto Res = SecToNode.insert(std::make_pair(IS, Nodes.size()));
+ if (Res.second)
+ Nodes.emplace_back(IS);
+ return Res.first->second;
+ };
+
+ // Create the graph.
+ for (const auto &C : Profile) {
+ const Symbol *FromSym = C.first.first;
+ const Symbol *ToSym = C.first.second;
+ uint64_t Weight = C.second;
+
+ if (Weight == 0)
+ continue;
+
+ // Get the input section for a given symbol.
+ auto *FromDR = dyn_cast_or_null<Defined>(FromSym);
+ auto *ToDR = dyn_cast_or_null<Defined>(ToSym);
+ if (!FromDR || !ToDR)
+ continue;
+
+ auto *FromSB = dyn_cast_or_null<const InputSectionBase>(FromDR->Section);
+ auto *ToSB = dyn_cast_or_null<const InputSectionBase>(ToDR->Section);
+ if (!FromSB || !ToSB || FromSB->getSize() == 0 || ToSB->getSize() == 0)
+ continue;
+
+ NodeIndex From = GetOrCreateNode(FromSB);
+ NodeIndex To = GetOrCreateNode(ToSB);
+ Edge E{From, To, Weight};
+
+ // Add or increment an edge
+ auto Res = EdgeMap.insert(std::make_pair(E, Edges.size()));
+ EdgeIndex EI = Res.first->second;
+ if (Res.second) {
+ Edges.push_back(E);
+ Nodes[From].IncidentEdges.push_back(EI);
+ Nodes[To].IncidentEdges.push_back(EI);
+ } else
+ Edges[EI].Weight = SaturatingAdd(Edges[EI].Weight, Weight);
+
+ Nodes[To].Weight = SaturatingAdd(Nodes[To].Weight, Weight);
+ }
+}
+
+/// Remove edge \p CEI from the graph while simultaneously merging its two
+/// incident vertices u and v. This merges any duplicate edges between u and v
+/// by accumulating their weights.
+void CallGraphSort::contractEdge(EdgeIndex CEI) {
+ // Make a copy of the edge as the original will be marked killed while being
+ // used.
+ Edge CE = Edges[CEI];
+ std::vector<EdgeIndex> &FE = Nodes[CE.From].IncidentEdges;
+
+ // Remove the self edge from From.
+ FE.erase(std::remove(FE.begin(), FE.end(), CEI));
+ std::vector<EdgeIndex> &TE = Nodes[CE.To].IncidentEdges;
+
+ // Update all edges incident with To to reference From instead. Then if they
+ // aren't self edges add them to From.
+ for (EdgeIndex EI : TE) {
+ Edge &E = Edges[EI];
+ if (E.From == CE.To)
+ E.From = CE.From;
+ if (E.To == CE.To)
+ E.To = CE.From;
+ if (E.To == E.From) {
+ E.kill();
+ continue;
+ }
+ FE.push_back(EI);
+ }
+
+ // Free memory.
+ std::vector<EdgeIndex>().swap(TE);
+
+ if (FE.empty())
+ return;
+
+ // Sort edges so they can be merged. The stability of this sort doesn't matter
+ // as equal edges will be merged in an order independent manner.
+ std::sort(FE.begin(), FE.end(),
+ [&](EdgeIndex AI, EdgeIndex BI) { return Edges[AI] < Edges[BI]; });
+
+ // std::unique, but also merge equal values.
+ auto First = FE.begin();
+ auto Last = FE.end();
+ auto Result = First;
+ while (++First != Last) {
+ if (Edges[*Result] == Edges[*First]) {
+ Edges[*Result].Weight =
+ SaturatingAdd(Edges[*Result].Weight, Edges[*First].Weight);
+ Edges[*First].kill();
+ // Add the updated edge to the work queue without removing the previous
+ // entry. Edges will never be contracted twice as they are marked as dead.
+ WorkQueue.push(*Result);
+ } else if (++Result != First)
+ *Result = *First;
+ }
+ FE.erase(++Result, FE.end());
+}
+
+// Group InputSections into clusters using the Call-Chain Clustering heuristic
+// then sort the clusters by density.
+void CallGraphSort::generateClusters() {
+ for (size_t I = 0; I < Edges.size(); ++I)
+ WorkQueue.push(I);
+
+ // Collapse the graph.
+ while (!WorkQueue.empty()) {
+ EdgeIndex MaxI = WorkQueue.top();
+ const Edge MaxE = Edges[MaxI];
+ WorkQueue.pop();
+ if (MaxE.isDead())
+ continue;
+ // Merge the Nodes.
+ Node &From = Nodes[MaxE.From];
+ Node &To = Nodes[MaxE.To];
+ if (From.Size + To.Size > Target->PageSize)
+ continue;
+ contractEdge(MaxI);
+ From.Sections.insert(From.Sections.end(), To.Sections.begin(),
+ To.Sections.end());
+ From.Size += To.Size;
+ From.Weight = SaturatingAdd(From.Weight, To.Weight);
+ To.Sections.clear();
+ To.Size = 0;
+ To.Weight = 0;
+ }
+
+ // Remove empty or dead nodes.
+ Nodes.erase(std::remove_if(Nodes.begin(), Nodes.end(),
+ [](const Node &N) {
+ return N.Size == 0 || N.Sections.empty();
+ }),
+ Nodes.end());
+
+ // Sort by density. Invalidates all NodeIndexs.
+ std::sort(Nodes.begin(), Nodes.end(), [](const Node &A, const Node &B) {
+ return (APFloat(APFloat::IEEEdouble(), A.Weight) /
+ APFloat(APFloat::IEEEdouble(), A.Size))
+ .compare(APFloat(APFloat::IEEEdouble(), B.Weight) /
+ APFloat(APFloat::IEEEdouble(), B.Size)) ==
+ APFloat::cmpLessThan;
+ });
+}
+
+DenseMap<const InputSectionBase *, int> CallGraphSort::run() {
+ generateClusters();
+
+ // Generate order.
+ llvm::DenseMap<const InputSectionBase *, int> OrderMap;
+ ssize_t CurOrder = 1;
+
+ for (const Node &N : Nodes)
+ for (const InputSectionBase *IS : N.Sections)
+ OrderMap[IS] = CurOrder++;
+
+ return OrderMap;
+}
+
+// Sort sections by the profile data provided by -callgraph-profile-file
+//
+// This first builds a call graph based on the profile data then iteratively
+// merges the hottest call edges as long as it would not create a cluster larger
+// than the page size. All clusters are then sorted by a density metric to
+// further improve locality.
+DenseMap<const InputSectionBase *, int> elf::computeCallGraphProfileOrder() {
+ return CallGraphSort(Config->CallGraphProfile).run();
+}
diff --git a/ELF/CallGraphSort.h b/ELF/CallGraphSort.h
new file mode 100644
index 000000000..46455489c
--- /dev/null
+++ b/ELF/CallGraphSort.h
@@ -0,0 +1,24 @@
+//===- CallGraphSort.h ------------------------------------------*- C++ -*-===//
+//
+// The LLVM Linker
+//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLD_ELF_CALL_GRAPH_SORT_H
+#define LLD_ELF_CALL_GRAPH_SORT_H
+
+#include "llvm/ADT/DenseMap.h"
+
+namespace lld {
+namespace elf {
+class InputSectionBase;
+
+llvm::DenseMap<const InputSectionBase *, int>
+computeCallGraphProfileOrder();
+}
+}
+
+#endif
diff --git a/ELF/Config.h b/ELF/Config.h
index d3356175e..363bd915b 100644
--- a/ELF/Config.h
+++ b/ELF/Config.h
@@ -10,6 +10,7 @@
#ifndef LLD_ELF_CONFIG_H
#define LLD_ELF_CONFIG_H
+#include "llvm/ADT/DenseMap.h"
#include "llvm/ADT/MapVector.h"
#include "llvm/ADT/StringRef.h"
#include "llvm/ADT/StringSet.h"
@@ -24,6 +25,7 @@ namespace lld {
namespace elf {
class InputFile;
+class Symbol;
enum ELFKind {
ELFNoneKind,
@@ -91,6 +93,7 @@ struct Configuration {
llvm::StringRef SoName;
llvm::StringRef Sysroot;
llvm::StringRef ThinLTOCacheDir;
+ llvm::StringRef CallGraphProfileFile;
std::string Rpath;
std::vector<VersionDefinition> VersionDefinitions;
std::vector<llvm::StringRef> Argv;
@@ -103,6 +106,8 @@ struct Configuration {
std::vector<SymbolVersion> VersionScriptGlobals;
std::vector<SymbolVersion> VersionScriptLocals;
std::vector<uint8_t> BuildIdVector;
+ llvm::DenseMap<std::pair<const Symbol *, const Symbol *>, uint64_t>
+ CallGraphProfile;
bool AllowMultipleDefinition;
bool AndroidPackDynRelocs = false;
bool ARMHasBlx = false;
@@ -111,6 +116,7 @@ struct Configuration {
bool AsNeeded = false;
bool Bsymbolic;
bool BsymbolicFunctions;
+ bool CallGraphProfileSort = true;
bool CompressDebugSections;
bool DefineCommon;
bool Demangle = true;
diff --git a/ELF/Driver.cpp b/ELF/Driver.cpp
index 12b24d34e..c03ddb9a7 100644
--- a/ELF/Driver.cpp
+++ b/ELF/Driver.cpp
@@ -565,6 +565,42 @@ getBuildId(opt::InputArgList &Args) {
return {BuildIdKind::None, {}};
}
+// This reads a list of call edges with weights one line at a time from a file
+// with the following format for each line:
+//
+// ^[.*]+ [.*]+ [.*]+$
+//
+// It interprets the first value as an unsigned 64 bit weight, the second as
+// the symbol the call is from, and the third as the symbol the call is to.
+//
+// Example:
+//
+// 5000 c a
+// 4000 c b
+// 18446744073709551615 e d
+//
+template <typename ELFT>
+void readCallGraphProfile(MemoryBufferRef MB) {
+ for (StringRef L : args::getLines(MB)) {
+ SmallVector<StringRef, 3> Fields;
+ L.split(Fields, ' ');
+ if (Fields.size() != 3) {
+ error("parse error: " + MB.getBufferIdentifier() + ": " + L);
+ return;
+ }
+ uint64_t Count;
+ if (!to_integer(Fields[0], Count)) {
+ error("parse error: " + MB.getBufferIdentifier() + ": " + L);
+ return;
+ }
+ StringRef From = Fields[1];
+ StringRef To = Fields[2];
+ Config->CallGraphProfile[std::make_pair(Symtab->addUndefined<ELFT>(From),
+ Symtab->addUndefined<ELFT>(To))] =
+ Count;
+ }
+}
+
static bool getCompressDebugSections(opt::InputArgList &Args) {
StringRef S = Args.getLastArgValue(OPT_compress_debug_sections, "none");
if (S == "none")
@@ -590,6 +626,8 @@ void LinkerDriver::readConfigs(opt::InputArgList &Args) {
Config->AuxiliaryList = args::getStrings(Args, OPT_auxiliary);
Config->Bsymbolic = Args.hasArg(OPT_Bsymbolic);
Config->BsymbolicFunctions = Args.hasArg(OPT_Bsymbolic_functions);
+ Config->CallGraphProfileSort = Args.hasFlag(
+ OPT_call_graph_profile_sort, OPT_no_call_graph_profile_sort, true);
Config->Chroot = Args.getLastArgValue(OPT_chroot);
Config->CompressDebugSections = getCompressDebugSections(Args);
Config->DefineCommon = Args.hasFlag(OPT_define_common, OPT_no_define_common,
@@ -743,6 +781,9 @@ void LinkerDriver::readConfigs(opt::InputArgList &Args) {
if (Optional<MemoryBufferRef> Buffer = readFile(Arg->getValue()))
Config->SymbolOrderingFile = args::getLines(*Buffer);
+ if (auto *Arg = Args.getLastArg(OPT_call_graph_profile_file))
+ Config->CallGraphProfileFile = Arg->getValue();
+
// If --retain-symbol-file is used, we'll keep only the symbols listed in
// the file and discard all others.
if (auto *Arg = Args.getLastArg(OPT_retain_symbols_file)) {
@@ -1015,6 +1056,11 @@ template <class ELFT> void LinkerDriver::link(opt::InputArgList &Args) {
Config->HasDynSymTab =
!SharedFiles.empty() || Config->Pic || Config->ExportDynamic;
+ if (!Config->CallGraphProfileFile.empty())
+ if (Optional<MemoryBufferRef> Buffer =
+ readFile(Config->CallGraphProfileFile))
+ readCallGraphProfile<ELFT>(*Buffer);
+
// Some symbols (such as __ehdr_start) are defined lazily only when there
// are undefined symbols for them, so we add these to trigger that logic.
for (StringRef Sym : Script->ReferencedSymbols)
diff --git a/ELF/InputFiles.cpp b/ELF/InputFiles.cpp
index 115de3f8c..6255ed917 100644
--- a/ELF/InputFiles.cpp
+++ b/ELF/InputFiles.cpp
@@ -177,6 +177,15 @@ std::string ObjFile<ELFT>::getLineInfo(InputSectionBase *S, uint64_t Offset) {
return "";
}
+template<class ELFT>
+void lld::elf::ObjFile<ELFT>::parseCGProfile() {
+ for (const Elf_CGProfile &CGPE : CGProfile) {
+ uint64_t &C = Config->CallGraphProfile[std::make_pair(
+ &getSymbol(CGPE.cgp_from), &getSymbol(CGPE.cgp_to))];
+ C = std::max(C, (uint64_t)CGPE.cgp_weight);
+ }
+}
+
// Returns "<internal>", "foo.a(bar.o)" or "baz.o".
std::string lld::toString(const InputFile *F) {
if (!F)
@@ -242,6 +251,7 @@ void ObjFile<ELFT>::parse(DenseSet<CachedHashStringRef> &ComdatGroups) {
// Read section and symbol tables.
initializeSections(ComdatGroups);
initializeSymbols();
+ parseCGProfile();
}
// Sections with SHT_GROUP and comdat bits define comdat section groups.
@@ -588,6 +598,13 @@ InputSectionBase *ObjFile<ELFT>::createInputSection(const Elf_Shdr &Sec) {
if (Name == ".eh_frame" && !Config->Relocatable)
return make<EhInputSection>(this, &Sec, Name);
+ // Profile data.
+ if (Name == ".note.llvm.cgprofile") {
+ CGProfile = check(
+ this->getObj().template getSectionContentsAsArray<Elf_CGProfile>(&Sec));
+ return &InputSection::Discarded;
+ }
+
if (shouldMerge(Sec))
return make<MergeInputSection>(this, &Sec, Name);
return make<InputSection>(this, &Sec, Name);
diff --git a/ELF/InputFiles.h b/ELF/InputFiles.h
index 4fb86f4e1..8b0756b72 100644
--- a/ELF/InputFiles.h
+++ b/ELF/InputFiles.h
@@ -154,6 +154,7 @@ template <class ELFT> class ObjFile : public ELFFileBase<ELFT> {
typedef typename ELFT::Sym Elf_Sym;
typedef typename ELFT::Shdr Elf_Shdr;
typedef typename ELFT::Word Elf_Word;
+ typedef typename ELFT::CGProfile Elf_CGProfile;
StringRef getShtGroupSignature(ArrayRef<Elf_Shdr> Sections,
const Elf_Shdr &Sec);
@@ -201,6 +202,7 @@ private:
initializeSections(llvm::DenseSet<llvm::CachedHashStringRef> &ComdatGroups);
void initializeSymbols();
void initializeDwarf();
+ void parseCGProfile();
InputSectionBase *getRelocTarget(const Elf_Shdr &Sec);
InputSectionBase *createInputSection(const Elf_Shdr &Sec);
StringRef getSectionName(const Elf_Shdr &Sec);
@@ -218,6 +220,8 @@ private:
std::unique_ptr<llvm::DWARFDebugLine> DwarfLine;
llvm::DenseMap<StringRef, std::pair<unsigned, unsigned>> VariableLoc;
llvm::once_flag InitDwarfLine;
+
+ ArrayRef<Elf_CGProfile> CGProfile;
};
// LazyObjFile is analogous to ArchiveFile in the sense that
diff --git a/ELF/Options.td b/ELF/Options.td
index fbb0f2dcb..50b0a8d4a 100644
--- a/ELF/Options.td
+++ b/ELF/Options.td
@@ -51,6 +51,12 @@ def allow_multiple_definition: F<"allow-multiple-definition">,
def as_needed: F<"as-needed">,
HelpText<"Only set DT_NEEDED for shared libraries if used">;
+def call_graph_profile_file: S<"call-graph-profile-file">,
+ HelpText<"Layout sections to optimize the given callgraph">;
+
+def call_graph_profile_sort: F<"call-graph-profile-sort">,
+ HelpText<"Sort sections by call graph profile information">;
+
// -chroot doesn't have a help text because it is an internal option.
def chroot: S<"chroot">;
@@ -163,6 +169,9 @@ def nostdlib: F<"nostdlib">,
def no_as_needed: F<"no-as-needed">,
HelpText<"Always DT_NEEDED for shared libraries">;
+def no_call_graph_profile_sort: F<"no-call-graph-profile-sort">,
+ HelpText<"Don't sort sections by call graph profile information">;
+
def no_color_diagnostics: F<"no-color-diagnostics">,
HelpText<"Do not use colors in diagnostics">;
diff --git a/ELF/Writer.cpp b/ELF/Writer.cpp
index f61be873d..feb5952fd 100644
--- a/ELF/Writer.cpp
+++ b/ELF/Writer.cpp
@@ -8,6 +8,7 @@
//===----------------------------------------------------------------------===//
#include "Writer.h"
+#include "CallGraphSort.h"
#include "Config.h"
#include "Filesystem.h"
#include "LinkerScript.h"
@@ -1019,6 +1020,15 @@ findOrphanPos(std::vector<BaseCommand *>::iterator B,
template <class ELFT> void Writer<ELFT>::sortInputSections() {
assert(!Script->HasSectionsCommand);
+ // Use the rarely used option -call-graph-ordering-file to sort sections.
+ if (Config->CallGraphProfileSort && !Config->CallGraphProfile.empty()) {
+ DenseMap<const InputSectionBase *, int> OrderMap =
+ computeCallGraphProfileOrder();
+
+ if (OutputSection *Sec = findSection(".text"))
+ Sec->sort([&](InputSectionBase *S) { return OrderMap.lookup(S); });
+ }
+
// Sort input sections by priority using the list provided
// by --symbol-ordering-file.
DenseMap<SectionBase *, int> Order = buildSectionOrder();
diff --git a/test/ELF/Inputs/cgprofile.txt b/test/ELF/Inputs/cgprofile.txt
new file mode 100644
index 000000000..6b60397a6
--- /dev/null
+++ b/test/ELF/Inputs/cgprofile.txt
@@ -0,0 +1,7 @@
+5000 c a
+4000 c b
+0 d e
+18446744073709551615 e d
+18446744073709551611 f d
+18446744073709551612 f e
+6000 c h
diff --git a/test/ELF/cgprofile-object.s b/test/ELF/cgprofile-object.s
new file mode 100644
index 000000000..b308d58de
--- /dev/null
+++ b/test/ELF/cgprofile-object.s
@@ -0,0 +1,50 @@
+# REQUIRES: x86
+
+# RUN: llvm-mc -filetype=obj -triple=x86_64-unknown-linux %s -o %t
+# RUN: ld.lld %t -o %t2
+# RUN: llvm-readobj -symbols %t2 | FileCheck %s
+# RUN: ld.lld %t -o %t2 -no-call-graph-profile-sort
+# RUN: llvm-readobj -symbols %t2 | FileCheck %s --check-prefix=NOSORT
+
+ .section .text.hot._Z4fooav,"ax", at progbits
+ .globl _Z4fooav
+_Z4fooav:
+ retq
+
+ .section .text.hot._Z4foobv,"ax", at progbits
+ .globl _Z4foobv
+_Z4foobv:
+ retq
+
+ .section .text.hot._Z3foov,"ax", at progbits
+ .globl _Z3foov
+_Z3foov:
+ retq
+
+ .section .text.hot._start,"ax", at progbits
+ .globl _start
+_start:
+ retq
+
+
+ .cg_profile _start, _Z3foov, 1
+ .cg_profile _Z4fooav, _Z4foobv, 1
+ .cg_profile _Z3foov, _Z4fooav, 1
+
+# CHECK: Name: _Z3foov
+# CHECK-NEXT: Value: 0x201001
+# CHECK: Name: _Z4fooav
+# CHECK-NEXT: Value: 0x201002
+# CHECK: Name: _Z4foobv
+# CHECK-NEXT: Value: 0x201003
+# CHECK: Name: _start
+# CHECK-NEXT: Value: 0x201000
+
+# NOSORT: Name: _Z3foov
+# NOSORT-NEXT: Value: 0x201002
+# NOSORT: Name: _Z4fooav
+# NOSORT-NEXT: Value: 0x201000
+# NOSORT: Name: _Z4foobv
+# NOSORT-NEXT: Value: 0x201001
+# NOSORT: Name: _start
+# NOSORT-NEXT: Value: 0x201003
diff --git a/test/ELF/cgprofile.s b/test/ELF/cgprofile.s
new file mode 100644
index 000000000..ce0e0a51b
--- /dev/null
+++ b/test/ELF/cgprofile.s
@@ -0,0 +1,128 @@
+# REQUIRES: x86
+#
+# RUN: llvm-mc -filetype=obj -triple=x86_64-unknown-linux %s -o %t1
+# RUN: ld.lld %t1 -e a -o %t -call-graph-profile-file %p/Inputs/cgprofile.txt
+# RUN: llvm-readobj -symbols %t | FileCheck %s
+
+ .section .text.a,"ax", at progbits
+ .global a
+a:
+ .zero 20
+
+ .section .text.b,"ax", at progbits
+ .global b
+b:
+ .zero 1
+
+ .section .text.c,"ax", at progbits
+ .global c
+c:
+ .zero 4095
+
+ .section .text.d,"ax", at progbits
+ .global d
+d:
+ .zero 51
+
+ .section .text.e,"ax", at progbits
+ .global e
+e:
+ .zero 42
+
+ .section .text.f,"ax", at progbits
+ .global f
+f:
+ .zero 42
+
+ .section .text.g,"ax", at progbits
+ .global g
+g:
+ .zero 34
+
+ .section .text.h,"ax", at progbits
+ .global h
+h:
+
+# CHECK: Symbols [
+# CHECK-NEXT: Symbol {
+# CHECK-NEXT: Name: (0)
+# CHECK-NEXT: Value: 0x0
+# CHECK-NEXT: Size: 0
+# CHECK-NEXT: Binding: Local (0x0)
+# CHECK-NEXT: Type: None (0x0)
+# CHECK-NEXT: Other: 0
+# CHECK-NEXT: Section: Undefined (0x0)
+# CHECK-NEXT: }
+# CHECK-NEXT: Symbol {
+# CHECK-NEXT: Name: a
+# CHECK-NEXT: Value: 0x202022
+# CHECK-NEXT: Size: 0
+# CHECK-NEXT: Binding: Global (0x1)
+# CHECK-NEXT: Type: None (0x0)
+# CHECK-NEXT: Other: 0
+# CHECK-NEXT: Section: .text
+# CHECK-NEXT: }
+# CHECK-NEXT: Symbol {
+# CHECK-NEXT: Name: b
+# CHECK-NEXT: Value: 0x202021
+# CHECK-NEXT: Size: 0
+# CHECK-NEXT: Binding: Global (0x1)
+# CHECK-NEXT: Type: None (0x0)
+# CHECK-NEXT: Other: 0
+# CHECK-NEXT: Section: .text
+# CHECK-NEXT: }
+# CHECK-NEXT: Symbol {
+# CHECK-NEXT: Name: c
+# CHECK-NEXT: Value: 0x201022
+# CHECK-NEXT: Size: 0
+# CHECK-NEXT: Binding: Global (0x1)
+# CHECK-NEXT: Type: None (0x0)
+# CHECK-NEXT: Other: 0
+# CHECK-NEXT: Section: .text
+# CHECK-NEXT: }
+# CHECK-NEXT: Symbol {
+# CHECK-NEXT: Name: d
+# CHECK-NEXT: Value: 0x20208A
+# CHECK-NEXT: Size: 0
+# CHECK-NEXT: Binding: Global (0x1)
+# CHECK-NEXT: Type: None (0x0)
+# CHECK-NEXT: Other: 0
+# CHECK-NEXT: Section: .text
+# CHECK-NEXT: }
+# CHECK-NEXT: Symbol {
+# CHECK-NEXT: Name: e
+# CHECK-NEXT: Value: 0x202060
+# CHECK-NEXT: Size: 0
+# CHECK-NEXT: Binding: Global (0x1)
+# CHECK-NEXT: Type: None (0x0)
+# CHECK-NEXT: Other: 0
+# CHECK-NEXT: Section: .text
+# CHECK-NEXT: }
+# CHECK-NEXT: Symbol {
+# CHECK-NEXT: Name: f
+# CHECK-NEXT: Value: 0x202036
+# CHECK-NEXT: Size: 0
+# CHECK-NEXT: Binding: Global (0x1)
+# CHECK-NEXT: Type: None (0x0)
+# CHECK-NEXT: Other: 0
+# CHECK-NEXT: Section: .text
+# CHECK-NEXT: }
+# CHECK-NEXT: Symbol {
+# CHECK-NEXT: Name: g
+# CHECK-NEXT: Value: 0x201000
+# CHECK-NEXT: Size: 0
+# CHECK-NEXT: Binding: Global (0x1)
+# CHECK-NEXT: Type: None (0x0)
+# CHECK-NEXT: Other: 0
+# CHECK-NEXT: Section: .text
+# CHECK-NEXT: }
+# CHECK-NEXT: Symbol {
+# CHECK-NEXT: Name: h
+# CHECK-NEXT: Value: 0x201022
+# CHECK-NEXT: Size: 0
+# CHECK-NEXT: Binding: Global (0x1)
+# CHECK-NEXT: Type: None (0x0)
+# CHECK-NEXT: Other: 0
+# CHECK-NEXT: Section: .text
+# CHECK-NEXT: }
+# CHECK-NEXT:]
More information about the llvm-commits
mailing list