[llvm] r194566 - SampleProfileLoader pass. Initial setup.
Alexey Samsonov
samsonov at google.com
Wed Nov 13 05:15:28 PST 2013
On Wed, Nov 13, 2013 at 4:22 PM, Diego Novillo <dnovillo at google.com> wrote:
> Author: dnovillo
> Date: Wed Nov 13 06:22:21 2013
> New Revision: 194566
>
> URL: http://llvm.org/viewvc/llvm-project?rev=194566&view=rev
> Log:
> SampleProfileLoader pass. Initial setup.
>
> This adds a new scalar pass that reads a file with samples generated
> by 'perf' during runtime. The samples read from the profile are
> incorporated and emmited as IR metadata reflecting that profile.
>
> The profile file is assumed to have been generated by an external
> profile source. The profile information is converted into IR metadata,
> which is later used by the analysis routines to estimate block
> frequencies, edge weights and other related data.
>
> External profile information files have no fixed format, each profiler
> is free to define its own. This includes both the on-disk representation
> of the profile and the kind of profile information stored in the file.
> A common kind of profile is based on sampling (e.g., perf), which
> essentially counts how many times each line of the program has been
> executed during the run.
>
> The SampleProfileLoader pass is organized as a scalar transformation.
> On startup, it reads the file given in -sample-profile-file to
> determine what kind of profile it contains. This file is assumed to
> contain profile information for the whole application. The profile
> data in the file is read and incorporated into the internal state of
> the corresponding profiler.
>
> To facilitate testing, I've organized the profilers to support two file
> formats: text and native. The native format is whatever on-disk
> representation the profiler wants to support, I think this will mostly
> be bitcode files, but it could be anything the profiler wants to
> support. To do this, every profiler must implement the
> SampleProfile::loadNative() function.
>
> The text format is mostly meant for debugging. Records are separated by
> newlines, but each profiler is free to interpret records as it sees fit.
> Profilers must implement the SampleProfile::loadText() function.
>
> Finally, the pass will call SampleProfile::emitAnnotations() for each
> function in the current translation unit. This function needs to
> translate the loaded profile into IR metadata, which the analyzer will
> later be able to use.
>
> This patch implements the first steps towards the above design. I've
> implemented a sample-based flat profiler. The format of the profile is
> fairly simplistic. Each sampled function contains a list of relative
> line locations (from the start of the function) together with a count
> representing how many samples were collected at that line during
> execution. I generate this profile using perf and a separate converter
> tool.
>
> Currently, I have only implemented a text format for these profiles. I
> am interested in initial feedback to the whole approach before I send
> the other parts of the implementation for review.
>
> This patch implements:
>
> - The SampleProfileLoader pass.
> - The base ExternalProfile class with the core interface.
> - A SampleProfile sub-class using the above interface. The profiler
> generates branch weight metadata on every branch instructions that
> matches the profiles.
> - A text loader class to assist the implementation of
> SampleProfile::loadText().
> - Basic unit tests for the pass.
>
> Additionally, the patch uses profile information to compute branch
> weights based on instruction samples.
>
> This patch converts instruction samples into branch weights. It
> does a fairly simplistic conversion:
>
> Given a multi-way branch instruction, it calculates the weight of
> each branch based on the maximum sample count gathered from each
> target basic block.
>
> Note that this assignment of branch weights is somewhat lossy and can be
> misleading. If a basic block has more than one incoming branch, all the
> incoming branches will get the same weight. In reality, it may be that
> only one of them is the most heavily taken branch.
>
> I will adjust this assignment in subsequent patches.
>
> Added:
> llvm/trunk/lib/Transforms/Scalar/SampleProfile.cpp
> llvm/trunk/test/Transforms/SampleProfile/
> llvm/trunk/test/Transforms/SampleProfile/Inputs/
> llvm/trunk/test/Transforms/SampleProfile/Inputs/branch.prof
> llvm/trunk/test/Transforms/SampleProfile/branch.ll
> Modified:
> llvm/trunk/include/llvm/InitializePasses.h
> llvm/trunk/include/llvm/Transforms/Scalar.h
> llvm/trunk/lib/Transforms/Scalar/CMakeLists.txt
> llvm/trunk/lib/Transforms/Scalar/Scalar.cpp
>
> Modified: llvm/trunk/include/llvm/InitializePasses.h
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/InitializePasses.h?rev=194566&r1=194565&r2=194566&view=diff
>
> ==============================================================================
> --- llvm/trunk/include/llvm/InitializePasses.h (original)
> +++ llvm/trunk/include/llvm/InitializePasses.h Wed Nov 13 06:22:21 2013
> @@ -70,6 +70,7 @@ void initializeAliasDebuggerPass(PassReg
> void initializeAliasSetPrinterPass(PassRegistry&);
> void initializeAlwaysInlinerPass(PassRegistry&);
> void initializeArgPromotionPass(PassRegistry&);
> +void initializeSampleProfileLoaderPass(PassRegistry&);
> void initializeBarrierNoopPass(PassRegistry&);
> void initializeBasicAliasAnalysisPass(PassRegistry&);
> void initializeCallGraphPass(PassRegistry&);
>
> Modified: llvm/trunk/include/llvm/Transforms/Scalar.h
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Transforms/Scalar.h?rev=194566&r1=194565&r2=194566&view=diff
>
> ==============================================================================
> --- llvm/trunk/include/llvm/Transforms/Scalar.h (original)
> +++ llvm/trunk/include/llvm/Transforms/Scalar.h Wed Nov 13 06:22:21 2013
> @@ -15,6 +15,8 @@
> #ifndef LLVM_TRANSFORMS_SCALAR_H
> #define LLVM_TRANSFORMS_SCALAR_H
>
> +#include "llvm/ADT/StringRef.h"
> +
> namespace llvm {
>
> class FunctionPass;
> @@ -355,6 +357,13 @@ FunctionPass *createLowerExpectIntrinsic
> //
> FunctionPass *createPartiallyInlineLibCallsPass();
>
>
> +//===----------------------------------------------------------------------===//
> +//
> +// SampleProfilePass - Loads sample profile data from disk and generates
> +// IR metadata to reflect the profile.
> +FunctionPass *createSampleProfileLoaderPass();
> +FunctionPass *createSampleProfileLoaderPass(StringRef Name);
> +
> } // End llvm namespace
>
> #endif
>
> Modified: llvm/trunk/lib/Transforms/Scalar/CMakeLists.txt
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/CMakeLists.txt?rev=194566&r1=194565&r2=194566&view=diff
>
> ==============================================================================
> --- llvm/trunk/lib/Transforms/Scalar/CMakeLists.txt (original)
> +++ llvm/trunk/lib/Transforms/Scalar/CMakeLists.txt Wed Nov 13 06:22:21
> 2013
> @@ -23,6 +23,7 @@ add_llvm_library(LLVMScalarOpts
> PartiallyInlineLibCalls.cpp
> Reassociate.cpp
> Reg2Mem.cpp
> + SampleProfile.cpp
> SCCP.cpp
> SROA.cpp
> Scalar.cpp
>
> Added: llvm/trunk/lib/Transforms/Scalar/SampleProfile.cpp
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/SampleProfile.cpp?rev=194566&view=auto
>
> ==============================================================================
> --- llvm/trunk/lib/Transforms/Scalar/SampleProfile.cpp (added)
> +++ llvm/trunk/lib/Transforms/Scalar/SampleProfile.cpp Wed Nov 13 06:22:21
> 2013
> @@ -0,0 +1,479 @@
> +//===- SampleProfile.cpp - Incorporate sample profiles into the IR
> --------===//
> +//
> +// The LLVM Compiler Infrastructure
> +//
> +// This file is distributed under the University of Illinois Open Source
> +// License. See LICENSE.TXT for details.
> +//
>
> +//===----------------------------------------------------------------------===//
> +//
> +// This file implements the SampleProfileLoader transformation. This pass
> +// reads a profile file generated by a sampling profiler (e.g. Linux Perf
> -
> +// http://perf.wiki.kernel.org/) and generates IR metadata to reflect the
> +// profile information in the given profile.
> +//
> +// This pass generates branch weight annotations on the IR:
> +//
> +// - prof: Represents branch weights. This annotation is added to branches
> +// to indicate the weights of each edge coming out of the branch.
> +// The weight of each edge is the weight of the target block for
> +// that edge. The weight of a block B is computed as the maximum
> +// number of samples found in B.
> +//
>
> +//===----------------------------------------------------------------------===//
> +
> +#define DEBUG_TYPE "sample-profile"
> +
> +#include "llvm/ADT/DenseMap.h"
> +#include "llvm/ADT/OwningPtr.h"
> +#include "llvm/ADT/StringMap.h"
> +#include "llvm/ADT/StringRef.h"
> +#include "llvm/DebugInfo/DIContext.h"
> +#include "llvm/IR/Constants.h"
> +#include "llvm/IR/Function.h"
> +#include "llvm/IR/Instructions.h"
> +#include "llvm/IR/LLVMContext.h"
> +#include "llvm/IR/Metadata.h"
> +#include "llvm/IR/MDBuilder.h"
> +#include "llvm/IR/Module.h"
> +#include "llvm/Pass.h"
> +#include "llvm/Support/CommandLine.h"
> +#include "llvm/Support/Debug.h"
> +#include "llvm/Support/InstIterator.h"
> +#include "llvm/Support/MemoryBuffer.h"
> +#include "llvm/Support/Regex.h"
> +#include "llvm/Support/raw_ostream.h"
> +#include "llvm/Transforms/Scalar.h"
> +
> +using namespace llvm;
> +
> +// Command line option to specify the file to read samples from. This is
> +// mainly used for debugging.
> +static cl::opt<std::string> SampleProfileFile(
> + "sample-profile-file", cl::init(""), cl::value_desc("filename"),
> + cl::desc("Profile file loaded by -sample-profile"), cl::Hidden);
> +
> +namespace {
> +/// \brief Sample-based profile reader.
> +///
> +/// Each profile contains sample counts for all the functions
> +/// executed. Inside each function, statements are annotated with the
> +/// collected samples on all the instructions associated with that
> +/// statement.
> +///
> +/// For this to produce meaningful data, the program needs to be
> +/// compiled with some debug information (at minimum, line numbers:
> +/// -gline-tables-only). Otherwise, it will be impossible to match IR
> +/// instructions to the line numbers collected by the profiler.
> +///
> +/// From the profile file, we are interested in collecting the
> +/// following information:
> +///
> +/// * A list of functions included in the profile (mangled names).
> +///
> +/// * For each function F:
> +/// 1. The total number of samples collected in F.
> +///
> +/// 2. The samples collected at each line in F. To provide some
> +/// protection against source code shuffling, line numbers should
> +/// be relative to the start of the function.
> +class SampleProfile {
> +public:
> + SampleProfile(StringRef F) : Profiles(0), Filename(F) {}
> +
> + virtual void dump();
> + virtual void loadText();
> + virtual void loadNative() { llvm_unreachable("not implemented"); }
> + virtual bool emitAnnotations(Function &F);
>
I've removed "virtual" from function declarations in r194568 , otherwise
Clang produced:
llvm/lib/Transforms/Scalar/SampleProfile.cpp:80:7: error:
'<anonymous>::SampleProfile' has virtual functions but non-virtual
destructor [-Werror,-Wnon-virtual-dtor]
class SampleProfile {
^
In file included from llvm/lib/Transforms/Scalar/SampleProfile.cpp:28:
llvm/include/llvm/ADT/OwningPtr.h:45:5: error: delete called on
'<anonymous>::SampleProfile' that has virtual functions but non-virtual
destructor [-Werror,-Wdelete-non-virtual-dtor]
delete Ptr;
^
llvm/lib/Transforms/Scalar/SampleProfile.cpp:212:3: note: in instantiation
of member function
'llvm::OwningPtr<<anonymous>::SampleProfile>::~OwningPtr' requested here
SampleProfileLoader(StringRef Name = SampleProfileFile)
^
In file included from llvm/lib/Transforms/Scalar/SampleProfile.cpp:28:
llvm/include/llvm/ADT/OwningPtr.h:55:5: error: delete called on
'<anonymous>::SampleProfile' that has virtual functions but non-virtual
destructor [-Werror,-Wdelete-non-virtual-dtor]
delete Tmp;
^
llvm/lib/Transforms/Scalar/SampleProfile.cpp:468:12: note: in instantiation
of member function 'llvm::OwningPtr<<anonymous>::SampleProfile>::reset'
requested here
Profiler.reset(new SampleProfile(Filename));
^
3 errors generated.
> + void printFunctionProfile(raw_ostream &OS, StringRef FName);
> + void dumpFunctionProfile(StringRef FName);
> +
> +protected:
> + typedef DenseMap<uint32_t, uint32_t> BodySampleMap;
> + typedef DenseMap<BasicBlock *, uint32_t> BlockWeightMap;
> +
> + /// \brief Representation of the runtime profile for a function.
> + ///
> + /// This data structure contains the runtime profile for a given
> + /// function. It contains the total number of samples collected
> + /// in the function and a map of samples collected in every statement.
> + struct FunctionProfile {
> + /// \brief Total number of samples collected inside this function.
> + ///
> + /// Samples are cumulative, they include all the samples collected
> + /// inside this function and all its inlined callees.
> + unsigned TotalSamples;
> +
> + // \brief Total number of samples collected at the head of the
> function.
> + unsigned TotalHeadSamples;
> +
> + /// \brief Map line offsets to collected samples.
> + ///
> + /// Each entry in this map contains the number of samples
> + /// collected at the corresponding line offset. All line locations
> + /// are an offset from the start of the function.
> + BodySampleMap BodySamples;
> +
> + /// \brief Map basic blocks to their computed weights.
> + ///
> + /// The weight of a basic block is defined to be the maximum
> + /// of all the instruction weights in that block.
> + BlockWeightMap BlockWeights;
> + };
> +
> + uint32_t getInstWeight(Instruction &I, unsigned FirstLineno,
> + BodySampleMap &BodySamples);
> + uint32_t computeBlockWeight(BasicBlock *B, unsigned FirstLineno,
> + BodySampleMap &BodySamples);
> +
> + /// \brief Map every function to its associated profile.
> + ///
> + /// The profile of every function executed at runtime is collected
> + /// in the structure FunctionProfile. This maps function objects
> + /// to their corresponding profiles.
> + StringMap<FunctionProfile> Profiles;
> +
> + /// \brief Path name to the file holding the profile data.
> + ///
> + /// The format of this file is defined by each profiler
> + /// independently. If possible, the profiler should have a text
> + /// version of the profile format to be used in constructing test
> + /// cases and debugging.
> + StringRef Filename;
> +};
> +
> +/// \brief Loader class for text-based profiles.
> +///
> +/// This class defines a simple interface to read text files containing
> +/// profiles. It keeps track of line number information and location of
> +/// the file pointer. Users of this class are responsible for actually
> +/// parsing the lines returned by the readLine function.
> +///
> +/// TODO - This does not really belong here. It is a generic text file
> +/// reader. It should be moved to the Support library and made more
> general.
> +class ExternalProfileTextLoader {
> +public:
> + ExternalProfileTextLoader(StringRef F) : Filename(F) {
> + error_code EC;
> + EC = MemoryBuffer::getFile(Filename, Buffer);
> + if (EC)
> + report_fatal_error("Could not open profile file " + Filename + ": "
> +
> + EC.message());
> + FP = Buffer->getBufferStart();
> + Lineno = 0;
> + }
> +
> + /// \brief Read a line from the mapped file.
> + StringRef readLine() {
> + size_t Length = 0;
> + const char *start = FP;
> + while (FP != Buffer->getBufferEnd() && *FP != '\n') {
> + Length++;
> + FP++;
> + }
> + if (FP != Buffer->getBufferEnd())
> + FP++;
> + Lineno++;
> + return StringRef(start, Length);
> + }
> +
> + /// \brief Return true, if we've reached EOF.
> + bool atEOF() const { return FP == Buffer->getBufferEnd(); }
> +
> + /// \brief Report a parse error message and stop compilation.
> + void reportParseError(Twine Msg) const {
> + report_fatal_error(Filename + ":" + Twine(Lineno) + ": " + Msg +
> "\n");
> + }
> +
> +private:
> + /// \brief Memory buffer holding the text file.
> + OwningPtr<MemoryBuffer> Buffer;
> +
> + /// \brief Current position into the memory buffer.
> + const char *FP;
> +
> + /// \brief Current line number.
> + int64_t Lineno;
> +
> + /// \brief Path name where to the profile file.
> + StringRef Filename;
> +};
> +
> +/// \brief Sample profile pass.
> +///
> +/// This pass reads profile data from the file specified by
> +/// -sample-profile-file and annotates every affected function with the
> +/// profile information found in that file.
> +class SampleProfileLoader : public FunctionPass {
> +public:
> + // Class identification, replacement for typeinfo
> + static char ID;
> +
> + SampleProfileLoader(StringRef Name = SampleProfileFile)
> + : FunctionPass(ID), Profiler(0), Filename(Name) {
> + initializeSampleProfileLoaderPass(*PassRegistry::getPassRegistry());
> + }
> +
> + virtual bool doInitialization(Module &M);
> +
> + void dump() { Profiler->dump(); }
> +
> + virtual const char *getPassName() const { return "Sample profile pass";
> }
> +
> + virtual bool runOnFunction(Function &F);
> +
> + virtual void getAnalysisUsage(AnalysisUsage &AU) const {
> + AU.setPreservesCFG();
> + }
> +
> +protected:
> + /// \brief Profile reader object.
> + OwningPtr<SampleProfile> Profiler;
> +
> + /// \brief Name of the profile file to load.
> + StringRef Filename;
> +};
> +}
> +
> +/// \brief Print the function profile for \p FName on stream \p OS.
> +///
> +/// \param OS Stream to emit the output to.
> +/// \param FName Name of the function to print.
> +void SampleProfile::printFunctionProfile(raw_ostream &OS, StringRef
> FName) {
> + FunctionProfile FProfile = Profiles[FName];
> + OS << "Function: " << FName << ", " << FProfile.TotalSamples << ", "
> + << FProfile.TotalHeadSamples << ", " << FProfile.BodySamples.size()
> + << " sampled lines\n";
> + for (BodySampleMap::const_iterator SI = FProfile.BodySamples.begin(),
> + SE = FProfile.BodySamples.end();
> + SI != SE; ++SI)
> + OS << "\tline offset: " << SI->first
> + << ", number of samples: " << SI->second << "\n";
> + OS << "\n";
> +}
> +
> +/// \brief Dump the function profile for \p FName.
> +///
> +/// \param FName Name of the function to print.
> +void SampleProfile::dumpFunctionProfile(StringRef FName) {
> + printFunctionProfile(dbgs(), FName);
> +}
> +
> +/// \brief Dump all the function profiles found.
> +void SampleProfile::dump() {
> + for (StringMap<FunctionProfile>::const_iterator I = Profiles.begin(),
> + E = Profiles.end();
> + I != E; ++I)
> + dumpFunctionProfile(I->getKey());
> +}
> +
> +/// \brief Load samples from a text file.
> +///
> +/// The file is divided in two segments:
> +///
> +/// Symbol table (represented with the string "symbol table")
> +/// Number of symbols in the table
> +/// symbol 1
> +/// symbol 2
> +/// ...
> +/// symbol N
> +///
> +/// Function body profiles
> +/// function1:total_samples:total_head_samples:number_of_locations
> +/// location_offset_1: number_of_samples
> +/// location_offset_2: number_of_samples
> +/// ...
> +/// location_offset_N: number_of_samples
> +///
> +/// Function names must be mangled in order for the profile loader to
> +/// match them in the current translation unit.
> +///
> +/// Since this is a flat profile, a function that shows up more than
> +/// once gets all its samples aggregated across all its instances.
> +/// TODO - flat profiles are too imprecise to provide good optimization
> +/// opportunities. Convert them to context-sensitive profile.
> +///
> +/// This textual representation is useful to generate unit tests and
> +/// for debugging purposes, but it should not be used to generate
> +/// profiles for large programs, as the representation is extremely
> +/// inefficient.
> +void SampleProfile::loadText() {
> + ExternalProfileTextLoader Loader(Filename);
> +
> + // Read the symbol table.
> + StringRef Line = Loader.readLine();
> + if (Line != "symbol table")
> + Loader.reportParseError("Expected 'symbol table', found " + Line);
> + int NumSymbols;
> + Line = Loader.readLine();
> + if (Line.getAsInteger(10, NumSymbols))
> + Loader.reportParseError("Expected a number, found " + Line);
> + for (int I = 0; I < NumSymbols; I++) {
> + StringRef FName = Loader.readLine();
> + FunctionProfile &FProfile = Profiles[FName];
> + FProfile.BodySamples.clear();
> + FProfile.TotalSamples = 0;
> + FProfile.TotalHeadSamples = 0;
> + }
> +
> + // Read the profile of each function. Since each function may be
> + // mentioned more than once, and we are collecting flat profiles,
> + // accumulate samples as we parse them.
> + Regex HeadRE("^([^:]+):([0-9]+):([0-9]+):([0-9]+)$");
> + Regex LineSample("^([0-9]+): ([0-9]+)$");
> + while (!Loader.atEOF()) {
> + SmallVector<StringRef, 4> Matches;
> + Line = Loader.readLine();
> + if (!HeadRE.match(Line, &Matches))
> + Loader.reportParseError("Expected 'mangled_name:NUM:NUM:NUM', found
> " +
> + Line);
> + assert(Matches.size() == 5);
> + StringRef FName = Matches[1];
> + unsigned NumSamples, NumHeadSamples, NumSampledLines;
> + Matches[2].getAsInteger(10, NumSamples);
> + Matches[3].getAsInteger(10, NumHeadSamples);
> + Matches[4].getAsInteger(10, NumSampledLines);
> + FunctionProfile &FProfile = Profiles[FName];
> + FProfile.TotalSamples += NumSamples;
> + FProfile.TotalHeadSamples += NumHeadSamples;
> + BodySampleMap &SampleMap = FProfile.BodySamples;
> + unsigned I;
> + for (I = 0; I < NumSampledLines && !Loader.atEOF(); I++) {
> + Line = Loader.readLine();
> + if (!LineSample.match(Line, &Matches))
> + Loader.reportParseError("Expected 'NUM: NUM', found " + Line);
> + assert(Matches.size() == 3);
> + unsigned LineOffset, NumSamples;
> + Matches[1].getAsInteger(10, LineOffset);
> + Matches[2].getAsInteger(10, NumSamples);
> + SampleMap[LineOffset] += NumSamples;
> + }
> +
> + if (I < NumSampledLines)
> + Loader.reportParseError("Unexpected end of file");
> + }
> +}
> +
> +/// \brief Get the weight for an instruction.
> +///
> +/// The "weight" of an instruction \p Inst is the number of samples
> +/// collected on that instruction at runtime. To retrieve it, we
> +/// need to compute the line number of \p Inst relative to the start of
> its
> +/// function. We use \p FirstLineno to compute the offset. We then
> +/// look up the samples collected for \p Inst using \p BodySamples.
> +///
> +/// \param Inst Instruction to query.
> +/// \param FirstLineno Line number of the first instruction in the
> function.
> +/// \param BodySamples Map of relative source line locations to samples.
> +///
> +/// \returns The profiled weight of I.
> +uint32_t SampleProfile::getInstWeight(Instruction &Inst, unsigned
> FirstLineno,
> + BodySampleMap &BodySamples) {
> + unsigned LOffset = Inst.getDebugLoc().getLine() - FirstLineno + 1;
> + return BodySamples.lookup(LOffset);
> +}
> +
> +/// \brief Compute the weight of a basic block.
> +///
> +/// The weight of basic block \p B is the maximum weight of all the
> +/// instructions in B.
> +///
> +/// \param B The basic block to query.
> +/// \param FirstLineno The line number for the first line in the
> +/// function holding B.
> +/// \param BodySamples The map containing all the samples collected in
> that
> +/// function.
> +///
> +/// \returns The computed weight of B.
> +uint32_t SampleProfile::computeBlockWeight(BasicBlock *B, unsigned
> FirstLineno,
> + BodySampleMap &BodySamples) {
> + // If we've computed B's weight before, return it.
> + Function *F = B->getParent();
> + FunctionProfile &FProfile = Profiles[F->getName()];
> + std::pair<BlockWeightMap::iterator, bool> Entry =
> + FProfile.BlockWeights.insert(std::make_pair(B, 0));
> + if (!Entry.second)
> + return Entry.first->second;
> +
> + // Otherwise, compute and cache B's weight.
> + uint32_t Weight = 0;
> + for (BasicBlock::iterator I = B->begin(), E = B->end(); I != E; ++I) {
> + uint32_t InstWeight = getInstWeight(*I, FirstLineno, BodySamples);
> + if (InstWeight > Weight)
> + Weight = InstWeight;
> + }
> + Entry.first->second = Weight;
> + return Weight;
> +}
> +
> +/// \brief Generate branch weight metadata for all branches in \p F.
> +///
> +/// For every branch instruction B in \p F, we compute the weight of the
> +/// target block for each of the edges out of B. This is the weight
> +/// that we associate with that branch.
> +///
> +/// TODO - This weight assignment will most likely be wrong if the
> +/// target branch has more than two predecessors. This needs to be done
> +/// using some form of flow propagation.
> +///
> +/// Once all the branch weights are computed, we emit the MD_prof
> +/// metadata on B using the computed values.
> +///
> +/// \param F The function to query.
> +bool SampleProfile::emitAnnotations(Function &F) {
> + bool Changed = false;
> + FunctionProfile &FProfile = Profiles[F.getName()];
> + unsigned FirstLineno = inst_begin(F)->getDebugLoc().getLine();
> + MDBuilder MDB(F.getContext());
> +
> + // Clear the block weights cache.
> + FProfile.BlockWeights.clear();
> +
> + // When we find a branch instruction: For each edge E out of the branch,
> + // the weight of E is the weight of the target block.
> + for (Function::iterator I = F.begin(), E = F.end(); I != E; ++I) {
> + BasicBlock *B = I;
> + TerminatorInst *TI = B->getTerminator();
> + if (TI->getNumSuccessors() == 1)
> + continue;
> + if (!isa<BranchInst>(TI) && !isa<SwitchInst>(TI))
> + continue;
> +
> + SmallVector<uint32_t, 4> Weights;
> + unsigned NSuccs = TI->getNumSuccessors();
> + for (unsigned I = 0; I < NSuccs; ++I) {
> + BasicBlock *Succ = TI->getSuccessor(I);
> + uint32_t Weight =
> + computeBlockWeight(Succ, FirstLineno, FProfile.BodySamples);
> + Weights.push_back(Weight);
> + }
> +
> + TI->setMetadata(llvm::LLVMContext::MD_prof,
> + MDB.createBranchWeights(Weights));
> + Changed = true;
> + }
> +
> + return Changed;
> +}
> +
> +char SampleProfileLoader::ID = 0;
> +INITIALIZE_PASS(SampleProfileLoader, "sample-profile", "Sample Profile
> loader",
> + false, false)
> +
> +bool SampleProfileLoader::runOnFunction(Function &F) {
> + return Profiler->emitAnnotations(F);
> +}
> +
> +bool SampleProfileLoader::doInitialization(Module &M) {
> + Profiler.reset(new SampleProfile(Filename));
> + Profiler->loadText();
> + return true;
> +}
> +
> +FunctionPass *llvm::createSampleProfileLoaderPass() {
> + return new SampleProfileLoader(SampleProfileFile);
> +}
> +
> +FunctionPass *llvm::createSampleProfileLoaderPass(StringRef Name) {
> + return new SampleProfileLoader(Name);
> +}
>
> Modified: llvm/trunk/lib/Transforms/Scalar/Scalar.cpp
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/Scalar.cpp?rev=194566&r1=194565&r2=194566&view=diff
>
> ==============================================================================
> --- llvm/trunk/lib/Transforms/Scalar/Scalar.cpp (original)
> +++ llvm/trunk/lib/Transforms/Scalar/Scalar.cpp Wed Nov 13 06:22:21 2013
> @@ -28,6 +28,7 @@ using namespace llvm;
> /// ScalarOpts library.
> void llvm::initializeScalarOpts(PassRegistry &Registry) {
> initializeADCEPass(Registry);
> + initializeSampleProfileLoaderPass(Registry);
> initializeCodeGenPreparePass(Registry);
> initializeConstantPropagationPass(Registry);
> initializeCorrelatedValuePropagationPass(Registry);
>
> Added: llvm/trunk/test/Transforms/SampleProfile/Inputs/branch.prof
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/SampleProfile/Inputs/branch.prof?rev=194566&view=auto
>
> ==============================================================================
> --- llvm/trunk/test/Transforms/SampleProfile/Inputs/branch.prof (added)
> +++ llvm/trunk/test/Transforms/SampleProfile/Inputs/branch.prof Wed Nov 13
> 06:22:21 2013
> @@ -0,0 +1,11 @@
> +symbol table
> +1
> +main
> +main:15680:0:7
> +0: 0
> +4: 0
> +7: 0
> +9: 10226
> +10: 2243
> +16: 0
> +18: 0
>
> Added: llvm/trunk/test/Transforms/SampleProfile/branch.ll
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/SampleProfile/branch.ll?rev=194566&view=auto
>
> ==============================================================================
> --- llvm/trunk/test/Transforms/SampleProfile/branch.ll (added)
> +++ llvm/trunk/test/Transforms/SampleProfile/branch.ll Wed Nov 13 06:22:21
> 2013
> @@ -0,0 +1,142 @@
> +; RUN: opt < %s -sample-profile
> -sample-profile-file=%S/Inputs/branch.prof | opt -analyze -branch-prob |
> FileCheck %s
> +
> +; Original C++ code for this test case:
> +;
> +; #include <stdio.h>
> +; #include <stdlib.h>
> +;
> +; int main(int argc, char *argv[]) {
> +; if (argc < 2)
> +; return 1;
> +; double result;
> +; int limit = atoi(argv[1]);
> +; if (limit > 100) {
> +; double s = 23.041968;
> +; for (int u = 0; u < limit; u++) {
> +; double x = s;
> +; s = x + 3.049 + (double)u;
> +; s -= s + 3.94 / x * 0.32;
> +; }
> +; result = s;
> +; } else {
> +; result = 0;
> +; }
> +; printf("result is %lf\n", result);
> +; return 0;
> +; }
> +
> + at .str = private unnamed_addr constant [15 x i8] c"result is %lf\0A\00",
> align 1
> +
> +; Function Attrs: nounwind uwtable
> +define i32 @main(i32 %argc, i8** nocapture readonly %argv) #0 {
> +; CHECK: Printing analysis 'Branch Probability Analysis' for function
> 'main':
> +
> +entry:
> + tail call void @llvm.dbg.value(metadata !{i32 %argc}, i64 0, metadata
> !13), !dbg !27
> + tail call void @llvm.dbg.value(metadata !{i8** %argv}, i64 0, metadata
> !14), !dbg !27
> + %cmp = icmp slt i32 %argc, 2, !dbg !28
> + br i1 %cmp, label %return, label %if.end, !dbg !28
> +; CHECK: edge entry -> return probability is 1 / 2 = 50%
> +; CHECK: edge entry -> if.end probability is 1 / 2 = 50%
> +
> +if.end: ; preds = %entry
> + %arrayidx = getelementptr inbounds i8** %argv, i64 1, !dbg !30
> + %0 = load i8** %arrayidx, align 8, !dbg !30, !tbaa !31
> + %call = tail call i32 @atoi(i8* %0) #4, !dbg !30
> + tail call void @llvm.dbg.value(metadata !{i32 %call}, i64 0, metadata
> !17), !dbg !30
> + %cmp1 = icmp sgt i32 %call, 100, !dbg !35
> + br i1 %cmp1, label %for.body, label %if.end6, !dbg !35
> +; CHECK: edge if.end -> for.body probability is 2243 / 2244 = 99.9554%
> [HOT edge]
> +; CHECK: edge if.end -> if.end6 probability is 1 / 2244 = 0.0445633%
> +
> +for.body: ; preds = %if.end,
> %for.body
> + %u.016 = phi i32 [ %inc, %for.body ], [ 0, %if.end ]
> + %s.015 = phi double [ %sub, %for.body ], [ 0x40370ABE6A337A81, %if.end ]
> + %add = fadd double %s.015, 3.049000e+00, !dbg !36
> + %conv = sitofp i32 %u.016 to double, !dbg !36
> + %add4 = fadd double %add, %conv, !dbg !36
> + tail call void @llvm.dbg.value(metadata !{double %add4}, i64 0,
> metadata !18), !dbg !36
> + %div = fdiv double 3.940000e+00, %s.015, !dbg !37
> + %mul = fmul double %div, 3.200000e-01, !dbg !37
> + %add5 = fadd double %add4, %mul, !dbg !37
> + %sub = fsub double %add4, %add5, !dbg !37
> + tail call void @llvm.dbg.value(metadata !{double %sub}, i64 0, metadata
> !18), !dbg !37
> + %inc = add nsw i32 %u.016, 1, !dbg !38
> + tail call void @llvm.dbg.value(metadata !{i32 %inc}, i64 0, metadata
> !21), !dbg !38
> + %exitcond = icmp eq i32 %inc, %call, !dbg !38
> + br i1 %exitcond, label %if.end6, label %for.body, !dbg !38
> +; CHECK: edge for.body -> if.end6 probability is 1 / 2244 = 0.0445633%
> +; CHECK: edge for.body -> for.body probability is 2243 / 2244 = 99.9554%
> [HOT edge]
> +
> +if.end6: ; preds = %for.body,
> %if.end
> + %result.0 = phi double [ 0.000000e+00, %if.end ], [ %sub, %for.body ]
> + %call7 = tail call i32 (i8*, ...)* @printf(i8* getelementptr inbounds
> ([15 x i8]* @.str, i64 0, i64 0), double %result.0), !dbg !39
> + br label %return, !dbg !40
> +; CHECK: edge if.end6 -> return probability is 16 / 16 = 100% [HOT edge]
> +
> +return: ; preds = %entry,
> %if.end6
> + %retval.0 = phi i32 [ 0, %if.end6 ], [ 1, %entry ]
> + ret i32 %retval.0, !dbg !41
> +}
> +
> +; Function Attrs: nounwind readonly
> +declare i32 @atoi(i8* nocapture) #1
> +
> +; Function Attrs: nounwind
> +declare i32 @printf(i8* nocapture readonly, ...) #2
> +
> +; Function Attrs: nounwind readnone
> +declare void @llvm.dbg.value(metadata, i64, metadata) #3
> +
> +attributes #0 = { nounwind uwtable "less-precise-fpmad"="false"
> "no-frame-pointer-elim"="false" "no-infs-fp-math"="false"
> "no-nans-fp-math"="false" "stack-protector-buffer-size"="8"
> "unsafe-fp-math"="false" "use-soft-float"="false" }
> +attributes #1 = { nounwind readonly "less-precise-fpmad"="false"
> "no-frame-pointer-elim"="false" "no-infs-fp-math"="false"
> "no-nans-fp-math"="false" "stack-protector-buffer-size"="8"
> "unsafe-fp-math"="false" "use-soft-float"="false" }
> +attributes #2 = { nounwind "less-precise-fpmad"="false"
> "no-frame-pointer-elim"="false" "no-infs-fp-math"="false"
> "no-nans-fp-math"="false" "stack-protector-buffer-size"="8"
> "unsafe-fp-math"="false" "use-soft-float"="false" }
> +attributes #3 = { nounwind readnone }
> +attributes #4 = { nounwind readonly }
> +
> +!llvm.dbg.cu = !{!0}
> +!llvm.module.flags = !{!25}
> +!llvm.ident = !{!26}
> +
> +!0 = metadata !{i32 786449, metadata !1, i32 4, metadata !"clang version
> 3.4 (trunk 192896) (llvm/trunk 192895)", i1 true, metadata !"", i32 0,
> metadata !2, metadata !2, metadata !3, metadata !2, metadata !2, metadata
> !""} ; [ DW_TAG_compile_unit ] [./branch.cc] [DW_LANG_C_plus_plus]
> +!1 = metadata !{metadata !"branch.cc", metadata !"."}
> +!2 = metadata !{i32 0}
> +!3 = metadata !{metadata !4}
> +!4 = metadata !{i32 786478, metadata !1, metadata !5, metadata !"main",
> metadata !"main", metadata !"", i32 4, metadata !6, i1 false, i1 true, i32
> 0, i32 0, null, i32 256, i1 true, i32 (i32, i8**)* @main, null, null,
> metadata !12, i32 4} ; [ DW_TAG_subprogram ] [line 4] [def] [main]
> +!5 = metadata !{i32 786473, metadata !1} ; [ DW_TAG_file_type ]
> [./branch.cc]
> +!6 = metadata !{i32 786453, i32 0, null, metadata !"", i32 0, i64 0, i64
> 0, i64 0, i32 0, null, metadata !7, i32 0, null, null, null} ; [
> DW_TAG_subroutine_type ] [line 0, size 0, align 0, offset 0] [from ]
> +!7 = metadata !{metadata !8, metadata !8, metadata !9}
> +!8 = metadata !{i32 786468, null, null, metadata !"int", i32 0, i64 32,
> i64 32, i64 0, i32 0, i32 5} ; [ DW_TAG_base_type ] [int] [line 0, size 32,
> align 32, offset 0, enc DW_ATE_signed]
> +!9 = metadata !{i32 786447, null, null, metadata !"", i32 0, i64 64, i64
> 64, i64 0, i32 0, metadata !10} ; [ DW_TAG_pointer_type ] [line 0, size 64,
> align 64, offset 0] [from ]
> +!10 = metadata !{i32 786447, null, null, metadata !"", i32 0, i64 64, i64
> 64, i64 0, i32 0, metadata !11} ; [ DW_TAG_pointer_type ] [line 0, size 64,
> align 64, offset 0] [from char]
> +!11 = metadata !{i32 786468, null, null, metadata !"char", i32 0, i64 8,
> i64 8, i64 0, i32 0, i32 6} ; [ DW_TAG_base_type ] [char] [line 0, size 8,
> align 8, offset 0, enc DW_ATE_signed_char]
> +!12 = metadata !{metadata !13, metadata !14, metadata !15, metadata !17,
> metadata !18, metadata !21, metadata !23}
> +!13 = metadata !{i32 786689, metadata !4, metadata !"argc", metadata !5,
> i32 16777220, metadata !8, i32 0, i32 0} ; [ DW_TAG_arg_variable ] [argc]
> [line 4]
> +!14 = metadata !{i32 786689, metadata !4, metadata !"argv", metadata !5,
> i32 33554436, metadata !9, i32 0, i32 0} ; [ DW_TAG_arg_variable ] [argv]
> [line 4]
> +!15 = metadata !{i32 786688, metadata !4, metadata !"result", metadata
> !5, i32 7, metadata !16, i32 0, i32 0} ; [ DW_TAG_auto_variable ] [result]
> [line 7]
> +!16 = metadata !{i32 786468, null, null, metadata !"double", i32 0, i64
> 64, i64 64, i64 0, i32 0, i32 4} ; [ DW_TAG_base_type ] [double] [line 0,
> size 64, align 64, offset 0, enc DW_ATE_float]
> +!17 = metadata !{i32 786688, metadata !4, metadata !"limit", metadata !5,
> i32 8, metadata !8, i32 0, i32 0} ; [ DW_TAG_auto_variable ] [limit] [line
> 8]
> +!18 = metadata !{i32 786688, metadata !19, metadata !"s", metadata !5,
> i32 10, metadata !16, i32 0, i32 0} ; [ DW_TAG_auto_variable ] [s] [line 10]
> +!19 = metadata !{i32 786443, metadata !1, metadata !20, i32 9, i32 0, i32
> 2} ; [ DW_TAG_lexical_block ] [./branch.cc]
> +!20 = metadata !{i32 786443, metadata !1, metadata !4, i32 9, i32 0, i32
> 1} ; [ DW_TAG_lexical_block ] [./branch.cc]
> +!21 = metadata !{i32 786688, metadata !22, metadata !"u", metadata !5,
> i32 11, metadata !8, i32 0, i32 0} ; [ DW_TAG_auto_variable ] [u] [line 11]
> +!22 = metadata !{i32 786443, metadata !1, metadata !19, i32 11, i32 0,
> i32 3} ; [ DW_TAG_lexical_block ] [./branch.cc]
> +!23 = metadata !{i32 786688, metadata !24, metadata !"x", metadata !5,
> i32 12, metadata !16, i32 0, i32 0} ; [ DW_TAG_auto_variable ] [x] [line 12]
> +!24 = metadata !{i32 786443, metadata !1, metadata !22, i32 11, i32 0,
> i32 4} ; [ DW_TAG_lexical_block ] [./branch.cc]
> +!25 = metadata !{i32 2, metadata !"Dwarf Version", i32 4}
> +!26 = metadata !{metadata !"clang version 3.4 (trunk 192896) (llvm/trunk
> 192895)"}
> +!27 = metadata !{i32 4, i32 0, metadata !4, null}
> +!28 = metadata !{i32 5, i32 0, metadata !29, null}
> +!29 = metadata !{i32 786443, metadata !1, metadata !4, i32 5, i32 0, i32
> 0} ; [ DW_TAG_lexical_block ] [./branch.cc]
> +!30 = metadata !{i32 8, i32 0, metadata !4, null} ; [
> DW_TAG_imported_declaration ]
> +!31 = metadata !{metadata !32, metadata !32, i64 0}
> +!32 = metadata !{metadata !"any pointer", metadata !33, i64 0}
> +!33 = metadata !{metadata !"omnipotent char", metadata !34, i64 0}
> +!34 = metadata !{metadata !"Simple C/C++ TBAA"}
> +!35 = metadata !{i32 9, i32 0, metadata !20, null}
> +!36 = metadata !{i32 13, i32 0, metadata !24, null}
> +!37 = metadata !{i32 14, i32 0, metadata !24, null}
> +!38 = metadata !{i32 11, i32 0, metadata !22, null}
> +!39 = metadata !{i32 20, i32 0, metadata !4, null}
> +!40 = metadata !{i32 21, i32 0, metadata !4, null}
> +!41 = metadata !{i32 22, i32 0, metadata !4, null}
>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>
--
Alexey Samsonov, MSK
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20131113/68f786d0/attachment.html>
More information about the llvm-commits
mailing list