[llvm] r194566 - SampleProfileLoader pass. Initial setup.

Alexey Samsonov samsonov at google.com
Wed Nov 13 05:15:28 PST 2013


On Wed, Nov 13, 2013 at 4:22 PM, Diego Novillo <dnovillo at google.com> wrote:

> Author: dnovillo
> Date: Wed Nov 13 06:22:21 2013
> New Revision: 194566
>
> URL: http://llvm.org/viewvc/llvm-project?rev=194566&view=rev
> Log:
> SampleProfileLoader pass. Initial setup.
>
> This adds a new scalar pass that reads a file with samples generated
> by 'perf' during runtime. The samples read from the profile are
> incorporated and emmited as IR metadata reflecting that profile.
>
> The profile file is assumed to have been generated by an external
> profile source. The profile information is converted into IR metadata,
> which is later used by the analysis routines to estimate block
> frequencies, edge weights and other related data.
>
> External profile information files have no fixed format, each profiler
> is free to define its own. This includes both the on-disk representation
> of the profile and the kind of profile information stored in the file.
> A common kind of profile is based on sampling (e.g., perf), which
> essentially counts how many times each line of the program has been
> executed during the run.
>
> The SampleProfileLoader pass is organized as a scalar transformation.
> On startup, it reads the file given in -sample-profile-file to
> determine what kind of profile it contains.  This file is assumed to
> contain profile information for the whole application. The profile
> data in the file is read and incorporated into the internal state of
> the corresponding profiler.
>
> To facilitate testing, I've organized the profilers to support two file
> formats: text and native. The native format is whatever on-disk
> representation the profiler wants to support, I think this will mostly
> be bitcode files, but it could be anything the profiler wants to
> support. To do this, every profiler must implement the
> SampleProfile::loadNative() function.
>
> The text format is mostly meant for debugging. Records are separated by
> newlines, but each profiler is free to interpret records as it sees fit.
> Profilers must implement the SampleProfile::loadText() function.
>
> Finally, the pass will call SampleProfile::emitAnnotations() for each
> function in the current translation unit. This function needs to
> translate the loaded profile into IR metadata, which the analyzer will
> later be able to use.
>
> This patch implements the first steps towards the above design. I've
> implemented a sample-based flat profiler. The format of the profile is
> fairly simplistic. Each sampled function contains a list of relative
> line locations (from the start of the function) together with a count
> representing how many samples were collected at that line during
> execution. I generate this profile using perf and a separate converter
> tool.
>
> Currently, I have only implemented a text format for these profiles. I
> am interested in initial feedback to the whole approach before I send
> the other parts of the implementation for review.
>
> This patch implements:
>
> - The SampleProfileLoader pass.
> - The base ExternalProfile class with the core interface.
> - A SampleProfile sub-class using the above interface. The profiler
>   generates branch weight metadata on every branch instructions that
>   matches the profiles.
> - A text loader class to assist the implementation of
>   SampleProfile::loadText().
> - Basic unit tests for the pass.
>
> Additionally, the patch uses profile information to compute branch
> weights based on instruction samples.
>
> This patch converts instruction samples into branch weights. It
> does a fairly simplistic conversion:
>
> Given a multi-way branch instruction, it calculates the weight of
> each branch based on the maximum sample count gathered from each
> target basic block.
>
> Note that this assignment of branch weights is somewhat lossy and can be
> misleading. If a basic block has more than one incoming branch, all the
> incoming branches will get the same weight. In reality, it may be that
> only one of them is the most heavily taken branch.
>
> I will adjust this assignment in subsequent patches.
>
> Added:
>     llvm/trunk/lib/Transforms/Scalar/SampleProfile.cpp
>     llvm/trunk/test/Transforms/SampleProfile/
>     llvm/trunk/test/Transforms/SampleProfile/Inputs/
>     llvm/trunk/test/Transforms/SampleProfile/Inputs/branch.prof
>     llvm/trunk/test/Transforms/SampleProfile/branch.ll
> Modified:
>     llvm/trunk/include/llvm/InitializePasses.h
>     llvm/trunk/include/llvm/Transforms/Scalar.h
>     llvm/trunk/lib/Transforms/Scalar/CMakeLists.txt
>     llvm/trunk/lib/Transforms/Scalar/Scalar.cpp
>
> Modified: llvm/trunk/include/llvm/InitializePasses.h
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/InitializePasses.h?rev=194566&r1=194565&r2=194566&view=diff
>
> ==============================================================================
> --- llvm/trunk/include/llvm/InitializePasses.h (original)
> +++ llvm/trunk/include/llvm/InitializePasses.h Wed Nov 13 06:22:21 2013
> @@ -70,6 +70,7 @@ void initializeAliasDebuggerPass(PassReg
>  void initializeAliasSetPrinterPass(PassRegistry&);
>  void initializeAlwaysInlinerPass(PassRegistry&);
>  void initializeArgPromotionPass(PassRegistry&);
> +void initializeSampleProfileLoaderPass(PassRegistry&);
>  void initializeBarrierNoopPass(PassRegistry&);
>  void initializeBasicAliasAnalysisPass(PassRegistry&);
>  void initializeCallGraphPass(PassRegistry&);
>
> Modified: llvm/trunk/include/llvm/Transforms/Scalar.h
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Transforms/Scalar.h?rev=194566&r1=194565&r2=194566&view=diff
>
> ==============================================================================
> --- llvm/trunk/include/llvm/Transforms/Scalar.h (original)
> +++ llvm/trunk/include/llvm/Transforms/Scalar.h Wed Nov 13 06:22:21 2013
> @@ -15,6 +15,8 @@
>  #ifndef LLVM_TRANSFORMS_SCALAR_H
>  #define LLVM_TRANSFORMS_SCALAR_H
>
> +#include "llvm/ADT/StringRef.h"
> +
>  namespace llvm {
>
>  class FunctionPass;
> @@ -355,6 +357,13 @@ FunctionPass *createLowerExpectIntrinsic
>  //
>  FunctionPass *createPartiallyInlineLibCallsPass();
>
>
> +//===----------------------------------------------------------------------===//
> +//
> +// SampleProfilePass - Loads sample profile data from disk and generates
> +// IR metadata to reflect the profile.
> +FunctionPass *createSampleProfileLoaderPass();
> +FunctionPass *createSampleProfileLoaderPass(StringRef Name);
> +
>  } // End llvm namespace
>
>  #endif
>
> Modified: llvm/trunk/lib/Transforms/Scalar/CMakeLists.txt
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/CMakeLists.txt?rev=194566&r1=194565&r2=194566&view=diff
>
> ==============================================================================
> --- llvm/trunk/lib/Transforms/Scalar/CMakeLists.txt (original)
> +++ llvm/trunk/lib/Transforms/Scalar/CMakeLists.txt Wed Nov 13 06:22:21
> 2013
> @@ -23,6 +23,7 @@ add_llvm_library(LLVMScalarOpts
>    PartiallyInlineLibCalls.cpp
>    Reassociate.cpp
>    Reg2Mem.cpp
> +  SampleProfile.cpp
>    SCCP.cpp
>    SROA.cpp
>    Scalar.cpp
>
> Added: llvm/trunk/lib/Transforms/Scalar/SampleProfile.cpp
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/SampleProfile.cpp?rev=194566&view=auto
>
> ==============================================================================
> --- llvm/trunk/lib/Transforms/Scalar/SampleProfile.cpp (added)
> +++ llvm/trunk/lib/Transforms/Scalar/SampleProfile.cpp Wed Nov 13 06:22:21
> 2013
> @@ -0,0 +1,479 @@
> +//===- SampleProfile.cpp - Incorporate sample profiles into the IR
> --------===//
> +//
> +//                      The LLVM Compiler Infrastructure
> +//
> +// This file is distributed under the University of Illinois Open Source
> +// License. See LICENSE.TXT for details.
> +//
>
> +//===----------------------------------------------------------------------===//
> +//
> +// This file implements the SampleProfileLoader transformation. This pass
> +// reads a profile file generated by a sampling profiler (e.g. Linux Perf
> -
> +// http://perf.wiki.kernel.org/) and generates IR metadata to reflect the
> +// profile information in the given profile.
> +//
> +// This pass generates branch weight annotations on the IR:
> +//
> +// - prof: Represents branch weights. This annotation is added to branches
> +//      to indicate the weights of each edge coming out of the branch.
> +//      The weight of each edge is the weight of the target block for
> +//      that edge. The weight of a block B is computed as the maximum
> +//      number of samples found in B.
> +//
>
> +//===----------------------------------------------------------------------===//
> +
> +#define DEBUG_TYPE "sample-profile"
> +
> +#include "llvm/ADT/DenseMap.h"
> +#include "llvm/ADT/OwningPtr.h"
> +#include "llvm/ADT/StringMap.h"
> +#include "llvm/ADT/StringRef.h"
> +#include "llvm/DebugInfo/DIContext.h"
> +#include "llvm/IR/Constants.h"
> +#include "llvm/IR/Function.h"
> +#include "llvm/IR/Instructions.h"
> +#include "llvm/IR/LLVMContext.h"
> +#include "llvm/IR/Metadata.h"
> +#include "llvm/IR/MDBuilder.h"
> +#include "llvm/IR/Module.h"
> +#include "llvm/Pass.h"
> +#include "llvm/Support/CommandLine.h"
> +#include "llvm/Support/Debug.h"
> +#include "llvm/Support/InstIterator.h"
> +#include "llvm/Support/MemoryBuffer.h"
> +#include "llvm/Support/Regex.h"
> +#include "llvm/Support/raw_ostream.h"
> +#include "llvm/Transforms/Scalar.h"
> +
> +using namespace llvm;
> +
> +// Command line option to specify the file to read samples from. This is
> +// mainly used for debugging.
> +static cl::opt<std::string> SampleProfileFile(
> +    "sample-profile-file", cl::init(""), cl::value_desc("filename"),
> +    cl::desc("Profile file loaded by -sample-profile"), cl::Hidden);
> +
> +namespace {
> +/// \brief Sample-based profile reader.
> +///
> +/// Each profile contains sample counts for all the functions
> +/// executed. Inside each function, statements are annotated with the
> +/// collected samples on all the instructions associated with that
> +/// statement.
> +///
> +/// For this to produce meaningful data, the program needs to be
> +/// compiled with some debug information (at minimum, line numbers:
> +/// -gline-tables-only). Otherwise, it will be impossible to match IR
> +/// instructions to the line numbers collected by the profiler.
> +///
> +/// From the profile file, we are interested in collecting the
> +/// following information:
> +///
> +/// * A list of functions included in the profile (mangled names).
> +///
> +/// * For each function F:
> +///   1. The total number of samples collected in F.
> +///
> +///   2. The samples collected at each line in F. To provide some
> +///      protection against source code shuffling, line numbers should
> +///      be relative to the start of the function.
> +class SampleProfile {
> +public:
> +  SampleProfile(StringRef F) : Profiles(0), Filename(F) {}
> +
> +  virtual void dump();
> +  virtual void loadText();
> +  virtual void loadNative() { llvm_unreachable("not implemented"); }
> +  virtual bool emitAnnotations(Function &F);
>

I've removed "virtual" from function declarations in r194568 , otherwise
Clang produced:

llvm/lib/Transforms/Scalar/SampleProfile.cpp:80:7: error:
'<anonymous>::SampleProfile' has virtual functions but non-virtual
destructor [-Werror,-Wnon-virtual-dtor]
class SampleProfile {
      ^
In file included from llvm/lib/Transforms/Scalar/SampleProfile.cpp:28:
llvm/include/llvm/ADT/OwningPtr.h:45:5: error: delete called on
'<anonymous>::SampleProfile' that has virtual functions but non-virtual
destructor [-Werror,-Wdelete-non-virtual-dtor]
    delete Ptr;
    ^
llvm/lib/Transforms/Scalar/SampleProfile.cpp:212:3: note: in instantiation
of member function
'llvm::OwningPtr<<anonymous>::SampleProfile>::~OwningPtr' requested here
  SampleProfileLoader(StringRef Name = SampleProfileFile)
  ^
In file included from llvm/lib/Transforms/Scalar/SampleProfile.cpp:28:
llvm/include/llvm/ADT/OwningPtr.h:55:5: error: delete called on
'<anonymous>::SampleProfile' that has virtual functions but non-virtual
destructor [-Werror,-Wdelete-non-virtual-dtor]
    delete Tmp;
    ^
llvm/lib/Transforms/Scalar/SampleProfile.cpp:468:12: note: in instantiation
of member function 'llvm::OwningPtr<<anonymous>::SampleProfile>::reset'
requested here
  Profiler.reset(new SampleProfile(Filename));
           ^
3 errors generated.



> +  void printFunctionProfile(raw_ostream &OS, StringRef FName);
> +  void dumpFunctionProfile(StringRef FName);
> +
> +protected:
> +  typedef DenseMap<uint32_t, uint32_t> BodySampleMap;
> +  typedef DenseMap<BasicBlock *, uint32_t> BlockWeightMap;
> +
> +  /// \brief Representation of the runtime profile for a function.
> +  ///
> +  /// This data structure contains the runtime profile for a given
> +  /// function. It contains the total number of samples collected
> +  /// in the function and a map of samples collected in every statement.
> +  struct FunctionProfile {
> +    /// \brief Total number of samples collected inside this function.
> +    ///
> +    /// Samples are cumulative, they include all the samples collected
> +    /// inside this function and all its inlined callees.
> +    unsigned TotalSamples;
> +
> +    // \brief Total number of samples collected at the head of the
> function.
> +    unsigned TotalHeadSamples;
> +
> +    /// \brief Map line offsets to collected samples.
> +    ///
> +    /// Each entry in this map contains the number of samples
> +    /// collected at the corresponding line offset. All line locations
> +    /// are an offset from the start of the function.
> +    BodySampleMap BodySamples;
> +
> +    /// \brief Map basic blocks to their computed weights.
> +    ///
> +    /// The weight of a basic block is defined to be the maximum
> +    /// of all the instruction weights in that block.
> +    BlockWeightMap BlockWeights;
> +  };
> +
> +  uint32_t getInstWeight(Instruction &I, unsigned FirstLineno,
> +                         BodySampleMap &BodySamples);
> +  uint32_t computeBlockWeight(BasicBlock *B, unsigned FirstLineno,
> +                              BodySampleMap &BodySamples);
> +
> +  /// \brief Map every function to its associated profile.
> +  ///
> +  /// The profile of every function executed at runtime is collected
> +  /// in the structure FunctionProfile. This maps function objects
> +  /// to their corresponding profiles.
> +  StringMap<FunctionProfile> Profiles;
> +
> +  /// \brief Path name to the file holding the profile data.
> +  ///
> +  /// The format of this file is defined by each profiler
> +  /// independently. If possible, the profiler should have a text
> +  /// version of the profile format to be used in constructing test
> +  /// cases and debugging.
> +  StringRef Filename;
> +};
> +
> +/// \brief Loader class for text-based profiles.
> +///
> +/// This class defines a simple interface to read text files containing
> +/// profiles. It keeps track of line number information and location of
> +/// the file pointer. Users of this class are responsible for actually
> +/// parsing the lines returned by the readLine function.
> +///
> +/// TODO - This does not really belong here. It is a generic text file
> +/// reader. It should be moved to the Support library and made more
> general.
> +class ExternalProfileTextLoader {
> +public:
> +  ExternalProfileTextLoader(StringRef F) : Filename(F) {
> +    error_code EC;
> +    EC = MemoryBuffer::getFile(Filename, Buffer);
> +    if (EC)
> +      report_fatal_error("Could not open profile file " + Filename + ": "
> +
> +                         EC.message());
> +    FP = Buffer->getBufferStart();
> +    Lineno = 0;
> +  }
> +
> +  /// \brief Read a line from the mapped file.
> +  StringRef readLine() {
> +    size_t Length = 0;
> +    const char *start = FP;
> +    while (FP != Buffer->getBufferEnd() && *FP != '\n') {
> +      Length++;
> +      FP++;
> +    }
> +    if (FP != Buffer->getBufferEnd())
> +      FP++;
> +    Lineno++;
> +    return StringRef(start, Length);
> +  }
> +
> +  /// \brief Return true, if we've reached EOF.
> +  bool atEOF() const { return FP == Buffer->getBufferEnd(); }
> +
> +  /// \brief Report a parse error message and stop compilation.
> +  void reportParseError(Twine Msg) const {
> +    report_fatal_error(Filename + ":" + Twine(Lineno) + ": " + Msg +
> "\n");
> +  }
> +
> +private:
> +  /// \brief Memory buffer holding the text file.
> +  OwningPtr<MemoryBuffer> Buffer;
> +
> +  /// \brief Current position into the memory buffer.
> +  const char *FP;
> +
> +  /// \brief Current line number.
> +  int64_t Lineno;
> +
> +  /// \brief Path name where to the profile file.
> +  StringRef Filename;
> +};
> +
> +/// \brief Sample profile pass.
> +///
> +/// This pass reads profile data from the file specified by
> +/// -sample-profile-file and annotates every affected function with the
> +/// profile information found in that file.
> +class SampleProfileLoader : public FunctionPass {
> +public:
> +  // Class identification, replacement for typeinfo
> +  static char ID;
> +
> +  SampleProfileLoader(StringRef Name = SampleProfileFile)
> +      : FunctionPass(ID), Profiler(0), Filename(Name) {
> +    initializeSampleProfileLoaderPass(*PassRegistry::getPassRegistry());
> +  }
> +
> +  virtual bool doInitialization(Module &M);
> +
> +  void dump() { Profiler->dump(); }
> +
> +  virtual const char *getPassName() const { return "Sample profile pass";
> }
> +
> +  virtual bool runOnFunction(Function &F);
> +
> +  virtual void getAnalysisUsage(AnalysisUsage &AU) const {
> +    AU.setPreservesCFG();
> +  }
> +
> +protected:
> +  /// \brief Profile reader object.
> +  OwningPtr<SampleProfile> Profiler;
> +
> +  /// \brief Name of the profile file to load.
> +  StringRef Filename;
> +};
> +}
> +
> +/// \brief Print the function profile for \p FName on stream \p OS.
> +///
> +/// \param OS Stream to emit the output to.
> +/// \param FName Name of the function to print.
> +void SampleProfile::printFunctionProfile(raw_ostream &OS, StringRef
> FName) {
> +  FunctionProfile FProfile = Profiles[FName];
> +  OS << "Function: " << FName << ", " << FProfile.TotalSamples << ", "
> +     << FProfile.TotalHeadSamples << ", " << FProfile.BodySamples.size()
> +     << " sampled lines\n";
> +  for (BodySampleMap::const_iterator SI = FProfile.BodySamples.begin(),
> +                                     SE = FProfile.BodySamples.end();
> +       SI != SE; ++SI)
> +    OS << "\tline offset: " << SI->first
> +       << ", number of samples: " << SI->second << "\n";
> +  OS << "\n";
> +}
> +
> +/// \brief Dump the function profile for \p FName.
> +///
> +/// \param FName Name of the function to print.
> +void SampleProfile::dumpFunctionProfile(StringRef FName) {
> +  printFunctionProfile(dbgs(), FName);
> +}
> +
> +/// \brief Dump all the function profiles found.
> +void SampleProfile::dump() {
> +  for (StringMap<FunctionProfile>::const_iterator I = Profiles.begin(),
> +                                                  E = Profiles.end();
> +       I != E; ++I)
> +    dumpFunctionProfile(I->getKey());
> +}
> +
> +/// \brief Load samples from a text file.
> +///
> +/// The file is divided in two segments:
> +///
> +/// Symbol table (represented with the string "symbol table")
> +///    Number of symbols in the table
> +///    symbol 1
> +///    symbol 2
> +///    ...
> +///    symbol N
> +///
> +/// Function body profiles
> +///    function1:total_samples:total_head_samples:number_of_locations
> +///    location_offset_1: number_of_samples
> +///    location_offset_2: number_of_samples
> +///    ...
> +///    location_offset_N: number_of_samples
> +///
> +/// Function names must be mangled in order for the profile loader to
> +/// match them in the current translation unit.
> +///
> +/// Since this is a flat profile, a function that shows up more than
> +/// once gets all its samples aggregated across all its instances.
> +/// TODO - flat profiles are too imprecise to provide good optimization
> +/// opportunities. Convert them to context-sensitive profile.
> +///
> +/// This textual representation is useful to generate unit tests and
> +/// for debugging purposes, but it should not be used to generate
> +/// profiles for large programs, as the representation is extremely
> +/// inefficient.
> +void SampleProfile::loadText() {
> +  ExternalProfileTextLoader Loader(Filename);
> +
> +  // Read the symbol table.
> +  StringRef Line = Loader.readLine();
> +  if (Line != "symbol table")
> +    Loader.reportParseError("Expected 'symbol table', found " + Line);
> +  int NumSymbols;
> +  Line = Loader.readLine();
> +  if (Line.getAsInteger(10, NumSymbols))
> +    Loader.reportParseError("Expected a number, found " + Line);
> +  for (int I = 0; I < NumSymbols; I++) {
> +    StringRef FName = Loader.readLine();
> +    FunctionProfile &FProfile = Profiles[FName];
> +    FProfile.BodySamples.clear();
> +    FProfile.TotalSamples = 0;
> +    FProfile.TotalHeadSamples = 0;
> +  }
> +
> +  // Read the profile of each function. Since each function may be
> +  // mentioned more than once, and we are collecting flat profiles,
> +  // accumulate samples as we parse them.
> +  Regex HeadRE("^([^:]+):([0-9]+):([0-9]+):([0-9]+)$");
> +  Regex LineSample("^([0-9]+): ([0-9]+)$");
> +  while (!Loader.atEOF()) {
> +    SmallVector<StringRef, 4> Matches;
> +    Line = Loader.readLine();
> +    if (!HeadRE.match(Line, &Matches))
> +      Loader.reportParseError("Expected 'mangled_name:NUM:NUM:NUM', found
> " +
> +                              Line);
> +    assert(Matches.size() == 5);
> +    StringRef FName = Matches[1];
> +    unsigned NumSamples, NumHeadSamples, NumSampledLines;
> +    Matches[2].getAsInteger(10, NumSamples);
> +    Matches[3].getAsInteger(10, NumHeadSamples);
> +    Matches[4].getAsInteger(10, NumSampledLines);
> +    FunctionProfile &FProfile = Profiles[FName];
> +    FProfile.TotalSamples += NumSamples;
> +    FProfile.TotalHeadSamples += NumHeadSamples;
> +    BodySampleMap &SampleMap = FProfile.BodySamples;
> +    unsigned I;
> +    for (I = 0; I < NumSampledLines && !Loader.atEOF(); I++) {
> +      Line = Loader.readLine();
> +      if (!LineSample.match(Line, &Matches))
> +        Loader.reportParseError("Expected 'NUM: NUM', found " + Line);
> +      assert(Matches.size() == 3);
> +      unsigned LineOffset, NumSamples;
> +      Matches[1].getAsInteger(10, LineOffset);
> +      Matches[2].getAsInteger(10, NumSamples);
> +      SampleMap[LineOffset] += NumSamples;
> +    }
> +
> +    if (I < NumSampledLines)
> +      Loader.reportParseError("Unexpected end of file");
> +  }
> +}
> +
> +/// \brief Get the weight for an instruction.
> +///
> +/// The "weight" of an instruction \p Inst is the number of samples
> +/// collected on that instruction at runtime. To retrieve it, we
> +/// need to compute the line number of \p Inst relative to the start of
> its
> +/// function. We use \p FirstLineno to compute the offset. We then
> +/// look up the samples collected for \p Inst using \p BodySamples.
> +///
> +/// \param Inst Instruction to query.
> +/// \param FirstLineno Line number of the first instruction in the
> function.
> +/// \param BodySamples Map of relative source line locations to samples.
> +///
> +/// \returns The profiled weight of I.
> +uint32_t SampleProfile::getInstWeight(Instruction &Inst, unsigned
> FirstLineno,
> +                                      BodySampleMap &BodySamples) {
> +  unsigned LOffset = Inst.getDebugLoc().getLine() - FirstLineno + 1;
> +  return BodySamples.lookup(LOffset);
> +}
> +
> +/// \brief Compute the weight of a basic block.
> +///
> +/// The weight of basic block \p B is the maximum weight of all the
> +/// instructions in B.
> +///
> +/// \param B The basic block to query.
> +/// \param FirstLineno The line number for the first line in the
> +///     function holding B.
> +/// \param BodySamples The map containing all the samples collected in
> that
> +///     function.
> +///
> +/// \returns The computed weight of B.
> +uint32_t SampleProfile::computeBlockWeight(BasicBlock *B, unsigned
> FirstLineno,
> +                                           BodySampleMap &BodySamples) {
> +  // If we've computed B's weight before, return it.
> +  Function *F = B->getParent();
> +  FunctionProfile &FProfile = Profiles[F->getName()];
> +  std::pair<BlockWeightMap::iterator, bool> Entry =
> +      FProfile.BlockWeights.insert(std::make_pair(B, 0));
> +  if (!Entry.second)
> +    return Entry.first->second;
> +
> +  // Otherwise, compute and cache B's weight.
> +  uint32_t Weight = 0;
> +  for (BasicBlock::iterator I = B->begin(), E = B->end(); I != E; ++I) {
> +    uint32_t InstWeight = getInstWeight(*I, FirstLineno, BodySamples);
> +    if (InstWeight > Weight)
> +      Weight = InstWeight;
> +  }
> +  Entry.first->second = Weight;
> +  return Weight;
> +}
> +
> +/// \brief Generate branch weight metadata for all branches in \p F.
> +///
> +/// For every branch instruction B in \p F, we compute the weight of the
> +/// target block for each of the edges out of B. This is the weight
> +/// that we associate with that branch.
> +///
> +/// TODO - This weight assignment will most likely be wrong if the
> +/// target branch has more than two predecessors. This needs to be done
> +/// using some form of flow propagation.
> +///
> +/// Once all the branch weights are computed, we emit the MD_prof
> +/// metadata on B using the computed values.
> +///
> +/// \param F The function to query.
> +bool SampleProfile::emitAnnotations(Function &F) {
> +  bool Changed = false;
> +  FunctionProfile &FProfile = Profiles[F.getName()];
> +  unsigned FirstLineno = inst_begin(F)->getDebugLoc().getLine();
> +  MDBuilder MDB(F.getContext());
> +
> +  // Clear the block weights cache.
> +  FProfile.BlockWeights.clear();
> +
> +  // When we find a branch instruction: For each edge E out of the branch,
> +  // the weight of E is the weight of the target block.
> +  for (Function::iterator I = F.begin(), E = F.end(); I != E; ++I) {
> +    BasicBlock *B = I;
> +    TerminatorInst *TI = B->getTerminator();
> +    if (TI->getNumSuccessors() == 1)
> +      continue;
> +    if (!isa<BranchInst>(TI) && !isa<SwitchInst>(TI))
> +      continue;
> +
> +    SmallVector<uint32_t, 4> Weights;
> +    unsigned NSuccs = TI->getNumSuccessors();
> +    for (unsigned I = 0; I < NSuccs; ++I) {
> +      BasicBlock *Succ = TI->getSuccessor(I);
> +      uint32_t Weight =
> +          computeBlockWeight(Succ, FirstLineno, FProfile.BodySamples);
> +      Weights.push_back(Weight);
> +    }
> +
> +    TI->setMetadata(llvm::LLVMContext::MD_prof,
> +                    MDB.createBranchWeights(Weights));
> +    Changed = true;
> +  }
> +
> +  return Changed;
> +}
> +
> +char SampleProfileLoader::ID = 0;
> +INITIALIZE_PASS(SampleProfileLoader, "sample-profile", "Sample Profile
> loader",
> +                false, false)
> +
> +bool SampleProfileLoader::runOnFunction(Function &F) {
> +  return Profiler->emitAnnotations(F);
> +}
> +
> +bool SampleProfileLoader::doInitialization(Module &M) {
> +  Profiler.reset(new SampleProfile(Filename));
> +  Profiler->loadText();
> +  return true;
> +}
> +
> +FunctionPass *llvm::createSampleProfileLoaderPass() {
> +  return new SampleProfileLoader(SampleProfileFile);
> +}
> +
> +FunctionPass *llvm::createSampleProfileLoaderPass(StringRef Name) {
> +  return new SampleProfileLoader(Name);
> +}
>
> Modified: llvm/trunk/lib/Transforms/Scalar/Scalar.cpp
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/Scalar.cpp?rev=194566&r1=194565&r2=194566&view=diff
>
> ==============================================================================
> --- llvm/trunk/lib/Transforms/Scalar/Scalar.cpp (original)
> +++ llvm/trunk/lib/Transforms/Scalar/Scalar.cpp Wed Nov 13 06:22:21 2013
> @@ -28,6 +28,7 @@ using namespace llvm;
>  /// ScalarOpts library.
>  void llvm::initializeScalarOpts(PassRegistry &Registry) {
>    initializeADCEPass(Registry);
> +  initializeSampleProfileLoaderPass(Registry);
>    initializeCodeGenPreparePass(Registry);
>    initializeConstantPropagationPass(Registry);
>    initializeCorrelatedValuePropagationPass(Registry);
>
> Added: llvm/trunk/test/Transforms/SampleProfile/Inputs/branch.prof
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/SampleProfile/Inputs/branch.prof?rev=194566&view=auto
>
> ==============================================================================
> --- llvm/trunk/test/Transforms/SampleProfile/Inputs/branch.prof (added)
> +++ llvm/trunk/test/Transforms/SampleProfile/Inputs/branch.prof Wed Nov 13
> 06:22:21 2013
> @@ -0,0 +1,11 @@
> +symbol table
> +1
> +main
> +main:15680:0:7
> +0: 0
> +4: 0
> +7: 0
> +9: 10226
> +10: 2243
> +16: 0
> +18: 0
>
> Added: llvm/trunk/test/Transforms/SampleProfile/branch.ll
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/SampleProfile/branch.ll?rev=194566&view=auto
>
> ==============================================================================
> --- llvm/trunk/test/Transforms/SampleProfile/branch.ll (added)
> +++ llvm/trunk/test/Transforms/SampleProfile/branch.ll Wed Nov 13 06:22:21
> 2013
> @@ -0,0 +1,142 @@
> +; RUN: opt < %s -sample-profile
> -sample-profile-file=%S/Inputs/branch.prof | opt -analyze -branch-prob |
> FileCheck %s
> +
> +; Original C++ code for this test case:
> +;
> +; #include <stdio.h>
> +; #include <stdlib.h>
> +;
> +; int main(int argc, char *argv[]) {
> +;   if (argc < 2)
> +;     return 1;
> +;   double result;
> +;   int limit = atoi(argv[1]);
> +;   if (limit > 100) {
> +;     double s = 23.041968;
> +;     for (int u = 0; u < limit; u++) {
> +;       double x = s;
> +;       s = x + 3.049 + (double)u;
> +;       s -= s + 3.94 / x * 0.32;
> +;     }
> +;     result = s;
> +;   } else {
> +;     result = 0;
> +;   }
> +;   printf("result is %lf\n", result);
> +;   return 0;
> +; }
> +
> + at .str = private unnamed_addr constant [15 x i8] c"result is %lf\0A\00",
> align 1
> +
> +; Function Attrs: nounwind uwtable
> +define i32 @main(i32 %argc, i8** nocapture readonly %argv) #0 {
> +; CHECK: Printing analysis 'Branch Probability Analysis' for function
> 'main':
> +
> +entry:
> +  tail call void @llvm.dbg.value(metadata !{i32 %argc}, i64 0, metadata
> !13), !dbg !27
> +  tail call void @llvm.dbg.value(metadata !{i8** %argv}, i64 0, metadata
> !14), !dbg !27
> +  %cmp = icmp slt i32 %argc, 2, !dbg !28
> +  br i1 %cmp, label %return, label %if.end, !dbg !28
> +; CHECK: edge entry -> return probability is 1 / 2 = 50%
> +; CHECK: edge entry -> if.end probability is 1 / 2 = 50%
> +
> +if.end:                                           ; preds = %entry
> +  %arrayidx = getelementptr inbounds i8** %argv, i64 1, !dbg !30
> +  %0 = load i8** %arrayidx, align 8, !dbg !30, !tbaa !31
> +  %call = tail call i32 @atoi(i8* %0) #4, !dbg !30
> +  tail call void @llvm.dbg.value(metadata !{i32 %call}, i64 0, metadata
> !17), !dbg !30
> +  %cmp1 = icmp sgt i32 %call, 100, !dbg !35
> +  br i1 %cmp1, label %for.body, label %if.end6, !dbg !35
> +; CHECK: edge if.end -> for.body probability is 2243 / 2244 = 99.9554%
> [HOT edge]
> +; CHECK: edge if.end -> if.end6 probability is 1 / 2244 = 0.0445633%
> +
> +for.body:                                         ; preds = %if.end,
> %for.body
> +  %u.016 = phi i32 [ %inc, %for.body ], [ 0, %if.end ]
> +  %s.015 = phi double [ %sub, %for.body ], [ 0x40370ABE6A337A81, %if.end ]
> +  %add = fadd double %s.015, 3.049000e+00, !dbg !36
> +  %conv = sitofp i32 %u.016 to double, !dbg !36
> +  %add4 = fadd double %add, %conv, !dbg !36
> +  tail call void @llvm.dbg.value(metadata !{double %add4}, i64 0,
> metadata !18), !dbg !36
> +  %div = fdiv double 3.940000e+00, %s.015, !dbg !37
> +  %mul = fmul double %div, 3.200000e-01, !dbg !37
> +  %add5 = fadd double %add4, %mul, !dbg !37
> +  %sub = fsub double %add4, %add5, !dbg !37
> +  tail call void @llvm.dbg.value(metadata !{double %sub}, i64 0, metadata
> !18), !dbg !37
> +  %inc = add nsw i32 %u.016, 1, !dbg !38
> +  tail call void @llvm.dbg.value(metadata !{i32 %inc}, i64 0, metadata
> !21), !dbg !38
> +  %exitcond = icmp eq i32 %inc, %call, !dbg !38
> +  br i1 %exitcond, label %if.end6, label %for.body, !dbg !38
> +; CHECK: edge for.body -> if.end6 probability is 1 / 2244 = 0.0445633%
> +; CHECK: edge for.body -> for.body probability is 2243 / 2244 = 99.9554%
> [HOT edge]
> +
> +if.end6:                                          ; preds = %for.body,
> %if.end
> +  %result.0 = phi double [ 0.000000e+00, %if.end ], [ %sub, %for.body ]
> +  %call7 = tail call i32 (i8*, ...)* @printf(i8* getelementptr inbounds
> ([15 x i8]* @.str, i64 0, i64 0), double %result.0), !dbg !39
> +  br label %return, !dbg !40
> +; CHECK: edge if.end6 -> return probability is 16 / 16 = 100% [HOT edge]
> +
> +return:                                           ; preds = %entry,
> %if.end6
> +  %retval.0 = phi i32 [ 0, %if.end6 ], [ 1, %entry ]
> +  ret i32 %retval.0, !dbg !41
> +}
> +
> +; Function Attrs: nounwind readonly
> +declare i32 @atoi(i8* nocapture) #1
> +
> +; Function Attrs: nounwind
> +declare i32 @printf(i8* nocapture readonly, ...) #2
> +
> +; Function Attrs: nounwind readnone
> +declare void @llvm.dbg.value(metadata, i64, metadata) #3
> +
> +attributes #0 = { nounwind uwtable "less-precise-fpmad"="false"
> "no-frame-pointer-elim"="false" "no-infs-fp-math"="false"
> "no-nans-fp-math"="false" "stack-protector-buffer-size"="8"
> "unsafe-fp-math"="false" "use-soft-float"="false" }
> +attributes #1 = { nounwind readonly "less-precise-fpmad"="false"
> "no-frame-pointer-elim"="false" "no-infs-fp-math"="false"
> "no-nans-fp-math"="false" "stack-protector-buffer-size"="8"
> "unsafe-fp-math"="false" "use-soft-float"="false" }
> +attributes #2 = { nounwind "less-precise-fpmad"="false"
> "no-frame-pointer-elim"="false" "no-infs-fp-math"="false"
> "no-nans-fp-math"="false" "stack-protector-buffer-size"="8"
> "unsafe-fp-math"="false" "use-soft-float"="false" }
> +attributes #3 = { nounwind readnone }
> +attributes #4 = { nounwind readonly }
> +
> +!llvm.dbg.cu = !{!0}
> +!llvm.module.flags = !{!25}
> +!llvm.ident = !{!26}
> +
> +!0 = metadata !{i32 786449, metadata !1, i32 4, metadata !"clang version
> 3.4 (trunk 192896) (llvm/trunk 192895)", i1 true, metadata !"", i32 0,
> metadata !2, metadata !2, metadata !3, metadata !2, metadata !2, metadata
> !""} ; [ DW_TAG_compile_unit ] [./branch.cc] [DW_LANG_C_plus_plus]
> +!1 = metadata !{metadata !"branch.cc", metadata !"."}
> +!2 = metadata !{i32 0}
> +!3 = metadata !{metadata !4}
> +!4 = metadata !{i32 786478, metadata !1, metadata !5, metadata !"main",
> metadata !"main", metadata !"", i32 4, metadata !6, i1 false, i1 true, i32
> 0, i32 0, null, i32 256, i1 true, i32 (i32, i8**)* @main, null, null,
> metadata !12, i32 4} ; [ DW_TAG_subprogram ] [line 4] [def] [main]
> +!5 = metadata !{i32 786473, metadata !1}          ; [ DW_TAG_file_type ]
> [./branch.cc]
> +!6 = metadata !{i32 786453, i32 0, null, metadata !"", i32 0, i64 0, i64
> 0, i64 0, i32 0, null, metadata !7, i32 0, null, null, null} ; [
> DW_TAG_subroutine_type ] [line 0, size 0, align 0, offset 0] [from ]
> +!7 = metadata !{metadata !8, metadata !8, metadata !9}
> +!8 = metadata !{i32 786468, null, null, metadata !"int", i32 0, i64 32,
> i64 32, i64 0, i32 0, i32 5} ; [ DW_TAG_base_type ] [int] [line 0, size 32,
> align 32, offset 0, enc DW_ATE_signed]
> +!9 = metadata !{i32 786447, null, null, metadata !"", i32 0, i64 64, i64
> 64, i64 0, i32 0, metadata !10} ; [ DW_TAG_pointer_type ] [line 0, size 64,
> align 64, offset 0] [from ]
> +!10 = metadata !{i32 786447, null, null, metadata !"", i32 0, i64 64, i64
> 64, i64 0, i32 0, metadata !11} ; [ DW_TAG_pointer_type ] [line 0, size 64,
> align 64, offset 0] [from char]
> +!11 = metadata !{i32 786468, null, null, metadata !"char", i32 0, i64 8,
> i64 8, i64 0, i32 0, i32 6} ; [ DW_TAG_base_type ] [char] [line 0, size 8,
> align 8, offset 0, enc DW_ATE_signed_char]
> +!12 = metadata !{metadata !13, metadata !14, metadata !15, metadata !17,
> metadata !18, metadata !21, metadata !23}
> +!13 = metadata !{i32 786689, metadata !4, metadata !"argc", metadata !5,
> i32 16777220, metadata !8, i32 0, i32 0} ; [ DW_TAG_arg_variable ] [argc]
> [line 4]
> +!14 = metadata !{i32 786689, metadata !4, metadata !"argv", metadata !5,
> i32 33554436, metadata !9, i32 0, i32 0} ; [ DW_TAG_arg_variable ] [argv]
> [line 4]
> +!15 = metadata !{i32 786688, metadata !4, metadata !"result", metadata
> !5, i32 7, metadata !16, i32 0, i32 0} ; [ DW_TAG_auto_variable ] [result]
> [line 7]
> +!16 = metadata !{i32 786468, null, null, metadata !"double", i32 0, i64
> 64, i64 64, i64 0, i32 0, i32 4} ; [ DW_TAG_base_type ] [double] [line 0,
> size 64, align 64, offset 0, enc DW_ATE_float]
> +!17 = metadata !{i32 786688, metadata !4, metadata !"limit", metadata !5,
> i32 8, metadata !8, i32 0, i32 0} ; [ DW_TAG_auto_variable ] [limit] [line
> 8]
> +!18 = metadata !{i32 786688, metadata !19, metadata !"s", metadata !5,
> i32 10, metadata !16, i32 0, i32 0} ; [ DW_TAG_auto_variable ] [s] [line 10]
> +!19 = metadata !{i32 786443, metadata !1, metadata !20, i32 9, i32 0, i32
> 2} ; [ DW_TAG_lexical_block ] [./branch.cc]
> +!20 = metadata !{i32 786443, metadata !1, metadata !4, i32 9, i32 0, i32
> 1} ; [ DW_TAG_lexical_block ] [./branch.cc]
> +!21 = metadata !{i32 786688, metadata !22, metadata !"u", metadata !5,
> i32 11, metadata !8, i32 0, i32 0} ; [ DW_TAG_auto_variable ] [u] [line 11]
> +!22 = metadata !{i32 786443, metadata !1, metadata !19, i32 11, i32 0,
> i32 3} ; [ DW_TAG_lexical_block ] [./branch.cc]
> +!23 = metadata !{i32 786688, metadata !24, metadata !"x", metadata !5,
> i32 12, metadata !16, i32 0, i32 0} ; [ DW_TAG_auto_variable ] [x] [line 12]
> +!24 = metadata !{i32 786443, metadata !1, metadata !22, i32 11, i32 0,
> i32 4} ; [ DW_TAG_lexical_block ] [./branch.cc]
> +!25 = metadata !{i32 2, metadata !"Dwarf Version", i32 4}
> +!26 = metadata !{metadata !"clang version 3.4 (trunk 192896) (llvm/trunk
> 192895)"}
> +!27 = metadata !{i32 4, i32 0, metadata !4, null}
> +!28 = metadata !{i32 5, i32 0, metadata !29, null}
> +!29 = metadata !{i32 786443, metadata !1, metadata !4, i32 5, i32 0, i32
> 0} ; [ DW_TAG_lexical_block ] [./branch.cc]
> +!30 = metadata !{i32 8, i32 0, metadata !4, null} ; [
> DW_TAG_imported_declaration ]
> +!31 = metadata !{metadata !32, metadata !32, i64 0}
> +!32 = metadata !{metadata !"any pointer", metadata !33, i64 0}
> +!33 = metadata !{metadata !"omnipotent char", metadata !34, i64 0}
> +!34 = metadata !{metadata !"Simple C/C++ TBAA"}
> +!35 = metadata !{i32 9, i32 0, metadata !20, null}
> +!36 = metadata !{i32 13, i32 0, metadata !24, null}
> +!37 = metadata !{i32 14, i32 0, metadata !24, null}
> +!38 = metadata !{i32 11, i32 0, metadata !22, null}
> +!39 = metadata !{i32 20, i32 0, metadata !4, null}
> +!40 = metadata !{i32 21, i32 0, metadata !4, null}
> +!41 = metadata !{i32 22, i32 0, metadata !4, null}
>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>



-- 
Alexey Samsonov, MSK
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20131113/68f786d0/attachment.html>


More information about the llvm-commits mailing list