Automatic PGO - Initial implementation (1/N)

Evan Cheng evan.cheng at apple.com
Wed Sep 25 09:11:06 PDT 2013


Hi Diego,

I am curious. Why is this called automatic PGO? Why is it automatic?

Also, why is the auto-profile part done as a scalar transformation pass? Did you consider alternatives?

Thanks,

Evan

Sent from my iPad

> On Sep 24, 2013, at 7:07 AM, Diego Novillo <dnovillo at google.com> wrote:
> 
> This adds two options: -auto-profile and -auto-profile-file. The new
> option causes the compiler to read a profile file and emit IR
> metadata reflecting that profile.
> 
> The profile file is assumed to have been generated by an external
> profile source. The profile information is converted into IR metadata,
> which is later used by the analysis routines to estimate block
> frequencies, edge weights and other related data.
> 
> External profile information files have no fixed format, each profiler
> is free to define its own. This includes both the on-disk representation
> of the profile and the kind of profile information stored in the file.
> A common kind of profile is based on sampling (e.g., perf), which
> essentially counts how many times each line of the program has been
> executed during the run.
> 
> The only requirements is that each profiler must provide a way to
> load its own profile format into internal data structures and
> then a method to convert that data into IR annotations.
> 
> The AutoProfile pass is organized as a scalar transformation. On
> startup, it reads the file given in -auto-profile-file to determine what
> kind of profile it contains.  This file is assumed to contain profile
> information for the whole application. The profile data in the file is
> read and incorporated into the internal state of the corresponding
> profiler.
> 
> To facilitate testing, I've organized the profilers to support two file
> formats: text and native. The native format is whatever on-disk
> representation the profiler wants to support, I think this will mostly
> be bitcode files, but it could be anything the profiler wants to
> support. To do this, every profiler must implement the
> AutoProfiler::loadNative() function.
> 
> The text format is mostly meant for debugging. Records are separated by
> newlines, but each profiler is free to interpret records as it sees fit.
> Profilers must implement the AutoProfiler::loadText() function.
> 
> Finally, the pass will call AutoProfiler::emitAnnotations() for each
> function in the current translation unit. This function needs to
> translate the loaded profile into IR metadata, which the analyzer will
> later be able to use.
> 
> This patch implements the first steps towards the above design. I've
> implemented a sample-based flat profiler. The format of the profile is
> fairly simplistic. Each sampled function contains a list of relative
> line locations (from the start of the function) together with a count
> representing how many samples were collected at that line during
> execution. I generate this profile using perf and a separate converter
> tool.
> 
> Currently, I have only implemented a text format for these profiles. I
> am interested in initial feedback to the whole approach before I send
> the other parts of the implementation for review.
> 
> This patch implements:
> 
> - The AutoProfile pass.
> - The base AutoProfiler class with the core interface.
> - A SampleBasedProfiler using the above interface. The profiler
>  generates metadata autoprofile.samples on every IR instruction that
>  matches the profiles.
> - A text loader class to assist the implementation of
>  AutoProfiler::loadText().
> 
> Caveats and questions:
> 
> 1- I am almost certainly using the wrong APIs or using the right
>   APIs in unorthodox ways. Please point me to better
>   alternatives.
> 
> 2- I was surprised to learn that line number information is not
>   transferred into the IR unless we are emitting debug
>   information. For sample-based profiling, I'm going to need
>   line number information generated by the front-end
>   independently of debug info. Eric, is that possible?
> 
> 3- I have not included in this patch changes to the analyzer. I
>   want to keep it focused to the profile loading and IR
>   annotation. In the analyzer, we will have propagation of
>   attributes and other fixes (e.g., from the samples it is
>   possible to have instructions on the same basic block
>   registered with differing number of samples). I also have not
>   included changes to code motion to get rid of the autoprofile
>   information.
> 
> 4- I need to add test cases.
> 
> Mainly, I'm interested in making sure that this direction is
> generally useful. I haven't given a lot of thought to other types
> of profiling, but I'm certain any kind of tracing or other
> execution frequency profiles can be adapted. Things like value
> profiling may be a bit more involved, but mostly because I'm not
> sure how the type and symbols are tracked in LLVM.
> 
> Thanks.  Diego.
> ---
> include/llvm/InitializePasses.h       |   1 +
> include/llvm/Transforms/Scalar.h      |   6 +
> lib/Transforms/Scalar/AutoProfile.cpp | 363 ++++++++++++++++++++++++++++++++++
> lib/Transforms/Scalar/CMakeLists.txt  |   1 +
> lib/Transforms/Scalar/Scalar.cpp      |   1 +
> 5 files changed, 372 insertions(+)
> create mode 100644 lib/Transforms/Scalar/AutoProfile.cpp
> 
> diff --git a/include/llvm/InitializePasses.h b/include/llvm/InitializePasses.h
> index 1b50bb2..c922228 100644
> --- a/include/llvm/InitializePasses.h
> +++ b/include/llvm/InitializePasses.h
> @@ -70,6 +70,7 @@ void initializeAliasDebuggerPass(PassRegistry&);
> void initializeAliasSetPrinterPass(PassRegistry&);
> void initializeAlwaysInlinerPass(PassRegistry&);
> void initializeArgPromotionPass(PassRegistry&);
> +void initializeAutoProfilePass(PassRegistry&);
> void initializeBarrierNoopPass(PassRegistry&);
> void initializeBasicAliasAnalysisPass(PassRegistry&);
> void initializeBasicCallGraphPass(PassRegistry&);
> diff --git a/include/llvm/Transforms/Scalar.h b/include/llvm/Transforms/Scalar.h
> index 51aeba4..9ee5b49 100644
> --- a/include/llvm/Transforms/Scalar.h
> +++ b/include/llvm/Transforms/Scalar.h
> @@ -354,6 +354,12 @@ FunctionPass *createLowerExpectIntrinsicPass();
> //
> FunctionPass *createPartiallyInlineLibCallsPass();
> 
> +//===----------------------------------------------------------------------===//
> +//
> +// AutoProfilePass - Loads profile data from disk and generates
> +// IR metadata to reflect the profile.
> +FunctionPass *createAutoProfilePass();
> +
> } // End llvm namespace
> 
> #endif
> diff --git a/lib/Transforms/Scalar/AutoProfile.cpp b/lib/Transforms/Scalar/AutoProfile.cpp
> new file mode 100644
> index 0000000..7a63409
> --- /dev/null
> +++ b/lib/Transforms/Scalar/AutoProfile.cpp
> @@ -0,0 +1,363 @@
> +//===- AutoProfile.cpp - Incorporate an external profile into the IR ------===//
> +//
> +//                      The LLVM Compiler Infrastructure
> +//
> +// This file is distributed under the University of Illinois Open Source
> +// License. See LICENSE.TXT for details.
> +//
> +//===----------------------------------------------------------------------===//
> +//
> +// This file implements the Auto Profile transformation. This pass reads a
> +// profile file generated by an external profiling source and generates IR
> +// metadata to reflect the profile information in the given profile.
> +//
> +// TODO(dnovillo) Add more.
> +//
> +//===----------------------------------------------------------------------===//
> +
> +#define DEBUG_TYPE "auto-profile"
> +
> +#include <cstdlib>
> +
> +#include "llvm/ADT/DenseMap.h"
> +#include "llvm/ADT/OwningPtr.h"
> +#include "llvm/ADT/StringMap.h"
> +#include "llvm/DebugInfo/DIContext.h"
> +#include "llvm/IR/Constants.h"
> +#include "llvm/IR/Function.h"
> +#include "llvm/IR/Instructions.h"
> +#include "llvm/IR/LLVMContext.h"
> +#include "llvm/IR/Metadata.h"
> +#include "llvm/IR/Module.h"
> +#include "llvm/Pass.h"
> +#include "llvm/Support/CommandLine.h"
> +#include "llvm/Support/Debug.h"
> +#include "llvm/Support/InstIterator.h"
> +#include "llvm/Support/MemoryBuffer.h"
> +#include "llvm/Support/Regex.h"
> +#include "llvm/Support/raw_ostream.h"
> +#include "llvm/Transforms/Scalar.h"
> +
> +using namespace llvm;
> +
> +// command line option for loading path profiles
> +static cl::opt<std::string> AutoProfileFilename(
> +    "auto-profile-file", cl::init("autoprof.llvm"), cl::value_desc("filename"),
> +    cl::desc("Profile file loaded by -auto-profile"), cl::Hidden);
> +
> +namespace {
> +
> +// Base profiler abstract class. This defines the abstract interface
> +// that every profiler should respond to.
> +//
> +// TODO(dnovillo) - Eventually this class ought to move to a separate file.
> +// There will be several types of profile loaders. Having them all together in
> +// this file will get pretty messy.
> +class AutoProfiler {
> +public:
> +  AutoProfiler(std::string filename) : filename_(filename) {}
> +  ~AutoProfiler() {}
> +
> +  // Load the profile from a file in the native format of this profile.
> +  virtual bool loadNative() = 0;
> +
> +  // Load the profile from a text file.
> +  virtual bool loadText() = 0;
> +
> +  // Dump this profile on stderr.
> +  virtual void dump() = 0;
> +
> +  // Modify the IR with annotations corresponding to the loaded profile.
> +  virtual bool emitAnnotations(Function &F) = 0;
> +
> +  // Instantiate an auto-profiler object based on the detected format of the
> +  // give file name. Set *is_text to true, if the file is in text format.
> +  static AutoProfiler *
> +  instantiateProfiler(const std::string filename, bool *is_text);
> +
> +protected:
> +  // Path name to the file holding the profile data.
> +  std::string filename_;
> +};
> +
> +
> +// Sample-based profiler. These profiles contain execution frequency
> +// information on the function bodies of the program.
> +class AutoProfileSampleBased : public AutoProfiler {
> +public:
> +  AutoProfileSampleBased(std::string filename)
> +      : AutoProfiler(filename), profiles_(0) {}
> +
> +  // Metadata kind for autoprofile.samples.
> +  static unsigned AutoProfileSamplesMDKind;
> +
> +  virtual void dump();
> +  virtual bool loadText();
> +  virtual bool loadNative() { llvm_unreachable("not implemented"); }
> +  virtual bool emitAnnotations(Function &F);
> +
> +  void dumpFunctionProfile(StringRef fn_name);
> +
> +protected:
> +  typedef DenseMap<uint32_t, uint32_t> BodySampleMap;
> +
> +  struct FunctionProfile {
> +    // Total number of samples collected inside this function. Samples
> +    // are cumulative, they include all the samples collected inside
> +    // this function and all its inlined callees.
> +    unsigned TotalSamples;
> +
> +    // Total number of samples collected at the head of the function.
> +    unsigned TotalHeadSamples;
> +
> +    // Map <line offset, samples> of line offset to samples collected
> +    // inside the function. Each entry in this map contains the number
> +    // of samples collected at the corresponding line offset. All line
> +    // locations are an offset from the start of the function.
> +    BodySampleMap BodySamples;
> +  };
> +
> +  typedef StringMap<FunctionProfile> FunctionProfileMap;
> +
> +  FunctionProfileMap profiles_;
> +};
> +
> +// Loader class for text-based profiles. These are mostly useful to
> +// generate unit tests and not much else.
> +class AutoProfileTextLoader {
> +public:
> +  AutoProfileTextLoader(std::string filename) : filename_(filename) {
> +    error_code ec;
> +    ec = MemoryBuffer::getFile(filename_, buffer_);
> +    if (ec)
> +      report_fatal_error("Could not open profile file " + filename_ + ": " +
> +                         ec.message());
> +    fp_ = buffer_->getBufferStart();
> +    linenum_ = 0;
> +  }
> +
> +  // Read a line from the mapped file. Update the current line and file pointer.
> +  StringRef readLine() {
> +    size_t length = 0;
> +    const char *start = fp_;
> +    while (fp_ != buffer_->getBufferEnd() && *fp_ != '\n') {
> +      length++;
> +      fp_++;
> +    }
> +    if (fp_ != buffer_->getBufferEnd())
> +      fp_++;
> +    linenum_++;
> +    return StringRef(start, length);
> +  }
> +
> +  // Return true, if we've reached EOF.
> +  bool atEOF() const {
> +    return fp_ == buffer_->getBufferEnd();
> +  }
> +
> +  void reportParseError(std::string msg) const {
> +    // TODO(dnovillo) - This is almost certainly the wrong way to emit
> +    // diagnostics and exit the compiler.
> +    errs() << filename_ << ":" << linenum_ << ": " << msg << "\n";
> +    exit(1);
> +  }
> +
> +private:
> +  OwningPtr<MemoryBuffer> buffer_;
> +  const char *fp_;
> +  size_t linenum_;
> +  std::string filename_;
> +};
> +
> +// Auto profile pass. This pass reads profile data from the file specified
> +// by -auto-profile-file and annotates every affected function with the
> +// profile information found in that file.
> +class AutoProfile : public FunctionPass {
> +public:
> +  // Class identification, replacement for typeinfo
> +  static char ID;
> +
> +  AutoProfile() : FunctionPass(ID), profiler_(0) {
> +    initializeAutoProfilePass(*PassRegistry::getPassRegistry());
> +  }
> +
> +  virtual bool doInitialization(Module &M);
> +
> +  ~AutoProfile() { delete profiler_; }
> +
> +  void dump() { profiler_->dump(); }
> +
> +  virtual const char *getPassName() const {
> +    return "Auto profile pass";
> +  }
> +
> +  virtual bool runOnFunction(Function &F);
> +
> +  virtual void getAnalysisUsage(AnalysisUsage &AU) const {
> +    AU.setPreservesCFG();
> +  }
> +
> +  bool loadProfile();
> +
> +protected:
> +  AutoProfiler *profiler_;
> +};
> +}
> +
> +// Dump the sample profile for the given function.
> +void AutoProfileSampleBased::dumpFunctionProfile(StringRef fn_name) {
> +  FunctionProfile fn_profile = profiles_[fn_name];
> +  errs() << "Function: " << fn_name << ", " << fn_profile.TotalSamples << ", "
> +         << fn_profile.TotalHeadSamples << ", " << fn_profile.BodySamples.size()
> +         << " sampled lines\n";
> +  for (BodySampleMap::const_iterator si = fn_profile.BodySamples.begin();
> +       si != fn_profile.BodySamples.end(); si++)
> +    errs() << "\tline offset: " << si->first
> +           << ", number of samples: " << si->second << "\n";
> +  errs() << "\n";
> +}
> +
> +// Dump all the collected function profiles.
> +void AutoProfileSampleBased::dump() {
> +  FunctionProfileMap::const_iterator it;
> +  for (it = profiles_.begin(); it != profiles_.end(); it++)
> +    dumpFunctionProfile(it->getKey());
> +}
> +
> +// Load a sample profile from a text file.
> +bool AutoProfileSampleBased::loadText() {
> +  AutoProfileTextLoader loader(filename_);
> +
> +  // Read the symbol table.
> +  std::string line = loader.readLine().str();
> +  if (line != "symbol table")
> +    loader.reportParseError("Expected 'symbol table', found " + line);
> +  Regex num("[0-9]+");
> +  line = loader.readLine().str();
> +  if (!num.match(line))
> +    loader.reportParseError("Expected a number, found " + line);
> +  int num_symbols = atoi(line.c_str());
> +  for (int i = 0; i < num_symbols; i++) {
> +    StringRef fn_name = loader.readLine();
> +    FunctionProfile &fn_profile = profiles_[fn_name];
> +    fn_profile.BodySamples.clear();
> +    fn_profile.TotalSamples = 0;
> +    fn_profile.TotalHeadSamples = 0;
> +  }
> +
> +  // Read the profile of each function. Since each function may be
> +  // mentioned more than once, and we are collecting flat profiles,
> +  // accumulate samples as we parse them.
> +  while (!loader.atEOF()) {
> +    SmallVector<StringRef, 4> matches;
> +    Regex head_re("^([^:]+):([0-9]+):([0-9]+):([0-9]+)$");
> +    line = loader.readLine().str();
> +    if (!head_re.match(line, &matches))
> +      loader.reportParseError("Expected 'mangled_name:NUM:NUM:NUM', found " +
> +                              line);
> +    assert(matches.size() == 5);
> +    StringRef fn_name = matches[1];
> +    unsigned num_samples = atoi(matches[2].str().c_str());
> +    unsigned num_head_samples = atoi(matches[3].str().c_str());
> +    unsigned num_sampled_lines = atoi(matches[4].str().c_str());
> +    FunctionProfile &fn_profile = profiles_[fn_name];
> +    fn_profile.TotalSamples += num_samples;
> +    fn_profile.TotalHeadSamples += num_head_samples;
> +    BodySampleMap &sample_map = fn_profile.BodySamples;
> +    unsigned i;
> +    for (i = 0; i < num_sampled_lines && !loader.atEOF(); i++) {
> +      Regex line_sample("^([0-9]+): ([0-9]+)$");
> +      line = loader.readLine().str();
> +      if (!line_sample.match(line, &matches))
> +        loader.reportParseError("Expected 'NUM: NUM', found " + line);
> +      assert(matches.size() == 3);
> +      unsigned line_offset = atoi(matches[1].str().c_str());
> +      unsigned num_samples = atoi(matches[2].str().c_str());
> +      sample_map[line_offset] += num_samples;
> +    }
> +
> +    if (i < num_sampled_lines)
> +      loader.reportParseError("Unexpected end of file");
> +  }
> +
> +  return true;
> +}
> +
> +// Annotate function F with the contents of the profile.
> +bool AutoProfileSampleBased::emitAnnotations(Function &F) {
> +  bool changed = false;
> +  StringRef name = F.getName();
> +  FunctionProfile &fn_profile = profiles_[name];
> +  BodySampleMap &body_samples = fn_profile.BodySamples;
> +  Instruction &first_inst = *(inst_begin(F));
> +  unsigned first_line = first_inst.getDebugLoc().getLine();
> +  LLVMContext &context = first_inst.getContext();
> +  for (inst_iterator i = inst_begin(F); i != inst_end(F); ++i) {
> +    Instruction &inst = *i;
> +    const DebugLoc &dloc = inst.getDebugLoc();
> +    unsigned loc_offset = dloc.getLine() - first_line + 1;
> +    if (body_samples.find(loc_offset) != body_samples.end()) {
> +      SmallVector<Value *, 1> sample_values;
> +      sample_values.push_back(ConstantInt::get(Type::getInt32Ty(context),
> +                                               body_samples[loc_offset]));
> +      MDNode *md = MDNode::get(context, sample_values);
> +      inst.setMetadata(AutoProfileSamplesMDKind, md);
> +      changed = true;
> +    }
> +  }
> +
> +  DEBUG(if (changed) {
> +              dbgs() << "\n\nInstructions changed in " << name << "\n";
> +              for (inst_iterator i = inst_begin(F); i != inst_end(F); ++i) {
> +                Instruction &inst = *i;
> +                MDNode *md = inst.getMetadata(AutoProfileSamplesMDKind);
> +                if (md) {
> +                  assert(md->getNumOperands() == 1);
> +                  ConstantInt *val = dyn_cast<ConstantInt>(md->getOperand(0));
> +                  dbgs() << inst << " (" << val->getValue().getZExtValue()
> +                         << " samples)\n";
> +                }
> +              }
> +            });
> +
> +  return changed;
> +}
> +
> +AutoProfiler *AutoProfiler::instantiateProfiler(const std::string filename,
> +                                                bool *is_text) {
> +  // TODO(dnovillo) - Implement file type detection and return the appropriate
> +  // AutoProfiler sub-class instance.
> +  *is_text = true;
> +  return new AutoProfileSampleBased(filename);
> +}
> +
> +unsigned AutoProfileSampleBased::AutoProfileSamplesMDKind = 0;
> +char AutoProfile::ID = 0;
> +INITIALIZE_PASS(AutoProfile, "auto-profile", "Auto Profile loader", false,
> +                false)
> +
> +bool AutoProfile::runOnFunction(Function& F) {
> +  return profiler_->emitAnnotations(F);
> +}
> +
> +bool AutoProfile::loadProfile() {
> +  bool is_text;
> +  profiler_ =
> +      AutoProfiler::instantiateProfiler(AutoProfileFilename, &is_text);
> +  if (!profiler_)
> +    return false;
> +
> +  return (is_text) ? profiler_->loadText() : profiler_->loadNative();
> +}
> +
> +bool AutoProfile::doInitialization(Module &M) {
> +  if (!loadProfile())
> +    return false;
> +  AutoProfileSampleBased::AutoProfileSamplesMDKind =
> +      M.getContext().getMDKindID("autoprofile.samples");
> +  return true;
> +}
> +
> +FunctionPass *llvm::createAutoProfilePass() {
> +  return new AutoProfile();
> +}
> diff --git a/lib/Transforms/Scalar/CMakeLists.txt b/lib/Transforms/Scalar/CMakeLists.txt
> index 3b89fd4..1093ef0 100644
> --- a/lib/Transforms/Scalar/CMakeLists.txt
> +++ b/lib/Transforms/Scalar/CMakeLists.txt
> @@ -1,5 +1,6 @@
> add_llvm_library(LLVMScalarOpts
>   ADCE.cpp
> +  AutoProfile.cpp
>   CodeGenPrepare.cpp
>   ConstantProp.cpp
>   CorrelatedValuePropagation.cpp
> diff --git a/lib/Transforms/Scalar/Scalar.cpp b/lib/Transforms/Scalar/Scalar.cpp
> index 0c3ffbc..542aa33 100644
> --- a/lib/Transforms/Scalar/Scalar.cpp
> +++ b/lib/Transforms/Scalar/Scalar.cpp
> @@ -28,6 +28,7 @@ using namespace llvm;
> /// ScalarOpts library.
> void llvm::initializeScalarOpts(PassRegistry &Registry) {
>   initializeADCEPass(Registry);
> +  initializeAutoProfilePass(Registry);
>   initializeCodeGenPreparePass(Registry);
>   initializeConstantPropagationPass(Registry);
>   initializeCorrelatedValuePropagationPass(Registry);
> -- 
> 1.8.4
> 
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits



More information about the llvm-commits mailing list