Automatic PGO - Initial implementation (1/N)

Eric Christopher echristo at gmail.com
Wed Sep 25 09:18:28 PDT 2013


On Wed, Sep 25, 2013 at 9:11 AM, Evan Cheng <evan.cheng at apple.com> wrote:
> Hi Diego,
>
> I am curious. Why is this called automatic PGO? Why is it automatic?

Because it doesn't require instrumentation. :)

Better names welcome. Maybe "sample pgo"? *shrug* Auto just sounds cooler.

-eric

>
> Also, why is the auto-profile part done as a scalar transformation pass? Did you consider alternatives?
>
> Thanks,
>
> Evan
>
> Sent from my iPad
>
>> On Sep 24, 2013, at 7:07 AM, Diego Novillo <dnovillo at google.com> wrote:
>>
>> This adds two options: -auto-profile and -auto-profile-file. The new
>> option causes the compiler to read a profile file and emit IR
>> metadata reflecting that profile.
>>
>> The profile file is assumed to have been generated by an external
>> profile source. The profile information is converted into IR metadata,
>> which is later used by the analysis routines to estimate block
>> frequencies, edge weights and other related data.
>>
>> External profile information files have no fixed format, each profiler
>> is free to define its own. This includes both the on-disk representation
>> of the profile and the kind of profile information stored in the file.
>> A common kind of profile is based on sampling (e.g., perf), which
>> essentially counts how many times each line of the program has been
>> executed during the run.
>>
>> The only requirements is that each profiler must provide a way to
>> load its own profile format into internal data structures and
>> then a method to convert that data into IR annotations.
>>
>> The AutoProfile pass is organized as a scalar transformation. On
>> startup, it reads the file given in -auto-profile-file to determine what
>> kind of profile it contains.  This file is assumed to contain profile
>> information for the whole application. The profile data in the file is
>> read and incorporated into the internal state of the corresponding
>> profiler.
>>
>> To facilitate testing, I've organized the profilers to support two file
>> formats: text and native. The native format is whatever on-disk
>> representation the profiler wants to support, I think this will mostly
>> be bitcode files, but it could be anything the profiler wants to
>> support. To do this, every profiler must implement the
>> AutoProfiler::loadNative() function.
>>
>> The text format is mostly meant for debugging. Records are separated by
>> newlines, but each profiler is free to interpret records as it sees fit.
>> Profilers must implement the AutoProfiler::loadText() function.
>>
>> Finally, the pass will call AutoProfiler::emitAnnotations() for each
>> function in the current translation unit. This function needs to
>> translate the loaded profile into IR metadata, which the analyzer will
>> later be able to use.
>>
>> This patch implements the first steps towards the above design. I've
>> implemented a sample-based flat profiler. The format of the profile is
>> fairly simplistic. Each sampled function contains a list of relative
>> line locations (from the start of the function) together with a count
>> representing how many samples were collected at that line during
>> execution. I generate this profile using perf and a separate converter
>> tool.
>>
>> Currently, I have only implemented a text format for these profiles. I
>> am interested in initial feedback to the whole approach before I send
>> the other parts of the implementation for review.
>>
>> This patch implements:
>>
>> - The AutoProfile pass.
>> - The base AutoProfiler class with the core interface.
>> - A SampleBasedProfiler using the above interface. The profiler
>>  generates metadata autoprofile.samples on every IR instruction that
>>  matches the profiles.
>> - A text loader class to assist the implementation of
>>  AutoProfiler::loadText().
>>
>> Caveats and questions:
>>
>> 1- I am almost certainly using the wrong APIs or using the right
>>   APIs in unorthodox ways. Please point me to better
>>   alternatives.
>>
>> 2- I was surprised to learn that line number information is not
>>   transferred into the IR unless we are emitting debug
>>   information. For sample-based profiling, I'm going to need
>>   line number information generated by the front-end
>>   independently of debug info. Eric, is that possible?
>>
>> 3- I have not included in this patch changes to the analyzer. I
>>   want to keep it focused to the profile loading and IR
>>   annotation. In the analyzer, we will have propagation of
>>   attributes and other fixes (e.g., from the samples it is
>>   possible to have instructions on the same basic block
>>   registered with differing number of samples). I also have not
>>   included changes to code motion to get rid of the autoprofile
>>   information.
>>
>> 4- I need to add test cases.
>>
>> Mainly, I'm interested in making sure that this direction is
>> generally useful. I haven't given a lot of thought to other types
>> of profiling, but I'm certain any kind of tracing or other
>> execution frequency profiles can be adapted. Things like value
>> profiling may be a bit more involved, but mostly because I'm not
>> sure how the type and symbols are tracked in LLVM.
>>
>> Thanks.  Diego.
>> ---
>> include/llvm/InitializePasses.h       |   1 +
>> include/llvm/Transforms/Scalar.h      |   6 +
>> lib/Transforms/Scalar/AutoProfile.cpp | 363 ++++++++++++++++++++++++++++++++++
>> lib/Transforms/Scalar/CMakeLists.txt  |   1 +
>> lib/Transforms/Scalar/Scalar.cpp      |   1 +
>> 5 files changed, 372 insertions(+)
>> create mode 100644 lib/Transforms/Scalar/AutoProfile.cpp
>>
>> diff --git a/include/llvm/InitializePasses.h b/include/llvm/InitializePasses.h
>> index 1b50bb2..c922228 100644
>> --- a/include/llvm/InitializePasses.h
>> +++ b/include/llvm/InitializePasses.h
>> @@ -70,6 +70,7 @@ void initializeAliasDebuggerPass(PassRegistry&);
>> void initializeAliasSetPrinterPass(PassRegistry&);
>> void initializeAlwaysInlinerPass(PassRegistry&);
>> void initializeArgPromotionPass(PassRegistry&);
>> +void initializeAutoProfilePass(PassRegistry&);
>> void initializeBarrierNoopPass(PassRegistry&);
>> void initializeBasicAliasAnalysisPass(PassRegistry&);
>> void initializeBasicCallGraphPass(PassRegistry&);
>> diff --git a/include/llvm/Transforms/Scalar.h b/include/llvm/Transforms/Scalar.h
>> index 51aeba4..9ee5b49 100644
>> --- a/include/llvm/Transforms/Scalar.h
>> +++ b/include/llvm/Transforms/Scalar.h
>> @@ -354,6 +354,12 @@ FunctionPass *createLowerExpectIntrinsicPass();
>> //
>> FunctionPass *createPartiallyInlineLibCallsPass();
>>
>> +//===----------------------------------------------------------------------===//
>> +//
>> +// AutoProfilePass - Loads profile data from disk and generates
>> +// IR metadata to reflect the profile.
>> +FunctionPass *createAutoProfilePass();
>> +
>> } // End llvm namespace
>>
>> #endif
>> diff --git a/lib/Transforms/Scalar/AutoProfile.cpp b/lib/Transforms/Scalar/AutoProfile.cpp
>> new file mode 100644
>> index 0000000..7a63409
>> --- /dev/null
>> +++ b/lib/Transforms/Scalar/AutoProfile.cpp
>> @@ -0,0 +1,363 @@
>> +//===- AutoProfile.cpp - Incorporate an external profile into the IR ------===//
>> +//
>> +//                      The LLVM Compiler Infrastructure
>> +//
>> +// This file is distributed under the University of Illinois Open Source
>> +// License. See LICENSE.TXT for details.
>> +//
>> +//===----------------------------------------------------------------------===//
>> +//
>> +// This file implements the Auto Profile transformation. This pass reads a
>> +// profile file generated by an external profiling source and generates IR
>> +// metadata to reflect the profile information in the given profile.
>> +//
>> +// TODO(dnovillo) Add more.
>> +//
>> +//===----------------------------------------------------------------------===//
>> +
>> +#define DEBUG_TYPE "auto-profile"
>> +
>> +#include <cstdlib>
>> +
>> +#include "llvm/ADT/DenseMap.h"
>> +#include "llvm/ADT/OwningPtr.h"
>> +#include "llvm/ADT/StringMap.h"
>> +#include "llvm/DebugInfo/DIContext.h"
>> +#include "llvm/IR/Constants.h"
>> +#include "llvm/IR/Function.h"
>> +#include "llvm/IR/Instructions.h"
>> +#include "llvm/IR/LLVMContext.h"
>> +#include "llvm/IR/Metadata.h"
>> +#include "llvm/IR/Module.h"
>> +#include "llvm/Pass.h"
>> +#include "llvm/Support/CommandLine.h"
>> +#include "llvm/Support/Debug.h"
>> +#include "llvm/Support/InstIterator.h"
>> +#include "llvm/Support/MemoryBuffer.h"
>> +#include "llvm/Support/Regex.h"
>> +#include "llvm/Support/raw_ostream.h"
>> +#include "llvm/Transforms/Scalar.h"
>> +
>> +using namespace llvm;
>> +
>> +// command line option for loading path profiles
>> +static cl::opt<std::string> AutoProfileFilename(
>> +    "auto-profile-file", cl::init("autoprof.llvm"), cl::value_desc("filename"),
>> +    cl::desc("Profile file loaded by -auto-profile"), cl::Hidden);
>> +
>> +namespace {
>> +
>> +// Base profiler abstract class. This defines the abstract interface
>> +// that every profiler should respond to.
>> +//
>> +// TODO(dnovillo) - Eventually this class ought to move to a separate file.
>> +// There will be several types of profile loaders. Having them all together in
>> +// this file will get pretty messy.
>> +class AutoProfiler {
>> +public:
>> +  AutoProfiler(std::string filename) : filename_(filename) {}
>> +  ~AutoProfiler() {}
>> +
>> +  // Load the profile from a file in the native format of this profile.
>> +  virtual bool loadNative() = 0;
>> +
>> +  // Load the profile from a text file.
>> +  virtual bool loadText() = 0;
>> +
>> +  // Dump this profile on stderr.
>> +  virtual void dump() = 0;
>> +
>> +  // Modify the IR with annotations corresponding to the loaded profile.
>> +  virtual bool emitAnnotations(Function &F) = 0;
>> +
>> +  // Instantiate an auto-profiler object based on the detected format of the
>> +  // give file name. Set *is_text to true, if the file is in text format.
>> +  static AutoProfiler *
>> +  instantiateProfiler(const std::string filename, bool *is_text);
>> +
>> +protected:
>> +  // Path name to the file holding the profile data.
>> +  std::string filename_;
>> +};
>> +
>> +
>> +// Sample-based profiler. These profiles contain execution frequency
>> +// information on the function bodies of the program.
>> +class AutoProfileSampleBased : public AutoProfiler {
>> +public:
>> +  AutoProfileSampleBased(std::string filename)
>> +      : AutoProfiler(filename), profiles_(0) {}
>> +
>> +  // Metadata kind for autoprofile.samples.
>> +  static unsigned AutoProfileSamplesMDKind;
>> +
>> +  virtual void dump();
>> +  virtual bool loadText();
>> +  virtual bool loadNative() { llvm_unreachable("not implemented"); }
>> +  virtual bool emitAnnotations(Function &F);
>> +
>> +  void dumpFunctionProfile(StringRef fn_name);
>> +
>> +protected:
>> +  typedef DenseMap<uint32_t, uint32_t> BodySampleMap;
>> +
>> +  struct FunctionProfile {
>> +    // Total number of samples collected inside this function. Samples
>> +    // are cumulative, they include all the samples collected inside
>> +    // this function and all its inlined callees.
>> +    unsigned TotalSamples;
>> +
>> +    // Total number of samples collected at the head of the function.
>> +    unsigned TotalHeadSamples;
>> +
>> +    // Map <line offset, samples> of line offset to samples collected
>> +    // inside the function. Each entry in this map contains the number
>> +    // of samples collected at the corresponding line offset. All line
>> +    // locations are an offset from the start of the function.
>> +    BodySampleMap BodySamples;
>> +  };
>> +
>> +  typedef StringMap<FunctionProfile> FunctionProfileMap;
>> +
>> +  FunctionProfileMap profiles_;
>> +};
>> +
>> +// Loader class for text-based profiles. These are mostly useful to
>> +// generate unit tests and not much else.
>> +class AutoProfileTextLoader {
>> +public:
>> +  AutoProfileTextLoader(std::string filename) : filename_(filename) {
>> +    error_code ec;
>> +    ec = MemoryBuffer::getFile(filename_, buffer_);
>> +    if (ec)
>> +      report_fatal_error("Could not open profile file " + filename_ + ": " +
>> +                         ec.message());
>> +    fp_ = buffer_->getBufferStart();
>> +    linenum_ = 0;
>> +  }
>> +
>> +  // Read a line from the mapped file. Update the current line and file pointer.
>> +  StringRef readLine() {
>> +    size_t length = 0;
>> +    const char *start = fp_;
>> +    while (fp_ != buffer_->getBufferEnd() && *fp_ != '\n') {
>> +      length++;
>> +      fp_++;
>> +    }
>> +    if (fp_ != buffer_->getBufferEnd())
>> +      fp_++;
>> +    linenum_++;
>> +    return StringRef(start, length);
>> +  }
>> +
>> +  // Return true, if we've reached EOF.
>> +  bool atEOF() const {
>> +    return fp_ == buffer_->getBufferEnd();
>> +  }
>> +
>> +  void reportParseError(std::string msg) const {
>> +    // TODO(dnovillo) - This is almost certainly the wrong way to emit
>> +    // diagnostics and exit the compiler.
>> +    errs() << filename_ << ":" << linenum_ << ": " << msg << "\n";
>> +    exit(1);
>> +  }
>> +
>> +private:
>> +  OwningPtr<MemoryBuffer> buffer_;
>> +  const char *fp_;
>> +  size_t linenum_;
>> +  std::string filename_;
>> +};
>> +
>> +// Auto profile pass. This pass reads profile data from the file specified
>> +// by -auto-profile-file and annotates every affected function with the
>> +// profile information found in that file.
>> +class AutoProfile : public FunctionPass {
>> +public:
>> +  // Class identification, replacement for typeinfo
>> +  static char ID;
>> +
>> +  AutoProfile() : FunctionPass(ID), profiler_(0) {
>> +    initializeAutoProfilePass(*PassRegistry::getPassRegistry());
>> +  }
>> +
>> +  virtual bool doInitialization(Module &M);
>> +
>> +  ~AutoProfile() { delete profiler_; }
>> +
>> +  void dump() { profiler_->dump(); }
>> +
>> +  virtual const char *getPassName() const {
>> +    return "Auto profile pass";
>> +  }
>> +
>> +  virtual bool runOnFunction(Function &F);
>> +
>> +  virtual void getAnalysisUsage(AnalysisUsage &AU) const {
>> +    AU.setPreservesCFG();
>> +  }
>> +
>> +  bool loadProfile();
>> +
>> +protected:
>> +  AutoProfiler *profiler_;
>> +};
>> +}
>> +
>> +// Dump the sample profile for the given function.
>> +void AutoProfileSampleBased::dumpFunctionProfile(StringRef fn_name) {
>> +  FunctionProfile fn_profile = profiles_[fn_name];
>> +  errs() << "Function: " << fn_name << ", " << fn_profile.TotalSamples << ", "
>> +         << fn_profile.TotalHeadSamples << ", " << fn_profile.BodySamples.size()
>> +         << " sampled lines\n";
>> +  for (BodySampleMap::const_iterator si = fn_profile.BodySamples.begin();
>> +       si != fn_profile.BodySamples.end(); si++)
>> +    errs() << "\tline offset: " << si->first
>> +           << ", number of samples: " << si->second << "\n";
>> +  errs() << "\n";
>> +}
>> +
>> +// Dump all the collected function profiles.
>> +void AutoProfileSampleBased::dump() {
>> +  FunctionProfileMap::const_iterator it;
>> +  for (it = profiles_.begin(); it != profiles_.end(); it++)
>> +    dumpFunctionProfile(it->getKey());
>> +}
>> +
>> +// Load a sample profile from a text file.
>> +bool AutoProfileSampleBased::loadText() {
>> +  AutoProfileTextLoader loader(filename_);
>> +
>> +  // Read the symbol table.
>> +  std::string line = loader.readLine().str();
>> +  if (line != "symbol table")
>> +    loader.reportParseError("Expected 'symbol table', found " + line);
>> +  Regex num("[0-9]+");
>> +  line = loader.readLine().str();
>> +  if (!num.match(line))
>> +    loader.reportParseError("Expected a number, found " + line);
>> +  int num_symbols = atoi(line.c_str());
>> +  for (int i = 0; i < num_symbols; i++) {
>> +    StringRef fn_name = loader.readLine();
>> +    FunctionProfile &fn_profile = profiles_[fn_name];
>> +    fn_profile.BodySamples.clear();
>> +    fn_profile.TotalSamples = 0;
>> +    fn_profile.TotalHeadSamples = 0;
>> +  }
>> +
>> +  // Read the profile of each function. Since each function may be
>> +  // mentioned more than once, and we are collecting flat profiles,
>> +  // accumulate samples as we parse them.
>> +  while (!loader.atEOF()) {
>> +    SmallVector<StringRef, 4> matches;
>> +    Regex head_re("^([^:]+):([0-9]+):([0-9]+):([0-9]+)$");
>> +    line = loader.readLine().str();
>> +    if (!head_re.match(line, &matches))
>> +      loader.reportParseError("Expected 'mangled_name:NUM:NUM:NUM', found " +
>> +                              line);
>> +    assert(matches.size() == 5);
>> +    StringRef fn_name = matches[1];
>> +    unsigned num_samples = atoi(matches[2].str().c_str());
>> +    unsigned num_head_samples = atoi(matches[3].str().c_str());
>> +    unsigned num_sampled_lines = atoi(matches[4].str().c_str());
>> +    FunctionProfile &fn_profile = profiles_[fn_name];
>> +    fn_profile.TotalSamples += num_samples;
>> +    fn_profile.TotalHeadSamples += num_head_samples;
>> +    BodySampleMap &sample_map = fn_profile.BodySamples;
>> +    unsigned i;
>> +    for (i = 0; i < num_sampled_lines && !loader.atEOF(); i++) {
>> +      Regex line_sample("^([0-9]+): ([0-9]+)$");
>> +      line = loader.readLine().str();
>> +      if (!line_sample.match(line, &matches))
>> +        loader.reportParseError("Expected 'NUM: NUM', found " + line);
>> +      assert(matches.size() == 3);
>> +      unsigned line_offset = atoi(matches[1].str().c_str());
>> +      unsigned num_samples = atoi(matches[2].str().c_str());
>> +      sample_map[line_offset] += num_samples;
>> +    }
>> +
>> +    if (i < num_sampled_lines)
>> +      loader.reportParseError("Unexpected end of file");
>> +  }
>> +
>> +  return true;
>> +}
>> +
>> +// Annotate function F with the contents of the profile.
>> +bool AutoProfileSampleBased::emitAnnotations(Function &F) {
>> +  bool changed = false;
>> +  StringRef name = F.getName();
>> +  FunctionProfile &fn_profile = profiles_[name];
>> +  BodySampleMap &body_samples = fn_profile.BodySamples;
>> +  Instruction &first_inst = *(inst_begin(F));
>> +  unsigned first_line = first_inst.getDebugLoc().getLine();
>> +  LLVMContext &context = first_inst.getContext();
>> +  for (inst_iterator i = inst_begin(F); i != inst_end(F); ++i) {
>> +    Instruction &inst = *i;
>> +    const DebugLoc &dloc = inst.getDebugLoc();
>> +    unsigned loc_offset = dloc.getLine() - first_line + 1;
>> +    if (body_samples.find(loc_offset) != body_samples.end()) {
>> +      SmallVector<Value *, 1> sample_values;
>> +      sample_values.push_back(ConstantInt::get(Type::getInt32Ty(context),
>> +                                               body_samples[loc_offset]));
>> +      MDNode *md = MDNode::get(context, sample_values);
>> +      inst.setMetadata(AutoProfileSamplesMDKind, md);
>> +      changed = true;
>> +    }
>> +  }
>> +
>> +  DEBUG(if (changed) {
>> +              dbgs() << "\n\nInstructions changed in " << name << "\n";
>> +              for (inst_iterator i = inst_begin(F); i != inst_end(F); ++i) {
>> +                Instruction &inst = *i;
>> +                MDNode *md = inst.getMetadata(AutoProfileSamplesMDKind);
>> +                if (md) {
>> +                  assert(md->getNumOperands() == 1);
>> +                  ConstantInt *val = dyn_cast<ConstantInt>(md->getOperand(0));
>> +                  dbgs() << inst << " (" << val->getValue().getZExtValue()
>> +                         << " samples)\n";
>> +                }
>> +              }
>> +            });
>> +
>> +  return changed;
>> +}
>> +
>> +AutoProfiler *AutoProfiler::instantiateProfiler(const std::string filename,
>> +                                                bool *is_text) {
>> +  // TODO(dnovillo) - Implement file type detection and return the appropriate
>> +  // AutoProfiler sub-class instance.
>> +  *is_text = true;
>> +  return new AutoProfileSampleBased(filename);
>> +}
>> +
>> +unsigned AutoProfileSampleBased::AutoProfileSamplesMDKind = 0;
>> +char AutoProfile::ID = 0;
>> +INITIALIZE_PASS(AutoProfile, "auto-profile", "Auto Profile loader", false,
>> +                false)
>> +
>> +bool AutoProfile::runOnFunction(Function& F) {
>> +  return profiler_->emitAnnotations(F);
>> +}
>> +
>> +bool AutoProfile::loadProfile() {
>> +  bool is_text;
>> +  profiler_ =
>> +      AutoProfiler::instantiateProfiler(AutoProfileFilename, &is_text);
>> +  if (!profiler_)
>> +    return false;
>> +
>> +  return (is_text) ? profiler_->loadText() : profiler_->loadNative();
>> +}
>> +
>> +bool AutoProfile::doInitialization(Module &M) {
>> +  if (!loadProfile())
>> +    return false;
>> +  AutoProfileSampleBased::AutoProfileSamplesMDKind =
>> +      M.getContext().getMDKindID("autoprofile.samples");
>> +  return true;
>> +}
>> +
>> +FunctionPass *llvm::createAutoProfilePass() {
>> +  return new AutoProfile();
>> +}
>> diff --git a/lib/Transforms/Scalar/CMakeLists.txt b/lib/Transforms/Scalar/CMakeLists.txt
>> index 3b89fd4..1093ef0 100644
>> --- a/lib/Transforms/Scalar/CMakeLists.txt
>> +++ b/lib/Transforms/Scalar/CMakeLists.txt
>> @@ -1,5 +1,6 @@
>> add_llvm_library(LLVMScalarOpts
>>   ADCE.cpp
>> +  AutoProfile.cpp
>>   CodeGenPrepare.cpp
>>   ConstantProp.cpp
>>   CorrelatedValuePropagation.cpp
>> diff --git a/lib/Transforms/Scalar/Scalar.cpp b/lib/Transforms/Scalar/Scalar.cpp
>> index 0c3ffbc..542aa33 100644
>> --- a/lib/Transforms/Scalar/Scalar.cpp
>> +++ b/lib/Transforms/Scalar/Scalar.cpp
>> @@ -28,6 +28,7 @@ using namespace llvm;
>> /// ScalarOpts library.
>> void llvm::initializeScalarOpts(PassRegistry &Registry) {
>>   initializeADCEPass(Registry);
>> +  initializeAutoProfilePass(Registry);
>>   initializeCodeGenPreparePass(Registry);
>>   initializeConstantPropagationPass(Registry);
>>   initializeCorrelatedValuePropagationPass(Registry);
>> --
>> 1.8.4
>>
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits



More information about the llvm-commits mailing list