Automatic PGO - Initial implementation (1/N)
Evan Cheng
evan.cheng at apple.com
Wed Sep 25 09:11:06 PDT 2013
Hi Diego,
I am curious. Why is this called automatic PGO? Why is it automatic?
Also, why is the auto-profile part done as a scalar transformation pass? Did you consider alternatives?
Thanks,
Evan
Sent from my iPad
> On Sep 24, 2013, at 7:07 AM, Diego Novillo <dnovillo at google.com> wrote:
>
> This adds two options: -auto-profile and -auto-profile-file. The new
> option causes the compiler to read a profile file and emit IR
> metadata reflecting that profile.
>
> The profile file is assumed to have been generated by an external
> profile source. The profile information is converted into IR metadata,
> which is later used by the analysis routines to estimate block
> frequencies, edge weights and other related data.
>
> External profile information files have no fixed format, each profiler
> is free to define its own. This includes both the on-disk representation
> of the profile and the kind of profile information stored in the file.
> A common kind of profile is based on sampling (e.g., perf), which
> essentially counts how many times each line of the program has been
> executed during the run.
>
> The only requirements is that each profiler must provide a way to
> load its own profile format into internal data structures and
> then a method to convert that data into IR annotations.
>
> The AutoProfile pass is organized as a scalar transformation. On
> startup, it reads the file given in -auto-profile-file to determine what
> kind of profile it contains. This file is assumed to contain profile
> information for the whole application. The profile data in the file is
> read and incorporated into the internal state of the corresponding
> profiler.
>
> To facilitate testing, I've organized the profilers to support two file
> formats: text and native. The native format is whatever on-disk
> representation the profiler wants to support, I think this will mostly
> be bitcode files, but it could be anything the profiler wants to
> support. To do this, every profiler must implement the
> AutoProfiler::loadNative() function.
>
> The text format is mostly meant for debugging. Records are separated by
> newlines, but each profiler is free to interpret records as it sees fit.
> Profilers must implement the AutoProfiler::loadText() function.
>
> Finally, the pass will call AutoProfiler::emitAnnotations() for each
> function in the current translation unit. This function needs to
> translate the loaded profile into IR metadata, which the analyzer will
> later be able to use.
>
> This patch implements the first steps towards the above design. I've
> implemented a sample-based flat profiler. The format of the profile is
> fairly simplistic. Each sampled function contains a list of relative
> line locations (from the start of the function) together with a count
> representing how many samples were collected at that line during
> execution. I generate this profile using perf and a separate converter
> tool.
>
> Currently, I have only implemented a text format for these profiles. I
> am interested in initial feedback to the whole approach before I send
> the other parts of the implementation for review.
>
> This patch implements:
>
> - The AutoProfile pass.
> - The base AutoProfiler class with the core interface.
> - A SampleBasedProfiler using the above interface. The profiler
> generates metadata autoprofile.samples on every IR instruction that
> matches the profiles.
> - A text loader class to assist the implementation of
> AutoProfiler::loadText().
>
> Caveats and questions:
>
> 1- I am almost certainly using the wrong APIs or using the right
> APIs in unorthodox ways. Please point me to better
> alternatives.
>
> 2- I was surprised to learn that line number information is not
> transferred into the IR unless we are emitting debug
> information. For sample-based profiling, I'm going to need
> line number information generated by the front-end
> independently of debug info. Eric, is that possible?
>
> 3- I have not included in this patch changes to the analyzer. I
> want to keep it focused to the profile loading and IR
> annotation. In the analyzer, we will have propagation of
> attributes and other fixes (e.g., from the samples it is
> possible to have instructions on the same basic block
> registered with differing number of samples). I also have not
> included changes to code motion to get rid of the autoprofile
> information.
>
> 4- I need to add test cases.
>
> Mainly, I'm interested in making sure that this direction is
> generally useful. I haven't given a lot of thought to other types
> of profiling, but I'm certain any kind of tracing or other
> execution frequency profiles can be adapted. Things like value
> profiling may be a bit more involved, but mostly because I'm not
> sure how the type and symbols are tracked in LLVM.
>
> Thanks. Diego.
> ---
> include/llvm/InitializePasses.h | 1 +
> include/llvm/Transforms/Scalar.h | 6 +
> lib/Transforms/Scalar/AutoProfile.cpp | 363 ++++++++++++++++++++++++++++++++++
> lib/Transforms/Scalar/CMakeLists.txt | 1 +
> lib/Transforms/Scalar/Scalar.cpp | 1 +
> 5 files changed, 372 insertions(+)
> create mode 100644 lib/Transforms/Scalar/AutoProfile.cpp
>
> diff --git a/include/llvm/InitializePasses.h b/include/llvm/InitializePasses.h
> index 1b50bb2..c922228 100644
> --- a/include/llvm/InitializePasses.h
> +++ b/include/llvm/InitializePasses.h
> @@ -70,6 +70,7 @@ void initializeAliasDebuggerPass(PassRegistry&);
> void initializeAliasSetPrinterPass(PassRegistry&);
> void initializeAlwaysInlinerPass(PassRegistry&);
> void initializeArgPromotionPass(PassRegistry&);
> +void initializeAutoProfilePass(PassRegistry&);
> void initializeBarrierNoopPass(PassRegistry&);
> void initializeBasicAliasAnalysisPass(PassRegistry&);
> void initializeBasicCallGraphPass(PassRegistry&);
> diff --git a/include/llvm/Transforms/Scalar.h b/include/llvm/Transforms/Scalar.h
> index 51aeba4..9ee5b49 100644
> --- a/include/llvm/Transforms/Scalar.h
> +++ b/include/llvm/Transforms/Scalar.h
> @@ -354,6 +354,12 @@ FunctionPass *createLowerExpectIntrinsicPass();
> //
> FunctionPass *createPartiallyInlineLibCallsPass();
>
> +//===----------------------------------------------------------------------===//
> +//
> +// AutoProfilePass - Loads profile data from disk and generates
> +// IR metadata to reflect the profile.
> +FunctionPass *createAutoProfilePass();
> +
> } // End llvm namespace
>
> #endif
> diff --git a/lib/Transforms/Scalar/AutoProfile.cpp b/lib/Transforms/Scalar/AutoProfile.cpp
> new file mode 100644
> index 0000000..7a63409
> --- /dev/null
> +++ b/lib/Transforms/Scalar/AutoProfile.cpp
> @@ -0,0 +1,363 @@
> +//===- AutoProfile.cpp - Incorporate an external profile into the IR ------===//
> +//
> +// The LLVM Compiler Infrastructure
> +//
> +// This file is distributed under the University of Illinois Open Source
> +// License. See LICENSE.TXT for details.
> +//
> +//===----------------------------------------------------------------------===//
> +//
> +// This file implements the Auto Profile transformation. This pass reads a
> +// profile file generated by an external profiling source and generates IR
> +// metadata to reflect the profile information in the given profile.
> +//
> +// TODO(dnovillo) Add more.
> +//
> +//===----------------------------------------------------------------------===//
> +
> +#define DEBUG_TYPE "auto-profile"
> +
> +#include <cstdlib>
> +
> +#include "llvm/ADT/DenseMap.h"
> +#include "llvm/ADT/OwningPtr.h"
> +#include "llvm/ADT/StringMap.h"
> +#include "llvm/DebugInfo/DIContext.h"
> +#include "llvm/IR/Constants.h"
> +#include "llvm/IR/Function.h"
> +#include "llvm/IR/Instructions.h"
> +#include "llvm/IR/LLVMContext.h"
> +#include "llvm/IR/Metadata.h"
> +#include "llvm/IR/Module.h"
> +#include "llvm/Pass.h"
> +#include "llvm/Support/CommandLine.h"
> +#include "llvm/Support/Debug.h"
> +#include "llvm/Support/InstIterator.h"
> +#include "llvm/Support/MemoryBuffer.h"
> +#include "llvm/Support/Regex.h"
> +#include "llvm/Support/raw_ostream.h"
> +#include "llvm/Transforms/Scalar.h"
> +
> +using namespace llvm;
> +
> +// command line option for loading path profiles
> +static cl::opt<std::string> AutoProfileFilename(
> + "auto-profile-file", cl::init("autoprof.llvm"), cl::value_desc("filename"),
> + cl::desc("Profile file loaded by -auto-profile"), cl::Hidden);
> +
> +namespace {
> +
> +// Base profiler abstract class. This defines the abstract interface
> +// that every profiler should respond to.
> +//
> +// TODO(dnovillo) - Eventually this class ought to move to a separate file.
> +// There will be several types of profile loaders. Having them all together in
> +// this file will get pretty messy.
> +class AutoProfiler {
> +public:
> + AutoProfiler(std::string filename) : filename_(filename) {}
> + ~AutoProfiler() {}
> +
> + // Load the profile from a file in the native format of this profile.
> + virtual bool loadNative() = 0;
> +
> + // Load the profile from a text file.
> + virtual bool loadText() = 0;
> +
> + // Dump this profile on stderr.
> + virtual void dump() = 0;
> +
> + // Modify the IR with annotations corresponding to the loaded profile.
> + virtual bool emitAnnotations(Function &F) = 0;
> +
> + // Instantiate an auto-profiler object based on the detected format of the
> + // give file name. Set *is_text to true, if the file is in text format.
> + static AutoProfiler *
> + instantiateProfiler(const std::string filename, bool *is_text);
> +
> +protected:
> + // Path name to the file holding the profile data.
> + std::string filename_;
> +};
> +
> +
> +// Sample-based profiler. These profiles contain execution frequency
> +// information on the function bodies of the program.
> +class AutoProfileSampleBased : public AutoProfiler {
> +public:
> + AutoProfileSampleBased(std::string filename)
> + : AutoProfiler(filename), profiles_(0) {}
> +
> + // Metadata kind for autoprofile.samples.
> + static unsigned AutoProfileSamplesMDKind;
> +
> + virtual void dump();
> + virtual bool loadText();
> + virtual bool loadNative() { llvm_unreachable("not implemented"); }
> + virtual bool emitAnnotations(Function &F);
> +
> + void dumpFunctionProfile(StringRef fn_name);
> +
> +protected:
> + typedef DenseMap<uint32_t, uint32_t> BodySampleMap;
> +
> + struct FunctionProfile {
> + // Total number of samples collected inside this function. Samples
> + // are cumulative, they include all the samples collected inside
> + // this function and all its inlined callees.
> + unsigned TotalSamples;
> +
> + // Total number of samples collected at the head of the function.
> + unsigned TotalHeadSamples;
> +
> + // Map <line offset, samples> of line offset to samples collected
> + // inside the function. Each entry in this map contains the number
> + // of samples collected at the corresponding line offset. All line
> + // locations are an offset from the start of the function.
> + BodySampleMap BodySamples;
> + };
> +
> + typedef StringMap<FunctionProfile> FunctionProfileMap;
> +
> + FunctionProfileMap profiles_;
> +};
> +
> +// Loader class for text-based profiles. These are mostly useful to
> +// generate unit tests and not much else.
> +class AutoProfileTextLoader {
> +public:
> + AutoProfileTextLoader(std::string filename) : filename_(filename) {
> + error_code ec;
> + ec = MemoryBuffer::getFile(filename_, buffer_);
> + if (ec)
> + report_fatal_error("Could not open profile file " + filename_ + ": " +
> + ec.message());
> + fp_ = buffer_->getBufferStart();
> + linenum_ = 0;
> + }
> +
> + // Read a line from the mapped file. Update the current line and file pointer.
> + StringRef readLine() {
> + size_t length = 0;
> + const char *start = fp_;
> + while (fp_ != buffer_->getBufferEnd() && *fp_ != '\n') {
> + length++;
> + fp_++;
> + }
> + if (fp_ != buffer_->getBufferEnd())
> + fp_++;
> + linenum_++;
> + return StringRef(start, length);
> + }
> +
> + // Return true, if we've reached EOF.
> + bool atEOF() const {
> + return fp_ == buffer_->getBufferEnd();
> + }
> +
> + void reportParseError(std::string msg) const {
> + // TODO(dnovillo) - This is almost certainly the wrong way to emit
> + // diagnostics and exit the compiler.
> + errs() << filename_ << ":" << linenum_ << ": " << msg << "\n";
> + exit(1);
> + }
> +
> +private:
> + OwningPtr<MemoryBuffer> buffer_;
> + const char *fp_;
> + size_t linenum_;
> + std::string filename_;
> +};
> +
> +// Auto profile pass. This pass reads profile data from the file specified
> +// by -auto-profile-file and annotates every affected function with the
> +// profile information found in that file.
> +class AutoProfile : public FunctionPass {
> +public:
> + // Class identification, replacement for typeinfo
> + static char ID;
> +
> + AutoProfile() : FunctionPass(ID), profiler_(0) {
> + initializeAutoProfilePass(*PassRegistry::getPassRegistry());
> + }
> +
> + virtual bool doInitialization(Module &M);
> +
> + ~AutoProfile() { delete profiler_; }
> +
> + void dump() { profiler_->dump(); }
> +
> + virtual const char *getPassName() const {
> + return "Auto profile pass";
> + }
> +
> + virtual bool runOnFunction(Function &F);
> +
> + virtual void getAnalysisUsage(AnalysisUsage &AU) const {
> + AU.setPreservesCFG();
> + }
> +
> + bool loadProfile();
> +
> +protected:
> + AutoProfiler *profiler_;
> +};
> +}
> +
> +// Dump the sample profile for the given function.
> +void AutoProfileSampleBased::dumpFunctionProfile(StringRef fn_name) {
> + FunctionProfile fn_profile = profiles_[fn_name];
> + errs() << "Function: " << fn_name << ", " << fn_profile.TotalSamples << ", "
> + << fn_profile.TotalHeadSamples << ", " << fn_profile.BodySamples.size()
> + << " sampled lines\n";
> + for (BodySampleMap::const_iterator si = fn_profile.BodySamples.begin();
> + si != fn_profile.BodySamples.end(); si++)
> + errs() << "\tline offset: " << si->first
> + << ", number of samples: " << si->second << "\n";
> + errs() << "\n";
> +}
> +
> +// Dump all the collected function profiles.
> +void AutoProfileSampleBased::dump() {
> + FunctionProfileMap::const_iterator it;
> + for (it = profiles_.begin(); it != profiles_.end(); it++)
> + dumpFunctionProfile(it->getKey());
> +}
> +
> +// Load a sample profile from a text file.
> +bool AutoProfileSampleBased::loadText() {
> + AutoProfileTextLoader loader(filename_);
> +
> + // Read the symbol table.
> + std::string line = loader.readLine().str();
> + if (line != "symbol table")
> + loader.reportParseError("Expected 'symbol table', found " + line);
> + Regex num("[0-9]+");
> + line = loader.readLine().str();
> + if (!num.match(line))
> + loader.reportParseError("Expected a number, found " + line);
> + int num_symbols = atoi(line.c_str());
> + for (int i = 0; i < num_symbols; i++) {
> + StringRef fn_name = loader.readLine();
> + FunctionProfile &fn_profile = profiles_[fn_name];
> + fn_profile.BodySamples.clear();
> + fn_profile.TotalSamples = 0;
> + fn_profile.TotalHeadSamples = 0;
> + }
> +
> + // Read the profile of each function. Since each function may be
> + // mentioned more than once, and we are collecting flat profiles,
> + // accumulate samples as we parse them.
> + while (!loader.atEOF()) {
> + SmallVector<StringRef, 4> matches;
> + Regex head_re("^([^:]+):([0-9]+):([0-9]+):([0-9]+)$");
> + line = loader.readLine().str();
> + if (!head_re.match(line, &matches))
> + loader.reportParseError("Expected 'mangled_name:NUM:NUM:NUM', found " +
> + line);
> + assert(matches.size() == 5);
> + StringRef fn_name = matches[1];
> + unsigned num_samples = atoi(matches[2].str().c_str());
> + unsigned num_head_samples = atoi(matches[3].str().c_str());
> + unsigned num_sampled_lines = atoi(matches[4].str().c_str());
> + FunctionProfile &fn_profile = profiles_[fn_name];
> + fn_profile.TotalSamples += num_samples;
> + fn_profile.TotalHeadSamples += num_head_samples;
> + BodySampleMap &sample_map = fn_profile.BodySamples;
> + unsigned i;
> + for (i = 0; i < num_sampled_lines && !loader.atEOF(); i++) {
> + Regex line_sample("^([0-9]+): ([0-9]+)$");
> + line = loader.readLine().str();
> + if (!line_sample.match(line, &matches))
> + loader.reportParseError("Expected 'NUM: NUM', found " + line);
> + assert(matches.size() == 3);
> + unsigned line_offset = atoi(matches[1].str().c_str());
> + unsigned num_samples = atoi(matches[2].str().c_str());
> + sample_map[line_offset] += num_samples;
> + }
> +
> + if (i < num_sampled_lines)
> + loader.reportParseError("Unexpected end of file");
> + }
> +
> + return true;
> +}
> +
> +// Annotate function F with the contents of the profile.
> +bool AutoProfileSampleBased::emitAnnotations(Function &F) {
> + bool changed = false;
> + StringRef name = F.getName();
> + FunctionProfile &fn_profile = profiles_[name];
> + BodySampleMap &body_samples = fn_profile.BodySamples;
> + Instruction &first_inst = *(inst_begin(F));
> + unsigned first_line = first_inst.getDebugLoc().getLine();
> + LLVMContext &context = first_inst.getContext();
> + for (inst_iterator i = inst_begin(F); i != inst_end(F); ++i) {
> + Instruction &inst = *i;
> + const DebugLoc &dloc = inst.getDebugLoc();
> + unsigned loc_offset = dloc.getLine() - first_line + 1;
> + if (body_samples.find(loc_offset) != body_samples.end()) {
> + SmallVector<Value *, 1> sample_values;
> + sample_values.push_back(ConstantInt::get(Type::getInt32Ty(context),
> + body_samples[loc_offset]));
> + MDNode *md = MDNode::get(context, sample_values);
> + inst.setMetadata(AutoProfileSamplesMDKind, md);
> + changed = true;
> + }
> + }
> +
> + DEBUG(if (changed) {
> + dbgs() << "\n\nInstructions changed in " << name << "\n";
> + for (inst_iterator i = inst_begin(F); i != inst_end(F); ++i) {
> + Instruction &inst = *i;
> + MDNode *md = inst.getMetadata(AutoProfileSamplesMDKind);
> + if (md) {
> + assert(md->getNumOperands() == 1);
> + ConstantInt *val = dyn_cast<ConstantInt>(md->getOperand(0));
> + dbgs() << inst << " (" << val->getValue().getZExtValue()
> + << " samples)\n";
> + }
> + }
> + });
> +
> + return changed;
> +}
> +
> +AutoProfiler *AutoProfiler::instantiateProfiler(const std::string filename,
> + bool *is_text) {
> + // TODO(dnovillo) - Implement file type detection and return the appropriate
> + // AutoProfiler sub-class instance.
> + *is_text = true;
> + return new AutoProfileSampleBased(filename);
> +}
> +
> +unsigned AutoProfileSampleBased::AutoProfileSamplesMDKind = 0;
> +char AutoProfile::ID = 0;
> +INITIALIZE_PASS(AutoProfile, "auto-profile", "Auto Profile loader", false,
> + false)
> +
> +bool AutoProfile::runOnFunction(Function& F) {
> + return profiler_->emitAnnotations(F);
> +}
> +
> +bool AutoProfile::loadProfile() {
> + bool is_text;
> + profiler_ =
> + AutoProfiler::instantiateProfiler(AutoProfileFilename, &is_text);
> + if (!profiler_)
> + return false;
> +
> + return (is_text) ? profiler_->loadText() : profiler_->loadNative();
> +}
> +
> +bool AutoProfile::doInitialization(Module &M) {
> + if (!loadProfile())
> + return false;
> + AutoProfileSampleBased::AutoProfileSamplesMDKind =
> + M.getContext().getMDKindID("autoprofile.samples");
> + return true;
> +}
> +
> +FunctionPass *llvm::createAutoProfilePass() {
> + return new AutoProfile();
> +}
> diff --git a/lib/Transforms/Scalar/CMakeLists.txt b/lib/Transforms/Scalar/CMakeLists.txt
> index 3b89fd4..1093ef0 100644
> --- a/lib/Transforms/Scalar/CMakeLists.txt
> +++ b/lib/Transforms/Scalar/CMakeLists.txt
> @@ -1,5 +1,6 @@
> add_llvm_library(LLVMScalarOpts
> ADCE.cpp
> + AutoProfile.cpp
> CodeGenPrepare.cpp
> ConstantProp.cpp
> CorrelatedValuePropagation.cpp
> diff --git a/lib/Transforms/Scalar/Scalar.cpp b/lib/Transforms/Scalar/Scalar.cpp
> index 0c3ffbc..542aa33 100644
> --- a/lib/Transforms/Scalar/Scalar.cpp
> +++ b/lib/Transforms/Scalar/Scalar.cpp
> @@ -28,6 +28,7 @@ using namespace llvm;
> /// ScalarOpts library.
> void llvm::initializeScalarOpts(PassRegistry &Registry) {
> initializeADCEPass(Registry);
> + initializeAutoProfilePass(Registry);
> initializeCodeGenPreparePass(Registry);
> initializeConstantPropagationPass(Registry);
> initializeCorrelatedValuePropagationPass(Registry);
> --
> 1.8.4
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
More information about the llvm-commits
mailing list