LTO API: lto_codegen_optimize and lto_codegen_compile_optimized

Mon Feb 2 11:33:48 PST 2015

Please clang-format the patch.
Please follow the llvm code style for new code (funcitonName, VariableName).

LGTM with that.

On 2 February 2015 at 14:07, Manman Ren <mren at apple.com> wrote:
> Hi all,
>
> This updated patch should address all the review comments.
>
> lto_api_version is added for ld64 to decide if the new api
> (lto_codegen_optimize and lto_codegen_compile_optimize) is available.
>
> Thanks,
> Manman
>
>
> On Jan 30, 2015, at 12:58 PM, Duncan P. N. Exon Smith <dexonsmith at apple.com>
> wrote:
>
>
>
> On Jan 30, 2015, at 12:01 PM, Manman Ren <mren at apple.com> wrote:
>
> We currently have lto_codegen_compile that runs IR passes and codegen
> passes. If we dump the bitcode file after lto_codegen_compile, codegen
> passes can modify IR (one example is StackProtector pass).
> If we run “llc lto.opt.bc”, we will get assertion failure because llc runs
> StackProtector pass again.
>
> To enable dumping bitcode file right before running the codegen passes, this
> patch splits lto_codegen_compile in 2
> lto_codegen_optimize
> lto_codegen_compile_optimize
>
> Linker can choose to dump the bitcode file before calling
> lto_codegen_compile_optimize and we can debug the codegen passes from the
> dumped bitcode file.
>
> I initially had a different proposal, thanks to Rafael for suggesting this
> solution!
>
> Cheers,
> Manman
>
> <lto.patch>
>
>
> +kledzik (see discussion about lto_codegen_optimize() below).
>
> diff --git a/include/llvm-c/lto.h b/include/llvm-c/lto.h
> index 3f30d6d..46ebc79 100644
> --- a/include/llvm-c/lto.h
> +++ b/include/llvm-c/lto.h
> @@ -40,7 +40,7 @@ typedef bool lto_bool_t;
>  * @{
>  */
>
> -#define LTO_API_VERSION 11
> +#define LTO_API_VERSION 12
>
> /**
>  * \since prior to LTO_API_VERSION=3
> @@ -484,6 +484,27 @@ lto_codegen_compile(lto_code_gen_t cg, size_t* length);
> extern lto_bool_t
> lto_codegen_compile_to_file(lto_code_gen_t cg, const char** name);
>
>
> Please update `lto_codegen_compile_to_file()` and
> `lto_codegen_compile()` to be described in terms of the new functions.
>
>
> +/**
> + * Runs optimization for the merged module. Returns true on error.
> + *
> + * \since LTO_API_VERSION=12
> + */
> +extern lto_bool_t
> +lto_codegen_optimize(lto_code_gen_t cg);
>
>
> It's not clear to me how the linker can determine at runtime whether
> this function is available in the `dlopen()`'ed libLTO.dylib.
>
> IIRC, the ld64 checks for runtime-availability of the new API by calling
> the function, and assuming it doesn't exist if the function returns NULL
> (the return value of functions that don't exist at runtime).  Usually
> the flow is something like:
>
>        Result = nullptr;
>    #ifdef LTO_API_VERSION >= XYZ
>        Result = lto_new_api(...);
>        if (!Result)
>    #else
>        Result = lto_old_api(...)
>    #endif
>        // deal with Result
>
> That scheme doesn't work for functions that can return 0 as a valid
> result.
>
> Simply switching the error condition (0 on failure) won't work either,
> since then on error with the new API, you'd run the old API.
>
> I can see two possibilities:
>
> 1. Change `lto_codegen_optimize()` to return `1` on success and `2` on
>    failure.  Then `0` will mark that the API isn't available.
>
> 2. Add an `lto_api_version()` function that returns the runtime API
>    version.
>
> The latter option seems cleaner to me.
>
> @Nick: any thoughts on this?
>
> Then the linker code would look something like this:
>
>        Result = nullptr;
>    #ifdef LTO_API_VERSION >= XYZ
>        if (lto_api_version() >= XYZ)
>          Result = lto_new_api(...);
>        else
>    #else
>        Result = lto_old_api(...)
>    #endif
>        // deal with Result
>
> +
> +/**
> + * Generates code for the optimized merged module into one native object
> file.
> + * On success returns a pointer to a generated mach-o/ELF buffer and
> + * length set to the buffer size.  The buffer is owned by the
> + * lto_code_gen_t and will be freed when lto_codegen_dispose()
> + * is called, or lto_codegen_compile() is called again.
> + * On failure, returns NULL (check lto_get_error_message() for details).
>
>
> This comment should specify that it doesn't run IR-level optimizations.
>
> + *
> + * \since LTO_API_VERSION=12
> + */
> +extern const void*
> +lto_codegen_compile_optimized(lto_code_gen_t cg, size_t* length);
> +
>
> /**
>  * Sets options to help debug codegen bugs.
> diff --git a/include/llvm/LTO/LTOCodeGenerator.h
> b/include/llvm/LTO/LTOCodeGenerator.h
> index 0c9ce4a..ebba400 100644
> --- a/include/llvm/LTO/LTOCodeGenerator.h
> +++ b/include/llvm/LTO/LTOCodeGenerator.h
> @@ -117,6 +117,19 @@ struct LTOCodeGenerator {
>                       bool disableVectorization,
>                       std::string &errMsg);
>
> +  // Optimizes the merged module. Returns true on success.
> +  bool optimize(bool disableOpt,
> +                bool disableInline,
> +                bool disableGVNLoadPRE,
> +                bool disableVectorization,
> +                std::string &errMsg);
> +
> +  // Compiles the merged optimized module into a single object file. It
> brings the
> +  // object to a buffer, and returns the buffer to the caller. Return NULL
> if the
> +  // compilation was not successful.
> +  const void *compile_optimized(size_t *length,
> +                                std::string &errMsg);
>
>
> The function names here aren't in the C API, so they should follow the
> regular LLVM coding conventions.  Maybe `compileOptimized()`?
>
> +
>   void setDiagnosticHandler(lto_diagnostic_handler_t, void *);
>
>   LLVMContext &getContext() { return Context; }
> @@ -124,9 +137,8 @@ struct LTOCodeGenerator {
> private:
>   void initializeLTOPasses();
>
> -  bool generateObjectFile(raw_ostream &out, bool disableOpt, bool
> disableInline,
> -                          bool disableGVNLoadPRE, bool
> disableVectorization,
> -                          std::string &errMsg);
> +  bool compile_optimized(raw_ostream &out, std::string &errMsg);
> +  bool compile_optimized_to_file(const char** name, std::string &errMsg);
>
>
> Same here about function names.
>
>   void applyScopeRestrictions();
>   void applyRestriction(GlobalValue &GV, ArrayRef<StringRef> Libcalls,
>                         std::vector<const char *> &MustPreserveList,
> diff --git a/lib/LTO/LTOCodeGenerator.cpp b/lib/LTO/LTOCodeGenerator.cpp
> index d937556..2fc9ee0 100644
> --- a/lib/LTO/LTOCodeGenerator.cpp
> +++ b/lib/LTO/LTOCodeGenerator.cpp
> @@ -201,12 +201,8 @@ bool LTOCodeGenerator::writeMergedModules(const char
> *path,
>   return true;
> }
>
> -bool LTOCodeGenerator::compile_to_file(const char** name,
> -                                       bool disableOpt,
> -                                       bool disableInline,
> -                                       bool disableGVNLoadPRE,
> -                                       bool disableVectorization,
> -                                       std::string& errMsg) {
> +bool LTOCodeGenerator::compile_optimized_to_file(const char** name,
> +                                                 std::string& errMsg) {
>   // make unique temp .o file to put generated object file
>   SmallString<128> Filename;
>   int FD;
> @@ -221,8 +217,7 @@ bool LTOCodeGenerator::compile_to_file(const char**
> name,
>   tool_output_file objFile(Filename.c_str(), FD);
>
>   bool genResult =
> -      generateObjectFile(objFile.os(), disableOpt, disableInline,
> -                         disableGVNLoadPRE, disableVectorization, errMsg);
> +      compile_optimized(objFile.os(), errMsg);
>   objFile.os().close();
>   if (objFile.os().has_error()) {
>     objFile.os().clear_error();
> @@ -241,15 +236,10 @@ bool LTOCodeGenerator::compile_to_file(const char**
> name,
>   return true;
> }
>
> -const void* LTOCodeGenerator::compile(size_t* length,
> -                                      bool disableOpt,
> -                                      bool disableInline,
> -                                      bool disableGVNLoadPRE,
> -                                      bool disableVectorization,
> -                                      std::string& errMsg) {
> +const void* LTOCodeGenerator::compile_optimized(size_t* length,
> +                                                std::string& errMsg) {
>   const char *name;
> -  if (!compile_to_file(&name, disableOpt, disableInline, disableGVNLoadPRE,
> -                       disableVectorization, errMsg))
> +  if (!compile_optimized_to_file(&name, errMsg))
>     return nullptr;
>
>   // read .o file into memory buffer
> @@ -272,6 +262,32 @@ const void* LTOCodeGenerator::compile(size_t* length,
>   return NativeObjectFile->getBufferStart();
> }
>
> +
> +bool LTOCodeGenerator::compile_to_file(const char** name,
> +                                       bool disableOpt,
> +                                       bool disableInline,
> +                                       bool disableGVNLoadPRE,
> +                                       bool disableVectorization,
> +                                       std::string& errMsg) {
> +  if (!optimize(disableOpt, disableInline, disableGVNLoadPRE,
> disableVectorization, errMsg))
>
>
> 80-column?
>
> +    return false;
> +
> +  return compile_optimized_to_file(name, errMsg);
> +}
> +
> +const void* LTOCodeGenerator::compile(size_t* length,
> +                                      bool disableOpt,
> +                                      bool disableInline,
> +                                      bool disableGVNLoadPRE,
> +                                      bool disableVectorization,
> +                                      std::string& errMsg) {
> +  if (!optimize(disableOpt, disableInline, disableGVNLoadPRE,
> +                       disableVectorization, errMsg))
>
>
> Whitespace looks strange here.  clang-format?
>
> +    return nullptr;
> +
> +  return compile_optimized(length, errMsg);
> +}
> +
> bool LTOCodeGenerator::determineTarget(std::string &errMsg) {
>   if (TargetMach)
>     return true;
> @@ -458,12 +474,11 @@ void LTOCodeGenerator::applyScopeRestrictions() {
> }
>
> /// Optimize merged modules using various IPO passes
> -bool LTOCodeGenerator::generateObjectFile(raw_ostream &out,
> -                                          bool DisableOpt,
> -                                          bool DisableInline,
> -                                          bool DisableGVNLoadPRE,
> -                                          bool DisableVectorization,
> -                                          std::string &errMsg) {
> +bool LTOCodeGenerator::optimize(bool DisableOpt,
> +                                bool DisableInline,
> +                                bool DisableGVNLoadPRE,
> +                                bool DisableVectorization,
> +                                std::string &errMsg) {
>   if (!this->determineTarget(errMsg))
>     return false;
>
> @@ -493,6 +508,22 @@ bool LTOCodeGenerator::generateObjectFile(raw_ostream
> &out,
>
>   PMB.populateLTOPassManager(passes, TargetMach);
>
> +  // Run our queue of passes all at once now, efficiently.
> +  passes.run(*mergedModule);
> +
> +  return true;
> +}
> +
> +bool LTOCodeGenerator::compile_optimized(raw_ostream &out,
> +                                         std::string &errMsg) {
> +  if (!this->determineTarget(errMsg))
> +    return false;
> +
> +  Module *mergedModule = IRLinker.getModule();
> +
> +  // Mark which symbols can not be internalized
> +  this->applyScopeRestrictions();
> +
>   PassManager codeGenPasses;
>
>   codeGenPasses.add(new DataLayoutPass());
> @@ -509,9 +540,6 @@ bool LTOCodeGenerator::generateObjectFile(raw_ostream
> &out,
>     return false;
>   }
>
> -  // Run our queue of passes all at once now, efficiently.
> -  passes.run(*mergedModule);
> -
>   // Run the code generator, and write assembly file
>   codeGenPasses.run(*mergedModule);
>
> diff --git a/tools/lto/lto.cpp b/tools/lto/lto.cpp
> index ec0372e..b4576cf 100644
> --- a/tools/lto/lto.cpp
> +++ b/tools/lto/lto.cpp
> @@ -289,6 +289,26 @@ const void *lto_codegen_compile(lto_code_gen_t cg,
> size_t *length) {
>                              sLastErrorString);
> }
>
> +bool lto_codegen_optimize(lto_code_gen_t cg) {
> +  if (!parsedOptions) {
> +    unwrap(cg)->parseCodeGenDebugOptions();
> +    lto_add_attrs(cg);
> +    parsedOptions = true;
> +  }
> +  return !unwrap(cg)->optimize(DisableOpt, DisableInline,
> +                               DisableGVNLoadPRE, DisableLTOVectorization,
> +                               sLastErrorString);
> +}
> +
> +const void *lto_codegen_compile_optimized(lto_code_gen_t cg, size_t
> *length) {
> +  if (!parsedOptions) {
> +    unwrap(cg)->parseCodeGenDebugOptions();
> +    lto_add_attrs(cg);
> +    parsedOptions = true;
> +  }
> +  return unwrap(cg)->compile_optimized(length, sLastErrorString);
> +}
> +
> bool lto_codegen_compile_to_file(lto_code_gen_t cg, const char **name) {
>   if (!parsedOptions) {
>     unwrap(cg)->parseCodeGenDebugOptions();
> diff --git a/tools/lto/lto.exports b/tools/lto/lto.exports
> index 8ba9d28..82277fb 100644
> --- a/tools/lto/lto.exports
> +++ b/tools/lto/lto.exports
> @@ -37,6 +37,8 @@ lto_codegen_set_assembler_args
> lto_codegen_set_assembler_path
> lto_codegen_set_cpu
> lto_codegen_compile_to_file
> +lto_codegen_optimize
> +lto_codegen_compile_optimized
> LLVMCreateDisasm
> LLVMCreateDisasmCPU
> LLVMDisasmDispose
>
>
>