LTO API: lto_codegen_optimize and lto_codegen_compile_optimized

Tue Feb 3 10:41:59 PST 2015

After fixing the coding style, committed in r228000.

Thanks,
Manman

> On Feb 2, 2015, at 11:33 AM, Rafael Espíndola <rafael.espindola at gmail.com> wrote:
> 
> Please clang-format the patch.
> Please follow the llvm code style for new code (funcitonName, VariableName).
> 
> LGTM with that.
> 
> 
> On 2 February 2015 at 14:07, Manman Ren <mren at apple.com> wrote:
>> Hi all,
>> 
>> This updated patch should address all the review comments.
>> 
>> lto_api_version is added for ld64 to decide if the new api
>> (lto_codegen_optimize and lto_codegen_compile_optimize) is available.
>> 
>> Thanks,
>> Manman
>> 
>> 
>> On Jan 30, 2015, at 12:58 PM, Duncan P. N. Exon Smith <dexonsmith at apple.com>
>> wrote:
>> 
>> 
>> 
>> On Jan 30, 2015, at 12:01 PM, Manman Ren <mren at apple.com> wrote:
>> 
>> We currently have lto_codegen_compile that runs IR passes and codegen
>> passes. If we dump the bitcode file after lto_codegen_compile, codegen
>> passes can modify IR (one example is StackProtector pass).
>> If we run “llc lto.opt.bc”, we will get assertion failure because llc runs
>> StackProtector pass again.
>> 
>> To enable dumping bitcode file right before running the codegen passes, this
>> patch splits lto_codegen_compile in 2
>> lto_codegen_optimize
>> lto_codegen_compile_optimize
>> 
>> Linker can choose to dump the bitcode file before calling
>> lto_codegen_compile_optimize and we can debug the codegen passes from the
>> dumped bitcode file.
>> 
>> I initially had a different proposal, thanks to Rafael for suggesting this
>> solution!
>> 
>> Cheers,
>> Manman
>> 
>> <lto.patch>
>> 
>> 
>> +kledzik (see discussion about lto_codegen_optimize() below).
>> 
>> diff --git a/include/llvm-c/lto.h b/include/llvm-c/lto.h
>> index 3f30d6d..46ebc79 100644
>> --- a/include/llvm-c/lto.h
>> +++ b/include/llvm-c/lto.h
>> @@ -40,7 +40,7 @@ typedef bool lto_bool_t;
>> * @{
>> */
>> 
>> -#define LTO_API_VERSION 11
>> +#define LTO_API_VERSION 12
>> 
>> /**
>> * \since prior to LTO_API_VERSION=3
>> @@ -484,6 +484,27 @@ lto_codegen_compile(lto_code_gen_t cg, size_t* length);
>> extern lto_bool_t
>> lto_codegen_compile_to_file(lto_code_gen_t cg, const char** name);
>> 
>> 
>> Please update `lto_codegen_compile_to_file()` and
>> `lto_codegen_compile()` to be described in terms of the new functions.
>> 
>> 
>> +/**
>> + * Runs optimization for the merged module. Returns true on error.
>> + *
>> + * \since LTO_API_VERSION=12
>> + */
>> +extern lto_bool_t
>> +lto_codegen_optimize(lto_code_gen_t cg);
>> 
>> 
>> It's not clear to me how the linker can determine at runtime whether
>> this function is available in the `dlopen()`'ed libLTO.dylib.
>> 
>> IIRC, the ld64 checks for runtime-availability of the new API by calling
>> the function, and assuming it doesn't exist if the function returns NULL
>> (the return value of functions that don't exist at runtime).  Usually
>> the flow is something like:
>> 
>>       Result = nullptr;
>>   #ifdef LTO_API_VERSION >= XYZ
>>       Result = lto_new_api(...);
>>       if (!Result)
>>   #else
>>       Result = lto_old_api(...)
>>   #endif
>>       // deal with Result
>> 
>> That scheme doesn't work for functions that can return 0 as a valid
>> result.
>> 
>> Simply switching the error condition (0 on failure) won't work either,
>> since then on error with the new API, you'd run the old API.
>> 
>> I can see two possibilities:
>> 
>> 1. Change `lto_codegen_optimize()` to return `1` on success and `2` on
>>   failure.  Then `0` will mark that the API isn't available.
>> 
>> 2. Add an `lto_api_version()` function that returns the runtime API
>>   version.
>> 
>> The latter option seems cleaner to me.
>> 
>> @Nick: any thoughts on this?
>> 
>> Then the linker code would look something like this:
>> 
>>       Result = nullptr;
>>   #ifdef LTO_API_VERSION >= XYZ
>>       if (lto_api_version() >= XYZ)
>>         Result = lto_new_api(...);
>>       else
>>   #else
>>       Result = lto_old_api(...)
>>   #endif
>>       // deal with Result
>> 
>> +
>> +/**
>> + * Generates code for the optimized merged module into one native object
>> file.
>> + * On success returns a pointer to a generated mach-o/ELF buffer and
>> + * length set to the buffer size.  The buffer is owned by the
>> + * lto_code_gen_t and will be freed when lto_codegen_dispose()
>> + * is called, or lto_codegen_compile() is called again.
>> + * On failure, returns NULL (check lto_get_error_message() for details).
>> 
>> 
>> This comment should specify that it doesn't run IR-level optimizations.
>> 
>> + *
>> + * \since LTO_API_VERSION=12
>> + */
>> +extern const void*
>> +lto_codegen_compile_optimized(lto_code_gen_t cg, size_t* length);
>> +
>> 
>> /**
>> * Sets options to help debug codegen bugs.
>> diff --git a/include/llvm/LTO/LTOCodeGenerator.h
>> b/include/llvm/LTO/LTOCodeGenerator.h
>> index 0c9ce4a..ebba400 100644
>> --- a/include/llvm/LTO/LTOCodeGenerator.h
>> +++ b/include/llvm/LTO/LTOCodeGenerator.h
>> @@ -117,6 +117,19 @@ struct LTOCodeGenerator {
>>                      bool disableVectorization,
>>                      std::string &errMsg);
>> 
>> +  // Optimizes the merged module. Returns true on success.
>> +  bool optimize(bool disableOpt,
>> +                bool disableInline,
>> +                bool disableGVNLoadPRE,
>> +                bool disableVectorization,
>> +                std::string &errMsg);
>> +
>> +  // Compiles the merged optimized module into a single object file. It
>> brings the
>> +  // object to a buffer, and returns the buffer to the caller. Return NULL
>> if the
>> +  // compilation was not successful.
>> +  const void *compile_optimized(size_t *length,
>> +                                std::string &errMsg);
>> 
>> 
>> The function names here aren't in the C API, so they should follow the
>> regular LLVM coding conventions.  Maybe `compileOptimized()`?
>> 
>> +
>>  void setDiagnosticHandler(lto_diagnostic_handler_t, void *);
>> 
>>  LLVMContext &getContext() { return Context; }
>> @@ -124,9 +137,8 @@ struct LTOCodeGenerator {
>> private:
>>  void initializeLTOPasses();
>> 
>> -  bool generateObjectFile(raw_ostream &out, bool disableOpt, bool
>> disableInline,
>> -                          bool disableGVNLoadPRE, bool
>> disableVectorization,
>> -                          std::string &errMsg);
>> +  bool compile_optimized(raw_ostream &out, std::string &errMsg);
>> +  bool compile_optimized_to_file(const char** name, std::string &errMsg);
>> 
>> 
>> Same here about function names.
>> 
>>  void applyScopeRestrictions();
>>  void applyRestriction(GlobalValue &GV, ArrayRef<StringRef> Libcalls,
>>                        std::vector<const char *> &MustPreserveList,
>> diff --git a/lib/LTO/LTOCodeGenerator.cpp b/lib/LTO/LTOCodeGenerator.cpp
>> index d937556..2fc9ee0 100644
>> --- a/lib/LTO/LTOCodeGenerator.cpp
>> +++ b/lib/LTO/LTOCodeGenerator.cpp
>> @@ -201,12 +201,8 @@ bool LTOCodeGenerator::writeMergedModules(const char
>> *path,
>>  return true;
>> }
>> 
>> -bool LTOCodeGenerator::compile_to_file(const char** name,
>> -                                       bool disableOpt,
>> -                                       bool disableInline,
>> -                                       bool disableGVNLoadPRE,
>> -                                       bool disableVectorization,
>> -                                       std::string& errMsg) {
>> +bool LTOCodeGenerator::compile_optimized_to_file(const char** name,
>> +                                                 std::string& errMsg) {
>>  // make unique temp .o file to put generated object file
>>  SmallString<128> Filename;
>>  int FD;
>> @@ -221,8 +217,7 @@ bool LTOCodeGenerator::compile_to_file(const char**
>> name,
>>  tool_output_file objFile(Filename.c_str(), FD);
>> 
>>  bool genResult =
>> -      generateObjectFile(objFile.os(), disableOpt, disableInline,
>> -                         disableGVNLoadPRE, disableVectorization, errMsg);
>> +      compile_optimized(objFile.os(), errMsg);
>>  objFile.os().close();
>>  if (objFile.os().has_error()) {
>>    objFile.os().clear_error();
>> @@ -241,15 +236,10 @@ bool LTOCodeGenerator::compile_to_file(const char**
>> name,
>>  return true;
>> }
>> 
>> -const void* LTOCodeGenerator::compile(size_t* length,
>> -                                      bool disableOpt,
>> -                                      bool disableInline,
>> -                                      bool disableGVNLoadPRE,
>> -                                      bool disableVectorization,
>> -                                      std::string& errMsg) {
>> +const void* LTOCodeGenerator::compile_optimized(size_t* length,
>> +                                                std::string& errMsg) {
>>  const char *name;
>> -  if (!compile_to_file(&name, disableOpt, disableInline, disableGVNLoadPRE,
>> -                       disableVectorization, errMsg))
>> +  if (!compile_optimized_to_file(&name, errMsg))
>>    return nullptr;
>> 
>>  // read .o file into memory buffer
>> @@ -272,6 +262,32 @@ const void* LTOCodeGenerator::compile(size_t* length,
>>  return NativeObjectFile->getBufferStart();
>> }
>> 
>> +
>> +bool LTOCodeGenerator::compile_to_file(const char** name,
>> +                                       bool disableOpt,
>> +                                       bool disableInline,
>> +                                       bool disableGVNLoadPRE,
>> +                                       bool disableVectorization,
>> +                                       std::string& errMsg) {
>> +  if (!optimize(disableOpt, disableInline, disableGVNLoadPRE,
>> disableVectorization, errMsg))
>> 
>> 
>> 80-column?
>> 
>> +    return false;
>> +
>> +  return compile_optimized_to_file(name, errMsg);
>> +}
>> +
>> +const void* LTOCodeGenerator::compile(size_t* length,
>> +                                      bool disableOpt,
>> +                                      bool disableInline,
>> +                                      bool disableGVNLoadPRE,
>> +                                      bool disableVectorization,
>> +                                      std::string& errMsg) {
>> +  if (!optimize(disableOpt, disableInline, disableGVNLoadPRE,
>> +                       disableVectorization, errMsg))
>> 
>> 
>> Whitespace looks strange here.  clang-format?
>> 
>> +    return nullptr;
>> +
>> +  return compile_optimized(length, errMsg);
>> +}
>> +
>> bool LTOCodeGenerator::determineTarget(std::string &errMsg) {
>>  if (TargetMach)
>>    return true;
>> @@ -458,12 +474,11 @@ void LTOCodeGenerator::applyScopeRestrictions() {
>> }
>> 
>> /// Optimize merged modules using various IPO passes
>> -bool LTOCodeGenerator::generateObjectFile(raw_ostream &out,
>> -                                          bool DisableOpt,
>> -                                          bool DisableInline,
>> -                                          bool DisableGVNLoadPRE,
>> -                                          bool DisableVectorization,
>> -                                          std::string &errMsg) {
>> +bool LTOCodeGenerator::optimize(bool DisableOpt,
>> +                                bool DisableInline,
>> +                                bool DisableGVNLoadPRE,
>> +                                bool DisableVectorization,
>> +                                std::string &errMsg) {
>>  if (!this->determineTarget(errMsg))
>>    return false;
>> 
>> @@ -493,6 +508,22 @@ bool LTOCodeGenerator::generateObjectFile(raw_ostream
>> &out,
>> 
>>  PMB.populateLTOPassManager(passes, TargetMach);
>> 
>> +  // Run our queue of passes all at once now, efficiently.
>> +  passes.run(*mergedModule);
>> +
>> +  return true;
>> +}
>> +
>> +bool LTOCodeGenerator::compile_optimized(raw_ostream &out,
>> +                                         std::string &errMsg) {
>> +  if (!this->determineTarget(errMsg))
>> +    return false;
>> +
>> +  Module *mergedModule = IRLinker.getModule();
>> +
>> +  // Mark which symbols can not be internalized
>> +  this->applyScopeRestrictions();
>> +
>>  PassManager codeGenPasses;
>> 
>>  codeGenPasses.add(new DataLayoutPass());
>> @@ -509,9 +540,6 @@ bool LTOCodeGenerator::generateObjectFile(raw_ostream
>> &out,
>>    return false;
>>  }
>> 
>> -  // Run our queue of passes all at once now, efficiently.
>> -  passes.run(*mergedModule);
>> -
>>  // Run the code generator, and write assembly file
>>  codeGenPasses.run(*mergedModule);
>> 
>> diff --git a/tools/lto/lto.cpp b/tools/lto/lto.cpp
>> index ec0372e..b4576cf 100644
>> --- a/tools/lto/lto.cpp
>> +++ b/tools/lto/lto.cpp
>> @@ -289,6 +289,26 @@ const void *lto_codegen_compile(lto_code_gen_t cg,
>> size_t *length) {
>>                             sLastErrorString);
>> }
>> 
>> +bool lto_codegen_optimize(lto_code_gen_t cg) {
>> +  if (!parsedOptions) {
>> +    unwrap(cg)->parseCodeGenDebugOptions();
>> +    lto_add_attrs(cg);
>> +    parsedOptions = true;
>> +  }
>> +  return !unwrap(cg)->optimize(DisableOpt, DisableInline,
>> +                               DisableGVNLoadPRE, DisableLTOVectorization,
>> +                               sLastErrorString);
>> +}
>> +
>> +const void *lto_codegen_compile_optimized(lto_code_gen_t cg, size_t
>> *length) {
>> +  if (!parsedOptions) {
>> +    unwrap(cg)->parseCodeGenDebugOptions();
>> +    lto_add_attrs(cg);
>> +    parsedOptions = true;
>> +  }
>> +  return unwrap(cg)->compile_optimized(length, sLastErrorString);
>> +}
>> +
>> bool lto_codegen_compile_to_file(lto_code_gen_t cg, const char **name) {
>>  if (!parsedOptions) {
>>    unwrap(cg)->parseCodeGenDebugOptions();
>> diff --git a/tools/lto/lto.exports b/tools/lto/lto.exports
>> index 8ba9d28..82277fb 100644
>> --- a/tools/lto/lto.exports
>> +++ b/tools/lto/lto.exports
>> @@ -37,6 +37,8 @@ lto_codegen_set_assembler_args
>> lto_codegen_set_assembler_path
>> lto_codegen_set_cpu
>> lto_codegen_compile_to_file
>> +lto_codegen_optimize
>> +lto_codegen_compile_optimized
>> LLVMCreateDisasm
>> LLVMCreateDisasmCPU
>> LLVMDisasmDispose
>> 
>> 
>>