LTO API: lto_codegen_optimize and lto_codegen_compile_optimized

Fri Jan 30 14:17:15 PST 2015

> On 2015 Jan 30, at 13:26, Nick Kledzik <kledzik at apple.com> wrote:
> 
> 
> On Jan 30, 2015, at 12:58 PM, Duncan P. N. Exon Smith <dexonsmith at apple.com> wrote:
> 
>> 
>> 
>>> On Jan 30, 2015, at 12:01 PM, Manman Ren <mren at apple.com> wrote:
>>> 
>>> We currently have lto_codegen_compile that runs IR passes and codegen passes. If we dump the bitcode file after lto_codegen_compile, codegen passes can modify IR (one example is StackProtector pass).
>>> If we run “llc lto.opt.bc”, we will get assertion failure because llc runs StackProtector pass again.
>>> 
>>> To enable dumping bitcode file right before running the codegen passes, this patch splits lto_codegen_compile in 2
>>> lto_codegen_optimize
>>> lto_codegen_compile_optimize
>>> 
>>> Linker can choose to dump the bitcode file before calling lto_codegen_compile_optimize and we can debug the codegen passes from the dumped bitcode file.
>>> 
>>> I initially had a different proposal, thanks to Rafael for suggesting this solution!
>>> 
>>> Cheers,
>>> Manman
>>> 
>>> <lto.patch>
>> 
>> +kledzik (see discussion about lto_codegen_optimize() below).
> 
> See if I understand this.  Instead of calling lto_codegen_compile() the linker would call lto_codegen_optimize() then lto_codegen_compile_optimized().  But the linker only needs to use the two new calls if it plans to write out the bitcode (-r or -save-temps) via lto_codegen_write_merged_modules().
> 
> If that is the only usage, why not combine the three into one function:
> 
>  extern const void*
>  lto_codegen_compile_and_write_bitcode(lto_code_gen_t cg, const char *path, size_t* length);

I think the proposal in the patch is cleaner from the standpoint of
"this is the current API", since it keeps the concerns separated.

However, for clients (like ld64) that depend on runtime availability
checks, that might be the wrong goal.  (IIRC, Manman's initial
proposal was similar to Nick's suggestion.)

I could be convinced either way.

> 
>> 
>>> 
>>> +/**
>>> + * Runs optimization for the merged module. Returns true on error.
>>> + *
>>> + * \since LTO_API_VERSION=12
>>> + */
>>> +extern lto_bool_t
>>> +lto_codegen_optimize(lto_code_gen_t cg);
>> 
>> It's not clear to me how the linker can determine at runtime whether
>> this function is available in the `dlopen()`'ed libLTO.dylib.
>> 
>> IIRC, the ld64 checks for runtime-availability of the new API by calling
>> the function, and assuming it doesn't exist if the function returns NULL
>> (the return value of functions that don't exist at runtime).  Usually
>> the flow is something like:
>> 
>>       Result = nullptr;
>>   #ifdef LTO_API_VERSION >= XYZ
>>       Result = lto_new_api(...);
>>       if (!Result)
>>   #else
>>       Result = lto_old_api(...)
>>   #endif
>>       // deal with Result
>> 
>> That scheme doesn't work for functions that can return 0 as a valid
>> result.
>> 
>> Simply switching the error condition (0 on failure) won't work either,
>> since then on error with the new API, you'd run the old API.
>> 
>> I can see two possibilities:
>> 
>> 1. Change `lto_codegen_optimize()` to return `1` on success and `2` on
>>   failure.  Then `0` will mark that the API isn't available.
>> 
>> 2. Add an `lto_api_version()` function that returns the runtime API
>>   version.
>> 
>> The latter option seems cleaner to me.
>> 
>> @Nick: any thoughts on this?
>> 
>> Then the linker code would look something like this:
>> 
>>       Result = nullptr;
>>   #ifdef LTO_API_VERSION >= XYZ
>>       if (lto_api_version() >= XYZ)
>>         Result = lto_new_api(...);
>>       else
>>   #else
>>       Result = lto_old_api(...)
>>   #endif
>>       // deal with Result
> If we needed to support functions that return 0 as a valid result, then yes, we’d need something like this.
> 
> -Nick