[PATCH] instcombine: Only create library call simplifier once
Nadav Rotem
nrotem at apple.com
Mon Mar 11 14:36:17 PDT 2013
Hi Meador,
I see nice compile time improvements with your patch (when compiling LLVM itself). LGTM!
Thanks,
Nadav
On Mar 11, 2013, at 9:26 AM, Michael Ilseman <milseman at apple.com> wrote:
> This is exactly the kind of fast-path for intrinsics I was hoping for and avoids cross-module caching. Awesome!
>
> Nadav brings up some great suggestions, and I hope that it makes sense to add a max-length to the TLI to further exit early. Are targets allowed to map lib calls from function names beginning with "_Z"?
>
> On Mar 11, 2013, at 9:01 AM, Nadav Rotem <nrotem at apple.com> wrote:
>
>> Hi Meador.
>>
>> Thanks for working on this. The new approach looks good to me, and I will benchmark it later today.
>>
>> I think that we can do a few things to make it a little faster. First, we can detect prefixes that are used by c++ mangled functions, such as "_Z" and exit early. We can also check if the length of the incoming string is greater than any library function call that we optimize.
>>
>> Thanks,
>> Nadav
>>
>> On Mar 11, 2013, at 6:29 AM, Meador Inge <meadori at codesourcery.com> wrote:
>>
>>> On 03/07/2013 07:37 PM, Nadav Rotem wrote:
>>>
>>>> Hi Meador,
>>>>
>>>> I spoke with Bill about this and I now think that your proposal to initialize
>>>> the SimplifyLibCalls on the first invocation of runOnFunction is correct. The
>>>> only exception is LTO build. On LTO builds we can change the TargetData and TLI
>>>> per-function. So, on each runOnFunction we need to check of the current TLI and
>>>> DataLayout is the same as the previous run. If it is not the we need to
>>>> re-initialize the SimpLibCalls, rehash the table, etc.
>>>
>>> Hi Nadav,
>>>
>>> After thinking about this a little more caching the LibCallSimplifier object on
>>> the first call makes me a little nervous and I agree with the original
>>> objections to it. This motivated me to try something different. The attached
>>> patch drops the hash table and does an explicit lookup for each of the lib call
>>> simplifier optimizers.
>>>
>>> This should alleviate Michael's concerns about caching the object in
>>> runOnFunction, short circuits the intrinsic case Michael is interested in,
>>> avoid repeatedly building and destroying the hash table, and benefits all
>>> clients of LibCallSimplifier.
>>>
>>> There is a slight overhead for the new lookup function, but it is still much
>>> better than the current approach. On a benchmark containing 100,000 calls
>>> where *none* of them are simplified I noticed a 30% speedup. On a benchmark
>>> containing 100,000 calls where *all* of them are simplified I noticed an 8%
>>> speedup. The original LibCallSimplifier caching patch also obtained a 30%
>>> speedup in the case where nothing was simplified and a 13% speedup when all
>>> calls where simplified. I am comfortable that the new patch is slightly slower
>>> for some cases (this is on average. I saw a few runs where the too approaches
>>> gave equivalent speedups).
>>>
>>> Comments? Would you all mind trying this patch for the benchmarks that you all
>>> are interested in? I personally like the new patch better.
>>>
>>> --
>>> Meador Inge
>>> CodeSourcery / Mentor Embedded
>>> <0001-LibCallSimplifier-optimize-speed-for-short-lived-ins.patch>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20130311/bdf4a703/attachment.html>
More information about the llvm-commits
mailing list