[PATCH] instcombine: Only create library call simplifier once
Michael Ilseman
milseman at apple.com
Mon Mar 11 09:26:37 PDT 2013
This is exactly the kind of fast-path for intrinsics I was hoping for and avoids cross-module caching. Awesome!
Nadav brings up some great suggestions, and I hope that it makes sense to add a max-length to the TLI to further exit early. Are targets allowed to map lib calls from function names beginning with "_Z"?
On Mar 11, 2013, at 9:01 AM, Nadav Rotem <nrotem at apple.com> wrote:
> Hi Meador.
>
> Thanks for working on this. The new approach looks good to me, and I will benchmark it later today.
>
> I think that we can do a few things to make it a little faster. First, we can detect prefixes that are used by c++ mangled functions, such as "_Z" and exit early. We can also check if the length of the incoming string is greater than any library function call that we optimize.
>
> Thanks,
> Nadav
>
> On Mar 11, 2013, at 6:29 AM, Meador Inge <meadori at codesourcery.com> wrote:
>
>> On 03/07/2013 07:37 PM, Nadav Rotem wrote:
>>
>>> Hi Meador,
>>>
>>> I spoke with Bill about this and I now think that your proposal to initialize
>>> the SimplifyLibCalls on the first invocation of runOnFunction is correct. The
>>> only exception is LTO build. On LTO builds we can change the TargetData and TLI
>>> per-function. So, on each runOnFunction we need to check of the current TLI and
>>> DataLayout is the same as the previous run. If it is not the we need to
>>> re-initialize the SimpLibCalls, rehash the table, etc.
>>
>> Hi Nadav,
>>
>> After thinking about this a little more caching the LibCallSimplifier object on
>> the first call makes me a little nervous and I agree with the original
>> objections to it. This motivated me to try something different. The attached
>> patch drops the hash table and does an explicit lookup for each of the lib call
>> simplifier optimizers.
>>
>> This should alleviate Michael's concerns about caching the object in
>> runOnFunction, short circuits the intrinsic case Michael is interested in,
>> avoid repeatedly building and destroying the hash table, and benefits all
>> clients of LibCallSimplifier.
>>
>> There is a slight overhead for the new lookup function, but it is still much
>> better than the current approach. On a benchmark containing 100,000 calls
>> where *none* of them are simplified I noticed a 30% speedup. On a benchmark
>> containing 100,000 calls where *all* of them are simplified I noticed an 8%
>> speedup. The original LibCallSimplifier caching patch also obtained a 30%
>> speedup in the case where nothing was simplified and a 13% speedup when all
>> calls where simplified. I am comfortable that the new patch is slightly slower
>> for some cases (this is on average. I saw a few runs where the too approaches
>> gave equivalent speedups).
>>
>> Comments? Would you all mind trying this patch for the benchmarks that you all
>> are interested in? I personally like the new patch better.
>>
>> --
>> Meador Inge
>> CodeSourcery / Mentor Embedded
>> <0001-LibCallSimplifier-optimize-speed-for-short-lived-ins.patch>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20130311/96f4a795/attachment.html>
More information about the llvm-commits
mailing list