[PATCH] D25344: Add a fast path to alignTo.
Rafael Espíndola via llvm-commits
llvm-commits at lists.llvm.org
Thu Oct 6 15:34:48 PDT 2016
It is marginally faster on a "Intel(R) Xeon(R) CPU E5-2697 v2"
firefox
master 6.521730037
patch 6.53065974 1.00136922304x slower
chromium
master 4.381491021
patch 4.372600839 1.00203315654x faster
chromium fast
master 1.847313003
patch 1.840066086 1.0039384004x faster
the gold plugin
master 0.326036955
patch 0.323885574 1.00664241069x faster
clang
master 0.550480887
patch 0.547021193 1.00632460688x faster
llvm-as
master 0.03225211
patch 0.031885871 1.01148593369x faster
the gold plugin fsds
master 0.355666359
patch 0.353588254 1.00587718901x faster
clang fsds
master 0.633038735
patch 0.629498967 1.0056231514x faster
llvm-as fsds
master 0.030099552
patch 0.0297617 1.0113519053x faster
scylla
master 2.908191778
patch 2.900006405 1.00282253618x faster
Cheers,
Rafael
On 6 October 2016 at 17:58, Rafael Espíndola <rafael.espindola at gmail.com> wrote:
> The attached test passes all tests. I will benchmark to see if it
> makes any difference.
>
> I also noticed a missing optimization. It would be nice if we could
> keep a single function but have the optimizer take care of it, so I
> tried
>
> uint64_t foo(uint64_t Value, uint64_t Align) {
> return alignToNonP2(Value, 1 << Align);
> }
>
> but it still produces
>
> define i64 @_Z3foomm(i64 %Value, i64 %Align) local_unnamed_addr #0 {
> entry:
> %sh_prom = trunc i64 %Align to i32
> %shl = shl i32 1, %sh_prom
> %conv = sext i32 %shl to i64
> %add.i = add i64 %Value, -1
> %sub.i = add i64 %add.i, %conv
> %div.i = urem i64 %sub.i, %conv
> %add2.i = sub i64 %sub.i, %div.i
> ret i64 %add2.i
> }
>
> Changing 1 to 1ULL does cause us to optimize it
>
> define i64 @_Z3foomm(i64 %Value, i64 %Align) local_unnamed_addr #0 {
> entry:
> %shl = shl i64 1, %Align
> %add.i = add i64 %Value, -1
> %sub.i = add i64 %add.i, %shl
> %.not = sub i64 0, %shl
> %add2.i = and i64 %sub.i, %.not
> ret i64 %add2.i
> }
>
>
> Cheers,
> Rafael
>
> On 6 October 2016 at 17:00, Rafael Espíndola <rafael.espindola at gmail.com> wrote:
>> On 6 October 2016 at 16:39, Davide Italiano <dccitaliano at gmail.com> wrote:
>>> On Thu, Oct 6, 2016 at 1:37 PM, Rui Ueyama <ruiu at google.com> wrote:
>>>> Or to make alignTo accept only power of twos and fix code that passes
>>>> non-power-of-twos.
>>>>
>>>
>>> Do you know how many of these cases are in LLVM and if there are legitimate?
>>
>> Interesting idea. I added an assert and I am running the tests.
>>
>> Cheers,
>> Rafael
More information about the llvm-commits
mailing list