[PATCH] D25344: Add a fast path to alignTo.
David Majnemer via llvm-commits
llvm-commits at lists.llvm.org
Thu Oct 6 15:47:05 PDT 2016
Is alignTo actually showing up in profiles of real world code?
On Thu, Oct 6, 2016 at 3:34 PM, Rafael Espíndola via llvm-commits <
llvm-commits at lists.llvm.org> wrote:
> It is marginally faster on a "Intel(R) Xeon(R) CPU E5-2697 v2"
>
> firefox
> master 6.521730037
> patch 6.53065974 1.00136922304x slower
> chromium
> master 4.381491021
> patch 4.372600839 1.00203315654x faster
> chromium fast
> master 1.847313003
> patch 1.840066086 1.0039384004x faster
> the gold plugin
> master 0.326036955
> patch 0.323885574 1.00664241069x faster
> clang
> master 0.550480887
> patch 0.547021193 1.00632460688x faster
> llvm-as
> master 0.03225211
> patch 0.031885871 1.01148593369x faster
> the gold plugin fsds
> master 0.355666359
> patch 0.353588254 1.00587718901x faster
> clang fsds
> master 0.633038735
> patch 0.629498967 1.0056231514x faster
> llvm-as fsds
> master 0.030099552
> patch 0.0297617 1.0113519053x faster
> scylla
> master 2.908191778
> patch 2.900006405 1.00282253618x faster
>
> Cheers,
> Rafael
>
>
> On 6 October 2016 at 17:58, Rafael Espíndola <rafael.espindola at gmail.com>
> wrote:
> > The attached test passes all tests. I will benchmark to see if it
> > makes any difference.
> >
> > I also noticed a missing optimization. It would be nice if we could
> > keep a single function but have the optimizer take care of it, so I
> > tried
> >
> > uint64_t foo(uint64_t Value, uint64_t Align) {
> > return alignToNonP2(Value, 1 << Align);
> > }
> >
> > but it still produces
> >
> > define i64 @_Z3foomm(i64 %Value, i64 %Align) local_unnamed_addr #0 {
> > entry:
> > %sh_prom = trunc i64 %Align to i32
> > %shl = shl i32 1, %sh_prom
> > %conv = sext i32 %shl to i64
> > %add.i = add i64 %Value, -1
> > %sub.i = add i64 %add.i, %conv
> > %div.i = urem i64 %sub.i, %conv
> > %add2.i = sub i64 %sub.i, %div.i
> > ret i64 %add2.i
> > }
> >
> > Changing 1 to 1ULL does cause us to optimize it
> >
> > define i64 @_Z3foomm(i64 %Value, i64 %Align) local_unnamed_addr #0 {
> > entry:
> > %shl = shl i64 1, %Align
> > %add.i = add i64 %Value, -1
> > %sub.i = add i64 %add.i, %shl
> > %.not = sub i64 0, %shl
> > %add2.i = and i64 %sub.i, %.not
> > ret i64 %add2.i
> > }
> >
> >
> > Cheers,
> > Rafael
> >
> > On 6 October 2016 at 17:00, Rafael Espíndola <rafael.espindola at gmail.com>
> wrote:
> >> On 6 October 2016 at 16:39, Davide Italiano <dccitaliano at gmail.com>
> wrote:
> >>> On Thu, Oct 6, 2016 at 1:37 PM, Rui Ueyama <ruiu at google.com> wrote:
> >>>> Or to make alignTo accept only power of twos and fix code that passes
> >>>> non-power-of-twos.
> >>>>
> >>>
> >>> Do you know how many of these cases are in LLVM and if there are
> legitimate?
> >>
> >> Interesting idea. I added an assert and I am running the tests.
> >>
> >> Cheers,
> >> Rafael
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20161006/414398cf/attachment.html>
More information about the llvm-commits
mailing list