[libcxx-commits] [PATCH] D70640: Optimize std::midpoint for signed integers

Jorg Brown via libcxx-commits libcxx-commits at lists.llvm.org
Mon Nov 25 04:58:25 PST 2019


> Maybe I would add explanation why the term  (__a & __b) + ((__a ^ __b) >>
1) calculates the floored average.

FWIW, it's mentioned in the second edition of Hacker's Delight.

= - = - =

I'm a bit wary of a change that produces more instructions but happens to
execute more quickly on Intel CPUs... though of course most CPUs running
this code are likely to be x86.

Note that if the integer in question isn't as wide as int64, then a much
simpler algorithm can be used:

return __a + (__b - int64_t{__a}) / 2;

This is indeed faster, as seen here =>
http://quick-bench.com/BP71mRtaL1f3lCAR6n8mdhAN7MM

-- Jorg

On Sun, Nov 24, 2019 at 4:21 AM Andrej Korman via Phabricator <
reviews at reviews.llvm.org> wrote:

> Aj0SK added a comment.
>
> Maybe I would add explanation why the term
>
>   (__a & __b) + ((__a ^ __b) >> 1)
>
> calculates the floored average. Adding a and b can be rewritten as a + b =
> (a ^ b) + ((a & b) << 1) . After this we have to divide a+b by 2. This can
> be done also using shifting to right by 1. After this we get (a + b)/2 =
> ((a ^ b) >>1) + (a & b).
>
>
> Repository:
>   rG LLVM Github Monorepo
>
> CHANGES SINCE LAST ACTION
>   https://reviews.llvm.org/D70640/new/
>
> https://reviews.llvm.org/D70640
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/libcxx-commits/attachments/20191125/8ae75f44/attachment.html>


More information about the libcxx-commits mailing list