[cfe-dev] [llvm-dev] Portable multiplication 64 x 64 -> 128 for int128 reimplementation

Paweł Bylica via cfe-dev cfe-dev at lists.llvm.org
Sun Dec 30 13:45:49 PST 2018

Hi Arthur, Craig,

Thanks for you comments about GCC/Clang intrinsics. I never considered
using them, but they might be better alternative to inline assembly.
Is there a one for regular MUL?

Anyway, I want to go the opposite direction. If I can I relay on compiler's
optimizations. If I want to use MULX in Clang I do it like that:

unsigned long mulx(unsigned long x, unsigned long y, unsigned long* hi)
auto p = (unsigned __int128){x} * y;
*hi = static_cast<unsigned long>(p >> 64);
return static_cast<unsigned long>(p);


If compiled with -mbmi2 -mtune=generic it just uses MULX instruction.

mulx(unsigned long, unsigned long, unsigned long*):
mov rcx, rdx
mov rdx, rsi
mulx rdx, rax, rdi
mov qword ptr [rcx], rdx

What I want to do it move it further - rewrite the above mulx() helper
without using __int128 type in a way that a compiler would recognize that
it should use MUL/MULX instruction.

A possible implementation looks like

uint64_t mul_full_64_generic(uint64_t x, uint64_t y, uint64_t* hi)
uint64_t xl = x & 0xffffffff;
uint64_t xh = x >> 32;
uint64_t yl = y & 0xffffffff;
uint64_t yh = y >> 32;

uint64_t t = xl * yl;
uint64_t l = t & 0xffffffff;
uint64_t h = t >> 32;

t = xh * yl;
t += h;
h = t >> 32;

t = xl * yh + (t & 0xffffffff);
l |= t << 32;
*hi = xh * yh + h + (t >> 32);
return l;

As expected, Clang is not able to match this pattern currently.

If we want to implement this optimization in Clang, there are some
questions I have:
1. Can we prove this pattern is equivalent of MUL 64x64 -> 128?
2. What pass this optimization should be added to?
3. Can this pattern be split into smaller ones? E.g. UMULH.


On Sun, Dec 30, 2018 at 2:34 AM Craig Topper <craig.topper at gmail.com> wrote:

> _mulx_u64 only exists when the target is x86_64. That's still not very
> portable. I'm not opposed to removing the bmi2 check, but gcc also has the
> same check so it doesn't improve portability much.
> ~Craig
> On Sat, Dec 29, 2018 at 4:44 PM Arthur O'Dwyer via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>> Hi Pawel,
>> There is the _mulx_u64 intrinsic, but it currently requires the hardware
>> flag "-mbmi2".
>> https://github.com/Quuxplusone/WideIntProofOfConcept/blob/master/wider.h#L89-L99
>> On Clang 3.8.1 and earlier, the _addcarry_u64 and _subborrow_u64
>> intrinsics required the hardware flag `-madx`, even though they didn't use
>> the hardware ADX/ADOX instructions. Modern GCC and Clang permit the use of
>> these intrinsics (to generate ADC) even in the absence of `-madx`.
>> I think it would be a very good idea for Clang to support _mulx_u64 (to
>> generate MUL) even in the absence of `-mbmi2`.
>> –Arthur
>> On Sat, Dec 29, 2018 at 6:03 PM Paweł Bylica via cfe-dev <
>> cfe-dev at lists.llvm.org> wrote:
>>> Hi,
>>> For some maybe dumb reasons I try to write a portable version of int128.
>>> What is very valuable for this implementation is access to MUL
>>> instruction on x86 which provides full 64 x 64 -> 128 bit multiplication.
>>> An equally useful on ARM would be UMULH instruction.
>>> Well, the way you can access this on clang / GCC is to use __int128 type
>>> or use inline assembly. MSVC provides an intrinsic for this instruction.
>>> This defeats the idea of portable int128 reimplementation and makes
>>> constexpr implementation of multiplication at least inconvenient.
>>> Maybe there is a hope for me in LLVM. Is there any pattern matcher that
>>> is producing MUL instruction of bigger type?
>>> If not, would it be good idea to teach LLVM about it?
>>> Bests,
>>> Paweł
>>> _______________________________________________
>>> cfe-dev mailing list
>>> cfe-dev at lists.llvm.org
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20181230/ab95ee15/attachment.html>

More information about the cfe-dev mailing list