[llvm-dev] __parityti2(), __paritydi2() and __paritysi2() vs. __builtin_parity
Stefan Kanthak via llvm-dev
llvm-dev at lists.llvm.org
Tue Dec 4 11:47:50 PST 2018
Hi @ll,
compiler-rt/lib/builtins/parityti2.c
compiler-rt/lib/builtins/paritydi2.c
compiler-rt/lib/builtins/paritysi2.c
implement the parity function as matroschka:
si_int
__paritysi2(si_int a)
{
su_int x = (su_int)a;
x ^= x >> 16;
x ^= x >> 8;
x ^= x >> 4;
return (0x6996 >> (x & 0xF)) & 1; // see optimisation below!
}
si_int
__paritydi2(di_int a)
{
dwords x;
x.all = a;
return __paritysi2(x.s.high ^ x.s.low);
}
si_int
__parityti2(ti_int a)
{
twords x;
x.all = a;
return __paritydi2(x.s.high ^ x.s.low);
}
Questions:
~~~~~~~~~~
1. are these functions still needed, given that __builtin_parity is available?
2. will the optimiser "inline" the internal function calls (as part of LTO)?
If NOT, they should be inlined manually!
JFTR: if the 3 functions are part of a single source or compilation unit,
they are inlined by the compiler!
Yes, parity is seldomly used, so this optimisation may not seem necessary.
si_int
__paritydi2(di_int a)
{
su_int x = (su_int)a;
x ^= (du_int)a >> 32;
x ^= x >> 16;
x ^= x >> 8;
x ^= x >> 4;
return (0x69966996 >> x) & 1;
}
si_int
__parityti2(ti_int a)
{
du_int x = (du_int)a;
x ^= (tu_int)a >> 64;
x ^= x >> 32;
x ^= x >> 16;
x ^= x >> 8;
x ^= x >> 4;
return (0x69966996 >> x) & 1;
}
CAVEAT: the last right-shift MAY BE undefined behaviour, the optimisation
shown here only works on CPUs which perform shifts modulo word-size!
stay tuned
Stefan Kanthak
More information about the llvm-dev
mailing list