[llvm-dev] __parityti2(), __paritydi2() and __paritysi2() vs. __builtin_parity

Stefan Kanthak via llvm-dev llvm-dev at lists.llvm.org
Tue Dec 4 11:47:50 PST 2018


Hi @ll,

compiler-rt/lib/builtins/parityti2.c
compiler-rt/lib/builtins/paritydi2.c
compiler-rt/lib/builtins/paritysi2.c

implement the parity function as matroschka:

si_int
__paritysi2(si_int a)
{
    su_int x = (su_int)a;
    x ^= x >> 16;
    x ^= x >> 8;
    x ^= x >> 4;
    return (0x6996 >> (x & 0xF)) & 1; // see optimisation below!
}

si_int
__paritydi2(di_int a)
{
    dwords x;
    x.all = a;
    return __paritysi2(x.s.high ^ x.s.low);
}

si_int
__parityti2(ti_int a)
{
    twords x;
    x.all = a;
    return __paritydi2(x.s.high ^ x.s.low);
}

Questions:
~~~~~~~~~~

1. are these functions still needed, given that __builtin_parity is available?

2. will the optimiser "inline" the internal function calls (as part of LTO)?

   If NOT, they should be inlined manually!

   JFTR: if the 3 functions are part of a single source or compilation unit,
         they are inlined by the compiler!

   Yes, parity is seldomly used, so this optimisation may not seem necessary.

si_int
__paritydi2(di_int a)
{
    su_int x = (su_int)a;
    x ^= (du_int)a >> 32;
    x ^= x >> 16;
    x ^= x >> 8;
    x ^= x >> 4;
    return (0x69966996 >> x) & 1;
}

si_int
__parityti2(ti_int a)
{
    du_int x = (du_int)a;
    x ^= (tu_int)a >> 64;
    x ^= x >> 32;
    x ^= x >> 16;
    x ^= x >> 8;
    x ^= x >> 4;
    return (0x69966996 >> x) & 1;
}

CAVEAT: the last right-shift MAY BE undefined behaviour, the optimisation
        shown here only works on CPUs which perform shifts modulo word-size!

stay tuned
Stefan Kanthak


More information about the llvm-dev mailing list