[PATCH] [LegalizeVectors] Improve vector CTPOP expansion
Bruno Cardoso Lopes
bruno.cardoso at gmail.com
Mon May 25 11:44:40 PDT 2015
Hi chandlerc, hfinkel, nadav, delena,
This patch is a follow up from vector CTPOP work started in http://reviews.llvm.org/D6531
It modifies current target independent vector CTPOP expansion to implement a parallel version of the algorithm presented in http://graphics.stanford.edu/~seander/bithacks.html#CountBitsSetParallel
A new TLI hook is provided to let the target decide for a vector type whether it should use the unrolled CTPOP expansion or the algorithm implemented in this patch. This is specially useful for x86 where unrolling, parallel bitmath and custom lowering dispute the better performance depending on the type. It looks like this can benefit other target as well. PowerPC folks, maybe this could show gains for vector types pre-POWER8?
The patch depends upon http://reviews.llvm.org/D6531 to be applied first so that the tests can run smoothly.
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 21986 bytes
Desc: not available
More information about the llvm-commits