[PATCH] [x86] Implement a faster vector population count based on the PSHUFB in-register LUT technique.
resistor at mac.com
Thu May 28 09:53:42 PDT 2015
I believe the same approach would work on ARM64, which also as byte-wise vector popcounts and can do interleave-with-zero. Do you think it would be worthwhile to find a way to share the core of this approach?
More information about the llvm-commits