[PATCH] D159283: Add intrinsic to count trailing zero elements in a vector

Paul Walker via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Sep 1 04:41:20 PDT 2023


paulwalker-arm added a comment.

The specific rational for mirroring cttz's second operand, as well as just being consistent, is that it can reduce the range of the result sufficiently to lead to better code generation.  I'll use SVE as an example, although knowing that for SVE we don't rely on the common expansion.  The largest supported SVE vector register is 256 bytes long.  This means cttz_elt has a return value of  0 <= ret >= 256.  For this scenario the common expansion will require 16-bit element types. However, for cases when an all zero input results in poison the range is reduced by 1 meaning 8-bit element types can be used.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D159283/new/

https://reviews.llvm.org/D159283



More information about the llvm-commits mailing list