[PATCH][X86] __builtin_ctz/clz sometimed defined for zero input
chisophugis at gmail.com
Thu Oct 23 19:02:37 PDT 2014
If I understand correctly, this patch is trying to change the meaning of
__builtin_ctz (et al.) under some extremely specific conditions. I don't
think that is the right direction since it will cause surprising undefined
behavior bugs across platforms. The intrinsic is documented to have
undefined behavior in the 0 case (everywhere I looked, including our
internal docs); a user that relies on the 0 case has a bug. It would be
nice to add a UBSan check for this undefined behavior though to help users
fix their code.
It would be better to just ensure that we always generate optimal code in
the presence of a manual guard for the 0 case. For example, in the
middle-end we could fold a manual 0 guard followed by @llvm.ctlz.*(X, true)
into @llvm.ctlz.*(X, false).
-- Sean Silva
On Thu, Oct 23, 2014 at 4:40 PM, Robinson, Paul <
Paul_Robinson at playstation.sony.com> wrote:
> In general, count-zeros instructions are undefined for a zero input value.
> However the X86 TZCNT and LZCNT instructions do return the bit-width on a
> zero input, so make Clang tell LLVM so.
> One quirk is that these instructions aren't necessarily both defined, so
> also create a separate predicate so we can do the right thing for all CPUs.
> cfe-commits mailing list
> cfe-commits at cs.uiuc.edu
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the cfe-commits