[PATCH][X86] __builtin_ctz/clz sometimed defined for zero input

Fri Oct 24 05:15:27 PDT 2014

Hi Sean,

On Fri, Oct 24, 2014 at 3:02 AM, Sean Silva <chisophugis at gmail.com> wrote:
>
> If I understand correctly, this patch is trying to change the meaning of
> __builtin_ctz (et al.) under some extremely specific conditions. I don't
> think that is the right direction since it will cause surprising undefined
> behavior bugs across platforms.

Paul's fix would only affect x86 behavior.
What it does is simply that on some x86 cpus, ctz/clz is defined on
zero (i.e. we have an instruction for it).
I don't think it can cause surprising undefined behavior across platforms.

> The intrinsic is documented to have
> undefined behavior in the 0 case (everywhere I looked, including our
> internal docs); a user that relies on the 0 case has a bug. It would be nice
> to add a UBSan check for this undefined behavior though to help users fix
> their code.

The intrinsic is documented to have the following behavior
(http://llvm.org/docs/LangRef.html#llvm-cttz-intrinsic):
"If src == 0 then the result is the size in bits of the type of src if
is_zero_undef == 0 and undef otherwise."
Where 'is_zero_undef' is the second argument to the intrinsic.
So, the intrinsic is not documented to have always undefined behavior
in the zero case.
Also I don't think that relying on the 0 case is a bug. It is a very
reasonable assumption on all modern x86 architectures.

> It would be better to just ensure that we always generate optimal code in
> the presence of a manual guard for the 0 case. For example, in the
> middle-end we could fold a manual 0 guard followed by @llvm.ctlz.*(X, true)
> into @llvm.ctlz.*(X, false).

I agree that the codegen should be improved in that case.

My concern is that your suggested approach would force people to
always guard calls to __builtin_ctz/__builtin_clz against zero.
>From a customer point of view, the compiler knows exactly if ctz and
clz is defined on zero. It is basically pushing the problem on the
customer by forcing them to guard all the calls to ctz/clz against
zero. We've already had a number of customer queries/complaints about
this and I personally don't think it is unreasonable to have ctz/clz
defined on zero on our target (and other x86 targets where the
behavior on zero is clearly defined).

-Andrea

>
> -- Sean Silva
>
> On Thu, Oct 23, 2014 at 4:40 PM, Robinson, Paul
> <Paul_Robinson at playstation.sony.com> wrote:
>>
>> In general, count-zeros instructions are undefined for a zero input value.
>> However the X86 TZCNT and LZCNT instructions do return the bit-width on a
>> zero input, so make Clang tell LLVM so.
>> One quirk is that these instructions aren't necessarily both defined, so
>> also create a separate predicate so we can do the right thing for all
>> CPUs.
>> --paulr
>>
>>
>> _______________________________________________
>> cfe-commits mailing list
>> cfe-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
>>
>
>
> _______________________________________________
> cfe-commits mailing list
> cfe-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
>