[PATCH][X86] __builtin_ctz/clz sometimed defined for zero input

Sean Silva chisophugis at gmail.com
Mon Oct 27 19:07:09 PDT 2014


On Mon, Oct 27, 2014 at 5:20 PM, Richard Smith <richard at metafoo.co.uk>
wrote:

> On Sun, Oct 26, 2014 at 10:24 PM, Chandler Carruth <chandlerc at google.com>
> wrote:
>
>> Sorry for the late replies, but as the last person to work on the CLZ /
>> CTZ stuff, I have some small opinions here. =]
>>
>> On Sun, Oct 26, 2014 at 9:52 PM, David Majnemer <david.majnemer at gmail.com
>> > wrote:
>>
>>> I think Paul's patch is fine.
>>>
>>
>> I firmly disagree.
>>
>>
>>>
>>> Having `__builtin_ctz` asking the target if `isCLZForZeroUndef` seems
>>> busted and this patch seems to fix that just fine.
>>>
>>
>> Yes, that is busted and we should fix that aspect regardless. But
>> changing the behavior on x86 seems actively harmful to portability.
>>
>>
>> So, let's look at the history.
>>
>> I went and taught LLVM to produce optimal code in cases where
>> undef-at-zero was acceptable and only instructions with undef-at-zero were
>> available. In order to generate reasonable code for these scenarios, Clang
>> and LLVM had to respect the contract specified for the GCC __builtin_clz
>> and __builtin_ctz. That spec[1] *very* clearly states that if the input is
>> zero the result is undefined.
>>
>> 1: https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html
>>
>> However, after this Bob changed ARM to not follow this spec citing a
>> radar only in r149086. I objected at the time, and continue to feel this is
>> the *wrong* approach for ARM. We should not be deviating from the clear,
>> documented, and widely followed spec here. However, I had no real basis for
>> arguing about the behavior of ARM at the time, and I doubt that I have much
>> say in the behavior there even now. =] That said, on x86, I am firmly
>> opposed to deviating from the clear and unambiguous spec.
>>
>> If folks want a builtin to access unambiguously defined-at-zero CLZ or
>> CTZ behavior, we should define *new* builtins which have a clear spec here.
>> However, I'm unpersuaded that this is better or more portable. The
>> alternative code seems quite palatable: (x == 0 ? 32 : __builtin_clz(x))
>>
>> If LLVM fails to turn this into the maximally efficient code on some
>> architecture, then much as others have suggested, we should enhance LLVM.
>>
>
> I agree with Chandler on all points here.
>
> __builtin_clz and __builtin_ctz is clearly and unambiguously defined by
> GCC's specification for them, and we should not be providing additional
> definedness  guarantees. If we want a defined-for-zero-input form of these
> intrinsics, they should have a different name (or an additional argument,
> or be written as "x == 0 ? 32 : __builtin_clz(x)" or similar).
>
> It is a major feature of Clang that we try not to lock people in by
> providing incompatible extensions.
>

Curiously, it looks like GCC is depending on definedness for 0 in its own
headers:
https://github.com/gcc-mirror/gcc/blob/master/gcc/config/i386/lzcntintrin.h

*shrug*

-- Sean Silva


>
> _______________________________________________
> cfe-commits mailing list
> cfe-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20141027/3c5aeadf/attachment.html>


More information about the cfe-commits mailing list