[cfe-commits] r147263 - in /cfe/trunk: include/clang/Basic/BuiltinsX86.def lib/CodeGen/CGBuiltin.cpp lib/Headers/bmiintrin.h lib/Headers/immintrin.h lib/Headers/lzcntintrin.h lib/Headers/x86intrin.h test/CodeGen/bmi-builtins.c test/CodeGen/lzcnt-

Chandler Carruth chandlerc at google.com
Sun Dec 25 04:38:28 PST 2011


On Sat, Dec 24, 2011 at 10:25 PM, Craig Topper <craig.topper at gmail.com>wrote:

> Author: ctopper
> Date: Sun Dec 25 00:25:37 2011
> New Revision: 147263
>
> URL: http://llvm.org/viewvc/llvm-project?rev=147263&view=rev
> Log:
> Add intrinsics for lzcnt and tzcnt instructions.
>

I have similar concerns as Benjamin, these appear to be the wrong
semantics. More specific comments below.

+// LZCNT
> +BUILTIN(__builtin_clzs, "UsUs", "")
>

What has this to do with lzcnt? This is just a short version of
__builtin_clz, which we can implement for x86 w/o lzcnt....


> +
> +// BMI
> +BUILTIN(__builtin_ctzs, "UsUs", "")
>

Same question. I don't understand your comments or why these builtins are
grouped separately.

+  case X86::BI__builtin_clzs: {
> +    Value *ArgValue = EmitScalarExpr(E->getArg(0));
> +
> +    llvm::Type *ArgType = ArgValue->getType();
> +    Value *F = CGM.getIntrinsic(Intrinsic::ctlz, ArgType);
> +
> +    llvm::Type *ResultType = ConvertType(E->getType());
> +    Value *Result = Builder.CreateCall2(F, ArgValue, Builder.getTrue());
>

If you're intending to provide builtins which model the semantics of LZCNT
and TZCNT (a noble goal), then this is wrong. You want False here as
Benjamin alluded to, as both of those instructions provide *defined*
behavior for zero inputs.

I feel like this patch has two intermixed components:

1) extend the existing __builtin_clz?(...) intrinsics to support shorts
2) add intrinsics based around the LZCNT and TZCNT instructions.

#1 is fine, but should be commented as having the same bizarre semantics as
the other __builtin_clz functions, and grouped with them. There is nothing
specific to LZCNT or TZCNT here.

#2 is actually fine with the existing intrinsics, but you need to define
them a bit differently:

+static __inline__ unsigned short __attribute__((__always_inline__,
> __nodebug__))
> +__tzcnt16(unsigned short __X)
> +{
> +  return __builtin_ctzs(__X);
> +}
> +
> +static __inline__ unsigned int __attribute__((__always_inline__,
> __nodebug__))
> +__tzcnt32(unsigned int __X)
> +{
> +  return __builtin_ctz(__X);
> +}
> +
> +#ifdef __x86_64__
> +static __inline__ unsigned long long __attribute__((__always_inline__,
> __nodebug__))
> +__tzcnt64(unsigned long long __X)
> +{
> +  return __builtin_ctzll(__X);
> +}
>

These all need to follow the pattern:

  return (__X == 0) ? sizeof(X) * 8 : __builtin_ctz*(__X);

As otherwise, a zero input produces an undefined result.

+static __inline__ unsigned short __attribute__((__always_inline__,
> __nodebug__))
> +__lzcnt16(unsigned short __X)
> +{
> +  return __builtin_clzs(__X);
> +}
> +
> +static __inline__ unsigned int __attribute__((__always_inline__,
> __nodebug__))
> +__lzcnt32(unsigned int __X)
> +{
> +  return __builtin_clz(__X);
> +}
> +
> +#ifdef __x86_64__
> +static __inline__ unsigned long long __attribute__((__always_inline__,
> __nodebug__))
> +__lzcnt64(unsigned long long __X)
> +{
> +  return __builtin_clzll(__X);
> +}
>

These also need the ?: treatment to give them a defined result for zero
inputs.

==============================================================================
> --- cfe/trunk/test/CodeGen/bmi-builtins.c (added)
> +++ cfe/trunk/test/CodeGen/bmi-builtins.c Sun Dec 25 00:25:37 2011
> @@ -0,0 +1,24 @@
> +// RUN: %clang_cc1 %s -O3 -triple=x86_64-apple-darwin -target-feature
> +bmi -S -o - | FileCheck %s
> +
> +// Don't include mm_malloc.h, it's system specific.
> +#define __MM_MALLOC_H
> +
> +#include <x86intrin.h>
> +
> +unsigned short test__tzcnt16(unsigned short __X)
> +{
> +  // CHECK: tzcntw
> +  return __tzcnt16(__X);
>

Please do not test x86 instructions inside the Clang test suite. These
tests need to be defined in terms of the LLVM IR the produce, not in terms
of the x86 instructions they produce, even though they are x86-specific
intrinsics. Otherwise, random and unrelated optimizations in LLVM may at
any time break these tests without cause.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20111225/2362b9ba/attachment.html>


More information about the cfe-commits mailing list