[cfe-dev] [PATCH] [OpenCL] Conversions for ternary operations between scalar and vector
Sahasrabuddhe, Sameer
sameer.sahasrabuddhe at amd.com
Tue Dec 16 20:44:55 PST 2014
On 12/16/2014 10:26 PM, Colin Riddell wrote:
>
>> ... or there is a conversion in section 6.2.1 Implicit
>> Conversions that
>> can be applied to one of the expressions to make their types
>> match, ...
>>
>> Which expression should be implicitly converted? As far as I can make
>> out, there was no need to mention Section 6.2.1 at this point in the
>> spec itself, especially because Section 6.2.6 is meant to clarify
>> exactly this choice: "The purpose is to determine a common real type
>> for the operands and result.". This section cites the C99 spec when
>> both operands are scalars: "Otherwise, if all operands are scalar,
>> the usual arithmetic conversions apply, per section 6.3.1.8 of the
>> C99 standard."
> Either exp2 or exp3 should be converted - which does mean they need to
> be evaluated. So as far as I can see, we are on the same understanding.
More about this in the comments to the patch inlined below. I have also
attached an updated version of my writeup. The intention is to document
all the details of the behaviour (to be) implemented in Clang and not
just type conversions. This update is mostly a cleanup with some
enhancements to the table of examples in the end. Now every negative
example points out the reason for the error.
> My latest patch provides the following:
>
> * Tests(there are 15 in total, now):
I have attached the tests that I have been working on. These correspond
to the table of examples in the writeup. There might be overlap with
your tests, but I decided to dump whatever I had so you can take a look
too. I have not checked the negative tests with pristine Clang, but the
following fail with your patch: 02, 03, 04, 05, 08, 09 and 10. But first
we have to agree that the tests are valid!
> * Code in SemaExpr::CheckConditionalOperands():
Comments inlined below.
> diff --git a/lib/Sema/SemaExpr.cpp b/lib/Sema/SemaExpr.cpp
> index 04497f3..e37e0c5 100644
> --- a/lib/Sema/SemaExpr.cpp
> +++ b/lib/Sema/SemaExpr.cpp
> @@ -5817,13 +5817,27 @@ QualType
Sema::CheckConditionalOperands(ExprResult &Cond, ExprResult &LHS,
> QualType LHSTy = LHS.get()->getType();
> QualType RHSTy = RHS.get()->getType();
>
> - // If the condition is a vector, and both operands are scalar,
> - // attempt to implicity convert them to the vector type to act
like the
> - // built in select. (OpenCL v1.1 s6.3.i)
> - if (getLangOpts().OpenCL && CondTy->isVectorType())
> - if (checkConditionalConvertScalarsToVectors(*this, LHS, RHS,
CondTy))
> - return QualType();
> -
> + if (getLangOpts().OpenCL) {
> + if (CondTy->isVectorType()){
> + if
(CondTy->getAs<VectorType>()->getElementType()->isFloatingType())
This needs to be checked for the scalar operator too.
> + Diag(Cond.get()->getLocStart(),
diag::err_use_of_float_ternary_cond)
> + << CondTy;
> + if (LHSTy != RHSTy){
> + if (LHSTy->isVectorType() && RHSTy->isScalarType())
> + RHS = ImpCastExprToType(RHS.get(), LHSTy, CK_IntegralCast);
> + else if (LHSTy->isScalarType() && RHSTy->isVectorType())
> + LHS = ImpCastExprToType(LHS.get(), RHSTy, CK_IntegralCast);
> + }
This should be moved to the function "Sema::UsualArithmeticConversions"
in the same file. That function implements C99, but needs to be enhanced
for Section 6.2.6 from OpenCL 1.2 spec, which provides additional
conversion rules for vector operands. Once we have that, we don't need
to worry about LHS and RHS; instead, we just use ResTy which is computed
just before the current point in the code.
> + else if (LHSTy->isScalarType() && RHSTy->isScalarType()){
If we make the above fix, then we only need to check if ResTy is scalar
at this point.
> + // Now attempt to implicity convert them to the vector type
to act like the
> + // built in select. (OpenCL v1.1 s6.3.i)
> + if (checkConditionalConvertScalarsToVectors(*this, LHS, RHS,
CondTy))
This is still casting the element type of the result to the element type
of the condition, which is exactly what the original problem was! It
should do the following instead:
1. Check that the bit-size of scalar ResTy is the same as the bit-size
of the element type in CondTy
2. Expand ResTy to match the vector width of CondTy.
> + return QualType();
> + return CondTy;
> + }
> + }
> + }
> +
Wondering if control should ever reach beyond this point for OpenCL!
> // If both operands have arithmetic type, do the usual arithmetic
conversions
> // to find a common type: C99 6.5.15p3,5.
> if (LHSTy->isArithmeticType() && RHSTy->isArithmeticType()) {
>
Sameer.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20141217/f2bd28cd/attachment.html>
-------------- next part --------------
// RUN: %clang_cc1 %s -verify -pedantic -fsyntax-only
typedef char char2 __attribute__((ext_vector_type(2)));
typedef char char3 __attribute__((ext_vector_type(3)));
typedef int int2 __attribute__((ext_vector_type(2)));
typedef float float2 __attribute__((ext_vector_type(2)));
kernel void test(global char3 *srcA, global char2 *srcB, global char *srcC,
global char3 *dst)
{
int tid = 1;
dst[tid] = srcC[tid] ? srcA[tid] : srcB[tid]; // expected-error {{can't convert between vector values of different size ('char3' (vector of 3 'char' values) and 'char2' (vector of 2 'char' values))}}
}
-------------- next part --------------
// RUN: %clang_cc1 %s -verify -pedantic -fsyntax-only
typedef char char2 __attribute__((ext_vector_type(2)));
typedef char char3 __attribute__((ext_vector_type(3)));
typedef int int2 __attribute__((ext_vector_type(2)));
typedef float float2 __attribute__((ext_vector_type(2)));
// scalar condition, vector operands with mismatching width
kernel void test(global char3 *srcA, global char *srcB, global char2 *srcC,
global char3 *dst)
{
int tid = 1;
dst[tid] = srcC[tid] ? srcA[tid] : srcB[tid]; // expected-error {{can't convert between vector values of different size ('char3' (vector of 3 'char' values) and 'char2' (vector of 2 'char' values))}}
}
-------------- next part --------------
// RUN: %clang_cc1 %s -verify -pedantic -fsyntax-only
typedef char char2 __attribute__((ext_vector_type(2)));
typedef char char3 __attribute__((ext_vector_type(3)));
typedef int int2 __attribute__((ext_vector_type(2)));
typedef float float2 __attribute__((ext_vector_type(2)));
// scalar condition, vector operands with mismatching width
kernel void test(global int *srcA, global int *srcB, global char2 *srcC,
global char2 *dst)
{
int tid = 1;
dst[tid] = srcC[tid] ? srcA[tid] : srcB[tid]; // expected-error {{vector ternary operator requires element types with the same bit width}}
}
-------------- next part --------------
// RUN: %clang_cc1 %s -verify -pedantic -fsyntax-only
typedef char char2 __attribute__((ext_vector_type(2)));
typedef char char3 __attribute__((ext_vector_type(3)));
typedef int int2 __attribute__((ext_vector_type(2)));
typedef float float2 __attribute__((ext_vector_type(2)));
// scalar condition, vector operands with mismatching width
kernel void test(global int2 *srcA, global int2 *srcB, global char2 *srcC,
global int2 *dst)
{
int tid = 1;
dst[tid] = srcC[tid] ? srcA[tid] : srcB[tid]; // expected-error {{vector ternary operator requires element types with the same bit width}}
}
-------------- next part --------------
// RUN: %clang_cc1 %s -verify -pedantic -fsyntax-only
typedef char char2 __attribute__((ext_vector_type(2)));
typedef char char3 __attribute__((ext_vector_type(3)));
typedef int int2 __attribute__((ext_vector_type(2)));
typedef float float2 __attribute__((ext_vector_type(2)));
// scalar condition, vector operands with mismatching width
kernel void test(global char *srcA, global char *srcB, global int2 *srcC,
global int2 *dst)
{
int tid = 1;
dst[tid] = srcC[tid] ? srcA[tid] : srcB[tid]; // expected-error {{vector ternary operator requires element types with the same bit width}}
}
-------------- next part --------------
// RUN: %clang_cc1 %s -verify -pedantic -fsyntax-only
typedef char char2 __attribute__((ext_vector_type(2)));
typedef char char3 __attribute__((ext_vector_type(3)));
typedef int int2 __attribute__((ext_vector_type(2)));
typedef float float2 __attribute__((ext_vector_type(2)));
// scalar condition, vector operands with mismatching width
kernel void test(global int2 *srcA, global float2 *srcB, global int2 *srcC,
global float2 *dst)
{
int tid = 1;
dst[tid] = srcC[tid] ? srcA[tid] : srcB[tid]; // expected-error {{can't convert between vector values of different size ('int2' (vector of 2 'int' values) and 'float2' (vector of 2 'float' values))}}
}
-------------- next part --------------
// RUN: %clang_cc1 %s -verify -pedantic -fsyntax-only
typedef char char2 __attribute__((ext_vector_type(2)));
typedef char char3 __attribute__((ext_vector_type(3)));
typedef int int2 __attribute__((ext_vector_type(2)));
typedef float float2 __attribute__((ext_vector_type(2)));
// scalar condition, vector operands with mismatching width
kernel void test(global int2 *srcA, global float *srcB, global int2 *srcC,
global float2 *dst)
{
int tid = 1;
dst[tid] = srcC[tid] ? srcA[tid] : srcB[tid]; // expected-error {{can't convert between vector values of different size ('int2' (vector of 2 'int' values) and 'float')}}
}
-------------- next part --------------
// RUN: %clang_cc1 %s -verify -pedantic -fsyntax-only
typedef char char2 __attribute__((ext_vector_type(2)));
typedef char char3 __attribute__((ext_vector_type(3)));
typedef int int2 __attribute__((ext_vector_type(2)));
typedef float float2 __attribute__((ext_vector_type(2)));
// scalar condition, vector operands with mismatching width
kernel void test(global char2 *srcA, global char2 *srcB, global int2 *srcC,
global char2 *dst)
{
int tid = 1;
dst[tid] = srcC[tid] ? srcA[tid] : srcB[tid]; // expected-error {{vector ternary operator requires element types with the same bit width}}
}
-------------- next part --------------
// RUN: %clang_cc1 %s -verify -pedantic -fsyntax-only
typedef char char2 __attribute__((ext_vector_type(2)));
typedef char char3 __attribute__((ext_vector_type(3)));
typedef int int2 __attribute__((ext_vector_type(2)));
typedef float float2 __attribute__((ext_vector_type(2)));
// scalar condition, vector operands with mismatching width
kernel void test(global float *srcA, global float *srcB, global float *srcC,
global float *dst)
{
int tid = 1;
dst[tid] = srcC[tid] ? srcA[tid] : srcB[tid]; // expected-error {{condition in a ternary operator cannot be a floating point type}}
}
-------------- next part --------------
// RUN: %clang_cc1 %s -verify -pedantic -fsyntax-only
typedef char char2 __attribute__((ext_vector_type(2)));
typedef char char3 __attribute__((ext_vector_type(3)));
typedef int int2 __attribute__((ext_vector_type(2)));
typedef float float2 __attribute__((ext_vector_type(2)));
// scalar condition, vector operands with mismatching width
kernel void test(global float2 *srcA, global float *srcB, global float *srcC,
global float2 *dst)
{
int tid = 1;
dst[tid] = srcC[tid] ? srcA[tid] : srcB[tid]; // expected-error {{condition in a ternary operator cannot be a floating point type}}
}
-------------- next part --------------
// RUN: %clang_cc1 %s -verify -pedantic -fsyntax-only
// expected-no-diagnostics
typedef char char2 __attribute__((ext_vector_type(2)));
typedef char char3 __attribute__((ext_vector_type(3)));
typedef int int2 __attribute__((ext_vector_type(2)));
typedef float float2 __attribute__((ext_vector_type(2)));
// all scalars, but widths do not match.
kernel void test01(global char *srcA, global int *srcB, global char *srcC,
global int *dst)
{
int tid = 1;
int destVal = srcC[tid] ? srcA[tid] : srcB[tid];
dst[tid] = destVal;
}
kernel void test02(global char *srcA, global char *srcB, global int *srcC,
global char *dst)
{
int tid = 1;
char destVal = srcC[tid] ? srcA[tid] : srcB[tid];
dst[tid] = destVal;
}
// mixed-width vectors and scalars
kernel void test03(global char *srcA, global int2 *srcB, global char *srcC,
global int2 *dst)
{
int tid = 1;
int2 destVal = srcC[tid] ? srcA[tid] : srcB[tid];
dst[tid] = destVal;
}
// uniform vectors
kernel void test04(global char2 *srcA, global char2 *srcB, global char2 *srcC,
global char2 *dst)
{
char tid = 1;
char2 destVal = srcC[tid] ? srcA[tid] : srcB[tid];
dst[tid] = destVal;
}
// vector condition and mixed scalar operands
kernel void test05(global int *srcA, global char *srcB, global int2 *srcC,
global int2 *dst)
{
char tid = 1;
int2 destVal = srcC[tid] ? srcA[tid] : srcB[tid];
dst[tid] = destVal;
}
/*
kernel void test06(global float *srcA, global float *srcB, global int2 *srcC,
global float2 *dst)
{
char tid = 1;
float2 destVal = srcC[tid] ? srcA[tid] : srcB[tid];
dst[tid] = destVal;
}
kernel void test07(global int *srcA, global float *srcB, global int2 *srcC,
global float2 *dst)
{
char tid = 1;
float2 destVal = srcC[tid] ? srcA[tid] : srcB[tid];
dst[tid] = destVal;
}
*/
// vector condition and mixed operands
kernel void test08(global int *srcA, global float2 *srcB, global int2 *srcC,
global float2 *dst)
{
char tid = 1;
float2 destVal = srcC[tid] ? srcA[tid] : srcB[tid];
dst[tid] = destVal;
}
-------------- next part --------------
Implementing OpenCL ternary operator with vector operands
=========================================================
Author: Sameer Sahasrabuddhe <sameer.sahasrabuddhe at amd.com>
Date: 2014-12-17
The ternary operator (?:) in OpenCL 1.2 has the following form:
result = exp1 ? exp2 : exp3
The operator first evaluates exp1, and compares its value to zero. If
the value is not equal to zero, then the operator selects to evaluate
exp2, otherwise it evaluates exp3. The situation when one or more of
the three operands is a vector needs clarification.
Three places in the OpenCL 1.2 spec are relevant here:
1. Page 220, which defines the behaviour of the ternary operator.
2. Section 6.2.6, which defines the usual arithmetic operations.
3. Page 268, which defines the select builtin.
In particular, this is an attempt to interpret page 220 in the
broadest possible way:
"The second and third expressions can be any type, as long their
types match, or there is a conversion <snip> that can be applied to
one of the expressions to make their types match <snip>. This
resulting matching type is the type of the entire expression."
The two snips refer to the applicable conversion rules that can be
used to determine the resulting matching type.
The semantics described on page 220 state that when exp1 is a vector,
then the operator is "equivalent to calling the select builtin". The
nature of this equivalence is open to interpretation. In particular,
If exp1 is a vector and either one or both of exp2 and exp3 are
scalar, then page 220 leaves room for performing "usual arithmetic
conversions" on exp2 and exp3.
It seems reasonable to assume that the intended meaning of the
"equivalence" refers two aspects:
1. If exp1 is a vector, then both exp2 and exp3 need to be evaluated,
since it is to be treated like a function call. This can cause
surprising results, but it is required for an element-wise
construction of the result vector as seen below.
2. All operands exp1, exp2, and exp3 are assumed to be vectors of the
same length (possibly after usual arithmetic conversions), and the
resulting vector is computed as:
result[i] = exp1[i] ? exp2[i] : exp3[i]
for every element index "i" into the vectors.
Section 6.2.6 "Usual Arithmetic Conversions" seems to say that for a
given operator, either all operands are of matching vector types, or
at most one operand is a vector. This can be overly restrictive for
the ternary operator because exp1 is treated specially as seen below.
Page 220 seems to over-ride this by saying that the result type is
determined only by exp2 and exp3. But this is insufficient, because
exp1 can affect the vector size of the result as seen below.
The three pieces in the OpenCL 1.2 spec could be synthesised into a
consistent behaviour for the ternary operator as follows:
1. If one or more operands are vectors, then their lengths must match.
2. The element type of exp1 cannot be a floating point type.
3. If exp1 is a scalar:
1. No restrictions apply to the types of exp2 and exp3 with respect
to exp1.
2. Either exp2 or exp3 is evaluated depending on the value of exp1.
3. The value of exp1 is compared to 0 for selection.
4. If exp1 is a vector:
1. The number of bits in the element types of all vector types must
match. Scalar types will be handled by usual arithmetic
conversions. See point 7 below for additional rules.
2. Both exp2 and exp3 are evaluated.
3. The MSB of exp1 is examined for selection.
5. The result type is obtained by usual arithmetic conversions on exp2
and exp3, as specified in Section 6.2.6.
6. If exp1 is a scalar, no further type conversions are required.
7. If exp1 is a vector, and the result type is a scalar, then
1. The scalar result type must have the same number of bits as the
element type in exp1, as required by the select builtin. Page
220 does not provide for further changing either the result type
or exp1 to match the number of bits.
2. The scalar result type is expanded to a vector having the same
length as exp1. The spec does not actually say this, and it
could be considered undefined behaviour. It is a useful but
tricky mix-and-match between all three referred sections to
arrive at a "reasonable behaviour" for the ternary operator.
Some notable examples:
exp1 exp2 exp3 result eval reason
--------+-------+--------+--------+--------+--------
char char int int either
int char char char either
char char int2 int2 either
char char2 char3 error - 1
char2 char2 char2 char2 either
char2 char3 char error - 1
char2 int int error - 7.1
char2 int2 int2 error - 4.1
int2 char char error - 7.1
int2 int char int2 both
int2 float float float2 both
int2 int float float2 both
int2 int2 float2 error - 5
int2 int2 float error - 5
int2 int float2 float2 both
int2 char2 char2 error - 4.1
float float float error - 2
float2 float float error - 2
More information about the cfe-dev
mailing list