[cfe-commits] [OpenCL patch] Clang fails on nested vector literals
Tanya Lattner
lattner at apple.com
Thu Sep 29 16:29:08 PDT 2011
On Sep 28, 2011, at 9:21 AM, Eli Friedman wrote:
> On Wed, Sep 28, 2011 at 6:32 AM, Anton Lokhmotov
> <Anton.Lokhmotov at arm.com> wrote:
>> Hi Eli,
>>
>>>> This code is necessary for Clang to recognise that a vector literal
>>>> with a nested constant initializer is constant.
>>>
>>> All of your tests pass without those changes... are they necessary for
>>> some other reason I'm missing?
>>
>> Without those changes, Clang thinks that the right-hand side of:
>>
>> __constant int4 i_1_2_1 = (int4)(1,(int2)(2,3),4);
>>
>> is not a compile-time constant. This contradicts the OpenCL spec (rev44,
>> p187):
>>
>> "Variables in the program scope or the outermost scope of kernel functions
>> can be declared in the __constant address space. These variables are
>> required to be initialized and the values used to initialize these variables
>> must be a compile time constant."
>>
>> The tests pass only because Clang currently does not enforce this behaviour
>> (but we will submit another patch for that). We include the tests in this
>> patch because the problem occurs specifically on nested vector literals.
>
> Those changes look simply wrong; please take them out of this patch.
> We can discuss them again with the patch where they make a difference.
>
Yes, these changes should be moved to the other patch since its a separate issue from the nested vector literals.
>>> Our code is more generic, as it allows nested vector initializers
>>> (see OpenCL 6.1.6). The added tests cover all possible uses.
>>>
>>> Your patch breaks the following:
>>>
>>> typedef int i4 __attribute((vector_size(16)));
>>> i4 x = (i4){1,2};
>>>
>>> It takes out the code that would otherwise pad the vector with zeros.
>>
>> You are right. But why was this behaviour used in the first place?
>>
>> Both OpenCL 1.1 (http://www.khronos.org/registry/cl/specs/opencl-1.1.pdf,
>> p164) and AltiVec PIM
>> (http://www.freescale.com/files/32bit/doc/ref_manual/ALTIVECPIM.pdf, p26)
>> say that the number of elements in the source must be equal to the number of
>> elements in the result vector or one. GCC vector extensions
>> (http://gcc.gnu.org/onlinedocs/gcc/Vector-Extensions.html) are also silent
>> on the issue.
>>
>> Could you please point to a specification that says that missing vector
>> elements must be filled with zeros?
>
> No spec, but it's gcc-compatible for my given test case.
In response to Eli's comment, how about doing something like this after you are done looping over the initializers:
+
+ // Handle the case where the number of elements is less than the number
+ // of initializers and insert zeros.
+ for (unsigned i = Elements.size(); i < VT->getNumElements(); ++i) {
+ if (EltTy->isIntegerType())
+ Elements.push_back(APValue(Info.Ctx.MakeIntValue(0, EltTy)));
+ else
+ Elements.push_back(APValue(APFloat::getZero(Info.Ctx.getFloatTypeSemantics(EltTy))));
+ }
+
I think that will take care of the gcc-compatibilty issue. Otherwise, the patch looks good to me as long as you remove the VisitCompoundLiteralExpr and first set of changes that Eli mentioned.
Also, it would be great to see some comments about the vector initializer. There were some good comments before you can modify/reuse.
-Tanya
More information about the cfe-commits
mailing list