[cfe-dev] Question about "CodeGenFunction::EmitLoadOfScalar" with vector type of 3 elements
jingu@codeplay.com via cfe-dev
cfe-dev at lists.llvm.org
Thu Mar 9 04:03:12 PST 2017
Hi Anastasia,
I appreciate your response. I think we need to keep
"ScalarExprEmitter::VisitAsTypeExpr" between vec3 and vec4, as we want
to maintain the features of the OpenCL source language. If llvm has
intrinsic function on IR for the __builtin_astype, we could generate it
and llvm's CodeGen could handle it. I have found other location for vec3
and it is "CodeGenFunction::EmitStoreOfScalar". I have simply added a
clang's CodeGen Option to preseve vec3. I have attached the diff file
and a test. If I missed something, please let me know.
Thanks,
JinGu Kang
On 08/03/17 13:05, Anastasia Stulova wrote:
>
> I think the problem is that the borderline for IR being target
> independent is very vague in general. In this case specifically the
> issue is that the Spec is very explicit about threating this as 4
> element aligned type. However, I agree this lowering could be done
> later as well. The approach to condition this on the Target property
> sounds reasonable. I think we have other places in Clang where vec3 is
> threated as vec4 (e.g. ScalarExprEmitter::VisitAsTypeExpr). Those
> would have to be handled too. Feel free to propose a prototype.
>
> Cheer,
>
> Anastasia
>
> *From:*cfe-dev [mailto:cfe-dev-bounces at lists.llvm.org] *On Behalf Of
> *jingu at codeplay.com via cfe-dev
> *Sent:* 08 March 2017 11:01
> *To:* aleksey.bader at gmail.com
> *Cc:* 'cfe-dev at lists.llvm.org' (cfe-dev at lists.llvm.org)
> *Subject:* Re: [cfe-dev] Question about
> "CodeGenFunction::EmitLoadOfScalar" with vector type of 3 elements
>
> Hi Alexey,
>
> I appreciate your response. My colleague and I are implementing a
> transformation pass between LLVM IR and another IR and we want to keep
> the 3-component vector types in our target IR. As you mentioned, the
> 4-component vector type conversion code is not problem. But I usually
> expect clang generates more target independent LLVM IR except target
> specific properties like calling convention, memory layout of
> variables, etc. clang can keep the 3-component vector type operations
> and llvm codegen can handle them according to target. At present,
> we're having to undo Clang's transformation of vec3 -> vec4, to
> recreate the original type information, which is unfortunate. Would it
> be possible to add an option to control the behaviour?
>
> Thanks,
>
> JinGu Kang
>
> On 07/03/17 18:19, aleksey.bader at gmail.com
> <mailto:aleksey.bader at gmail.com> wrote:
>
> Hi JinGu,
>
> I don't think it should be a problem for OpenCL. 3-component
> vector is aligned as 4-component vector (see section 6.1.5
> "Alignment of Type" of OpenCL C kernel language specification v2.0).
>
> AFAIK, almost all existing OpenCL compilers are based on clang and
> there seems to be no problems with handling load/store operations
> this way.
>
> Could you elaborate on the case where this approach doesn't work?
>
> Thanks,
>
> Alexey
>
> On Mon, Mar 6, 2017 at 6:47 PM, jingu at codeplay.com
> <mailto:jingu at codeplay.com> via cfe-dev <cfe-dev at lists.llvm.org
> <mailto:cfe-dev at lists.llvm.org>> wrote:
>
> Hi All,
>
>
> I have a question about "CodeGenFunction::EmitLoadOfScalar". I am
> compiling code with vector type of 3 elements like int3 or float3.
> Clang converts the vector load to different vector load with 4
> element vector type because there is code on
> "CodeGenFunction::EmitLoadOfScalar" as follows:
>
> 1312 // For better performance, handle vector loads differently.
> 1313 if (Ty->isVectorType()) {
> 1314 const llvm::Type *EltTy = Addr.getElementType();
> 1315
> 1316 const auto *VTy = cast<llvm::VectorType>(EltTy);
> 1317
> 1318 // Handle vectors of size 3 like size 4 for better
> performance.
> 1319 if (VTy->getNumElements() == 3) {
> 1320
> 1321 // Bitcast to vec4 type.
> 1322 llvm::VectorType *vec4Ty =
> llvm::VectorType::get(VTy->getElementType(),
> 1323 4);
> 1324 Address Cast = Builder.CreateElementBitCast(Addr,
> vec4Ty, "castToVec4");
> 1325 // Now load value.
> 1326 llvm::Value *V = Builder.CreateLoad(Cast, Volatile,
> "loadVec4");
>
> 4 element vector load could generate aligned vector load in the
> end and it would be better in usual. But it is not good for other
> target or language like OpenCL which supports 3 element vector
> type natively. Can we consider this situation on
> "CodeGenFunction::EmitLoadOfScalar" like this "if
> (!getLangOpts().OpenCL)" or with target specific property on
> TargetCodeGenInfo?
>
> If I missed something, please let me know.
>
> Thanks,
> JinGu Kang
>
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>
> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20170309/04db77d1/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: vec3.diff
Type: text/x-patch
Size: 6234 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20170309/04db77d1/attachment.bin>
-------------- next part --------------
// RUN: %clang_cc1 %s -emit-llvm -o - -triple spir-unknown-unknown -preserve-vec3-type | FileCheck %s
typedef float float3 __attribute__((ext_vector_type(3)));
typedef float float4 __attribute__((ext_vector_type(4)));
void kernel foo(global float3 *a, global float3 *b) {
// CHECK: %[[LOAD_A:.*]] = load <3 x float>, <3 x float> addrspace(1)* %a
// CHECK: store <3 x float> %[[LOAD_A]], <3 x float> addrspace(1)* %b
*b = *a;
}
More information about the cfe-dev
mailing list