[cfe-dev] Question about "CodeGenFunction::EmitLoadOfScalar" with vector type of 3 elements
Anastasia Stulova via cfe-dev
cfe-dev at lists.llvm.org
Thu Mar 9 11:08:40 PST 2017
Cool, could you please resend the patch to cfe-commits with “[OpenCL]” prefix in the subject. Or if possible create review with Phabricator: http://llvm.org/docs/Phabricator.html.
Thanks!
Anastasia
From: jingu at codeplay.com [mailto:jingu at codeplay.com]
Sent: 09 March 2017 12:03
To: Anastasia Stulova; aleksey.bader at gmail.com
Cc: 'cfe-dev at lists.llvm.org' (cfe-dev at lists.llvm.org); nd
Subject: Re: [cfe-dev] Question about "CodeGenFunction::EmitLoadOfScalar" with vector type of 3 elements
Hi Anastasia,
I appreciate your response. I think we need to keep "ScalarExprEmitter::VisitAsTypeExpr" between vec3 and vec4, as we want to maintain the features of the OpenCL source language. If llvm has intrinsic function on IR for the __builtin_astype, we could generate it and llvm's CodeGen could handle it. I have found other location for vec3 and it is "CodeGenFunction::EmitStoreOfScalar". I have simply added a clang's CodeGen Option to preseve vec3. I have attached the diff file and a test. If I missed something, please let me know.
Thanks,
JinGu Kang
On 08/03/17 13:05, Anastasia Stulova wrote:
I think the problem is that the borderline for IR being target independent is very vague in general. In this case specifically the issue is that the Spec is very explicit about threating this as 4 element aligned type. However, I agree this lowering could be done later as well. The approach to condition this on the Target property sounds reasonable. I think we have other places in Clang where vec3 is threated as vec4 (e.g. ScalarExprEmitter::VisitAsTypeExpr). Those would have to be handled too. Feel free to propose a prototype.
Cheer,
Anastasia
From: cfe-dev [mailto:cfe-dev-bounces at lists.llvm.org] On Behalf Of jingu at codeplay.com<mailto:jingu at codeplay.com> via cfe-dev
Sent: 08 March 2017 11:01
To: aleksey.bader at gmail.com<mailto:aleksey.bader at gmail.com>
Cc: 'cfe-dev at lists.llvm.org<mailto:cfe-dev at lists.llvm.org>' (cfe-dev at lists.llvm.org<mailto:cfe-dev at lists.llvm.org>)
Subject: Re: [cfe-dev] Question about "CodeGenFunction::EmitLoadOfScalar" with vector type of 3 elements
Hi Alexey,
I appreciate your response. My colleague and I are implementing a transformation pass between LLVM IR and another IR and we want to keep the 3-component vector types in our target IR. As you mentioned, the 4-component vector type conversion code is not problem. But I usually expect clang generates more target independent LLVM IR except target specific properties like calling convention, memory layout of variables, etc. clang can keep the 3-component vector type operations and llvm codegen can handle them according to target. At present, we're having to undo Clang's transformation of vec3 -> vec4, to recreate the original type information, which is unfortunate. Would it be possible to add an option to control the behaviour?
Thanks,
JinGu Kang
On 07/03/17 18:19, aleksey.bader at gmail.com<mailto:aleksey.bader at gmail.com> wrote:
Hi JinGu,
I don't think it should be a problem for OpenCL. 3-component vector is aligned as 4-component vector (see section 6.1.5 "Alignment of Type" of OpenCL C kernel language specification v2.0).
AFAIK, almost all existing OpenCL compilers are based on clang and there seems to be no problems with handling load/store operations this way.
Could you elaborate on the case where this approach doesn't work?
Thanks,
Alexey
On Mon, Mar 6, 2017 at 6:47 PM, jingu at codeplay.com<mailto:jingu at codeplay.com> via cfe-dev <cfe-dev at lists.llvm.org<mailto:cfe-dev at lists.llvm.org>> wrote:
Hi All,
I have a question about "CodeGenFunction::EmitLoadOfScalar". I am compiling code with vector type of 3 elements like int3 or float3. Clang converts the vector load to different vector load with 4 element vector type because there is code on "CodeGenFunction::EmitLoadOfScalar" as follows:
1312 // For better performance, handle vector loads differently.
1313 if (Ty->isVectorType()) {
1314 const llvm::Type *EltTy = Addr.getElementType();
1315
1316 const auto *VTy = cast<llvm::VectorType>(EltTy);
1317
1318 // Handle vectors of size 3 like size 4 for better performance.
1319 if (VTy->getNumElements() == 3) {
1320
1321 // Bitcast to vec4 type.
1322 llvm::VectorType *vec4Ty = llvm::VectorType::get(VTy->getElementType(),
1323 4);
1324 Address Cast = Builder.CreateElementBitCast(Addr, vec4Ty, "castToVec4");
1325 // Now load value.
1326 llvm::Value *V = Builder.CreateLoad(Cast, Volatile, "loadVec4");
4 element vector load could generate aligned vector load in the end and it would be better in usual. But it is not good for other target or language like OpenCL which supports 3 element vector type natively. Can we consider this situation on "CodeGenFunction::EmitLoadOfScalar" like this "if (!getLangOpts().OpenCL)" or with target specific property on TargetCodeGenInfo?
If I missed something, please let me know.
Thanks,
JinGu Kang
_______________________________________________
cfe-dev mailing list
cfe-dev at lists.llvm.org<mailto:cfe-dev at lists.llvm.org>
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20170309/72e6af1a/attachment.html>
More information about the cfe-dev
mailing list