[cfe-dev] Question about "CodeGenFunction::EmitLoadOfScalar" with vector type of 3 elements

Mon Mar 6 07:47:51 PST 2017

Hi All,

I have a question about "CodeGenFunction::EmitLoadOfScalar". I am 
compiling code with vector type of 3 elements like int3 or float3. Clang 
converts the vector load to different vector load with 4 element vector 
type because there is code on "CodeGenFunction::EmitLoadOfScalar" as 
follows:

1312   // For better performance, handle vector loads differently.
1313   if (Ty->isVectorType()) {
1314     const llvm::Type *EltTy = Addr.getElementType();
1315
1316     const auto *VTy = cast<llvm::VectorType>(EltTy);
1317
1318     // Handle vectors of size 3 like size 4 for better performance.
1319     if (VTy->getNumElements() == 3) {
1320
1321       // Bitcast to vec4 type.
1322       llvm::VectorType *vec4Ty = 
llvm::VectorType::get(VTy->getElementType(),
1323                                                          4);
1324       Address Cast = Builder.CreateElementBitCast(Addr, vec4Ty, 
"castToVec4");
1325       // Now load value.
1326       llvm::Value *V = Builder.CreateLoad(Cast, Volatile, "loadVec4");

4 element vector load could generate aligned vector load in the end and 
it would be better in usual. But it is not good for other target or 
language like OpenCL which supports 3 element vector type natively. Can 
we consider this situation on "CodeGenFunction::EmitLoadOfScalar" like 
this "if (!getLangOpts().OpenCL)" or with target specific property on 
TargetCodeGenInfo?

If I missed something, please let me know.

Thanks,
JinGu Kang