[llvm] [NVTPX] Copy kernel arguments as byte array (PR #110356)
Artem Belevich via llvm-commits
llvm-commits at lists.llvm.org
Wed Oct 2 14:30:47 PDT 2024
================
@@ -623,13 +623,33 @@ void NVPTXLowerArgs::handleByValParam(const NVPTXTargetMachine &TM,
Value *ArgInParam = new AddrSpaceCastInst(
Arg, PointerType::get(Arg->getContext(), ADDRESS_SPACE_PARAM),
Arg->getName(), FirstInst);
+ // Create an opaque type of same size as StructType but without padding
+ // holes as this could have been a union.
+ const auto StructBytes = *AllocA->getAllocationSize(DL);
+ SmallVector<Type *, 5> ChunkTypes;
+ if (StructBytes >= 16) {
+ Type *IntType = Type::getInt64Ty(Func->getContext());
+ Type *ChunkType = VectorType::get(IntType, 2, false);
+ Type *OpaqueType = StructBytes < 32 ? ChunkType :
+ ArrayType::get(ChunkType, StructBytes / 16);
+ ChunkTypes.push_back(OpaqueType);
+ }
+ for (const auto ChunkBytes: {8, 4, 2, 1}) {
+ if (StructBytes & ChunkBytes) {
+ Type *ChunkType = Type::getIntNTy(Func->getContext(), 8 * ChunkBytes);
+ ChunkTypes.push_back(ChunkType);
+ }
+ }
+ Type * OpaqueType = ChunkTypes.size() == 1 ? ChunkTypes[0] :
+ StructType::create(ChunkTypes);
----------------
Artem-B wrote:
Do you still have the IR that didn't get vectorized? If it should be vectorizable, that's the problem that we should fix, rather than complicate things here.
I also see regression in load/store vectorization in some other code I'm looking at, so I suspect we may have a problem lurking there.
https://github.com/llvm/llvm-project/pull/110356
More information about the llvm-commits
mailing list