[llvm] [NVTPX] Copy kernel arguments as byte array (PR #110356)

Artem Belevich via llvm-commits llvm-commits at lists.llvm.org
Thu Oct 3 10:30:52 PDT 2024


================
@@ -623,13 +623,33 @@ void NVPTXLowerArgs::handleByValParam(const NVPTXTargetMachine &TM,
     Value *ArgInParam = new AddrSpaceCastInst(
         Arg, PointerType::get(Arg->getContext(), ADDRESS_SPACE_PARAM),
         Arg->getName(), FirstInst);
+    // Create an opaque type of same size as StructType but without padding
+    // holes as this could have been a union.
+    const auto StructBytes = *AllocA->getAllocationSize(DL);
+    SmallVector<Type *, 5> ChunkTypes;
+    if (StructBytes >= 16) {
+        Type *IntType = Type::getInt64Ty(Func->getContext());
+        Type *ChunkType = VectorType::get(IntType, 2, false);
+        Type *OpaqueType = StructBytes < 32 ? ChunkType :
+                           ArrayType::get(ChunkType, StructBytes / 16);
+        ChunkTypes.push_back(OpaqueType);
+    }
+    for (const auto ChunkBytes: {8, 4, 2, 1}) {
+      if (StructBytes & ChunkBytes) {
+          Type *ChunkType = Type::getIntNTy(Func->getContext(), 8 * ChunkBytes);
+          ChunkTypes.push_back(ChunkType);
+      }
+    }
+    Type * OpaqueType = ChunkTypes.size() == 1 ? ChunkTypes[0] :
+                        StructType::create(ChunkTypes);
----------------
Artem-B wrote:

Interesting. I think the problem is that `SROA` just sticks 'align 1' on all 1-byte loads/stores, because it technically does not need any higher alignment, and LSV takes it at face value, even though the base pointer is aligned.

That looks like something we may want to fix, eventually, but it's probably going to take some time. 

Can we just use memcpy instead of load/store w/ a custom type? That should copy the data we need *and* convey proper alignment info to the code that does the copying.

https://github.com/llvm/llvm-project/pull/110356


More information about the llvm-commits mailing list