[PATCH] D89404: Preserve param alignment in NVPTXLowerArgs pass.
Justin Lebar via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Oct 14 09:59:50 PDT 2020
jlebar created this revision.
jlebar added a reviewer: tra.
Herald added subscribers: llvm-commits, hiraditya, jholewinski.
Herald added a project: LLVM.
jlebar requested review of this revision.
NVPTXLowerArgs takes a load of an argument and converts it to
make alloca with appropriate alignment.
store into alloca
addrspacecast to param address space
load from alloca in param addrspace
The bug here is that it did not preserve the alloca's alignment in the
final load. LLVM doesn't know that NVPTX addrspacecast preserves
pointer alignment, so it doesn't do it for us.
The impact of this bug is that sometimes param loads would be lowered as
a series of u8 loads, because we're incorrectly assuming everything has
alignment 1.
Repository:
rG LLVM Github Monorepo
https://reviews.llvm.org/D89404
Files:
llvm/lib/Target/NVPTX/NVPTXLowerArgs.cpp
llvm/test/CodeGen/NVPTX/lower-args.ll
Index: llvm/test/CodeGen/NVPTX/lower-args.ll
===================================================================
--- /dev/null
+++ llvm/test/CodeGen/NVPTX/lower-args.ll
@@ -0,0 +1,27 @@
+; RUN: opt < %s -S -nvptx-lower-args | FileCheck %s --check-prefix IR
+; RUN: llc < %s -S -mcpu=sm_20 | FileCheck %s --check-prefix PTX
+
+target datalayout = "e-i64:64-i128:128-v16:16-v32:32-n16:32:64"
+target triple = "nvptx64-nvidia-cuda"
+
+%class.outer = type <{ %class.inner, i32, [4 x i8] }>
+%class.inner = type { i32*, i32* }
+
+; Check that nvptx-lower-args preserves arg alignment
+define void @load_alignment(%class.outer* nocapture readonly byval(%class.outer) align 8 %arg) {
+entry:
+; IR: load %class.outer, %class.outer addrspace(101)*
+; IR-SAME: align 8
+; PTX: ld.param.u64
+; PTX-NOT: ld.param.u8
+ %arg.idx = getelementptr %class.outer, %class.outer* %arg, i64 0, i32 0, i32 0
+ %arg.idx.val = load i32*, i32** %arg.idx, align 8
+ %arg.idx1 = getelementptr %class.outer, %class.outer* %arg, i64 0, i32 0, i32 1
+ %arg.idx1.val = load i32*, i32** %arg.idx1, align 8
+ %arg.idx2 = getelementptr %class.outer, %class.outer* %arg, i64 0, i32 1
+ %arg.idx2.val = load i32, i32* %arg.idx2, align 8
+ %arg.idx.val.val = load i32, i32* %arg.idx.val, align 4
+ %add.i = add nsw i32 %arg.idx.val.val, %arg.idx2.val
+ store i32 %add.i, i32* %arg.idx1.val, align 4
+ ret void
+}
Index: llvm/lib/Target/NVPTX/NVPTXLowerArgs.cpp
===================================================================
--- llvm/lib/Target/NVPTX/NVPTXLowerArgs.cpp
+++ llvm/lib/Target/NVPTX/NVPTXLowerArgs.cpp
@@ -172,8 +172,12 @@
Value *ArgInParam = new AddrSpaceCastInst(
Arg, PointerType::get(StructType, ADDRESS_SPACE_PARAM), Arg->getName(),
FirstInst);
+ // Be sure to propagate alignment to this load; LLVM doesn't know that NVPTX
+ // addrspacecast preserves alignment. Since params are constant, this load is
+ // definitely not volatile.
LoadInst *LI =
- new LoadInst(StructType, ArgInParam, Arg->getName(), FirstInst);
+ new LoadInst(StructType, ArgInParam, Arg->getName(),
+ /*isVolatile=*/false, AllocA->getAlign(), FirstInst);
new StoreInst(LI, AllocA, FirstInst);
}
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D89404.298173.patch
Type: text/x-patch
Size: 2220 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20201014/958b9ce0/attachment.bin>
More information about the llvm-commits
mailing list