[PATCH] D89404: Preserve param alignment in NVPTXLowerArgs pass.

Justin Lebar via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Oct 14 09:59:50 PDT 2020


jlebar created this revision.
jlebar added a reviewer: tra.
Herald added subscribers: llvm-commits, hiraditya, jholewinski.
Herald added a project: LLVM.
jlebar requested review of this revision.

NVPTXLowerArgs takes a load of an argument and converts it to

  make alloca with appropriate alignment.
  store into alloca
  addrspacecast to param address space
  load from alloca in param addrspace

The bug here is that it did not preserve the alloca's alignment in the
final load.  LLVM doesn't know that NVPTX addrspacecast preserves
pointer alignment, so it doesn't do it for us.

The impact of this bug is that sometimes param loads would be lowered as
a series of u8 loads, because we're incorrectly assuming everything has
alignment 1.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D89404

Files:
  llvm/lib/Target/NVPTX/NVPTXLowerArgs.cpp
  llvm/test/CodeGen/NVPTX/lower-args.ll


Index: llvm/test/CodeGen/NVPTX/lower-args.ll
===================================================================
--- /dev/null
+++ llvm/test/CodeGen/NVPTX/lower-args.ll
@@ -0,0 +1,27 @@
+; RUN: opt < %s -S -nvptx-lower-args | FileCheck %s --check-prefix IR
+; RUN: llc < %s -S -mcpu=sm_20 | FileCheck %s --check-prefix PTX
+
+target datalayout = "e-i64:64-i128:128-v16:16-v32:32-n16:32:64"
+target triple = "nvptx64-nvidia-cuda"
+
+%class.outer = type <{ %class.inner, i32, [4 x i8] }>
+%class.inner = type { i32*, i32* }
+
+; Check that nvptx-lower-args preserves arg alignment
+define void @load_alignment(%class.outer* nocapture readonly byval(%class.outer) align 8 %arg) {
+entry:
+; IR: load %class.outer, %class.outer addrspace(101)*
+; IR-SAME: align 8
+; PTX: ld.param.u64
+; PTX-NOT: ld.param.u8
+  %arg.idx = getelementptr %class.outer, %class.outer* %arg, i64 0, i32 0, i32 0
+  %arg.idx.val = load i32*, i32** %arg.idx, align 8
+  %arg.idx1 = getelementptr %class.outer, %class.outer* %arg, i64 0, i32 0, i32 1
+  %arg.idx1.val = load i32*, i32** %arg.idx1, align 8
+  %arg.idx2 = getelementptr %class.outer, %class.outer* %arg, i64 0, i32 1
+  %arg.idx2.val = load i32, i32* %arg.idx2, align 8
+  %arg.idx.val.val = load i32, i32* %arg.idx.val, align 4
+  %add.i = add nsw i32 %arg.idx.val.val, %arg.idx2.val
+  store i32 %add.i, i32* %arg.idx1.val, align 4
+  ret void
+}
Index: llvm/lib/Target/NVPTX/NVPTXLowerArgs.cpp
===================================================================
--- llvm/lib/Target/NVPTX/NVPTXLowerArgs.cpp
+++ llvm/lib/Target/NVPTX/NVPTXLowerArgs.cpp
@@ -172,8 +172,12 @@
   Value *ArgInParam = new AddrSpaceCastInst(
       Arg, PointerType::get(StructType, ADDRESS_SPACE_PARAM), Arg->getName(),
       FirstInst);
+  // Be sure to propagate alignment to this load; LLVM doesn't know that NVPTX
+  // addrspacecast preserves alignment.  Since params are constant, this load is
+  // definitely not volatile.
   LoadInst *LI =
-      new LoadInst(StructType, ArgInParam, Arg->getName(), FirstInst);
+      new LoadInst(StructType, ArgInParam, Arg->getName(),
+                   /*isVolatile=*/false, AllocA->getAlign(), FirstInst);
   new StoreInst(LI, AllocA, FirstInst);
 }
 


-------------- next part --------------
A non-text attachment was scrubbed...
Name: D89404.298173.patch
Type: text/x-patch
Size: 2220 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20201014/958b9ce0/attachment.bin>


More information about the llvm-commits mailing list