[llvm] aa23e49 - [NVPTX] Fix generating permute bytes from register pair when the initial values are undefined (#74437)

via llvm-commits llvm-commits at lists.llvm.org
Tue Jan 16 11:05:46 PST 2024


Author: mmoadeli
Date: 2024-01-16T11:05:41-08:00
New Revision: aa23e493f2b363982a472fe38caffc69d907402c

URL: https://github.com/llvm/llvm-project/commit/aa23e493f2b363982a472fe38caffc69d907402c
DIFF: https://github.com/llvm/llvm-project/commit/aa23e493f2b363982a472fe38caffc69d907402c.diff

LOG: [NVPTX] Fix generating permute bytes from register pair when the initial values are undefined (#74437)

When generating the permute bytes for the prmt instruction, the
existence of an undefined initial value initialises the int32 that holds
the mask with all 1's (0xFFFFFFFF). That initialization subsequently
leads to complications during the subsequent OR operation, leading to
inaccuracies in populating mask values for the following bytes.
Consequently, the final value persists as a constant -1, irrespective of
the actual mask values that succeed the initial set value.

Added: 
    llvm/test/CodeGen/NVPTX/shuffle-vec-undef-init.ll

Modified: 
    llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp

Removed: 
    


################################################################################
diff  --git a/llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp b/llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp
index 6b4d2168e749385..c2552a95ad2103f 100644
--- a/llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp
+++ b/llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp
@@ -2387,8 +2387,10 @@ SDValue NVPTXTargetLowering::LowerVECTOR_SHUFFLE(SDValue Op,
   const ShuffleVectorSDNode *SVN = cast<ShuffleVectorSDNode>(Op.getNode());
   SDValue V2 = Op.getOperand(1);
   uint32_t Selector = 0;
-  for (auto I : llvm::enumerate(SVN->getMask()))
-    Selector |= (I.value() << (I.index() * 4));
+  for (auto I : llvm::enumerate(SVN->getMask())) {
+    if (I.value() != -1) // -1 is a placeholder for undef.
+      Selector |= (I.value() << (I.index() * 4));
+  }
 
   SDLoc DL(Op);
   return DAG.getNode(NVPTXISD::PRMT, DL, MVT::v4i8, V1, V2,

diff  --git a/llvm/test/CodeGen/NVPTX/shuffle-vec-undef-init.ll b/llvm/test/CodeGen/NVPTX/shuffle-vec-undef-init.ll
new file mode 100644
index 000000000000000..4f147f28e1a57e4
--- /dev/null
+++ b/llvm/test/CodeGen/NVPTX/shuffle-vec-undef-init.ll
@@ -0,0 +1,18 @@
+; RUN: llc < %s -march=nvptx -mcpu=sm_20 -verify-machineinstrs | FileCheck %s  
+; RUN: llc < %s -march=nvptx -mcpu=sm_20 -verify-machineinstrs | FileCheck %s   -check-prefix=CHECK-FOUND
+
+define void @kernel_func(ptr %in.vec, ptr %out.vec0) nounwind {
+  entry:
+  %wide.vec = load <32 x i8>, ptr %in.vec, align 64
+  %vec0 = shufflevector <32 x i8> %wide.vec, <32 x i8> undef, <4 x i32> <i32 0, i32 8, i32 16, i32 24>
+  store <4 x i8> %vec0, ptr %out.vec0, align 64
+  ret void
+
+; CHECK-FOUND: prmt.b32 	{{.*}} 16384;
+; CHECK-FOUND: prmt.b32 	{{.*}} 64;
+; CHECK-FOUND: prmt.b32 	{{.*}} 30224;
+
+; CHECK:  @kernel_func
+; CHECK-NOT: 	prmt.b32 	{{.*}} -1;
+; CHECK:  -- End function
+}


        


More information about the llvm-commits mailing list