[PATCH] ARMEB: Fix trunc store for vector types

Fri Jun 13 08:19:56 PDT 2014

Hi all,

The ARM backend transforms a trunc store to shuffle and store operations.
A vector (or multiple scalars) is packed in one (wide) place to be stored in less memory operations.
The shuffle operation is utilized to extract some values (narrowed elements) to mimic the trunc operation.
Currently, the shuffle operation assumes little endian byte order.

This patch is calculating the shuffle indices for the least significant data (trunc) based upon the "higher side" of vector element.

Please review.

Thanks,
Christian

http://reviews.llvm.org/D4135

Files:
  lib/Target/ARM/ARMISelLowering.cpp
  test/CodeGen/ARM/big-endian-neon-trunc-store.ll

Index: lib/Target/ARM/ARMISelLowering.cpp
===================================================================

--- lib/Target/ARM/ARMISelLowering.cpp
+++ lib/Target/ARM/ARMISelLowering.cpp
@@ -8464,7 +8464,8 @@
     SDLoc DL(St);
     SDValue WideVec = DAG.getNode(ISD::BITCAST, DL, WideVecVT, StVal);
     SmallVector<int, 8> ShuffleVec(NumElems * SizeRatio, -1);
-    for (unsigned i = 0; i < NumElems; ++i) ShuffleVec[i] = i * SizeRatio;
+    for (unsigned i = 0; i < NumElems; ++i)
+      ShuffleVec[i] = TLI.isBigEndian() ? (i+1) * SizeRatio - 1 : i * SizeRatio;
 
     // Can't shuffle using an illegal type.
     if (!TLI.isTypeLegal(WideVecVT)) return SDValue();
Index: test/CodeGen/ARM/big-endian-neon-trunc-store.ll
===================================================================
--- test/CodeGen/ARM/big-endian-neon-trunc-store.ll
+++ test/CodeGen/ARM/big-endian-neon-trunc-store.ll
@@ -0,0 +1,26 @@
+; RUN: llc < %s -mtriple armeb-eabi -mattr v7,neon -o - | FileCheck %s
+
+define void @vector_trunc_store_2i64_to_2i16( <2 x i64>* %loadaddr, <2 x i16>* %storeaddr ) {
+; CHECK-LABEL: vector_trunc_store_2i64_to_2i16:
+; CHECK:       vmovn.i64  [[REG:d[0-9]+]]
+; CHECK:       vrev32.16  [[REG]], [[REG]]
+; CHECK:       vuzp.16    [[REG]], [[REG2:d[0-9]+]]
+; CHECK:       vrev32.16  [[REG]], [[REG2]]
+  %1 = load <2 x i64>* %loadaddr
+  %2 = trunc <2 x i64> %1 to <2 x i16>
+  store <2 x i16> %2, <2 x i16>* %storeaddr
+  ret void
+}
+
+define void @vector_trunc_store_4i32_to_4i8( <4 x i32>* %loadaddr, <4 x i8>* %storeaddr ) {
+; CHECK-LABEL: vector_trunc_store_4i32_to_4i8:
+; CHECK:       vmovn.i32 [[REG:d[0-9]+]]
+; CHECK:       vrev16.8  [[REG]], [[REG]]
+; CHECK:       vuzp.8    [[REG]], [[REG2:d[0-9]+]]
+; CHECK:       vrev32.8  [[REG]], [[REG2]]
+  %1 = load <4 x i32>* %loadaddr
+  %2 = trunc <4 x i32> %1 to <4 x i8>
+  store <4 x i8> %2, <4 x i8>* %storeaddr
+  ret void
+}
+
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D4135.10391.patch
Type: text/x-patch
Size: 1908 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140613/48181a40/attachment.bin>