<div dir="rtl"><div dir="ltr">Hi Ulrich,</div><div dir="ltr"><br></div><div dir="ltr">Visual C++ warns that:</div><div dir="ltr"><br></div><div dir="ltr"> lib\Target\SystemZ\SystemZISelLowering.cpp(3799): warning C4334: '<<' : result of 32-bit shift implicitly converted to 64 bits (was 64-bit shift intended?)<br></div><div dir="ltr"><br></div><div dir="ltr">The code is </div><div dir="ltr"><br></div><div dir="ltr"> Mask |= 1 << ((E - I - 1) * BytesPerElement + J);</div><div dir="ltr"><br></div><div dir="ltr">Maybe 1 should be 1L. We're trying to keep the compilation warning free. Thanks!</div><div dir="ltr"><br></div><div dir="ltr">Yaron</div><div dir="ltr"><br></div><div dir="ltr"><br></div><div dir="ltr"><br></div><div dir="ltr"><br></div></div><div class="gmail_extra"><br><div class="gmail_quote"><div dir="ltr">2015-05-05 22:25 GMT+03:00 Ulrich Weigand <span dir="ltr"><<a href="mailto:ulrich.weigand@de.ibm.com" target="_blank">ulrich.weigand@de.ibm.com</a>></span>:</div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Author: uweigand<br>
Date: Tue May 5 14:25:42 2015<br>
New Revision: 236521<br>
<br>
URL: <a href="http://llvm.org/viewvc/llvm-project?rev=236521&view=rev" target="_blank">http://llvm.org/viewvc/llvm-project?rev=236521&view=rev</a><br>
Log:<br>
[SystemZ] Add CodeGen support for integer vector types<br>
<br>
This the first of a series of patches to add CodeGen support exploiting<br>
the instructions of the z13 vector facility. This patch adds support<br>
for the native integer vector types (v16i8, v8i16, v4i32, v2i64).<br>
<br>
When the vector facility is present, we default to the new vector ABI.<br>
This is characterized by two major differences:<br>
- Vector types are passed/returned in vector registers<br>
(except for unnamed arguments of a variable-argument list function).<br>
- Vector types are at most 8-byte aligned.<br>
<br>
The reason for the choice of 8-byte vector alignment is that the hardware<br>
is able to efficiently load vectors at 8-byte alignment, and the ABI only<br>
guarantees 8-byte alignment of the stack pointer, so requiring any higher<br>
alignment for vectors would require dynamic stack re-alignment code.<br>
<br>
However, for compatibility with old code that may use vector types, when<br>
*not* using the vector facility, the old alignment rules (vector types<br>
are naturally aligned) remain in use.<br>
<br>
These alignment rules are not only implemented at the C language level<br>
(implemented in clang), but also at the LLVM IR level. This is done<br>
by selecting a different DataLayout string depending on whether the<br>
vector ABI is in effect or not.<br>
<br>
Based on a patch by Richard Sandiford.<br>
<br>
<br>
Added:<br>
llvm/trunk/test/CodeGen/SystemZ/frame-19.ll<br>
llvm/trunk/test/CodeGen/SystemZ/vec-abi-align.ll<br>
llvm/trunk/test/CodeGen/SystemZ/vec-abs-01.ll<br>
llvm/trunk/test/CodeGen/SystemZ/vec-abs-02.ll<br>
llvm/trunk/test/CodeGen/SystemZ/vec-abs-03.ll<br>
llvm/trunk/test/CodeGen/SystemZ/vec-abs-04.ll<br>
llvm/trunk/test/CodeGen/SystemZ/vec-add-01.ll<br>
llvm/trunk/test/CodeGen/SystemZ/vec-and-01.ll<br>
llvm/trunk/test/CodeGen/SystemZ/vec-and-02.ll<br>
llvm/trunk/test/CodeGen/SystemZ/vec-and-03.ll<br>
llvm/trunk/test/CodeGen/SystemZ/vec-args-01.ll<br>
llvm/trunk/test/CodeGen/SystemZ/vec-args-02.ll<br>
llvm/trunk/test/CodeGen/SystemZ/vec-args-03.ll<br>
llvm/trunk/test/CodeGen/SystemZ/vec-cmp-01.ll<br>
llvm/trunk/test/CodeGen/SystemZ/vec-cmp-02.ll<br>
llvm/trunk/test/CodeGen/SystemZ/vec-cmp-03.ll<br>
llvm/trunk/test/CodeGen/SystemZ/vec-cmp-04.ll<br>
llvm/trunk/test/CodeGen/SystemZ/vec-combine-01.ll<br>
llvm/trunk/test/CodeGen/SystemZ/vec-const-01.ll<br>
llvm/trunk/test/CodeGen/SystemZ/vec-const-02.ll<br>
llvm/trunk/test/CodeGen/SystemZ/vec-const-03.ll<br>
llvm/trunk/test/CodeGen/SystemZ/vec-const-04.ll<br>
llvm/trunk/test/CodeGen/SystemZ/vec-const-07.ll<br>
llvm/trunk/test/CodeGen/SystemZ/vec-const-08.ll<br>
llvm/trunk/test/CodeGen/SystemZ/vec-const-09.ll<br>
llvm/trunk/test/CodeGen/SystemZ/vec-const-10.ll<br>
llvm/trunk/test/CodeGen/SystemZ/vec-const-13.ll<br>
llvm/trunk/test/CodeGen/SystemZ/vec-const-14.ll<br>
llvm/trunk/test/CodeGen/SystemZ/vec-const-15.ll<br>
llvm/trunk/test/CodeGen/SystemZ/vec-const-16.ll<br>
llvm/trunk/test/CodeGen/SystemZ/vec-ctlz-01.ll<br>
llvm/trunk/test/CodeGen/SystemZ/vec-ctpop-01.ll<br>
llvm/trunk/test/CodeGen/SystemZ/vec-cttz-01.ll<br>
llvm/trunk/test/CodeGen/SystemZ/vec-div-01.ll<br>
llvm/trunk/test/CodeGen/SystemZ/vec-max-01.ll<br>
llvm/trunk/test/CodeGen/SystemZ/vec-max-02.ll<br>
llvm/trunk/test/CodeGen/SystemZ/vec-max-03.ll<br>
llvm/trunk/test/CodeGen/SystemZ/vec-max-04.ll<br>
llvm/trunk/test/CodeGen/SystemZ/vec-min-01.ll<br>
llvm/trunk/test/CodeGen/SystemZ/vec-min-02.ll<br>
llvm/trunk/test/CodeGen/SystemZ/vec-min-03.ll<br>
llvm/trunk/test/CodeGen/SystemZ/vec-min-04.ll<br>
llvm/trunk/test/CodeGen/SystemZ/vec-move-01.ll<br>
llvm/trunk/test/CodeGen/SystemZ/vec-move-02.ll<br>
llvm/trunk/test/CodeGen/SystemZ/vec-move-03.ll<br>
llvm/trunk/test/CodeGen/SystemZ/vec-move-04.ll<br>
llvm/trunk/test/CodeGen/SystemZ/vec-move-05.ll<br>
llvm/trunk/test/CodeGen/SystemZ/vec-move-06.ll<br>
llvm/trunk/test/CodeGen/SystemZ/vec-move-07.ll<br>
llvm/trunk/test/CodeGen/SystemZ/vec-move-08.ll<br>
llvm/trunk/test/CodeGen/SystemZ/vec-move-09.ll<br>
llvm/trunk/test/CodeGen/SystemZ/vec-move-10.ll<br>
llvm/trunk/test/CodeGen/SystemZ/vec-move-11.ll<br>
llvm/trunk/test/CodeGen/SystemZ/vec-move-12.ll<br>
llvm/trunk/test/CodeGen/SystemZ/vec-move-13.ll<br>
llvm/trunk/test/CodeGen/SystemZ/vec-move-14.ll<br>
llvm/trunk/test/CodeGen/SystemZ/vec-mul-01.ll<br>
llvm/trunk/test/CodeGen/SystemZ/vec-mul-02.ll<br>
llvm/trunk/test/CodeGen/SystemZ/vec-neg-01.ll<br>
llvm/trunk/test/CodeGen/SystemZ/vec-or-01.ll<br>
llvm/trunk/test/CodeGen/SystemZ/vec-or-02.ll<br>
llvm/trunk/test/CodeGen/SystemZ/vec-perm-01.ll<br>
llvm/trunk/test/CodeGen/SystemZ/vec-perm-02.ll<br>
llvm/trunk/test/CodeGen/SystemZ/vec-perm-03.ll<br>
llvm/trunk/test/CodeGen/SystemZ/vec-perm-04.ll<br>
llvm/trunk/test/CodeGen/SystemZ/vec-perm-05.ll<br>
llvm/trunk/test/CodeGen/SystemZ/vec-perm-06.ll<br>
llvm/trunk/test/CodeGen/SystemZ/vec-perm-07.ll<br>
llvm/trunk/test/CodeGen/SystemZ/vec-perm-08.ll<br>
llvm/trunk/test/CodeGen/SystemZ/vec-perm-09.ll<br>
llvm/trunk/test/CodeGen/SystemZ/vec-perm-10.ll<br>
llvm/trunk/test/CodeGen/SystemZ/vec-perm-11.ll<br>
llvm/trunk/test/CodeGen/SystemZ/vec-shift-01.ll<br>
llvm/trunk/test/CodeGen/SystemZ/vec-shift-02.ll<br>
llvm/trunk/test/CodeGen/SystemZ/vec-shift-03.ll<br>
llvm/trunk/test/CodeGen/SystemZ/vec-shift-04.ll<br>
llvm/trunk/test/CodeGen/SystemZ/vec-shift-05.ll<br>
llvm/trunk/test/CodeGen/SystemZ/vec-shift-06.ll<br>
llvm/trunk/test/CodeGen/SystemZ/vec-shift-07.ll<br>
llvm/trunk/test/CodeGen/SystemZ/vec-sub-01.ll<br>
llvm/trunk/test/CodeGen/SystemZ/vec-xor-01.ll<br>
Modified:<br>
llvm/trunk/lib/Target/SystemZ/SystemZ.h<br>
llvm/trunk/lib/Target/SystemZ/SystemZAsmPrinter.cpp<br>
llvm/trunk/lib/Target/SystemZ/SystemZCallingConv.h<br>
llvm/trunk/lib/Target/SystemZ/SystemZCallingConv.td<br>
llvm/trunk/lib/Target/SystemZ/SystemZISelDAGToDAG.cpp<br>
llvm/trunk/lib/Target/SystemZ/SystemZISelLowering.cpp<br>
llvm/trunk/lib/Target/SystemZ/SystemZISelLowering.h<br>
llvm/trunk/lib/Target/SystemZ/SystemZInstrFormats.td<br>
llvm/trunk/lib/Target/SystemZ/SystemZInstrInfo.cpp<br>
llvm/trunk/lib/Target/SystemZ/SystemZInstrVector.td<br>
llvm/trunk/lib/Target/SystemZ/SystemZOperators.td<br>
llvm/trunk/lib/Target/SystemZ/SystemZTargetMachine.cpp<br>
llvm/trunk/lib/Target/SystemZ/SystemZTargetTransformInfo.cpp<br>
llvm/trunk/lib/Target/SystemZ/SystemZTargetTransformInfo.h<br>
<br>
Modified: llvm/trunk/lib/Target/SystemZ/SystemZ.h<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/SystemZ/SystemZ.h?rev=236521&r1=236520&r2=236521&view=diff" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/SystemZ/SystemZ.h?rev=236521&r1=236520&r2=236521&view=diff</a><br>
==============================================================================<br>
--- llvm/trunk/lib/Target/SystemZ/SystemZ.h (original)<br>
+++ llvm/trunk/lib/Target/SystemZ/SystemZ.h Tue May 5 14:25:42 2015<br>
@@ -87,6 +87,13 @@ const unsigned IPM_CC = 28;<br>
const unsigned PFD_READ = 1;<br>
const unsigned PFD_WRITE = 2;<br>
<br>
+// Number of bits in a vector register.<br>
+const unsigned VectorBits = 128;<br>
+<br>
+// Number of bytes in a vector register (and consequently the number of<br>
+// bytes in a general permute vector).<br>
+const unsigned VectorBytes = VectorBits / 8;<br>
+<br>
// Return true if Val fits an LLILL operand.<br>
static inline bool isImmLL(uint64_t Val) {<br>
return (Val & ~0x000000000000ffffULL) == 0;<br>
<br>
Modified: llvm/trunk/lib/Target/SystemZ/SystemZAsmPrinter.cpp<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/SystemZ/SystemZAsmPrinter.cpp?rev=236521&r1=236520&r2=236521&view=diff" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/SystemZ/SystemZAsmPrinter.cpp?rev=236521&r1=236520&r2=236521&view=diff</a><br>
==============================================================================<br>
--- llvm/trunk/lib/Target/SystemZ/SystemZAsmPrinter.cpp (original)<br>
+++ llvm/trunk/lib/Target/SystemZ/SystemZAsmPrinter.cpp Tue May 5 14:25:42 2015<br>
@@ -151,6 +151,13 @@ void SystemZAsmPrinter::EmitInstruction(<br>
LoweredMI = lowerRIEfLow(MI, SystemZ::RISBLG);<br>
break;<br>
<br>
+ case SystemZ::VLVGP32:<br>
+ LoweredMI = MCInstBuilder(SystemZ::VLVGP)<br>
+ .addReg(MI->getOperand(0).getReg())<br>
+ .addReg(SystemZMC::getRegAsGR64(MI->getOperand(1).getReg()))<br>
+ .addReg(SystemZMC::getRegAsGR64(MI->getOperand(2).getReg()));<br>
+ break;<br>
+<br>
#define LOWER_LOW(NAME) \<br>
case SystemZ::NAME##64: LoweredMI = lowerRILow(MI, SystemZ::NAME); break<br>
<br>
<br>
Modified: llvm/trunk/lib/Target/SystemZ/SystemZCallingConv.h<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/SystemZ/SystemZCallingConv.h?rev=236521&r1=236520&r2=236521&view=diff" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/SystemZ/SystemZCallingConv.h?rev=236521&r1=236520&r2=236521&view=diff</a><br>
==============================================================================<br>
--- llvm/trunk/lib/Target/SystemZ/SystemZCallingConv.h (original)<br>
+++ llvm/trunk/lib/Target/SystemZ/SystemZCallingConv.h Tue May 5 14:25:42 2015<br>
@@ -10,6 +10,9 @@<br>
#ifndef LLVM_LIB_TARGET_SYSTEMZ_SYSTEMZCALLINGCONV_H<br>
#define LLVM_LIB_TARGET_SYSTEMZ_SYSTEMZCALLINGCONV_H<br>
<br>
+#include "llvm/ADT/SmallVector.h"<br>
+#include "llvm/CodeGen/CallingConvLower.h"<br>
+<br>
namespace llvm {<br>
namespace SystemZ {<br>
const unsigned NumArgGPRs = 5;<br>
@@ -18,6 +21,47 @@ namespace SystemZ {<br>
const unsigned NumArgFPRs = 4;<br>
extern const unsigned ArgFPRs[NumArgFPRs];<br>
} // end namespace SystemZ<br>
+<br>
+class SystemZCCState : public CCState {<br>
+private:<br>
+ /// Records whether the value was a fixed argument.<br>
+ /// See ISD::OutputArg::IsFixed.<br>
+ SmallVector<bool, 4> ArgIsFixed;<br>
+<br>
+public:<br>
+ SystemZCCState(CallingConv::ID CC, bool isVarArg, MachineFunction &MF,<br>
+ SmallVectorImpl<CCValAssign> &locs, LLVMContext &C)<br>
+ : CCState(CC, isVarArg, MF, locs, C) {}<br>
+<br>
+ void AnalyzeFormalArguments(const SmallVectorImpl<ISD::InputArg> &Ins,<br>
+ CCAssignFn Fn) {<br>
+ // Formal arguments are always fixed.<br>
+ ArgIsFixed.clear();<br>
+ for (unsigned i = 0; i < Ins.size(); ++i)<br>
+ ArgIsFixed.push_back(true);<br>
+<br>
+ CCState::AnalyzeFormalArguments(Ins, Fn);<br>
+ }<br>
+<br>
+ void AnalyzeCallOperands(const SmallVectorImpl<ISD::OutputArg> &Outs,<br>
+ CCAssignFn Fn) {<br>
+ // Record whether the call operand was a fixed argument.<br>
+ ArgIsFixed.clear();<br>
+ for (unsigned i = 0; i < Outs.size(); ++i)<br>
+ ArgIsFixed.push_back(Outs[i].IsFixed);<br>
+<br>
+ CCState::AnalyzeCallOperands(Outs, Fn);<br>
+ }<br>
+<br>
+ // This version of AnalyzeCallOperands in the base class is not usable<br>
+ // since we must provide a means of accessing ISD::OutputArg::IsFixed.<br>
+ void AnalyzeCallOperands(const SmallVectorImpl<MVT> &Outs,<br>
+ SmallVectorImpl<ISD::ArgFlagsTy> &Flags,<br>
+ CCAssignFn Fn) = delete;<br>
+<br>
+ bool IsFixed(unsigned ValNo) { return ArgIsFixed[ValNo]; }<br>
+};<br>
+<br>
} // end namespace llvm<br>
<br>
#endif<br>
<br>
Modified: llvm/trunk/lib/Target/SystemZ/SystemZCallingConv.td<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/SystemZ/SystemZCallingConv.td?rev=236521&r1=236520&r2=236521&view=diff" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/SystemZ/SystemZCallingConv.td?rev=236521&r1=236520&r2=236521&view=diff</a><br>
==============================================================================<br>
--- llvm/trunk/lib/Target/SystemZ/SystemZCallingConv.td (original)<br>
+++ llvm/trunk/lib/Target/SystemZ/SystemZCallingConv.td Tue May 5 14:25:42 2015<br>
@@ -12,6 +12,15 @@<br>
class CCIfExtend<CCAction A><br>
: CCIf<"ArgFlags.isSExt() || ArgFlags.isZExt()", A>;<br>
<br>
+class CCIfSubtarget<string F, CCAction A><br>
+ : CCIf<!strconcat("static_cast<const SystemZSubtarget&>"<br>
+ "(State.getMachineFunction().getSubtarget()).", F),<br>
+ A>;<br>
+<br>
+// Match if this specific argument is a fixed (i.e. named) argument.<br>
+class CCIfFixed<CCAction A><br>
+ : CCIf<"static_cast<SystemZCCState *>(&State)->IsFixed(ValNo)", A>;<br>
+<br>
//===----------------------------------------------------------------------===//<br>
// z/Linux return value calling convention<br>
//===----------------------------------------------------------------------===//<br>
@@ -31,7 +40,12 @@ def RetCC_SystemZ : CallingConv<[<br>
// doesn't care about the ABI. All floating-point argument registers<br>
// are call-clobbered, so we can use all of them here.<br>
CCIfType<[f32], CCAssignToReg<[F0S, F2S, F4S, F6S]>>,<br>
- CCIfType<[f64], CCAssignToReg<[F0D, F2D, F4D, F6D]>><br>
+ CCIfType<[f64], CCAssignToReg<[F0D, F2D, F4D, F6D]>>,<br>
+<br>
+ // Similarly for vectors, with V24 being the ABI-compliant choice.<br>
+ CCIfSubtarget<"hasVector()",<br>
+ CCIfType<[v16i8, v8i16, v4i32, v2i64],<br>
+ CCAssignToReg<[V24, V26, V28, V30, V25, V27, V29, V31]>>><br>
<br>
// ABI-compliant code returns long double by reference, but that conversion<br>
// is left to higher-level code. Perhaps we could add an f128 definition<br>
@@ -60,6 +74,17 @@ def CC_SystemZ : CallingConv<[<br>
CCIfType<[f32], CCAssignToReg<[F0S, F2S, F4S, F6S]>>,<br>
CCIfType<[f64], CCAssignToReg<[F0D, F2D, F4D, F6D]>>,<br>
<br>
+ // The first 8 named vector arguments are passed in V24-V31.<br>
+ CCIfSubtarget<"hasVector()",<br>
+ CCIfType<[v16i8, v8i16, v4i32, v2i64],<br>
+ CCIfFixed<CCAssignToReg<[V24, V26, V28, V30,<br>
+ V25, V27, V29, V31]>>>>,<br>
+<br>
+ // Other vector arguments are passed in 8-byte-aligned 16-byte stack slots.<br>
+ CCIfSubtarget<"hasVector()",<br>
+ CCIfType<[v16i8, v8i16, v4i32, v2i64],<br>
+ CCAssignToStack<16, 8>>>,<br>
+<br>
// Other arguments are passed in 8-byte-aligned 8-byte stack slots.<br>
CCIfType<[i32, i64, f32, f64], CCAssignToStack<8, 8>><br>
]>;<br>
<br>
Modified: llvm/trunk/lib/Target/SystemZ/SystemZISelDAGToDAG.cpp<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/SystemZ/SystemZISelDAGToDAG.cpp?rev=236521&r1=236520&r2=236521&view=diff" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/SystemZ/SystemZISelDAGToDAG.cpp?rev=236521&r1=236520&r2=236521&view=diff</a><br>
==============================================================================<br>
--- llvm/trunk/lib/Target/SystemZ/SystemZISelDAGToDAG.cpp (original)<br>
+++ llvm/trunk/lib/Target/SystemZ/SystemZISelDAGToDAG.cpp Tue May 5 14:25:42 2015<br>
@@ -255,6 +255,13 @@ class SystemZDAGToDAGISel : public Selec<br>
Addr, Base, Disp, Index);<br>
}<br>
<br>
+ // Try to match Addr as an address with a base, 12-bit displacement<br>
+ // and index, where the index is element Elem of a vector.<br>
+ // Return true on success, storing the base, displacement and vector<br>
+ // in Base, Disp and Index respectively.<br>
+ bool selectBDVAddr12Only(SDValue Addr, SDValue Elem, SDValue &Base,<br>
+ SDValue &Disp, SDValue &Index) const;<br>
+<br>
// Check whether (or Op (and X InsertMask)) is effectively an insertion<br>
// of X into bits InsertMask of some Y != Op. Return true if so and<br>
// set Op to that Y.<br>
@@ -292,6 +299,12 @@ class SystemZDAGToDAGISel : public Selec<br>
SDNode *splitLargeImmediate(unsigned Opcode, SDNode *Node, SDValue Op0,<br>
uint64_t UpperVal, uint64_t LowerVal);<br>
<br>
+ // Try to use gather instruction Opcode to implement vector insertion N.<br>
+ SDNode *tryGather(SDNode *N, unsigned Opcode);<br>
+<br>
+ // Try to use scatter instruction Opcode to implement store Store.<br>
+ SDNode *tryScatter(StoreSDNode *Store, unsigned Opcode);<br>
+<br>
// Return true if Load and Store are loads and stores of the same size<br>
// and are guaranteed not to overlap. Such operations can be implemented<br>
// using block (SS-format) instructions.<br>
@@ -645,6 +658,30 @@ bool SystemZDAGToDAGISel::selectBDXAddr(<br>
return true;<br>
}<br>
<br>
+bool SystemZDAGToDAGISel::selectBDVAddr12Only(SDValue Addr, SDValue Elem,<br>
+ SDValue &Base,<br>
+ SDValue &Disp,<br>
+ SDValue &Index) const {<br>
+ SDValue Regs[2];<br>
+ if (selectBDXAddr12Only(Addr, Regs[0], Disp, Regs[1]) &&<br>
+ Regs[0].getNode() && Regs[1].getNode()) {<br>
+ for (unsigned int I = 0; I < 2; ++I) {<br>
+ Base = Regs[I];<br>
+ Index = Regs[1 - I];<br>
+ // We can't tell here whether the index vector has the right type<br>
+ // for the access; the caller needs to do that instead.<br>
+ if (Index.getOpcode() == ISD::ZERO_EXTEND)<br>
+ Index = Index.getOperand(0);<br>
+ if (Index.getOpcode() == ISD::EXTRACT_VECTOR_ELT &&<br>
+ Index.getOperand(1) == Elem) {<br>
+ Index = Index.getOperand(0);<br>
+ return true;<br>
+ }<br>
+ }<br>
+ }<br>
+ return false;<br>
+}<br>
+<br>
bool SystemZDAGToDAGISel::detectOrAndInsertion(SDValue &Op,<br>
uint64_t InsertMask) const {<br>
// We're only interested in cases where the insertion is into some operand<br>
@@ -984,6 +1021,71 @@ SDNode *SystemZDAGToDAGISel::splitLargeI<br>
return Or.getNode();<br>
}<br>
<br>
+SDNode *SystemZDAGToDAGISel::tryGather(SDNode *N, unsigned Opcode) {<br>
+ SDValue ElemV = N->getOperand(2);<br>
+ auto *ElemN = dyn_cast<ConstantSDNode>(ElemV);<br>
+ if (!ElemN)<br>
+ return 0;<br>
+<br>
+ unsigned Elem = ElemN->getZExtValue();<br>
+ EVT VT = N->getValueType(0);<br>
+ if (Elem >= VT.getVectorNumElements())<br>
+ return 0;<br>
+<br>
+ auto *Load = dyn_cast<LoadSDNode>(N->getOperand(1));<br>
+ if (!Load || !Load->hasOneUse())<br>
+ return 0;<br>
+ if (Load->getMemoryVT().getSizeInBits() !=<br>
+ Load->getValueType(0).getSizeInBits())<br>
+ return 0;<br>
+<br>
+ SDValue Base, Disp, Index;<br>
+ if (!selectBDVAddr12Only(Load->getBasePtr(), ElemV, Base, Disp, Index) ||<br>
+ Index.getValueType() != VT.changeVectorElementTypeToInteger())<br>
+ return 0;<br>
+<br>
+ SDLoc DL(Load);<br>
+ SDValue Ops[] = {<br>
+ N->getOperand(0), Base, Disp, Index,<br>
+ CurDAG->getTargetConstant(Elem, DL, MVT::i32), Load->getChain()<br>
+ };<br>
+ SDNode *Res = CurDAG->getMachineNode(Opcode, DL, VT, MVT::Other, Ops);<br>
+ ReplaceUses(SDValue(Load, 1), SDValue(Res, 1));<br>
+ return Res;<br>
+}<br>
+<br>
+SDNode *SystemZDAGToDAGISel::tryScatter(StoreSDNode *Store, unsigned Opcode) {<br>
+ SDValue Value = Store->getValue();<br>
+ if (Value.getOpcode() != ISD::EXTRACT_VECTOR_ELT)<br>
+ return 0;<br>
+ if (Store->getMemoryVT().getSizeInBits() !=<br>
+ Value.getValueType().getSizeInBits())<br>
+ return 0;<br>
+<br>
+ SDValue ElemV = Value.getOperand(1);<br>
+ auto *ElemN = dyn_cast<ConstantSDNode>(ElemV);<br>
+ if (!ElemN)<br>
+ return 0;<br>
+<br>
+ SDValue Vec = Value.getOperand(0);<br>
+ EVT VT = Vec.getValueType();<br>
+ unsigned Elem = ElemN->getZExtValue();<br>
+ if (Elem >= VT.getVectorNumElements())<br>
+ return 0;<br>
+<br>
+ SDValue Base, Disp, Index;<br>
+ if (!selectBDVAddr12Only(Store->getBasePtr(), ElemV, Base, Disp, Index) ||<br>
+ Index.getValueType() != VT.changeVectorElementTypeToInteger())<br>
+ return 0;<br>
+<br>
+ SDLoc DL(Store);<br>
+ SDValue Ops[] = {<br>
+ Vec, Base, Disp, Index, CurDAG->getTargetConstant(Elem, DL, MVT::i32),<br>
+ Store->getChain()<br>
+ };<br>
+ return CurDAG->getMachineNode(Opcode, DL, MVT::Other, Ops);<br>
+}<br>
+<br>
bool SystemZDAGToDAGISel::canUseBlockOperation(StoreSDNode *Store,<br>
LoadSDNode *Load) const {<br>
// Check that the two memory operands have the same size.<br>
@@ -1120,6 +1222,26 @@ SDNode *SystemZDAGToDAGISel::Select(SDNo<br>
}<br>
break;<br>
}<br>
+<br>
+ case ISD::INSERT_VECTOR_ELT: {<br>
+ EVT VT = Node->getValueType(0);<br>
+ unsigned ElemBitSize = VT.getVectorElementType().getSizeInBits();<br>
+ if (ElemBitSize == 32)<br>
+ ResNode = tryGather(Node, SystemZ::VGEF);<br>
+ else if (ElemBitSize == 64)<br>
+ ResNode = tryGather(Node, SystemZ::VGEG);<br>
+ break;<br>
+ }<br>
+<br>
+ case ISD::STORE: {<br>
+ auto *Store = cast<StoreSDNode>(Node);<br>
+ unsigned ElemBitSize = Store->getValue().getValueType().getSizeInBits();<br>
+ if (ElemBitSize == 32)<br>
+ ResNode = tryScatter(Store, SystemZ::VSCEF);<br>
+ else if (ElemBitSize == 64)<br>
+ ResNode = tryScatter(Store, SystemZ::VSCEG);<br>
+ break;<br>
+ }<br>
}<br>
<br>
// Select the default instruction<br>
<br>
Modified: llvm/trunk/lib/Target/SystemZ/SystemZISelLowering.cpp<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/SystemZ/SystemZISelLowering.cpp?rev=236521&r1=236520&r2=236521&view=diff" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/SystemZ/SystemZISelLowering.cpp?rev=236521&r1=236520&r2=236521&view=diff</a><br>
==============================================================================<br>
--- llvm/trunk/lib/Target/SystemZ/SystemZISelLowering.cpp (original)<br>
+++ llvm/trunk/lib/Target/SystemZ/SystemZISelLowering.cpp Tue May 5 14:25:42 2015<br>
@@ -96,6 +96,13 @@ SystemZTargetLowering::SystemZTargetLowe<br>
addRegisterClass(MVT::f64, &SystemZ::FP64BitRegClass);<br>
addRegisterClass(MVT::f128, &SystemZ::FP128BitRegClass);<br>
<br>
+ if (Subtarget.hasVector()) {<br>
+ addRegisterClass(MVT::v16i8, &SystemZ::VR128BitRegClass);<br>
+ addRegisterClass(MVT::v8i16, &SystemZ::VR128BitRegClass);<br>
+ addRegisterClass(MVT::v4i32, &SystemZ::VR128BitRegClass);<br>
+ addRegisterClass(MVT::v2i64, &SystemZ::VR128BitRegClass);<br>
+ }<br>
+<br>
// Compute derived properties from the register classes<br>
computeRegisterProperties(Subtarget.getRegisterInfo());<br>
<br>
@@ -111,7 +118,7 @@ SystemZTargetLowering::SystemZTargetLowe<br>
setSchedulingPreference(Sched::RegPressure);<br>
<br>
setBooleanContents(ZeroOrOneBooleanContent);<br>
- setBooleanVectorContents(ZeroOrOneBooleanContent); // FIXME: Is this correct?<br>
+ setBooleanVectorContents(ZeroOrNegativeOneBooleanContent);<br>
<br>
// Instructions are strings of 2-byte aligned 2-byte values.<br>
setMinFunctionAlignment(2);<br>
@@ -250,6 +257,76 @@ SystemZTargetLowering::SystemZTargetLowe<br>
// Handle prefetches with PFD or PFDRL.<br>
setOperationAction(ISD::PREFETCH, MVT::Other, Custom);<br>
<br>
+ for (MVT VT : MVT::vector_valuetypes()) {<br>
+ // Assume by default that all vector operations need to be expanded.<br>
+ for (unsigned Opcode = 0; Opcode < ISD::BUILTIN_OP_END; ++Opcode)<br>
+ if (getOperationAction(Opcode, VT) == Legal)<br>
+ setOperationAction(Opcode, VT, Expand);<br>
+<br>
+ // Likewise all truncating stores and extending loads.<br>
+ for (MVT InnerVT : MVT::vector_valuetypes()) {<br>
+ setTruncStoreAction(VT, InnerVT, Expand);<br>
+ setLoadExtAction(ISD::SEXTLOAD, VT, InnerVT, Expand);<br>
+ setLoadExtAction(ISD::ZEXTLOAD, VT, InnerVT, Expand);<br>
+ setLoadExtAction(ISD::EXTLOAD, VT, InnerVT, Expand);<br>
+ }<br>
+<br>
+ if (isTypeLegal(VT)) {<br>
+ // These operations are legal for anything that can be stored in a<br>
+ // vector register, even if there is no native support for the format<br>
+ // as such.<br>
+ setOperationAction(ISD::LOAD, VT, Legal);<br>
+ setOperationAction(ISD::STORE, VT, Legal);<br>
+ setOperationAction(ISD::VSELECT, VT, Legal);<br>
+ setOperationAction(ISD::BITCAST, VT, Legal);<br>
+ setOperationAction(ISD::UNDEF, VT, Legal);<br>
+<br>
+ // Likewise, except that we need to replace the nodes with something<br>
+ // more specific.<br>
+ setOperationAction(ISD::BUILD_VECTOR, VT, Custom);<br>
+ setOperationAction(ISD::VECTOR_SHUFFLE, VT, Custom);<br>
+ }<br>
+ }<br>
+<br>
+ // Handle integer vector types.<br>
+ for (MVT VT : MVT::integer_vector_valuetypes()) {<br>
+ if (isTypeLegal(VT)) {<br>
+ // These operations have direct equivalents.<br>
+ setOperationAction(ISD::EXTRACT_VECTOR_ELT, VT, Legal);<br>
+ setOperationAction(ISD::INSERT_VECTOR_ELT, VT, Legal);<br>
+ setOperationAction(ISD::ADD, VT, Legal);<br>
+ setOperationAction(ISD::SUB, VT, Legal);<br>
+ if (VT != MVT::v2i64)<br>
+ setOperationAction(ISD::MUL, VT, Legal);<br>
+ setOperationAction(ISD::AND, VT, Legal);<br>
+ setOperationAction(ISD::OR, VT, Legal);<br>
+ setOperationAction(ISD::XOR, VT, Legal);<br>
+ setOperationAction(ISD::CTPOP, VT, Custom);<br>
+ setOperationAction(ISD::CTTZ, VT, Legal);<br>
+ setOperationAction(ISD::CTLZ, VT, Legal);<br>
+ setOperationAction(ISD::CTTZ_ZERO_UNDEF, VT, Custom);<br>
+ setOperationAction(ISD::CTLZ_ZERO_UNDEF, VT, Custom);<br>
+<br>
+ // Convert a GPR scalar to a vector by inserting it into element 0.<br>
+ setOperationAction(ISD::SCALAR_TO_VECTOR, VT, Custom);<br>
+<br>
+ // Detect shifts by a scalar amount and convert them into<br>
+ // V*_BY_SCALAR.<br>
+ setOperationAction(ISD::SHL, VT, Custom);<br>
+ setOperationAction(ISD::SRA, VT, Custom);<br>
+ setOperationAction(ISD::SRL, VT, Custom);<br>
+<br>
+ // At present ROTL isn't matched by DAGCombiner. ROTR should be<br>
+ // converted into ROTL.<br>
+ setOperationAction(ISD::ROTL, VT, Expand);<br>
+ setOperationAction(ISD::ROTR, VT, Expand);<br>
+<br>
+ // Map SETCCs onto one of VCE, VCH or VCHL, swapping the operands<br>
+ // and inverting the result as necessary.<br>
+ setOperationAction(ISD::SETCC, VT, Custom);<br>
+ }<br>
+ }<br>
+<br>
// Handle floating-point types.<br>
for (unsigned I = MVT::FIRST_FP_VALUETYPE;<br>
I <= MVT::LAST_FP_VALUETYPE;<br>
@@ -304,6 +381,8 @@ SystemZTargetLowering::SystemZTargetLowe<br>
<br>
// Codes for which we want to perform some z-specific combinations.<br>
setTargetDAGCombine(ISD::SIGN_EXTEND);<br>
+ setTargetDAGCombine(ISD::STORE);<br>
+ setTargetDAGCombine(ISD::EXTRACT_VECTOR_ELT);<br>
<br>
// Handle intrinsics.<br>
setOperationAction(ISD::INTRINSIC_W_CHAIN, MVT::Other, Custom);<br>
@@ -703,7 +782,7 @@ LowerFormalArguments(SDValue Chain, Call<br>
<br>
// Assign locations to all of the incoming arguments.<br>
SmallVector<CCValAssign, 16> ArgLocs;<br>
- CCState CCInfo(CallConv, IsVarArg, MF, ArgLocs, *DAG.getContext());<br>
+ SystemZCCState CCInfo(CallConv, IsVarArg, MF, ArgLocs, *DAG.getContext());<br>
CCInfo.AnalyzeFormalArguments(Ins, CC_SystemZ);<br>
<br>
unsigned NumFixedGPRs = 0;<br>
@@ -735,6 +814,12 @@ LowerFormalArguments(SDValue Chain, Call<br>
NumFixedFPRs += 1;<br>
RC = &SystemZ::FP64BitRegClass;<br>
break;<br>
+ case MVT::v16i8:<br>
+ case MVT::v8i16:<br>
+ case MVT::v4i32:<br>
+ case MVT::v2i64:<br>
+ RC = &SystemZ::VR128BitRegClass;<br>
+ break;<br>
}<br>
<br>
unsigned VReg = MRI.createVirtualRegister(RC);<br>
@@ -842,7 +927,7 @@ SystemZTargetLowering::LowerCall(CallLow<br>
<br>
// Analyze the operands of the call, assigning locations to each operand.<br>
SmallVector<CCValAssign, 16> ArgLocs;<br>
- CCState ArgCCInfo(CallConv, IsVarArg, MF, ArgLocs, *DAG.getContext());<br>
+ SystemZCCState ArgCCInfo(CallConv, IsVarArg, MF, ArgLocs, *DAG.getContext());<br>
ArgCCInfo.AnalyzeCallOperands(Outs, CC_SystemZ);<br>
<br>
// We don't support GuaranteedTailCallOpt, only automatically-detected<br>
@@ -1809,12 +1894,78 @@ static SDValue emitSETCC(SelectionDAG &D<br>
return Result;<br>
}<br>
<br>
+// Return the SystemZISD vector comparison operation for CC, or 0 if it cannot<br>
+// be done directly.<br>
+static unsigned getVectorComparison(ISD::CondCode CC) {<br>
+ switch (CC) {<br>
+ case ISD::SETEQ:<br>
+ return SystemZISD::VICMPE;<br>
+<br>
+ case ISD::SETGT:<br>
+ return SystemZISD::VICMPH;<br>
+<br>
+ case ISD::SETUGT:<br>
+ return SystemZISD::VICMPHL;<br>
+<br>
+ default:<br>
+ return 0;<br>
+ }<br>
+}<br>
+<br>
+// Return the SystemZISD vector comparison operation for CC or its inverse,<br>
+// or 0 if neither can be done directly. Indicate in Invert whether the<br>
+// result is for the inverse of CC.<br>
+static unsigned getVectorComparisonOrInvert(ISD::CondCode CC, bool &Invert) {<br>
+ if (unsigned Opcode = getVectorComparison(CC)) {<br>
+ Invert = false;<br>
+ return Opcode;<br>
+ }<br>
+<br>
+ CC = ISD::getSetCCInverse(CC, true);<br>
+ if (unsigned Opcode = getVectorComparison(CC)) {<br>
+ Invert = true;<br>
+ return Opcode;<br>
+ }<br>
+<br>
+ return 0;<br>
+}<br>
+<br>
+// Lower a vector comparison of type CC between CmpOp0 and CmpOp1, producing<br>
+// an integer mask of type VT.<br>
+static SDValue lowerVectorSETCC(SelectionDAG &DAG, SDLoc DL, EVT VT,<br>
+ ISD::CondCode CC, SDValue CmpOp0,<br>
+ SDValue CmpOp1) {<br>
+ bool Invert = false;<br>
+ SDValue Cmp;<br>
+ // It doesn't really matter whether we try the inversion or the swap first,<br>
+ // since there are no cases where both work.<br>
+ if (unsigned Opcode = getVectorComparisonOrInvert(CC, Invert))<br>
+ Cmp = DAG.getNode(Opcode, DL, VT, CmpOp0, CmpOp1);<br>
+ else {<br>
+ CC = ISD::getSetCCSwappedOperands(CC);<br>
+ if (unsigned Opcode = getVectorComparisonOrInvert(CC, Invert))<br>
+ Cmp = DAG.getNode(Opcode, DL, VT, CmpOp1, CmpOp0);<br>
+ else<br>
+ llvm_unreachable("Unhandled comparison");<br>
+ }<br>
+ if (Invert) {<br>
+ SDValue Mask = DAG.getNode(SystemZISD::BYTE_MASK, DL, MVT::v16i8,<br>
+ DAG.getConstant(65535, DL, MVT::i32));<br>
+ Mask = DAG.getNode(ISD::BITCAST, DL, VT, Mask);<br>
+ Cmp = DAG.getNode(ISD::XOR, DL, VT, Cmp, Mask);<br>
+ }<br>
+ return Cmp;<br>
+}<br>
+<br>
SDValue SystemZTargetLowering::lowerSETCC(SDValue Op,<br>
SelectionDAG &DAG) const {<br>
SDValue CmpOp0 = Op.getOperand(0);<br>
SDValue CmpOp1 = Op.getOperand(1);<br>
ISD::CondCode CC = cast<CondCodeSDNode>(Op.getOperand(2))->get();<br>
SDLoc DL(Op);<br>
+ EVT VT = Op.getValueType();<br>
+ if (VT.isVector())<br>
+ return lowerVectorSETCC(DAG, DL, VT, CC, CmpOp0, CmpOp1);<br>
<br>
Comparison C(getCmp(DAG, CmpOp0, CmpOp1, CC, DL));<br>
SDValue Glue = emitCmp(DAG, DL, C);<br>
@@ -2146,6 +2297,13 @@ SDValue SystemZTargetLowering::lowerBITC<br>
EVT InVT = In.getValueType();<br>
EVT ResVT = Op.getValueType();<br>
<br>
+ // Convert loads directly. This is normally done by DAGCombiner,<br>
+ // but we need this case for bitcasts that are created during lowering<br>
+ // and which are then lowered themselves.<br>
+ if (auto *LoadN = dyn_cast<LoadSDNode>(In))<br>
+ return DAG.getLoad(ResVT, DL, LoadN->getChain(), LoadN->getBasePtr(),<br>
+ LoadN->getMemOperand());<br>
+<br>
if (InVT == MVT::i32 && ResVT == MVT::f32) {<br>
SDValue In64;<br>
if (Subtarget.hasHighWord()) {<br>
@@ -2421,11 +2579,44 @@ SDValue SystemZTargetLowering::lowerOR(S<br>
SDValue SystemZTargetLowering::lowerCTPOP(SDValue Op,<br>
SelectionDAG &DAG) const {<br>
EVT VT = Op.getValueType();<br>
- int64_t OrigBitSize = VT.getSizeInBits();<br>
SDLoc DL(Op);<br>
+ Op = Op.getOperand(0);<br>
+<br>
+ // Handle vector types via VPOPCT.<br>
+ if (VT.isVector()) {<br>
+ Op = DAG.getNode(ISD::BITCAST, DL, MVT::v16i8, Op);<br>
+ Op = DAG.getNode(SystemZISD::POPCNT, DL, MVT::v16i8, Op);<br>
+ switch (VT.getVectorElementType().getSizeInBits()) {<br>
+ case 8:<br>
+ break;<br>
+ case 16: {<br>
+ Op = DAG.getNode(ISD::BITCAST, DL, VT, Op);<br>
+ SDValue Shift = DAG.getConstant(8, DL, MVT::i32);<br>
+ SDValue Tmp = DAG.getNode(SystemZISD::VSHL_BY_SCALAR, DL, VT, Op, Shift);<br>
+ Op = DAG.getNode(ISD::ADD, DL, VT, Op, Tmp);<br>
+ Op = DAG.getNode(SystemZISD::VSRL_BY_SCALAR, DL, VT, Op, Shift);<br>
+ break;<br>
+ }<br>
+ case 32: {<br>
+ SDValue Tmp = DAG.getNode(SystemZISD::BYTE_MASK, DL, MVT::v16i8,<br>
+ DAG.getConstant(0, DL, MVT::i32));<br>
+ Op = DAG.getNode(SystemZISD::VSUM, DL, VT, Op, Tmp);<br>
+ break;<br>
+ }<br>
+ case 64: {<br>
+ SDValue Tmp = DAG.getNode(SystemZISD::BYTE_MASK, DL, MVT::v16i8,<br>
+ DAG.getConstant(0, DL, MVT::i32));<br>
+ Op = DAG.getNode(SystemZISD::VSUM, DL, MVT::v4i32, Op, Tmp);<br>
+ Op = DAG.getNode(SystemZISD::VSUM, DL, VT, Op, Tmp);<br>
+ break;<br>
+ }<br>
+ default:<br>
+ llvm_unreachable("Unexpected type");<br>
+ }<br>
+ return Op;<br>
+ }<br>
<br>
// Get the known-zero mask for the operand.<br>
- Op = Op.getOperand(0);<br>
APInt KnownZero, KnownOne;<br>
DAG.computeKnownBits(Op, KnownZero, KnownOne);<br>
unsigned NumSignificantBits = (~KnownZero).getActiveBits();<br>
@@ -2433,6 +2624,7 @@ SDValue SystemZTargetLowering::lowerCTPO<br>
return DAG.getConstant(0, DL, VT);<br>
<br>
// Skip known-zero high parts of the operand.<br>
+ int64_t OrigBitSize = VT.getSizeInBits();<br>
int64_t BitSize = (int64_t)1 << Log2_32_Ceil(NumSignificantBits);<br>
BitSize = std::min(BitSize, OrigBitSize);<br>
<br>
@@ -2698,6 +2890,837 @@ SystemZTargetLowering::lowerINTRINSIC_W_<br>
return SDValue();<br>
}<br>
<br>
+namespace {<br>
+// Says that SystemZISD operation Opcode can be used to perform the equivalent<br>
+// of a VPERM with permute vector Bytes. If Opcode takes three operands,<br>
+// Operand is the constant third operand, otherwise it is the number of<br>
+// bytes in each element of the result.<br>
+struct Permute {<br>
+ unsigned Opcode;<br>
+ unsigned Operand;<br>
+ unsigned char Bytes[SystemZ::VectorBytes];<br>
+};<br>
+}<br>
+<br>
+static const Permute PermuteForms[] = {<br>
+ // VMRHG<br>
+ { SystemZISD::MERGE_HIGH, 8,<br>
+ { 0, 1, 2, 3, 4, 5, 6, 7, 16, 17, 18, 19, 20, 21, 22, 23 } },<br>
+ // VMRHF<br>
+ { SystemZISD::MERGE_HIGH, 4,<br>
+ { 0, 1, 2, 3, 16, 17, 18, 19, 4, 5, 6, 7, 20, 21, 22, 23 } },<br>
+ // VMRHH<br>
+ { SystemZISD::MERGE_HIGH, 2,<br>
+ { 0, 1, 16, 17, 2, 3, 18, 19, 4, 5, 20, 21, 6, 7, 22, 23 } },<br>
+ // VMRHB<br>
+ { SystemZISD::MERGE_HIGH, 1,<br>
+ { 0, 16, 1, 17, 2, 18, 3, 19, 4, 20, 5, 21, 6, 22, 7, 23 } },<br>
+ // VMRLG<br>
+ { SystemZISD::MERGE_LOW, 8,<br>
+ { 8, 9, 10, 11, 12, 13, 14, 15, 24, 25, 26, 27, 28, 29, 30, 31 } },<br>
+ // VMRLF<br>
+ { SystemZISD::MERGE_LOW, 4,<br>
+ { 8, 9, 10, 11, 24, 25, 26, 27, 12, 13, 14, 15, 28, 29, 30, 31 } },<br>
+ // VMRLH<br>
+ { SystemZISD::MERGE_LOW, 2,<br>
+ { 8, 9, 24, 25, 10, 11, 26, 27, 12, 13, 28, 29, 14, 15, 30, 31 } },<br>
+ // VMRLB<br>
+ { SystemZISD::MERGE_LOW, 1,<br>
+ { 8, 24, 9, 25, 10, 26, 11, 27, 12, 28, 13, 29, 14, 30, 15, 31 } },<br>
+ // VPKG<br>
+ { SystemZISD::PACK, 4,<br>
+ { 4, 5, 6, 7, 12, 13, 14, 15, 20, 21, 22, 23, 28, 29, 30, 31 } },<br>
+ // VPKF<br>
+ { SystemZISD::PACK, 2,<br>
+ { 2, 3, 6, 7, 10, 11, 14, 15, 18, 19, 22, 23, 26, 27, 30, 31 } },<br>
+ // VPKH<br>
+ { SystemZISD::PACK, 1,<br>
+ { 1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31 } },<br>
+ // VPDI V1, V2, 4 (low half of V1, high half of V2)<br>
+ { SystemZISD::PERMUTE_DWORDS, 4,<br>
+ { 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23 } },<br>
+ // VPDI V1, V2, 1 (high half of V1, low half of V2)<br>
+ { SystemZISD::PERMUTE_DWORDS, 1,<br>
+ { 0, 1, 2, 3, 4, 5, 6, 7, 24, 25, 26, 27, 28, 29, 30, 31 } }<br>
+};<br>
+<br>
+// Called after matching a vector shuffle against a particular pattern.<br>
+// Both the original shuffle and the pattern have two vector operands.<br>
+// OpNos[0] is the operand of the original shuffle that should be used for<br>
+// operand 0 of the pattern, or -1 if operand 0 of the pattern can be anything.<br>
+// OpNos[1] is the same for operand 1 of the pattern. Resolve these -1s and<br>
+// set OpNo0 and OpNo1 to the shuffle operands that should actually be used<br>
+// for operands 0 and 1 of the pattern.<br>
+static bool chooseShuffleOpNos(int *OpNos, unsigned &OpNo0, unsigned &OpNo1) {<br>
+ if (OpNos[0] < 0) {<br>
+ if (OpNos[1] < 0)<br>
+ return false;<br>
+ OpNo0 = OpNo1 = OpNos[1];<br>
+ } else if (OpNos[1] < 0) {<br>
+ OpNo0 = OpNo1 = OpNos[0];<br>
+ } else {<br>
+ OpNo0 = OpNos[0];<br>
+ OpNo1 = OpNos[1];<br>
+ }<br>
+ return true;<br>
+}<br>
+<br>
+// Bytes is a VPERM-like permute vector, except that -1 is used for<br>
+// undefined bytes. Return true if the VPERM can be implemented using P.<br>
+// When returning true set OpNo0 to the VPERM operand that should be<br>
+// used for operand 0 of P and likewise OpNo1 for operand 1 of P.<br>
+//<br>
+// For example, if swapping the VPERM operands allows P to match, OpNo0<br>
+// will be 1 and OpNo1 will be 0. If instead Bytes only refers to one<br>
+// operand, but rewriting it to use two duplicated operands allows it to<br>
+// match P, then OpNo0 and OpNo1 will be the same.<br>
+static bool matchPermute(const SmallVectorImpl<int> &Bytes, const Permute &P,<br>
+ unsigned &OpNo0, unsigned &OpNo1) {<br>
+ int OpNos[] = { -1, -1 };<br>
+ for (unsigned I = 0; I < SystemZ::VectorBytes; ++I) {<br>
+ int Elt = Bytes[I];<br>
+ if (Elt >= 0) {<br>
+ // Make sure that the two permute vectors use the same suboperand<br>
+ // byte number. Only the operand numbers (the high bits) are<br>
+ // allowed to differ.<br>
+ if ((Elt ^ P.Bytes[I]) & (SystemZ::VectorBytes - 1))<br>
+ return false;<br>
+ int ModelOpNo = P.Bytes[I] / SystemZ::VectorBytes;<br>
+ int RealOpNo = unsigned(Elt) / SystemZ::VectorBytes;<br>
+ // Make sure that the operand mappings are consistent with previous<br>
+ // elements.<br>
+ if (OpNos[ModelOpNo] == 1 - RealOpNo)<br>
+ return false;<br>
+ OpNos[ModelOpNo] = RealOpNo;<br>
+ }<br>
+ }<br>
+ return chooseShuffleOpNos(OpNos, OpNo0, OpNo1);<br>
+}<br>
+<br>
+// As above, but search for a matching permute.<br>
+static const Permute *matchPermute(const SmallVectorImpl<int> &Bytes,<br>
+ unsigned &OpNo0, unsigned &OpNo1) {<br>
+ for (auto &P : PermuteForms)<br>
+ if (matchPermute(Bytes, P, OpNo0, OpNo1))<br>
+ return &P;<br>
+ return nullptr;<br>
+}<br>
+<br>
+// Bytes is a VPERM-like permute vector, except that -1 is used for<br>
+// undefined bytes. This permute is an operand of an outer permute.<br>
+// See whether redistributing the -1 bytes gives a shuffle that can be<br>
+// implemented using P. If so, set Transform to a VPERM-like permute vector<br>
+// that, when applied to the result of P, gives the original permute in Bytes.<br>
+static bool matchDoublePermute(const SmallVectorImpl<int> &Bytes,<br>
+ const Permute &P,<br>
+ SmallVectorImpl<int> &Transform) {<br>
+ unsigned To = 0;<br>
+ for (unsigned From = 0; From < SystemZ::VectorBytes; ++From) {<br>
+ int Elt = Bytes[From];<br>
+ if (Elt < 0)<br>
+ // Byte number From of the result is undefined.<br>
+ Transform[From] = -1;<br>
+ else {<br>
+ while (P.Bytes[To] != Elt) {<br>
+ To += 1;<br>
+ if (To == SystemZ::VectorBytes)<br>
+ return false;<br>
+ }<br>
+ Transform[From] = To;<br>
+ }<br>
+ }<br>
+ return true;<br>
+}<br>
+<br>
+// As above, but search for a matching permute.<br>
+static const Permute *matchDoublePermute(const SmallVectorImpl<int> &Bytes,<br>
+ SmallVectorImpl<int> &Transform) {<br>
+ for (auto &P : PermuteForms)<br>
+ if (matchDoublePermute(Bytes, P, Transform))<br>
+ return &P;<br>
+ return nullptr;<br>
+}<br>
+<br>
+// Convert the mask of the given VECTOR_SHUFFLE into a byte-level mask,<br>
+// as if it had type vNi8.<br>
+static void getVPermMask(ShuffleVectorSDNode *VSN,<br>
+ SmallVectorImpl<int> &Bytes) {<br>
+ EVT VT = VSN->getValueType(0);<br>
+ unsigned NumElements = VT.getVectorNumElements();<br>
+ unsigned BytesPerElement = VT.getVectorElementType().getStoreSize();<br>
+ Bytes.resize(NumElements * BytesPerElement, -1);<br>
+ for (unsigned I = 0; I < NumElements; ++I) {<br>
+ int Index = VSN->getMaskElt(I);<br>
+ if (Index >= 0)<br>
+ for (unsigned J = 0; J < BytesPerElement; ++J)<br>
+ Bytes[I * BytesPerElement + J] = Index * BytesPerElement + J;<br>
+ }<br>
+}<br>
+<br>
+// Bytes is a VPERM-like permute vector, except that -1 is used for<br>
+// undefined bytes. See whether bytes [Start, Start + BytesPerElement) of<br>
+// the result come from a contiguous sequence of bytes from one input.<br>
+// Set Base to the selector for the first byte if so.<br>
+static bool getShuffleInput(const SmallVectorImpl<int> &Bytes, unsigned Start,<br>
+ unsigned BytesPerElement, int &Base) {<br>
+ Base = -1;<br>
+ for (unsigned I = 0; I < BytesPerElement; ++I) {<br>
+ if (Bytes[Start + I] >= 0) {<br>
+ unsigned Elem = Bytes[Start + I];<br>
+ if (Base < 0) {<br>
+ Base = Elem - I;<br>
+ // Make sure the bytes would come from one input operand.<br>
+ if (unsigned(Base) % Bytes.size() + BytesPerElement > Bytes.size())<br>
+ return false;<br>
+ } else if (unsigned(Base) != Elem - I)<br>
+ return false;<br>
+ }<br>
+ }<br>
+ return true;<br>
+}<br>
+<br>
+// Bytes is a VPERM-like permute vector, except that -1 is used for<br>
+// undefined bytes. Return true if it can be performed using VSLDI.<br>
+// When returning true, set StartIndex to the shift amount and OpNo0<br>
+// and OpNo1 to the VPERM operands that should be used as the first<br>
+// and second shift operand respectively.<br>
+static bool isShlDoublePermute(const SmallVectorImpl<int> &Bytes,<br>
+ unsigned &StartIndex, unsigned &OpNo0,<br>
+ unsigned &OpNo1) {<br>
+ int OpNos[] = { -1, -1 };<br>
+ int Shift = -1;<br>
+ for (unsigned I = 0; I < 16; ++I) {<br>
+ int Index = Bytes[I];<br>
+ if (Index >= 0) {<br>
+ int ExpectedShift = (Index - I) % SystemZ::VectorBytes;<br>
+ int ModelOpNo = unsigned(ExpectedShift + I) / SystemZ::VectorBytes;<br>
+ int RealOpNo = unsigned(Index) / SystemZ::VectorBytes;<br>
+ if (Shift < 0)<br>
+ Shift = ExpectedShift;<br>
+ else if (Shift != ExpectedShift)<br>
+ return false;<br>
+ // Make sure that the operand mappings are consistent with previous<br>
+ // elements.<br>
+ if (OpNos[ModelOpNo] == 1 - RealOpNo)<br>
+ return false;<br>
+ OpNos[ModelOpNo] = RealOpNo;<br>
+ }<br>
+ }<br>
+ StartIndex = Shift;<br>
+ return chooseShuffleOpNos(OpNos, OpNo0, OpNo1);<br>
+}<br>
+<br>
+// Create a node that performs P on operands Op0 and Op1, casting the<br>
+// operands to the appropriate type. The type of the result is determined by P.<br>
+static SDValue getPermuteNode(SelectionDAG &DAG, SDLoc DL,<br>
+ const Permute &P, SDValue Op0, SDValue Op1) {<br>
+ // VPDI (PERMUTE_DWORDS) always operates on v2i64s. The input<br>
+ // elements of a PACK are twice as wide as the outputs.<br>
+ unsigned InBytes = (P.Opcode == SystemZISD::PERMUTE_DWORDS ? 8 :<br>
+ P.Opcode == SystemZISD::PACK ? P.Operand * 2 :<br>
+ P.Operand);<br>
+ // Cast both operands to the appropriate type.<br>
+ MVT InVT = MVT::getVectorVT(MVT::getIntegerVT(InBytes * 8),<br>
+ SystemZ::VectorBytes / InBytes);<br>
+ Op0 = DAG.getNode(ISD::BITCAST, DL, InVT, Op0);<br>
+ Op1 = DAG.getNode(ISD::BITCAST, DL, InVT, Op1);<br>
+ SDValue Op;<br>
+ if (P.Opcode == SystemZISD::PERMUTE_DWORDS) {<br>
+ SDValue Op2 = DAG.getConstant(P.Operand, DL, MVT::i32);<br>
+ Op = DAG.getNode(SystemZISD::PERMUTE_DWORDS, DL, InVT, Op0, Op1, Op2);<br>
+ } else if (P.Opcode == SystemZISD::PACK) {<br>
+ MVT OutVT = MVT::getVectorVT(MVT::getIntegerVT(P.Operand * 8),<br>
+ SystemZ::VectorBytes / P.Operand);<br>
+ Op = DAG.getNode(SystemZISD::PACK, DL, OutVT, Op0, Op1);<br>
+ } else {<br>
+ Op = DAG.getNode(P.Opcode, DL, InVT, Op0, Op1);<br>
+ }<br>
+ return Op;<br>
+}<br>
+<br>
+// Bytes is a VPERM-like permute vector, except that -1 is used for<br>
+// undefined bytes. Implement it on operands Ops[0] and Ops[1] using<br>
+// VSLDI or VPERM.<br>
+static SDValue getGeneralPermuteNode(SelectionDAG &DAG, SDLoc DL, SDValue *Ops,<br>
+ const SmallVectorImpl<int> &Bytes) {<br>
+ for (unsigned I = 0; I < 2; ++I)<br>
+ Ops[I] = DAG.getNode(ISD::BITCAST, DL, MVT::v16i8, Ops[I]);<br>
+<br>
+ // First see whether VSLDI can be used.<br>
+ unsigned StartIndex, OpNo0, OpNo1;<br>
+ if (isShlDoublePermute(Bytes, StartIndex, OpNo0, OpNo1))<br>
+ return DAG.getNode(SystemZISD::SHL_DOUBLE, DL, MVT::v16i8, Ops[OpNo0],<br>
+ Ops[OpNo1], DAG.getConstant(StartIndex, DL, MVT::i32));<br>
+<br>
+ // Fall back on VPERM. Construct an SDNode for the permute vector.<br>
+ SDValue IndexNodes[SystemZ::VectorBytes];<br>
+ for (unsigned I = 0; I < SystemZ::VectorBytes; ++I)<br>
+ if (Bytes[I] >= 0)<br>
+ IndexNodes[I] = DAG.getConstant(Bytes[I], DL, MVT::i32);<br>
+ else<br>
+ IndexNodes[I] = DAG.getUNDEF(MVT::i32);<br>
+ SDValue Op2 = DAG.getNode(ISD::BUILD_VECTOR, DL, MVT::v16i8, IndexNodes);<br>
+ return DAG.getNode(SystemZISD::PERMUTE, DL, MVT::v16i8, Ops[0], Ops[1], Op2);<br>
+}<br>
+<br>
+namespace {<br>
+// Describes a general N-operand vector shuffle.<br>
+struct GeneralShuffle {<br>
+ GeneralShuffle(EVT vt) : VT(vt) {}<br>
+ void addUndef();<br>
+ void add(SDValue, unsigned);<br>
+ SDValue getNode(SelectionDAG &, SDLoc);<br>
+<br>
+ // The operands of the shuffle.<br>
+ SmallVector<SDValue, SystemZ::VectorBytes> Ops;<br>
+<br>
+ // Index I is -1 if byte I of the result is undefined. Otherwise the<br>
+ // result comes from byte Bytes[I] % SystemZ::VectorBytes of operand<br>
+ // Bytes[I] / SystemZ::VectorBytes.<br>
+ SmallVector<int, SystemZ::VectorBytes> Bytes;<br>
+<br>
+ // The type of the shuffle result.<br>
+ EVT VT;<br>
+};<br>
+}<br>
+<br>
+// Add an extra undefined element to the shuffle.<br>
+void GeneralShuffle::addUndef() {<br>
+ unsigned BytesPerElement = VT.getVectorElementType().getStoreSize();<br>
+ for (unsigned I = 0; I < BytesPerElement; ++I)<br>
+ Bytes.push_back(-1);<br>
+}<br>
+<br>
+// Add an extra element to the shuffle, taking it from element Elem of Op.<br>
+// A null Op indicates a vector input whose value will be calculated later;<br>
+// there is at most one such input per shuffle and it always has the same<br>
+// type as the result.<br>
+void GeneralShuffle::add(SDValue Op, unsigned Elem) {<br>
+ unsigned BytesPerElement = VT.getVectorElementType().getStoreSize();<br>
+<br>
+ // The source vector can have wider elements than the result,<br>
+ // either through an explicit TRUNCATE or because of type legalization.<br>
+ // We want the least significant part.<br>
+ EVT FromVT = Op.getNode() ? Op.getValueType() : VT;<br>
+ unsigned FromBytesPerElement = FromVT.getVectorElementType().getStoreSize();<br>
+ assert(FromBytesPerElement >= BytesPerElement &&<br>
+ "Invalid EXTRACT_VECTOR_ELT");<br>
+ unsigned Byte = ((Elem * FromBytesPerElement) % SystemZ::VectorBytes +<br>
+ (FromBytesPerElement - BytesPerElement));<br>
+<br>
+ // Look through things like shuffles and bitcasts.<br>
+ while (Op.getNode()) {<br>
+ if (Op.getOpcode() == ISD::BITCAST)<br>
+ Op = Op.getOperand(0);<br>
+ else if (Op.getOpcode() == ISD::VECTOR_SHUFFLE && Op.hasOneUse()) {<br>
+ // See whether the bytes we need come from a contiguous part of one<br>
+ // operand.<br>
+ SmallVector<int, SystemZ::VectorBytes> OpBytes;<br>
+ getVPermMask(cast<ShuffleVectorSDNode>(Op), OpBytes);<br>
+ int NewByte;<br>
+ if (!getShuffleInput(OpBytes, Byte, BytesPerElement, NewByte))<br>
+ break;<br>
+ if (NewByte < 0) {<br>
+ addUndef();<br>
+ return;<br>
+ }<br>
+ Op = Op.getOperand(unsigned(NewByte) / SystemZ::VectorBytes);<br>
+ Byte = unsigned(NewByte) % SystemZ::VectorBytes;<br>
+ } else if (Op.getOpcode() == ISD::UNDEF) {<br>
+ addUndef();<br>
+ return;<br>
+ } else<br>
+ break;<br>
+ }<br>
+<br>
+ // Make sure that the source of the extraction is in Ops.<br>
+ unsigned OpNo = 0;<br>
+ for (; OpNo < Ops.size(); ++OpNo)<br>
+ if (Ops[OpNo] == Op)<br>
+ break;<br>
+ if (OpNo == Ops.size())<br>
+ Ops.push_back(Op);<br>
+<br>
+ // Add the element to Bytes.<br>
+ unsigned Base = OpNo * SystemZ::VectorBytes + Byte;<br>
+ for (unsigned I = 0; I < BytesPerElement; ++I)<br>
+ Bytes.push_back(Base + I);<br>
+}<br>
+<br>
+// Return SDNodes for the completed shuffle.<br>
+SDValue GeneralShuffle::getNode(SelectionDAG &DAG, SDLoc DL) {<br>
+ assert(Bytes.size() == SystemZ::VectorBytes && "Incomplete vector");<br>
+<br>
+ if (Ops.size() == 0)<br>
+ return DAG.getUNDEF(VT);<br>
+<br>
+ // Make sure that there are at least two shuffle operands.<br>
+ if (Ops.size() == 1)<br>
+ Ops.push_back(DAG.getUNDEF(MVT::v16i8));<br>
+<br>
+ // Create a tree of shuffles, deferring root node until after the loop.<br>
+ // Try to redistribute the undefined elements of non-root nodes so that<br>
+ // the non-root shuffles match something like a pack or merge, then adjust<br>
+ // the parent node's permute vector to compensate for the new order.<br>
+ // Among other things, this copes with vectors like <2 x i16> that were<br>
+ // padded with undefined elements during type legalization.<br>
+ //<br>
+ // In the best case this redistribution will lead to the whole tree<br>
+ // using packs and merges. It should rarely be a loss in other cases.<br>
+ unsigned Stride = 1;<br>
+ for (; Stride * 2 < Ops.size(); Stride *= 2) {<br>
+ for (unsigned I = 0; I < Ops.size() - Stride; I += Stride * 2) {<br>
+ SDValue SubOps[] = { Ops[I], Ops[I + Stride] };<br>
+<br>
+ // Create a mask for just these two operands.<br>
+ SmallVector<int, SystemZ::VectorBytes> NewBytes(SystemZ::VectorBytes);<br>
+ for (unsigned J = 0; J < SystemZ::VectorBytes; ++J) {<br>
+ unsigned OpNo = unsigned(Bytes[J]) / SystemZ::VectorBytes;<br>
+ unsigned Byte = unsigned(Bytes[J]) % SystemZ::VectorBytes;<br>
+ if (OpNo == I)<br>
+ NewBytes[J] = Byte;<br>
+ else if (OpNo == I + Stride)<br>
+ NewBytes[J] = SystemZ::VectorBytes + Byte;<br>
+ else<br>
+ NewBytes[J] = -1;<br>
+ }<br>
+ // See if it would be better to reorganize NewMask to avoid using VPERM.<br>
+ SmallVector<int, SystemZ::VectorBytes> NewBytesMap(SystemZ::VectorBytes);<br>
+ if (const Permute *P = matchDoublePermute(NewBytes, NewBytesMap)) {<br>
+ Ops[I] = getPermuteNode(DAG, DL, *P, SubOps[0], SubOps[1]);<br>
+ // Applying NewBytesMap to Ops[I] gets back to NewBytes.<br>
+ for (unsigned J = 0; J < SystemZ::VectorBytes; ++J) {<br>
+ if (NewBytes[J] >= 0) {<br>
+ assert(unsigned(NewBytesMap[J]) < SystemZ::VectorBytes &&<br>
+ "Invalid double permute");<br>
+ Bytes[J] = I * SystemZ::VectorBytes + NewBytesMap[J];<br>
+ } else<br>
+ assert(NewBytesMap[J] < 0 && "Invalid double permute");<br>
+ }<br>
+ } else {<br>
+ // Just use NewBytes on the operands.<br>
+ Ops[I] = getGeneralPermuteNode(DAG, DL, SubOps, NewBytes);<br>
+ for (unsigned J = 0; J < SystemZ::VectorBytes; ++J)<br>
+ if (NewBytes[J] >= 0)<br>
+ Bytes[J] = I * SystemZ::VectorBytes + J;<br>
+ }<br>
+ }<br>
+ }<br>
+<br>
+ // Now we just have 2 inputs. Put the second operand in Ops[1].<br>
+ if (Stride > 1) {<br>
+ Ops[1] = Ops[Stride];<br>
+ for (unsigned I = 0; I < SystemZ::VectorBytes; ++I)<br>
+ if (Bytes[I] >= int(SystemZ::VectorBytes))<br>
+ Bytes[I] -= (Stride - 1) * SystemZ::VectorBytes;<br>
+ }<br>
+<br>
+ // Look for an instruction that can do the permute without resorting<br>
+ // to VPERM.<br>
+ unsigned OpNo0, OpNo1;<br>
+ SDValue Op;<br>
+ if (const Permute *P = matchPermute(Bytes, OpNo0, OpNo1))<br>
+ Op = getPermuteNode(DAG, DL, *P, Ops[OpNo0], Ops[OpNo1]);<br>
+ else<br>
+ Op = getGeneralPermuteNode(DAG, DL, &Ops[0], Bytes);<br>
+ return DAG.getNode(ISD::BITCAST, DL, VT, Op);<br>
+}<br>
+<br>
+// Extend GPR scalars Op0 and Op1 to doublewords and return a v2i64<br>
+// vector for them.<br>
+static SDValue joinDwords(SelectionDAG &DAG, SDLoc DL, SDValue Op0,<br>
+ SDValue Op1) {<br>
+ if (Op0.getOpcode() == ISD::UNDEF && Op1.getOpcode() == ISD::UNDEF)<br>
+ return DAG.getUNDEF(MVT::v2i64);<br>
+ // If one of the two inputs is undefined then replicate the other one,<br>
+ // in order to avoid using another register unnecessarily.<br>
+ if (Op0.getOpcode() == ISD::UNDEF)<br>
+ Op0 = Op1 = DAG.getNode(ISD::ANY_EXTEND, DL, MVT::i64, Op1);<br>
+ else if (Op1.getOpcode() == ISD::UNDEF)<br>
+ Op0 = Op1 = DAG.getNode(ISD::ANY_EXTEND, DL, MVT::i64, Op0);<br>
+ else {<br>
+ Op0 = DAG.getNode(ISD::ANY_EXTEND, DL, MVT::i64, Op0);<br>
+ Op1 = DAG.getNode(ISD::ANY_EXTEND, DL, MVT::i64, Op1);<br>
+ }<br>
+ return DAG.getNode(SystemZISD::JOIN_DWORDS, DL, MVT::v2i64, Op0, Op1);<br>
+}<br>
+<br>
+// Try to represent constant BUILD_VECTOR node BVN using a<br>
+// SystemZISD::BYTE_MASK-style mask. Store the mask value in Mask<br>
+// on success.<br>
+static bool tryBuildVectorByteMask(BuildVectorSDNode *BVN, uint64_t &Mask) {<br>
+ EVT ElemVT = BVN->getValueType(0).getVectorElementType();<br>
+ unsigned BytesPerElement = ElemVT.getStoreSize();<br>
+ for (unsigned I = 0, E = BVN->getNumOperands(); I != E; ++I) {<br>
+ SDValue Op = BVN->getOperand(I);<br>
+ if (Op.getOpcode() != ISD::UNDEF) {<br>
+ uint64_t Value;<br>
+ if (Op.getOpcode() == ISD::Constant)<br>
+ Value = dyn_cast<ConstantSDNode>(Op)->getZExtValue();<br>
+ else if (Op.getOpcode() == ISD::ConstantFP)<br>
+ Value = (dyn_cast<ConstantFPSDNode>(Op)->getValueAPF().bitcastToAPInt()<br>
+ .getZExtValue());<br>
+ else<br>
+ return false;<br>
+ for (unsigned J = 0; J < BytesPerElement; ++J) {<br>
+ uint64_t Byte = (Value >> (J * 8)) & 0xff;<br>
+ if (Byte == 0xff)<br>
+ Mask |= 1 << ((E - I - 1) * BytesPerElement + J);<br>
+ else if (Byte != 0)<br>
+ return false;<br>
+ }<br>
+ }<br>
+ }<br>
+ return true;<br>
+}<br>
+<br>
+// Try to load a vector constant in which BitsPerElement-bit value Value<br>
+// is replicated to fill the vector. VT is the type of the resulting<br>
+// constant, which may have elements of a different size from BitsPerElement.<br>
+// Return the SDValue of the constant on success, otherwise return<br>
+// an empty value.<br>
+static SDValue tryBuildVectorReplicate(SelectionDAG &DAG,<br>
+ const SystemZInstrInfo *TII,<br>
+ SDLoc DL, EVT VT, uint64_t Value,<br>
+ unsigned BitsPerElement) {<br>
+ // Signed 16-bit values can be replicated using VREPI.<br>
+ int64_t SignedValue = SignExtend64(Value, BitsPerElement);<br>
+ if (isInt<16>(SignedValue)) {<br>
+ MVT VecVT = MVT::getVectorVT(MVT::getIntegerVT(BitsPerElement),<br>
+ SystemZ::VectorBits / BitsPerElement);<br>
+ SDValue Op = DAG.getNode(SystemZISD::REPLICATE, DL, VecVT,<br>
+ DAG.getConstant(SignedValue, DL, MVT::i32));<br>
+ return DAG.getNode(ISD::BITCAST, DL, VT, Op);<br>
+ }<br>
+ // See whether rotating the constant left some N places gives a value that<br>
+ // is one less than a power of 2 (i.e. all zeros followed by all ones).<br>
+ // If so we can use VGM.<br>
+ unsigned Start, End;<br>
+ if (TII->isRxSBGMask(Value, BitsPerElement, Start, End)) {<br>
+ // isRxSBGMask returns the bit numbers for a full 64-bit value,<br>
+ // with 0 denoting 1 << 63 and 63 denoting 1. Convert them to<br>
+ // bit numbers for an BitsPerElement value, so that 0 denotes<br>
+ // 1 << (BitsPerElement-1).<br>
+ Start -= 64 - BitsPerElement;<br>
+ End -= 64 - BitsPerElement;<br>
+ MVT VecVT = MVT::getVectorVT(MVT::getIntegerVT(BitsPerElement),<br>
+ SystemZ::VectorBits / BitsPerElement);<br>
+ SDValue Op = DAG.getNode(SystemZISD::ROTATE_MASK, DL, VecVT,<br>
+ DAG.getConstant(Start, DL, MVT::i32),<br>
+ DAG.getConstant(End, DL, MVT::i32));<br>
+ return DAG.getNode(ISD::BITCAST, DL, VT, Op);<br>
+ }<br>
+ return SDValue();<br>
+}<br>
+<br>
+// If a BUILD_VECTOR contains some EXTRACT_VECTOR_ELTs, it's usually<br>
+// better to use VECTOR_SHUFFLEs on them, only using BUILD_VECTOR for<br>
+// the non-EXTRACT_VECTOR_ELT elements. See if the given BUILD_VECTOR<br>
+// would benefit from this representation and return it if so.<br>
+static SDValue tryBuildVectorShuffle(SelectionDAG &DAG,<br>
+ BuildVectorSDNode *BVN) {<br>
+ EVT VT = BVN->getValueType(0);<br>
+ unsigned NumElements = VT.getVectorNumElements();<br>
+<br>
+ // Represent the BUILD_VECTOR as an N-operand VECTOR_SHUFFLE-like operation<br>
+ // on byte vectors. If there are non-EXTRACT_VECTOR_ELT elements that still<br>
+ // need a BUILD_VECTOR, add an additional placeholder operand for that<br>
+ // BUILD_VECTOR and store its operands in ResidueOps.<br>
+ GeneralShuffle GS(VT);<br>
+ SmallVector<SDValue, SystemZ::VectorBytes> ResidueOps;<br>
+ bool FoundOne = false;<br>
+ for (unsigned I = 0; I < NumElements; ++I) {<br>
+ SDValue Op = BVN->getOperand(I);<br>
+ if (Op.getOpcode() == ISD::TRUNCATE)<br>
+ Op = Op.getOperand(0);<br>
+ if (Op.getOpcode() == ISD::EXTRACT_VECTOR_ELT &&<br>
+ Op.getOperand(1).getOpcode() == ISD::Constant) {<br>
+ unsigned Elem = cast<ConstantSDNode>(Op.getOperand(1))->getZExtValue();<br>
+ GS.add(Op.getOperand(0), Elem);<br>
+ FoundOne = true;<br>
+ } else if (Op.getOpcode() == ISD::UNDEF) {<br>
+ GS.addUndef();<br>
+ } else {<br>
+ GS.add(SDValue(), ResidueOps.size());<br>
+ ResidueOps.push_back(Op);<br>
+ }<br>
+ }<br>
+<br>
+ // Nothing to do if there are no EXTRACT_VECTOR_ELTs.<br>
+ if (!FoundOne)<br>
+ return SDValue();<br>
+<br>
+ // Create the BUILD_VECTOR for the remaining elements, if any.<br>
+ if (!ResidueOps.empty()) {<br>
+ while (ResidueOps.size() < NumElements)<br>
+ ResidueOps.push_back(DAG.getUNDEF(VT.getVectorElementType()));<br>
+ for (auto &Op : GS.Ops) {<br>
+ if (!Op.getNode()) {<br>
+ Op = DAG.getNode(ISD::BUILD_VECTOR, SDLoc(BVN), VT, ResidueOps);<br>
+ break;<br>
+ }<br>
+ }<br>
+ }<br>
+ return GS.getNode(DAG, SDLoc(BVN));<br>
+}<br>
+<br>
+// Combine GPR scalar values Elems into a vector of type VT.<br>
+static SDValue buildVector(SelectionDAG &DAG, SDLoc DL, EVT VT,<br>
+ SmallVectorImpl<SDValue> &Elems) {<br>
+ // See whether there is a single replicated value.<br>
+ SDValue Single;<br>
+ unsigned int NumElements = Elems.size();<br>
+ unsigned int Count = 0;<br>
+ for (auto Elem : Elems) {<br>
+ if (Elem.getOpcode() != ISD::UNDEF) {<br>
+ if (!Single.getNode())<br>
+ Single = Elem;<br>
+ else if (Elem != Single) {<br>
+ Single = SDValue();<br>
+ break;<br>
+ }<br>
+ Count += 1;<br>
+ }<br>
+ }<br>
+ // There are three cases here:<br>
+ //<br>
+ // - if the only defined element is a loaded one, the best sequence<br>
+ // is a replicating load.<br>
+ //<br>
+ // - otherwise, if the only defined element is an i64 value, we will<br>
+ // end up with the same VLVGP sequence regardless of whether we short-cut<br>
+ // for replication or fall through to the later code.<br>
+ //<br>
+ // - otherwise, if the only defined element is an i32 or smaller value,<br>
+ // we would need 2 instructions to replicate it: VLVGP followed by VREPx.<br>
+ // This is only a win if the single defined element is used more than once.<br>
+ // In other cases we're better off using a single VLVGx.<br>
+ if (Single.getNode() && (Count > 1 || Single.getOpcode() == ISD::LOAD))<br>
+ return DAG.getNode(SystemZISD::REPLICATE, DL, VT, Single);<br>
+<br>
+ // The best way of building a v2i64 from two i64s is to use VLVGP.<br>
+ if (VT == MVT::v2i64)<br>
+ return joinDwords(DAG, DL, Elems[0], Elems[1]);<br>
+<br>
+ // Collect the constant terms.<br>
+ SmallVector<SDValue, SystemZ::VectorBytes> Constants(NumElements, SDValue());<br>
+ SmallVector<bool, SystemZ::VectorBytes> Done(NumElements, false);<br>
+<br>
+ unsigned NumConstants = 0;<br>
+ for (unsigned I = 0; I < NumElements; ++I) {<br>
+ SDValue Elem = Elems[I];<br>
+ if (Elem.getOpcode() == ISD::Constant ||<br>
+ Elem.getOpcode() == ISD::ConstantFP) {<br>
+ NumConstants += 1;<br>
+ Constants[I] = Elem;<br>
+ Done[I] = true;<br>
+ }<br>
+ }<br>
+ // If there was at least one constant, fill in the other elements of<br>
+ // Constants with undefs to get a full vector constant and use that<br>
+ // as the starting point.<br>
+ SDValue Result;<br>
+ if (NumConstants > 0) {<br>
+ for (unsigned I = 0; I < NumElements; ++I)<br>
+ if (!Constants[I].getNode())<br>
+ Constants[I] = DAG.getUNDEF(Elems[I].getValueType());<br>
+ Result = DAG.getNode(ISD::BUILD_VECTOR, DL, VT, Constants);<br>
+ } else {<br>
+ // Otherwise try to use VLVGP to start the sequence in order to<br>
+ // avoid a false dependency on any previous contents of the vector<br>
+ // register. This only makes sense if one of the associated elements<br>
+ // is defined.<br>
+ unsigned I1 = NumElements / 2 - 1;<br>
+ unsigned I2 = NumElements - 1;<br>
+ bool Def1 = (Elems[I1].getOpcode() != ISD::UNDEF);<br>
+ bool Def2 = (Elems[I2].getOpcode() != ISD::UNDEF);<br>
+ if (Def1 || Def2) {<br>
+ SDValue Elem1 = Elems[Def1 ? I1 : I2];<br>
+ SDValue Elem2 = Elems[Def2 ? I2 : I1];<br>
+ Result = DAG.getNode(ISD::BITCAST, DL, VT,<br>
+ joinDwords(DAG, DL, Elem1, Elem2));<br>
+ Done[I1] = true;<br>
+ Done[I2] = true;<br>
+ } else<br>
+ Result = DAG.getUNDEF(VT);<br>
+ }<br>
+<br>
+ // Use VLVGx to insert the other elements.<br>
+ for (unsigned I = 0; I < NumElements; ++I)<br>
+ if (!Done[I] && Elems[I].getOpcode() != ISD::UNDEF)<br>
+ Result = DAG.getNode(ISD::INSERT_VECTOR_ELT, DL, VT, Result, Elems[I],<br>
+ DAG.getConstant(I, DL, MVT::i32));<br>
+ return Result;<br>
+}<br>
+<br>
+SDValue SystemZTargetLowering::lowerBUILD_VECTOR(SDValue Op,<br>
+ SelectionDAG &DAG) const {<br>
+ const SystemZInstrInfo *TII =<br>
+ static_cast<const SystemZInstrInfo *>(Subtarget.getInstrInfo());<br>
+ auto *BVN = cast<BuildVectorSDNode>(Op.getNode());<br>
+ SDLoc DL(Op);<br>
+ EVT VT = Op.getValueType();<br>
+<br>
+ if (BVN->isConstant()) {<br>
+ // Try using VECTOR GENERATE BYTE MASK. This is the architecturally-<br>
+ // preferred way of creating all-zero and all-one vectors so give it<br>
+ // priority over other methods below.<br>
+ uint64_t Mask = 0;<br>
+ if (tryBuildVectorByteMask(BVN, Mask)) {<br>
+ SDValue Op = DAG.getNode(SystemZISD::BYTE_MASK, DL, MVT::v16i8,<br>
+ DAG.getConstant(Mask, DL, MVT::i32));<br>
+ return DAG.getNode(ISD::BITCAST, DL, VT, Op);<br>
+ }<br>
+<br>
+ // Try using some form of replication.<br>
+ APInt SplatBits, SplatUndef;<br>
+ unsigned SplatBitSize;<br>
+ bool HasAnyUndefs;<br>
+ if (BVN->isConstantSplat(SplatBits, SplatUndef, SplatBitSize, HasAnyUndefs,<br>
+ 8, true) &&<br>
+ SplatBitSize <= 64) {<br>
+ // First try assuming that any undefined bits above the highest set bit<br>
+ // and below the lowest set bit are 1s. This increases the likelihood of<br>
+ // being able to use a sign-extended element value in VECTOR REPLICATE<br>
+ // IMMEDIATE or a wraparound mask in VECTOR GENERATE MASK.<br>
+ uint64_t SplatBitsZ = SplatBits.getZExtValue();<br>
+ uint64_t SplatUndefZ = SplatUndef.getZExtValue();<br>
+ uint64_t Lower = (SplatUndefZ<br>
+ & ((uint64_t(1) << findFirstSet(SplatBitsZ)) - 1));<br>
+ uint64_t Upper = (SplatUndefZ<br>
+ & ~((uint64_t(1) << findLastSet(SplatBitsZ)) - 1));<br>
+ uint64_t Value = SplatBitsZ | Upper | Lower;<br>
+ SDValue Op = tryBuildVectorReplicate(DAG, TII, DL, VT, Value,<br>
+ SplatBitSize);<br>
+ if (Op.getNode())<br>
+ return Op;<br>
+<br>
+ // Now try assuming that any undefined bits between the first and<br>
+ // last defined set bits are set. This increases the chances of<br>
+ // using a non-wraparound mask.<br>
+ uint64_t Middle = SplatUndefZ & ~Upper & ~Lower;<br>
+ Value = SplatBitsZ | Middle;<br>
+ Op = tryBuildVectorReplicate(DAG, TII, DL, VT, Value, SplatBitSize);<br>
+ if (Op.getNode())<br>
+ return Op;<br>
+ }<br>
+<br>
+ // Fall back to loading it from memory.<br>
+ return SDValue();<br>
+ }<br>
+<br>
+ // See if we should use shuffles to construct the vector from other vectors.<br>
+ SDValue Res = tryBuildVectorShuffle(DAG, BVN);<br>
+ if (Res.getNode())<br>
+ return Res;<br>
+<br>
+ // Otherwise use buildVector to build the vector up from GPRs.<br>
+ unsigned NumElements = Op.getNumOperands();<br>
+ SmallVector<SDValue, SystemZ::VectorBytes> Ops(NumElements);<br>
+ for (unsigned I = 0; I < NumElements; ++I)<br>
+ Ops[I] = Op.getOperand(I);<br>
+ return buildVector(DAG, DL, VT, Ops);<br>
+}<br>
+<br>
+SDValue SystemZTargetLowering::lowerVECTOR_SHUFFLE(SDValue Op,<br>
+ SelectionDAG &DAG) const {<br>
+ auto *VSN = cast<ShuffleVectorSDNode>(Op.getNode());<br>
+ SDLoc DL(Op);<br>
+ EVT VT = Op.getValueType();<br>
+ unsigned NumElements = VT.getVectorNumElements();<br>
+<br>
+ if (VSN->isSplat()) {<br>
+ SDValue Op0 = Op.getOperand(0);<br>
+ unsigned Index = VSN->getSplatIndex();<br>
+ assert(Index < VT.getVectorNumElements() &&<br>
+ "Splat index should be defined and in first operand");<br>
+ // See whether the value we're splatting is directly available as a scalar.<br>
+ if ((Index == 0 && Op0.getOpcode() == ISD::SCALAR_TO_VECTOR) ||<br>
+ Op0.getOpcode() == ISD::BUILD_VECTOR)<br>
+ return DAG.getNode(SystemZISD::REPLICATE, DL, VT, Op0.getOperand(Index));<br>
+ // Otherwise keep it as a vector-to-vector operation.<br>
+ return DAG.getNode(SystemZISD::SPLAT, DL, VT, Op.getOperand(0),<br>
+ DAG.getConstant(Index, DL, MVT::i32));<br>
+ }<br>
+<br>
+ GeneralShuffle GS(VT);<br>
+ for (unsigned I = 0; I < NumElements; ++I) {<br>
+ int Elt = VSN->getMaskElt(I);<br>
+ if (Elt < 0)<br>
+ GS.addUndef();<br>
+ else<br>
+ GS.add(Op.getOperand(unsigned(Elt) / NumElements),<br>
+ unsigned(Elt) % NumElements);<br>
+ }<br>
+ return GS.getNode(DAG, SDLoc(VSN));<br>
+}<br>
+<br>
+SDValue SystemZTargetLowering::lowerSCALAR_TO_VECTOR(SDValue Op,<br>
+ SelectionDAG &DAG) const {<br>
+ SDLoc DL(Op);<br>
+ // Just insert the scalar into element 0 of an undefined vector.<br>
+ return DAG.getNode(ISD::INSERT_VECTOR_ELT, DL,<br>
+ Op.getValueType(), DAG.getUNDEF(Op.getValueType()),<br>
+ Op.getOperand(0), DAG.getConstant(0, DL, MVT::i32));<br>
+}<br>
+<br>
+SDValue SystemZTargetLowering::lowerShift(SDValue Op, SelectionDAG &DAG,<br>
+ unsigned ByScalar) const {<br>
+ // Look for cases where a vector shift can use the *_BY_SCALAR form.<br>
+ SDValue Op0 = Op.getOperand(0);<br>
+ SDValue Op1 = Op.getOperand(1);<br>
+ SDLoc DL(Op);<br>
+ EVT VT = Op.getValueType();<br>
+ unsigned ElemBitSize = VT.getVectorElementType().getSizeInBits();<br>
+<br>
+ // See whether the shift vector is a splat represented as BUILD_VECTOR.<br>
+ if (auto *BVN = dyn_cast<BuildVectorSDNode>(Op1)) {<br>
+ APInt SplatBits, SplatUndef;<br>
+ unsigned SplatBitSize;<br>
+ bool HasAnyUndefs;<br>
+ // Check for constant splats. Use ElemBitSize as the minimum element<br>
+ // width and reject splats that need wider elements.<br>
+ if (BVN->isConstantSplat(SplatBits, SplatUndef, SplatBitSize, HasAnyUndefs,<br>
+ ElemBitSize, true) &&<br>
+ SplatBitSize == ElemBitSize) {<br>
+ SDValue Shift = DAG.getConstant(SplatBits.getZExtValue() & 0xfff,<br>
+ DL, MVT::i32);<br>
+ return DAG.getNode(ByScalar, DL, VT, Op0, Shift);<br>
+ }<br>
+ // Check for variable splats.<br>
+ BitVector UndefElements;<br>
+ SDValue Splat = BVN->getSplatValue(&UndefElements);<br>
+ if (Splat) {<br>
+ // Since i32 is the smallest legal type, we either need a no-op<br>
+ // or a truncation.<br>
+ SDValue Shift = DAG.getNode(ISD::TRUNCATE, DL, MVT::i32, Splat);<br>
+ return DAG.getNode(ByScalar, DL, VT, Op0, Shift);<br>
+ }<br>
+ }<br>
+<br>
+ // See whether the shift vector is a splat represented as SHUFFLE_VECTOR,<br>
+ // and the shift amount is directly available in a GPR.<br>
+ if (auto *VSN = dyn_cast<ShuffleVectorSDNode>(Op1)) {<br>
+ if (VSN->isSplat()) {<br>
+ SDValue VSNOp0 = VSN->getOperand(0);<br>
+ unsigned Index = VSN->getSplatIndex();<br>
+ assert(Index < VT.getVectorNumElements() &&<br>
+ "Splat index should be defined and in first operand");<br>
+ if ((Index == 0 && VSNOp0.getOpcode() == ISD::SCALAR_TO_VECTOR) ||<br>
+ VSNOp0.getOpcode() == ISD::BUILD_VECTOR) {<br>
+ // Since i32 is the smallest legal type, we either need a no-op<br>
+ // or a truncation.<br>
+ SDValue Shift = DAG.getNode(ISD::TRUNCATE, DL, MVT::i32,<br>
+ VSNOp0.getOperand(Index));<br>
+ return DAG.getNode(ByScalar, DL, VT, Op0, Shift);<br>
+ }<br>
+ }<br>
+ }<br>
+<br>
+ // Otherwise just treat the current form as legal.<br>
+ return Op;<br>
+}<br>
+<br>
SDValue SystemZTargetLowering::LowerOperation(SDValue Op,<br>
SelectionDAG &DAG) const {<br>
switch (Op.getOpcode()) {<br>
@@ -2737,6 +3760,12 @@ SDValue SystemZTargetLowering::LowerOper<br>
return lowerOR(Op, DAG);<br>
case ISD::CTPOP:<br>
return lowerCTPOP(Op, DAG);<br>
+ case ISD::CTLZ_ZERO_UNDEF:<br>
+ return DAG.getNode(ISD::CTLZ, SDLoc(Op),<br>
+ Op.getValueType(), Op.getOperand(0));<br>
+ case ISD::CTTZ_ZERO_UNDEF:<br>
+ return DAG.getNode(ISD::CTTZ, SDLoc(Op),<br>
+ Op.getValueType(), Op.getOperand(0));<br>
case ISD::ATOMIC_SWAP:<br>
return lowerATOMIC_LOAD_OP(Op, DAG, SystemZISD::ATOMIC_SWAPW);<br>
case ISD::ATOMIC_STORE:<br>
@@ -2773,6 +3802,18 @@ SDValue SystemZTargetLowering::LowerOper<br>
return lowerPREFETCH(Op, DAG);<br>
case ISD::INTRINSIC_W_CHAIN:<br>
return lowerINTRINSIC_W_CHAIN(Op, DAG);<br>
+ case ISD::BUILD_VECTOR:<br>
+ return lowerBUILD_VECTOR(Op, DAG);<br>
+ case ISD::VECTOR_SHUFFLE:<br>
+ return lowerVECTOR_SHUFFLE(Op, DAG);<br>
+ case ISD::SCALAR_TO_VECTOR:<br>
+ return lowerSCALAR_TO_VECTOR(Op, DAG);<br>
+ case ISD::SHL:<br>
+ return lowerShift(Op, DAG, SystemZISD::VSHL_BY_SCALAR);<br>
+ case ISD::SRL:<br>
+ return lowerShift(Op, DAG, SystemZISD::VSRL_BY_SCALAR);<br>
+ case ISD::SRA:<br>
+ return lowerShift(Op, DAG, SystemZISD::VSRA_BY_SCALAR);<br>
default:<br>
llvm_unreachable("Unexpected node to lower");<br>
}<br>
@@ -2820,6 +3861,24 @@ const char *SystemZTargetLowering::getTa<br>
OPCODE(TBEGIN);<br>
OPCODE(TBEGIN_NOFLOAT);<br>
OPCODE(TEND);<br>
+ OPCODE(BYTE_MASK);<br>
+ OPCODE(ROTATE_MASK);<br>
+ OPCODE(REPLICATE);<br>
+ OPCODE(JOIN_DWORDS);<br>
+ OPCODE(SPLAT);<br>
+ OPCODE(MERGE_HIGH);<br>
+ OPCODE(MERGE_LOW);<br>
+ OPCODE(SHL_DOUBLE);<br>
+ OPCODE(PERMUTE_DWORDS);<br>
+ OPCODE(PERMUTE);<br>
+ OPCODE(PACK);<br>
+ OPCODE(VSHL_BY_SCALAR);<br>
+ OPCODE(VSRL_BY_SCALAR);<br>
+ OPCODE(VSRA_BY_SCALAR);<br>
+ OPCODE(VSUM);<br>
+ OPCODE(VICMPE);<br>
+ OPCODE(VICMPH);<br>
+ OPCODE(VICMPHL);<br>
OPCODE(ATOMIC_SWAPW);<br>
OPCODE(ATOMIC_LOADW_ADD);<br>
OPCODE(ATOMIC_LOADW_SUB);<br>
@@ -2838,6 +3897,157 @@ const char *SystemZTargetLowering::getTa<br>
#undef OPCODE<br>
}<br>
<br>
+// Return true if VT is a vector whose elements are a whole number of bytes<br>
+// in width.<br>
+static bool canTreatAsByteVector(EVT VT) {<br>
+ return VT.isVector() && VT.getVectorElementType().getSizeInBits() % 8 == 0;<br>
+}<br>
+<br>
+// Try to simplify an EXTRACT_VECTOR_ELT from a vector of type VecVT<br>
+// producing a result of type ResVT. Op is a possibly bitcast version<br>
+// of the input vector and Index is the index (based on type VecVT) that<br>
+// should be extracted. Return the new extraction if a simplification<br>
+// was possible or if Force is true.<br>
+SDValue SystemZTargetLowering::combineExtract(SDLoc DL, EVT ResVT, EVT VecVT,<br>
+ SDValue Op, unsigned Index,<br>
+ DAGCombinerInfo &DCI,<br>
+ bool Force) const {<br>
+ SelectionDAG &DAG = DCI.DAG;<br>
+<br>
+ // The number of bytes being extracted.<br>
+ unsigned BytesPerElement = VecVT.getVectorElementType().getStoreSize();<br>
+<br>
+ for (;;) {<br>
+ unsigned Opcode = Op.getOpcode();<br>
+ if (Opcode == ISD::BITCAST)<br>
+ // Look through bitcasts.<br>
+ Op = Op.getOperand(0);<br>
+ else if (Opcode == ISD::VECTOR_SHUFFLE &&<br>
+ canTreatAsByteVector(Op.getValueType())) {<br>
+ // Get a VPERM-like permute mask and see whether the bytes covered<br>
+ // by the extracted element are a contiguous sequence from one<br>
+ // source operand.<br>
+ SmallVector<int, SystemZ::VectorBytes> Bytes;<br>
+ getVPermMask(cast<ShuffleVectorSDNode>(Op), Bytes);<br>
+ int First;<br>
+ if (!getShuffleInput(Bytes, Index * BytesPerElement,<br>
+ BytesPerElement, First))<br>
+ break;<br>
+ if (First < 0)<br>
+ return DAG.getUNDEF(ResVT);<br>
+ // Make sure the contiguous sequence starts at a multiple of the<br>
+ // original element size.<br>
+ unsigned Byte = unsigned(First) % Bytes.size();<br>
+ if (Byte % BytesPerElement != 0)<br>
+ break;<br>
+ // We can get the extracted value directly from an input.<br>
+ Index = Byte / BytesPerElement;<br>
+ Op = Op.getOperand(unsigned(First) / Bytes.size());<br>
+ Force = true;<br>
+ } else if (Opcode == ISD::BUILD_VECTOR &&<br>
+ canTreatAsByteVector(Op.getValueType())) {<br>
+ // We can only optimize this case if the BUILD_VECTOR elements are<br>
+ // at least as wide as the extracted value.<br>
+ EVT OpVT = Op.getValueType();<br>
+ unsigned OpBytesPerElement = OpVT.getVectorElementType().getStoreSize();<br>
+ if (OpBytesPerElement < BytesPerElement)<br>
+ break;<br>
+ // Make sure that the least-significant bit of the extracted value<br>
+ // is the least significant bit of an input.<br>
+ unsigned End = (Index + 1) * BytesPerElement;<br>
+ if (End % OpBytesPerElement != 0)<br>
+ break;<br>
+ // We're extracting the low part of one operand of the BUILD_VECTOR.<br>
+ Op = Op.getOperand(End / OpBytesPerElement - 1);<br>
+ if (!Op.getValueType().isInteger()) {<br>
+ EVT VT = MVT::getIntegerVT(Op.getValueType().getSizeInBits());<br>
+ Op = DAG.getNode(ISD::BITCAST, DL, VT, Op);<br>
+ DCI.AddToWorklist(Op.getNode());<br>
+ }<br>
+ EVT VT = MVT::getIntegerVT(ResVT.getSizeInBits());<br>
+ Op = DAG.getNode(ISD::TRUNCATE, DL, VT, Op);<br>
+ if (VT != ResVT) {<br>
+ DCI.AddToWorklist(Op.getNode());<br>
+ Op = DAG.getNode(ISD::BITCAST, DL, ResVT, Op);<br>
+ }<br>
+ return Op;<br>
+ } else if ((Opcode == ISD::SIGN_EXTEND_VECTOR_INREG ||<br>
+ Opcode == ISD::ZERO_EXTEND_VECTOR_INREG ||<br>
+ Opcode == ISD::ANY_EXTEND_VECTOR_INREG) &&<br>
+ canTreatAsByteVector(Op.getValueType()) &&<br>
+ canTreatAsByteVector(Op.getOperand(0).getValueType())) {<br>
+ // Make sure that only the unextended bits are significant.<br>
+ EVT ExtVT = Op.getValueType();<br>
+ EVT OpVT = Op.getOperand(0).getValueType();<br>
+ unsigned ExtBytesPerElement = ExtVT.getVectorElementType().getStoreSize();<br>
+ unsigned OpBytesPerElement = OpVT.getVectorElementType().getStoreSize();<br>
+ unsigned Byte = Index * BytesPerElement;<br>
+ unsigned SubByte = Byte % ExtBytesPerElement;<br>
+ unsigned MinSubByte = ExtBytesPerElement - OpBytesPerElement;<br>
+ if (SubByte < MinSubByte ||<br>
+ SubByte + BytesPerElement > ExtBytesPerElement)<br>
+ break;<br>
+ // Get the byte offset of the unextended element<br>
+ Byte = Byte / ExtBytesPerElement * OpBytesPerElement;<br>
+ // ...then add the byte offset relative to that element.<br>
+ Byte += SubByte - MinSubByte;<br>
+ if (Byte % BytesPerElement != 0)<br>
+ break;<br>
+ Op = Op.getOperand(0);<br>
+ Index = Byte / BytesPerElement;<br>
+ Force = true;<br>
+ } else<br>
+ break;<br>
+ }<br>
+ if (Force) {<br>
+ if (Op.getValueType() != VecVT) {<br>
+ Op = DAG.getNode(ISD::BITCAST, DL, VecVT, Op);<br>
+ DCI.AddToWorklist(Op.getNode());<br>
+ }<br>
+ return DAG.getNode(ISD::EXTRACT_VECTOR_ELT, DL, ResVT, Op,<br>
+ DAG.getConstant(Index, DL, MVT::i32));<br>
+ }<br>
+ return SDValue();<br>
+}<br>
+<br>
+// Optimize vector operations in scalar value Op on the basis that Op<br>
+// is truncated to TruncVT.<br>
+SDValue<br>
+SystemZTargetLowering::combineTruncateExtract(SDLoc DL, EVT TruncVT, SDValue Op,<br>
+ DAGCombinerInfo &DCI) const {<br>
+ // If we have (trunc (extract_vector_elt X, Y)), try to turn it into<br>
+ // (extract_vector_elt (bitcast X), Y'), where (bitcast X) has elements<br>
+ // of type TruncVT.<br>
+ if (Op.getOpcode() == ISD::EXTRACT_VECTOR_ELT &&<br>
+ TruncVT.getSizeInBits() % 8 == 0) {<br>
+ SDValue Vec = Op.getOperand(0);<br>
+ EVT VecVT = Vec.getValueType();<br>
+ if (canTreatAsByteVector(VecVT)) {<br>
+ if (auto *IndexN = dyn_cast<ConstantSDNode>(Op.getOperand(1))) {<br>
+ unsigned BytesPerElement = VecVT.getVectorElementType().getStoreSize();<br>
+ unsigned TruncBytes = TruncVT.getStoreSize();<br>
+ if (BytesPerElement % TruncBytes == 0) {<br>
+ // Calculate the value of Y' in the above description. We are<br>
+ // splitting the original elements into Scale equal-sized pieces<br>
+ // and for truncation purposes want the last (least-significant)<br>
+ // of these pieces for IndexN. This is easiest to do by calculating<br>
+ // the start index of the following element and then subtracting 1.<br>
+ unsigned Scale = BytesPerElement / TruncBytes;<br>
+ unsigned NewIndex = (IndexN->getZExtValue() + 1) * Scale - 1;<br>
+<br>
+ // Defer the creation of the bitcast from X to combineExtract,<br>
+ // which might be able to optimize the extraction.<br>
+ VecVT = MVT::getVectorVT(MVT::getIntegerVT(TruncBytes * 8),<br>
+ VecVT.getStoreSize() / TruncBytes);<br>
+ EVT ResVT = (TruncBytes < 4 ? MVT::i32 : TruncVT);<br>
+ return combineExtract(DL, ResVT, VecVT, Vec, NewIndex, DCI, true);<br>
+ }<br>
+ }<br>
+ }<br>
+ }<br>
+ return SDValue();<br>
+}<br>
+<br>
SDValue SystemZTargetLowering::PerformDAGCombine(SDNode *N,<br>
DAGCombinerInfo &DCI) const {<br>
SelectionDAG &DAG = DCI.DAG;<br>
@@ -2869,6 +4079,40 @@ SDValue SystemZTargetLowering::PerformDA<br>
}<br>
}<br>
}<br>
+ // If we have (truncstoreiN (extract_vector_elt X, Y), Z) then it is better<br>
+ // for the extraction to be done on a vMiN value, so that we can use VSTE.<br>
+ // If X has wider elements then convert it to:<br>
+ // (truncstoreiN (extract_vector_elt (bitcast X), Y2), Z).<br>
+ if (Opcode == ISD::STORE) {<br>
+ auto *SN = cast<StoreSDNode>(N);<br>
+ EVT MemVT = SN->getMemoryVT();<br>
+ if (MemVT.isInteger()) {<br>
+ SDValue Value = combineTruncateExtract(SDLoc(N), MemVT,<br>
+ SN->getValue(), DCI);<br>
+ if (Value.getNode()) {<br>
+ DCI.AddToWorklist(Value.getNode());<br>
+<br>
+ // Rewrite the store with the new form of stored value.<br>
+ return DAG.getTruncStore(SN->getChain(), SDLoc(SN), Value,<br>
+ SN->getBasePtr(), SN->getMemoryVT(),<br>
+ SN->getMemOperand());<br>
+ }<br>
+ }<br>
+ }<br>
+ // Try to simplify a vector extraction.<br>
+ if (Opcode == ISD::EXTRACT_VECTOR_ELT) {<br>
+ if (auto *IndexN = dyn_cast<ConstantSDNode>(N->getOperand(1))) {<br>
+ SDValue Op0 = N->getOperand(0);<br>
+ EVT VecVT = Op0.getValueType();<br>
+ return combineExtract(SDLoc(N), N->getValueType(0), VecVT, Op0,<br>
+ IndexN->getZExtValue(), DCI, false);<br>
+ }<br>
+ }<br>
+ // (join_dwords X, X) == (replicate X)<br>
+ if (Opcode == SystemZISD::JOIN_DWORDS &&<br>
+ N->getOperand(0) == N->getOperand(1))<br>
+ return DAG.getNode(SystemZISD::REPLICATE, SDLoc(N), N->getValueType(0),<br>
+ N->getOperand(0));<br>
return SDValue();<br>
}<br>
<br>
@@ -3681,11 +4925,18 @@ SystemZTargetLowering::emitTransactionBe<br>
}<br>
}<br>
<br>
- // Add FPR clobbers.<br>
+ // Add FPR/VR clobbers.<br>
if (!NoFloat && (Control & 4) != 0) {<br>
- for (int I = 0; I < 16; I++) {<br>
- unsigned Reg = SystemZMC::FP64Regs[I];<br>
- MI->addOperand(MachineOperand::CreateReg(Reg, true, true));<br>
+ if (Subtarget.hasVector()) {<br>
+ for (int I = 0; I < 32; I++) {<br>
+ unsigned Reg = SystemZMC::VR128Regs[I];<br>
+ MI->addOperand(MachineOperand::CreateReg(Reg, true, true));<br>
+ }<br>
+ } else {<br>
+ for (int I = 0; I < 16; I++) {<br>
+ unsigned Reg = SystemZMC::FP64Regs[I];<br>
+ MI->addOperand(MachineOperand::CreateReg(Reg, true, true));<br>
+ }<br>
}<br>
}<br>
<br>
<br>
Modified: llvm/trunk/lib/Target/SystemZ/SystemZISelLowering.h<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/SystemZ/SystemZISelLowering.h?rev=236521&r1=236520&r2=236521&view=diff" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/SystemZ/SystemZISelLowering.h?rev=236521&r1=236520&r2=236521&view=diff</a><br>
==============================================================================<br>
--- llvm/trunk/lib/Target/SystemZ/SystemZISelLowering.h (original)<br>
+++ llvm/trunk/lib/Target/SystemZ/SystemZISelLowering.h Tue May 5 14:25:42 2015<br>
@@ -155,6 +155,70 @@ enum {<br>
// Transaction end. Just the chain operand. Returns chain and glue.<br>
TEND,<br>
<br>
+ // Create a vector constant by filling byte N of the result with bit<br>
+ // 15-N of the single operand.<br>
+ BYTE_MASK,<br>
+<br>
+ // Create a vector constant by replicating an element-sized RISBG-style mask.<br>
+ // The first operand specifies the starting set bit and the second operand<br>
+ // specifies the ending set bit. Both operands count from the MSB of the<br>
+ // element.<br>
+ ROTATE_MASK,<br>
+<br>
+ // Replicate a GPR scalar value into all elements of a vector.<br>
+ REPLICATE,<br>
+<br>
+ // Create a vector from two i64 GPRs.<br>
+ JOIN_DWORDS,<br>
+<br>
+ // Replicate one element of a vector into all elements. The first operand<br>
+ // is the vector and the second is the index of the element to replicate.<br>
+ SPLAT,<br>
+<br>
+ // Interleave elements from the high half of operand 0 and the high half<br>
+ // of operand 1.<br>
+ MERGE_HIGH,<br>
+<br>
+ // Likewise for the low halves.<br>
+ MERGE_LOW,<br>
+<br>
+ // Concatenate the vectors in the first two operands, shift them left<br>
+ // by the third operand, and take the first half of the result.<br>
+ SHL_DOUBLE,<br>
+<br>
+ // Take one element of the first v2i64 operand and the one element of<br>
+ // the second v2i64 operand and concatenate them to form a v2i64 result.<br>
+ // The third operand is a 4-bit value of the form 0A0B, where A and B<br>
+ // are the element selectors for the first operand and second operands<br>
+ // respectively.<br>
+ PERMUTE_DWORDS,<br>
+<br>
+ // Perform a general vector permute on vector operands 0 and 1.<br>
+ // Each byte of operand 2 controls the corresponding byte of the result,<br>
+ // in the same way as a byte-level VECTOR_SHUFFLE mask.<br>
+ PERMUTE,<br>
+<br>
+ // Pack vector operands 0 and 1 into a single vector with half-sized elements.<br>
+ PACK,<br>
+<br>
+ // Shift each element of vector operand 0 by the number of bits specified<br>
+ // by scalar operand 1.<br>
+ VSHL_BY_SCALAR,<br>
+ VSRL_BY_SCALAR,<br>
+ VSRA_BY_SCALAR,<br>
+<br>
+ // For each element of the output type, sum across all sub-elements of<br>
+ // operand 0 belonging to the corresponding element, and add in the<br>
+ // rightmost sub-element of the corresponding element of operand 1.<br>
+ VSUM,<br>
+<br>
+ // Compare integer vector operands 0 and 1 to produce the usual 0/-1<br>
+ // vector result. VICMPE is for equality, VICMPH for "signed greater than"<br>
+ // and VICMPHL for "unsigned greater than".<br>
+ VICMPE,<br>
+ VICMPH,<br>
+ VICMPHL,<br>
+<br>
// Wrappers around the inner loop of an 8- or 16-bit ATOMIC_SWAP or<br>
// ATOMIC_LOAD_<op>.<br>
//<br>
@@ -222,6 +286,11 @@ public:<br>
MVT getScalarShiftAmountTy(EVT LHSTy) const override {<br>
return MVT::i32;<br>
}<br>
+ MVT getVectorIdxTy() const override {<br>
+ // Only the lower 12 bits of an element index are used, so we don't<br>
+ // want to clobber the upper 32 bits of a GPR unnecessarily.<br>
+ return MVT::i32;<br>
+ }<br>
EVT getSetCCResultType(LLVMContext &, EVT) const override;<br>
bool isFMAFasterThanFMulAndFAdd(EVT VT) const override;<br>
bool isFPImmLegal(const APFloat &Imm, EVT VT) const override;<br>
@@ -328,6 +397,16 @@ private:<br>
SDValue lowerSTACKRESTORE(SDValue Op, SelectionDAG &DAG) const;<br>
SDValue lowerPREFETCH(SDValue Op, SelectionDAG &DAG) const;<br>
SDValue lowerINTRINSIC_W_CHAIN(SDValue Op, SelectionDAG &DAG) const;<br>
+ SDValue lowerBUILD_VECTOR(SDValue Op, SelectionDAG &DAG) const;<br>
+ SDValue lowerVECTOR_SHUFFLE(SDValue Op, SelectionDAG &DAG) const;<br>
+ SDValue lowerSCALAR_TO_VECTOR(SDValue Op, SelectionDAG &DAG) const;<br>
+ SDValue lowerShift(SDValue Op, SelectionDAG &DAG, unsigned ByScalar) const;<br>
+<br>
+ SDValue combineExtract(SDLoc DL, EVT ElemVT, EVT VecVT, SDValue OrigOp,<br>
+ unsigned Index, DAGCombinerInfo &DCI,<br>
+ bool Force) const;<br>
+ SDValue combineTruncateExtract(SDLoc DL, EVT TruncVT, SDValue Op,<br>
+ DAGCombinerInfo &DCI) const;<br>
<br>
// If the last instruction before MBBI in MBB was some form of COMPARE,<br>
// try to replace it with a COMPARE AND BRANCH just before MBBI.<br>
<br>
Modified: llvm/trunk/lib/Target/SystemZ/SystemZInstrFormats.td<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/SystemZ/SystemZInstrFormats.td?rev=236521&r1=236520&r2=236521&view=diff" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/SystemZ/SystemZInstrFormats.td?rev=236521&r1=236520&r2=236521&view=diff</a><br>
==============================================================================<br>
--- llvm/trunk/lib/Target/SystemZ/SystemZInstrFormats.td (original)<br>
+++ llvm/trunk/lib/Target/SystemZ/SystemZInstrFormats.td Tue May 5 14:25:42 2015<br>
@@ -2414,6 +2414,10 @@ class BinaryAliasRIL<SDPatternOperator o<br>
let Constraints = "$R1 = $R1src";<br>
}<br>
<br>
+// An alias of a BinaryVRRf, but with different register sizes.<br>
+class BinaryAliasVRRf<RegisterOperand cls><br>
+ : Alias<6, (outs VR128:$V1), (ins cls:$R2, cls:$R3), []>;<br>
+<br>
// An alias of a CompareRI, but with different register sizes.<br>
class CompareAliasRI<SDPatternOperator operator, RegisterOperand cls,<br>
Immediate imm><br>
<br>
Modified: llvm/trunk/lib/Target/SystemZ/SystemZInstrInfo.cpp<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/SystemZ/SystemZInstrInfo.cpp?rev=236521&r1=236520&r2=236521&view=diff" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/SystemZ/SystemZInstrInfo.cpp?rev=236521&r1=236520&r2=236521&view=diff</a><br>
==============================================================================<br>
--- llvm/trunk/lib/Target/SystemZ/SystemZInstrInfo.cpp (original)<br>
+++ llvm/trunk/lib/Target/SystemZ/SystemZInstrInfo.cpp Tue May 5 14:25:42 2015<br>
@@ -578,6 +578,8 @@ SystemZInstrInfo::copyPhysReg(MachineBas<br>
Opcode = SystemZ::LDR;<br>
else if (SystemZ::FP128BitRegClass.contains(DestReg, SrcReg))<br>
Opcode = SystemZ::LXR;<br>
+ else if (SystemZ::VR128BitRegClass.contains(DestReg, SrcReg))<br>
+ Opcode = SystemZ::VLR;<br>
else<br>
llvm_unreachable("Impossible reg-to-reg copy");<br>
<br>
@@ -1116,6 +1118,10 @@ void SystemZInstrInfo::getLoadStoreOpcod<br>
} else if (RC == &SystemZ::FP128BitRegClass) {<br>
LoadOpcode = SystemZ::LX;<br>
StoreOpcode = SystemZ::STX;<br>
+ } else if (RC == &SystemZ::VF128BitRegClass ||<br>
+ RC == &SystemZ::VR128BitRegClass) {<br>
+ LoadOpcode = SystemZ::VL;<br>
+ StoreOpcode = SystemZ::VST;<br>
} else<br>
llvm_unreachable("Unsupported regclass to load or store");<br>
}<br>
@@ -1185,6 +1191,7 @@ static bool isStringOfOnes(uint64_t Mask<br>
bool SystemZInstrInfo::isRxSBGMask(uint64_t Mask, unsigned BitSize,<br>
unsigned &Start, unsigned &End) const {<br>
// Reject trivial all-zero masks.<br>
+ Mask &= allOnes(BitSize);<br>
if (Mask == 0)<br>
return false;<br>
<br>
<br>
Modified: llvm/trunk/lib/Target/SystemZ/SystemZInstrVector.td<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/SystemZ/SystemZInstrVector.td?rev=236521&r1=236520&r2=236521&view=diff" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/SystemZ/SystemZInstrVector.td?rev=236521&r1=236520&r2=236521&view=diff</a><br>
==============================================================================<br>
--- llvm/trunk/lib/Target/SystemZ/SystemZInstrVector.td (original)<br>
+++ llvm/trunk/lib/Target/SystemZ/SystemZInstrVector.td Tue May 5 14:25:42 2015<br>
@@ -19,18 +19,34 @@ let Predicates = [FeatureVector] in {<br>
def VLGVB : BinaryVRSc<"vlgvb", 0xE721, null_frag, v128b, 0>;<br>
def VLGVH : BinaryVRSc<"vlgvh", 0xE721, null_frag, v128h, 1>;<br>
def VLGVF : BinaryVRSc<"vlgvf", 0xE721, null_frag, v128f, 2>;<br>
- def VLGVG : BinaryVRSc<"vlgvg", 0xE721, null_frag, v128g, 3>;<br>
+ def VLGVG : BinaryVRSc<"vlgvg", 0xE721, z_vector_extract, v128g, 3>;<br>
<br>
// Load VR element from GR.<br>
- def VLVGB : TernaryVRSb<"vlvgb", 0xE722, null_frag, v128b, v128b, GR32, 0>;<br>
- def VLVGH : TernaryVRSb<"vlvgh", 0xE722, null_frag, v128h, v128h, GR32, 1>;<br>
- def VLVGF : TernaryVRSb<"vlvgf", 0xE722, null_frag, v128f, v128f, GR32, 2>;<br>
- def VLVGG : TernaryVRSb<"vlvgg", 0xE722, null_frag, v128g, v128g, GR64, 3>;<br>
+ def VLVGB : TernaryVRSb<"vlvgb", 0xE722, z_vector_insert,<br>
+ v128b, v128b, GR32, 0>;<br>
+ def VLVGH : TernaryVRSb<"vlvgh", 0xE722, z_vector_insert,<br>
+ v128h, v128h, GR32, 1>;<br>
+ def VLVGF : TernaryVRSb<"vlvgf", 0xE722, z_vector_insert,<br>
+ v128f, v128f, GR32, 2>;<br>
+ def VLVGG : TernaryVRSb<"vlvgg", 0xE722, z_vector_insert,<br>
+ v128g, v128g, GR64, 3>;<br>
<br>
// Load VR from GRs disjoint.<br>
- def VLVGP : BinaryVRRf<"vlvgp", 0xE762, null_frag, v128g>;<br>
+ def VLVGP : BinaryVRRf<"vlvgp", 0xE762, z_join_dwords, v128g>;<br>
+ def VLVGP32 : BinaryAliasVRRf<GR32>;<br>
}<br>
<br>
+// Extractions always assign to the full GR64, even if the element would<br>
+// fit in the lower 32 bits. Sub-i64 extracts therefore need to take a<br>
+// subreg of the result.<br>
+class VectorExtractSubreg<ValueType type, Instruction insn><br>
+ : Pat<(i32 (z_vector_extract (type VR128:$vec), shift12only:$index)),<br>
+ (EXTRACT_SUBREG (insn VR128:$vec, shift12only:$index), subreg_l32)>;<br>
+<br>
+def : VectorExtractSubreg<v16i8, VLGVB>;<br>
+def : VectorExtractSubreg<v8i16, VLGVH>;<br>
+def : VectorExtractSubreg<v4i32, VLGVF>;<br>
+<br>
//===----------------------------------------------------------------------===//<br>
// Immediate instructions<br>
//===----------------------------------------------------------------------===//<br>
@@ -39,29 +55,38 @@ let Predicates = [FeatureVector] in {<br>
// Generate byte mask.<br>
def VZERO : InherentVRIa<"vzero", 0xE744, 0>;<br>
def VONE : InherentVRIa<"vone", 0xE744, 0xffff>;<br>
- def VGBM : UnaryVRIa<"vgbm", 0xE744, null_frag, v128b, imm32zx16>;<br>
+ def VGBM : UnaryVRIa<"vgbm", 0xE744, z_byte_mask, v128b, imm32zx16>;<br>
<br>
// Generate mask.<br>
- def VGMB : BinaryVRIb<"vgmb", 0xE746, null_frag, v128b, 0>;<br>
- def VGMH : BinaryVRIb<"vgmh", 0xE746, null_frag, v128h, 1>;<br>
- def VGMF : BinaryVRIb<"vgmf", 0xE746, null_frag, v128f, 2>;<br>
- def VGMG : BinaryVRIb<"vgmg", 0xE746, null_frag, v128g, 3>;<br>
+ def VGMB : BinaryVRIb<"vgmb", 0xE746, z_rotate_mask, v128b, 0>;<br>
+ def VGMH : BinaryVRIb<"vgmh", 0xE746, z_rotate_mask, v128h, 1>;<br>
+ def VGMF : BinaryVRIb<"vgmf", 0xE746, z_rotate_mask, v128f, 2>;<br>
+ def VGMG : BinaryVRIb<"vgmg", 0xE746, z_rotate_mask, v128g, 3>;<br>
<br>
// Load element immediate.<br>
- def VLEIB : TernaryVRIa<"vleib", 0xE740, null_frag,<br>
- v128b, v128b, imm32sx16trunc, imm32zx4>;<br>
- def VLEIH : TernaryVRIa<"vleih", 0xE741, null_frag,<br>
- v128h, v128h, imm32sx16trunc, imm32zx3>;<br>
- def VLEIF : TernaryVRIa<"vleif", 0xE743, null_frag,<br>
- v128f, v128f, imm32sx16, imm32zx2>;<br>
- def VLEIG : TernaryVRIa<"vleig", 0xE742, null_frag,<br>
- v128g, v128g, imm64sx16, imm32zx1>;<br>
+ //<br>
+ // We want these instructions to be used ahead of VLVG* where possible.<br>
+ // However, VLVG* takes a variable BD-format index whereas VLEI takes<br>
+ // a plain immediate index. This means that VLVG* has an extra "base"<br>
+ // register operand and is 3 units more complex. Bumping the complexity<br>
+ // of the VLEI* instructions by 4 means that they are strictly better<br>
+ // than VLVG* in cases where both forms match.<br>
+ let AddedComplexity = 4 in {<br>
+ def VLEIB : TernaryVRIa<"vleib", 0xE740, z_vector_insert,<br>
+ v128b, v128b, imm32sx16trunc, imm32zx4>;<br>
+ def VLEIH : TernaryVRIa<"vleih", 0xE741, z_vector_insert,<br>
+ v128h, v128h, imm32sx16trunc, imm32zx3>;<br>
+ def VLEIF : TernaryVRIa<"vleif", 0xE743, z_vector_insert,<br>
+ v128f, v128f, imm32sx16, imm32zx2>;<br>
+ def VLEIG : TernaryVRIa<"vleig", 0xE742, z_vector_insert,<br>
+ v128g, v128g, imm64sx16, imm32zx1>;<br>
+ }<br>
<br>
// Replicate immediate.<br>
- def VREPIB : UnaryVRIa<"vrepib", 0xE745, null_frag, v128b, imm32sx16, 0>;<br>
- def VREPIH : UnaryVRIa<"vrepih", 0xE745, null_frag, v128h, imm32sx16, 1>;<br>
- def VREPIF : UnaryVRIa<"vrepif", 0xE745, null_frag, v128f, imm32sx16, 2>;<br>
- def VREPIG : UnaryVRIa<"vrepig", 0xE745, null_frag, v128g, imm32sx16, 3>;<br>
+ def VREPIB : UnaryVRIa<"vrepib", 0xE745, z_replicate, v128b, imm32sx16, 0>;<br>
+ def VREPIH : UnaryVRIa<"vrepih", 0xE745, z_replicate, v128h, imm32sx16, 1>;<br>
+ def VREPIF : UnaryVRIa<"vrepif", 0xE745, z_replicate, v128f, imm32sx16, 2>;<br>
+ def VREPIG : UnaryVRIa<"vrepig", 0xE745, z_replicate, v128g, imm32sx16, 3>;<br>
}<br>
<br>
//===----------------------------------------------------------------------===//<br>
@@ -89,28 +114,45 @@ let Predicates = [FeatureVector] in {<br>
def VLM : LoadMultipleVRSa<"vlm", 0xE736>;<br>
<br>
// Load and replicate<br>
- def VLREPB : UnaryVRX<"vlrepb", 0xE705, null_frag, v128b, 1, 0>;<br>
- def VLREPH : UnaryVRX<"vlreph", 0xE705, null_frag, v128h, 2, 1>;<br>
- def VLREPF : UnaryVRX<"vlrepf", 0xE705, null_frag, v128f, 4, 2>;<br>
- def VLREPG : UnaryVRX<"vlrepg", 0xE705, null_frag, v128g, 8, 3>;<br>
+ def VLREPB : UnaryVRX<"vlrepb", 0xE705, z_replicate_loadi8, v128b, 1, 0>;<br>
+ def VLREPH : UnaryVRX<"vlreph", 0xE705, z_replicate_loadi16, v128h, 2, 1>;<br>
+ def VLREPF : UnaryVRX<"vlrepf", 0xE705, z_replicate_loadi32, v128f, 4, 2>;<br>
+ def VLREPG : UnaryVRX<"vlrepg", 0xE705, z_replicate_loadi64, v128g, 8, 3>;<br>
<br>
// Load logical element and zero.<br>
- def VLLEZB : UnaryVRX<"vllezb", 0xE704, null_frag, v128b, 1, 0>;<br>
- def VLLEZH : UnaryVRX<"vllezh", 0xE704, null_frag, v128h, 2, 1>;<br>
- def VLLEZF : UnaryVRX<"vllezf", 0xE704, null_frag, v128f, 4, 2>;<br>
- def VLLEZG : UnaryVRX<"vllezg", 0xE704, null_frag, v128g, 8, 3>;<br>
+ def VLLEZB : UnaryVRX<"vllezb", 0xE704, z_vllezi8, v128b, 1, 0>;<br>
+ def VLLEZH : UnaryVRX<"vllezh", 0xE704, z_vllezi16, v128h, 2, 1>;<br>
+ def VLLEZF : UnaryVRX<"vllezf", 0xE704, z_vllezi32, v128f, 4, 2>;<br>
+ def VLLEZG : UnaryVRX<"vllezg", 0xE704, z_vllezi64, v128g, 8, 3>;<br>
<br>
// Load element.<br>
- def VLEB : TernaryVRX<"vleb", 0xE700, null_frag, v128b, v128b, 1, imm32zx4>;<br>
- def VLEH : TernaryVRX<"vleh", 0xE701, null_frag, v128h, v128h, 2, imm32zx3>;<br>
- def VLEF : TernaryVRX<"vlef", 0xE703, null_frag, v128f, v128f, 4, imm32zx2>;<br>
- def VLEG : TernaryVRX<"vleg", 0xE702, null_frag, v128g, v128g, 8, imm32zx1>;<br>
+ def VLEB : TernaryVRX<"vleb", 0xE700, z_vlei8, v128b, v128b, 1, imm32zx4>;<br>
+ def VLEH : TernaryVRX<"vleh", 0xE701, z_vlei16, v128h, v128h, 2, imm32zx3>;<br>
+ def VLEF : TernaryVRX<"vlef", 0xE703, z_vlei32, v128f, v128f, 4, imm32zx2>;<br>
+ def VLEG : TernaryVRX<"vleg", 0xE702, z_vlei64, v128g, v128g, 8, imm32zx1>;<br>
<br>
// Gather element.<br>
def VGEF : TernaryVRV<"vgef", 0xE713, 4, imm32zx2>;<br>
def VGEG : TernaryVRV<"vgeg", 0xE712, 8, imm32zx1>;<br>
}<br>
<br>
+// Use replicating loads if we're inserting a single element into an<br>
+// undefined vector. This avoids a false dependency on the previous<br>
+// register contents.<br>
+multiclass ReplicatePeephole<Instruction vlrep, ValueType vectype,<br>
+ SDPatternOperator load, ValueType scalartype> {<br>
+ def : Pat<(vectype (z_vector_insert<br>
+ (undef), (scalartype (load bdxaddr12only:$addr)), 0)),<br>
+ (vlrep bdxaddr12only:$addr)>;<br>
+ def : Pat<(vectype (scalar_to_vector<br>
+ (scalartype (load bdxaddr12only:$addr)))),<br>
+ (vlrep bdxaddr12only:$addr)>;<br>
+}<br>
+defm : ReplicatePeephole<VLREPB, v16i8, anyextloadi8, i32>;<br>
+defm : ReplicatePeephole<VLREPH, v8i16, anyextloadi16, i32>;<br>
+defm : ReplicatePeephole<VLREPF, v4i32, load, i32>;<br>
+defm : ReplicatePeephole<VLREPG, v2i64, load, i64>;<br>
+<br>
//===----------------------------------------------------------------------===//<br>
// Stores<br>
//===----------------------------------------------------------------------===//<br>
@@ -126,10 +168,10 @@ let Predicates = [FeatureVector] in {<br>
def VSTM : StoreMultipleVRSa<"vstm", 0xE73E>;<br>
<br>
// Store element.<br>
- def VSTEB : StoreBinaryVRX<"vsteb", 0xE708, null_frag, v128b, 1, imm32zx4>;<br>
- def VSTEH : StoreBinaryVRX<"vsteh", 0xE709, null_frag, v128h, 2, imm32zx3>;<br>
- def VSTEF : StoreBinaryVRX<"vstef", 0xE70B, null_frag, v128f, 4, imm32zx2>;<br>
- def VSTEG : StoreBinaryVRX<"vsteg", 0xE70A, null_frag, v128g, 8, imm32zx1>;<br>
+ def VSTEB : StoreBinaryVRX<"vsteb", 0xE708, z_vstei8, v128b, 1, imm32zx4>;<br>
+ def VSTEH : StoreBinaryVRX<"vsteh", 0xE709, z_vstei16, v128h, 2, imm32zx3>;<br>
+ def VSTEF : StoreBinaryVRX<"vstef", 0xE70B, z_vstei32, v128f, 4, imm32zx2>;<br>
+ def VSTEG : StoreBinaryVRX<"vsteg", 0xE70A, z_vstei64, v128g, 8, imm32zx1>;<br>
<br>
// Scatter element.<br>
def VSCEF : StoreBinaryVRV<"vscef", 0xE71B, 4, imm32zx2>;<br>
@@ -142,28 +184,28 @@ let Predicates = [FeatureVector] in {<br>
<br>
let Predicates = [FeatureVector] in {<br>
// Merge high.<br>
- def VMRHB : BinaryVRRc<"vmrhb", 0xE761, null_frag, v128b, v128b, 0>;<br>
- def VMRHH : BinaryVRRc<"vmrhh", 0xE761, null_frag, v128h, v128h, 1>;<br>
- def VMRHF : BinaryVRRc<"vmrhf", 0xE761, null_frag, v128f, v128f, 2>;<br>
- def VMRHG : BinaryVRRc<"vmrhg", 0xE761, null_frag, v128g, v128g, 3>;<br>
+ def VMRHB : BinaryVRRc<"vmrhb", 0xE761, z_merge_high, v128b, v128b, 0>;<br>
+ def VMRHH : BinaryVRRc<"vmrhh", 0xE761, z_merge_high, v128h, v128h, 1>;<br>
+ def VMRHF : BinaryVRRc<"vmrhf", 0xE761, z_merge_high, v128f, v128f, 2>;<br>
+ def VMRHG : BinaryVRRc<"vmrhg", 0xE761, z_merge_high, v128g, v128g, 3>;<br>
<br>
// Merge low.<br>
- def VMRLB : BinaryVRRc<"vmrlb", 0xE760, null_frag, v128b, v128b, 0>;<br>
- def VMRLH : BinaryVRRc<"vmrlh", 0xE760, null_frag, v128h, v128h, 1>;<br>
- def VMRLF : BinaryVRRc<"vmrlf", 0xE760, null_frag, v128f, v128f, 2>;<br>
- def VMRLG : BinaryVRRc<"vmrlg", 0xE760, null_frag, v128g, v128g, 3>;<br>
+ def VMRLB : BinaryVRRc<"vmrlb", 0xE760, z_merge_low, v128b, v128b, 0>;<br>
+ def VMRLH : BinaryVRRc<"vmrlh", 0xE760, z_merge_low, v128h, v128h, 1>;<br>
+ def VMRLF : BinaryVRRc<"vmrlf", 0xE760, z_merge_low, v128f, v128f, 2>;<br>
+ def VMRLG : BinaryVRRc<"vmrlg", 0xE760, z_merge_low, v128g, v128g, 3>;<br>
<br>
// Permute.<br>
- def VPERM : TernaryVRRe<"vperm", 0xE78C, null_frag, v128b, v128b>;<br>
+ def VPERM : TernaryVRRe<"vperm", 0xE78C, z_permute, v128b, v128b>;<br>
<br>
// Permute doubleword immediate.<br>
- def VPDI : TernaryVRRc<"vpdi", 0xE784, null_frag, v128b, v128b>;<br>
+ def VPDI : TernaryVRRc<"vpdi", 0xE784, z_permute_dwords, v128g, v128g>;<br>
<br>
// Replicate.<br>
- def VREPB : BinaryVRIc<"vrepb", 0xE74D, null_frag, v128b, v128b, 0>;<br>
- def VREPH : BinaryVRIc<"vreph", 0xE74D, null_frag, v128h, v128h, 1>;<br>
- def VREPF : BinaryVRIc<"vrepf", 0xE74D, null_frag, v128f, v128f, 2>;<br>
- def VREPG : BinaryVRIc<"vrepg", 0xE74D, null_frag, v128g, v128g, 3>;<br>
+ def VREPB : BinaryVRIc<"vrepb", 0xE74D, z_splat, v128b, v128b, 0>;<br>
+ def VREPH : BinaryVRIc<"vreph", 0xE74D, z_splat, v128h, v128h, 1>;<br>
+ def VREPF : BinaryVRIc<"vrepf", 0xE74D, z_splat, v128f, v128f, 2>;<br>
+ def VREPG : BinaryVRIc<"vrepg", 0xE74D, z_splat, v128g, v128g, 3>;<br>
<br>
// Select.<br>
def VSEL : TernaryVRRe<"vsel", 0xE78D, null_frag, v128any, v128any>;<br>
@@ -175,9 +217,9 @@ let Predicates = [FeatureVector] in {<br>
<br>
let Predicates = [FeatureVector] in {<br>
// Pack<br>
- def VPKH : BinaryVRRc<"vpkh", 0xE794, null_frag, v128b, v128h, 1>;<br>
- def VPKF : BinaryVRRc<"vpkf", 0xE794, null_frag, v128h, v128f, 2>;<br>
- def VPKG : BinaryVRRc<"vpkg", 0xE794, null_frag, v128f, v128g, 3>;<br>
+ def VPKH : BinaryVRRc<"vpkh", 0xE794, z_pack, v128b, v128h, 1>;<br>
+ def VPKF : BinaryVRRc<"vpkf", 0xE794, z_pack, v128h, v128f, 2>;<br>
+ def VPKG : BinaryVRRc<"vpkg", 0xE794, z_pack, v128f, v128g, 3>;<br>
<br>
// Pack saturate.<br>
defm VPKSH : BinaryVRRbSPair<"vpksh", 0xE797, null_frag, null_frag,<br>
@@ -196,9 +238,12 @@ let Predicates = [FeatureVector] in {<br>
v128f, v128g, 3>;<br>
<br>
// Sign-extend to doubleword.<br>
- def VSEGB : UnaryVRRa<"vsegb", 0xE75F, null_frag, v128g, v128b, 0>;<br>
- def VSEGH : UnaryVRRa<"vsegh", 0xE75F, null_frag, v128g, v128h, 1>;<br>
- def VSEGF : UnaryVRRa<"vsegf", 0xE75F, null_frag, v128g, v128f, 2>;<br>
+ def VSEGB : UnaryVRRa<"vsegb", 0xE75F, z_vsei8, v128g, v128g, 0>;<br>
+ def VSEGH : UnaryVRRa<"vsegh", 0xE75F, z_vsei16, v128g, v128g, 1>;<br>
+ def VSEGF : UnaryVRRa<"vsegf", 0xE75F, z_vsei32, v128g, v128g, 2>;<br>
+ def : Pat<(z_vsei8_by_parts (v16i8 VR128:$src)), (VSEGB VR128:$src)>;<br>
+ def : Pat<(z_vsei16_by_parts (v8i16 VR128:$src)), (VSEGH VR128:$src)>;<br>
+ def : Pat<(z_vsei32_by_parts (v4i32 VR128:$src)), (VSEGF VR128:$src)>;<br>
<br>
// Unpack high.<br>
def VUPHB : UnaryVRRa<"vuphb", 0xE7D7, null_frag, v128h, v128b, 0>;<br>
@@ -222,15 +267,37 @@ let Predicates = [FeatureVector] in {<br>
}<br>
<br>
//===----------------------------------------------------------------------===//<br>
+// Instantiating generic operations for specific types.<br>
+//===----------------------------------------------------------------------===//<br>
+<br>
+multiclass GenericVectorOps<ValueType type, ValueType inttype> {<br>
+ let Predicates = [FeatureVector] in {<br>
+ def : Pat<(type (load bdxaddr12only:$addr)),<br>
+ (VL bdxaddr12only:$addr)>;<br>
+ def : Pat<(store (type VR128:$src), bdxaddr12only:$addr),<br>
+ (VST VR128:$src, bdxaddr12only:$addr)>;<br>
+ def : Pat<(type (vselect (inttype VR128:$x), VR128:$y, VR128:$z)),<br>
+ (VSEL VR128:$y, VR128:$z, VR128:$x)>;<br>
+ def : Pat<(type (vselect (inttype (z_vnot VR128:$x)), VR128:$y, VR128:$z)),<br>
+ (VSEL VR128:$z, VR128:$y, VR128:$x)>;<br>
+ }<br>
+}<br>
+<br>
+defm : GenericVectorOps<v16i8, v16i8>;<br>
+defm : GenericVectorOps<v8i16, v8i16>;<br>
+defm : GenericVectorOps<v4i32, v4i32>;<br>
+defm : GenericVectorOps<v2i64, v2i64>;<br>
+<br>
+//===----------------------------------------------------------------------===//<br>
// Integer arithmetic<br>
//===----------------------------------------------------------------------===//<br>
<br>
let Predicates = [FeatureVector] in {<br>
// Add.<br>
- def VAB : BinaryVRRc<"vab", 0xE7F3, null_frag, v128b, v128b, 0>;<br>
- def VAH : BinaryVRRc<"vah", 0xE7F3, null_frag, v128h, v128h, 1>;<br>
- def VAF : BinaryVRRc<"vaf", 0xE7F3, null_frag, v128f, v128f, 2>;<br>
- def VAG : BinaryVRRc<"vag", 0xE7F3, null_frag, v128g, v128g, 3>;<br>
+ def VAB : BinaryVRRc<"vab", 0xE7F3, add, v128b, v128b, 0>;<br>
+ def VAH : BinaryVRRc<"vah", 0xE7F3, add, v128h, v128h, 1>;<br>
+ def VAF : BinaryVRRc<"vaf", 0xE7F3, add, v128f, v128f, 2>;<br>
+ def VAG : BinaryVRRc<"vag", 0xE7F3, add, v128g, v128g, 3>;<br>
def VAQ : BinaryVRRc<"vaq", 0xE7F3, null_frag, v128q, v128q, 4>;<br>
<br>
// Add compute carry.<br>
@@ -268,16 +335,16 @@ let Predicates = [FeatureVector] in {<br>
def VCKSM : BinaryVRRc<"vcksm", 0xE766, null_frag, v128any, v128any>;<br>
<br>
// Count leading zeros.<br>
- def VCLZB : UnaryVRRa<"vclzb", 0xE753, null_frag, v128b, v128b, 0>;<br>
- def VCLZH : UnaryVRRa<"vclzh", 0xE753, null_frag, v128h, v128h, 1>;<br>
- def VCLZF : UnaryVRRa<"vclzf", 0xE753, null_frag, v128f, v128f, 2>;<br>
- def VCLZG : UnaryVRRa<"vclzg", 0xE753, null_frag, v128g, v128g, 3>;<br>
+ def VCLZB : UnaryVRRa<"vclzb", 0xE753, ctlz, v128b, v128b, 0>;<br>
+ def VCLZH : UnaryVRRa<"vclzh", 0xE753, ctlz, v128h, v128h, 1>;<br>
+ def VCLZF : UnaryVRRa<"vclzf", 0xE753, ctlz, v128f, v128f, 2>;<br>
+ def VCLZG : UnaryVRRa<"vclzg", 0xE753, ctlz, v128g, v128g, 3>;<br>
<br>
// Count trailing zeros.<br>
- def VCTZB : UnaryVRRa<"vctzb", 0xE752, null_frag, v128b, v128b, 0>;<br>
- def VCTZH : UnaryVRRa<"vctzh", 0xE752, null_frag, v128h, v128h, 1>;<br>
- def VCTZF : UnaryVRRa<"vctzf", 0xE752, null_frag, v128f, v128f, 2>;<br>
- def VCTZG : UnaryVRRa<"vctzg", 0xE752, null_frag, v128g, v128g, 3>;<br>
+ def VCTZB : UnaryVRRa<"vctzb", 0xE752, cttz, v128b, v128b, 0>;<br>
+ def VCTZH : UnaryVRRa<"vctzh", 0xE752, cttz, v128h, v128h, 1>;<br>
+ def VCTZF : UnaryVRRa<"vctzf", 0xE752, cttz, v128f, v128f, 2>;<br>
+ def VCTZG : UnaryVRRa<"vctzg", 0xE752, cttz, v128g, v128g, 3>;<br>
<br>
// Exclusive or.<br>
def VX : BinaryVRRc<"vx", 0xE76D, null_frag, v128any, v128any>;<br>
@@ -295,16 +362,16 @@ let Predicates = [FeatureVector] in {<br>
def VGFMAG : TernaryVRRd<"vgfmag", 0xE7BC, null_frag, v128g, v128g, 3>;<br>
<br>
// Load complement.<br>
- def VLCB : UnaryVRRa<"vlcb", 0xE7DE, null_frag, v128b, v128b, 0>;<br>
- def VLCH : UnaryVRRa<"vlch", 0xE7DE, null_frag, v128h, v128h, 1>;<br>
- def VLCF : UnaryVRRa<"vlcf", 0xE7DE, null_frag, v128f, v128f, 2>;<br>
- def VLCG : UnaryVRRa<"vlcg", 0xE7DE, null_frag, v128g, v128g, 3>;<br>
+ def VLCB : UnaryVRRa<"vlcb", 0xE7DE, z_vneg, v128b, v128b, 0>;<br>
+ def VLCH : UnaryVRRa<"vlch", 0xE7DE, z_vneg, v128h, v128h, 1>;<br>
+ def VLCF : UnaryVRRa<"vlcf", 0xE7DE, z_vneg, v128f, v128f, 2>;<br>
+ def VLCG : UnaryVRRa<"vlcg", 0xE7DE, z_vneg, v128g, v128g, 3>;<br>
<br>
// Load positive.<br>
- def VLPB : UnaryVRRa<"vlpb", 0xE7DF, null_frag, v128b, v128b, 0>;<br>
- def VLPH : UnaryVRRa<"vlph", 0xE7DF, null_frag, v128h, v128h, 1>;<br>
- def VLPF : UnaryVRRa<"vlpf", 0xE7DF, null_frag, v128f, v128f, 2>;<br>
- def VLPG : UnaryVRRa<"vlpg", 0xE7DF, null_frag, v128g, v128g, 3>;<br>
+ def VLPB : UnaryVRRa<"vlpb", 0xE7DF, z_viabs8, v128b, v128b, 0>;<br>
+ def VLPH : UnaryVRRa<"vlph", 0xE7DF, z_viabs16, v128h, v128h, 1>;<br>
+ def VLPF : UnaryVRRa<"vlpf", 0xE7DF, z_viabs32, v128f, v128f, 2>;<br>
+ def VLPG : UnaryVRRa<"vlpg", 0xE7DF, z_viabs64, v128g, v128g, 3>;<br>
<br>
// Maximum.<br>
def VMXB : BinaryVRRc<"vmxb", 0xE7FF, null_frag, v128b, v128b, 0>;<br>
@@ -331,9 +398,9 @@ let Predicates = [FeatureVector] in {<br>
def VMNLG : BinaryVRRc<"vmnlg", 0xE7FC, null_frag, v128g, v128g, 3>;<br>
<br>
// Multiply and add low.<br>
- def VMALB : TernaryVRRd<"vmalb", 0xE7AA, null_frag, v128b, v128b, 0>;<br>
- def VMALHW : TernaryVRRd<"vmalhw", 0xE7AA, null_frag, v128h, v128h, 1>;<br>
- def VMALF : TernaryVRRd<"vmalf", 0xE7AA, null_frag, v128f, v128f, 2>;<br>
+ def VMALB : TernaryVRRd<"vmalb", 0xE7AA, z_muladd, v128b, v128b, 0>;<br>
+ def VMALHW : TernaryVRRd<"vmalhw", 0xE7AA, z_muladd, v128h, v128h, 1>;<br>
+ def VMALF : TernaryVRRd<"vmalf", 0xE7AA, z_muladd, v128f, v128f, 2>;<br>
<br>
// Multiply and add high.<br>
def VMAHB : TernaryVRRd<"vmahb", 0xE7AB, null_frag, v128b, v128b, 0>;<br>
@@ -376,9 +443,9 @@ let Predicates = [FeatureVector] in {<br>
def VMLHF : BinaryVRRc<"vmlhf", 0xE7A1, null_frag, v128f, v128f, 2>;<br>
<br>
// Multiply low.<br>
- def VMLB : BinaryVRRc<"vmlb", 0xE7A2, null_frag, v128b, v128b, 0>;<br>
- def VMLHW : BinaryVRRc<"vmlhw", 0xE7A2, null_frag, v128h, v128h, 1>;<br>
- def VMLF : BinaryVRRc<"vmlf", 0xE7A2, null_frag, v128f, v128f, 2>;<br>
+ def VMLB : BinaryVRRc<"vmlb", 0xE7A2, mul, v128b, v128b, 0>;<br>
+ def VMLHW : BinaryVRRc<"vmlhw", 0xE7A2, mul, v128h, v128h, 1>;<br>
+ def VMLF : BinaryVRRc<"vmlf", 0xE7A2, mul, v128f, v128f, 2>;<br>
<br>
// Multiply even.<br>
def VMEB : BinaryVRRc<"vmeb", 0xE7A6, null_frag, v128h, v128b, 0>;<br>
@@ -408,6 +475,7 @@ let Predicates = [FeatureVector] in {<br>
<br>
// Population count.<br>
def VPOPCT : BinaryVRRa<"vpopct", 0xE750>;<br>
+ def : Pat<(v16i8 (z_popcnt VR128:$x)), (VPOPCT VR128:$x, 0)>;<br>
<br>
// Element rotate left logical (with vector shift amount).<br>
def VERLLVB : BinaryVRRc<"verllvb", 0xE773, null_frag, v128b, v128b, 0>;<br>
@@ -428,40 +496,40 @@ let Predicates = [FeatureVector] in {<br>
def VERIMG : QuaternaryVRId<"verimg", 0xE772, null_frag, v128g, v128g, 3>;<br>
<br>
// Element shift left (with vector shift amount).<br>
- def VESLVB : BinaryVRRc<"veslvb", 0xE770, null_frag, v128b, v128b, 0>;<br>
- def VESLVH : BinaryVRRc<"veslvh", 0xE770, null_frag, v128h, v128h, 1>;<br>
- def VESLVF : BinaryVRRc<"veslvf", 0xE770, null_frag, v128f, v128f, 2>;<br>
- def VESLVG : BinaryVRRc<"veslvg", 0xE770, null_frag, v128g, v128g, 3>;<br>
+ def VESLVB : BinaryVRRc<"veslvb", 0xE770, z_vshl, v128b, v128b, 0>;<br>
+ def VESLVH : BinaryVRRc<"veslvh", 0xE770, z_vshl, v128h, v128h, 1>;<br>
+ def VESLVF : BinaryVRRc<"veslvf", 0xE770, z_vshl, v128f, v128f, 2>;<br>
+ def VESLVG : BinaryVRRc<"veslvg", 0xE770, z_vshl, v128g, v128g, 3>;<br>
<br>
// Element shift left (with scalar shift amount).<br>
- def VESLB : BinaryVRSa<"veslb", 0xE730, null_frag, v128b, v128b, 0>;<br>
- def VESLH : BinaryVRSa<"veslh", 0xE730, null_frag, v128h, v128h, 1>;<br>
- def VESLF : BinaryVRSa<"veslf", 0xE730, null_frag, v128f, v128f, 2>;<br>
- def VESLG : BinaryVRSa<"veslg", 0xE730, null_frag, v128g, v128g, 3>;<br>
+ def VESLB : BinaryVRSa<"veslb", 0xE730, z_vshl_by_scalar, v128b, v128b, 0>;<br>
+ def VESLH : BinaryVRSa<"veslh", 0xE730, z_vshl_by_scalar, v128h, v128h, 1>;<br>
+ def VESLF : BinaryVRSa<"veslf", 0xE730, z_vshl_by_scalar, v128f, v128f, 2>;<br>
+ def VESLG : BinaryVRSa<"veslg", 0xE730, z_vshl_by_scalar, v128g, v128g, 3>;<br>
<br>
// Element shift right arithmetic (with vector shift amount).<br>
- def VESRAVB : BinaryVRRc<"vesravb", 0xE77A, null_frag, v128b, v128b, 0>;<br>
- def VESRAVH : BinaryVRRc<"vesravh", 0xE77A, null_frag, v128h, v128h, 1>;<br>
- def VESRAVF : BinaryVRRc<"vesravf", 0xE77A, null_frag, v128f, v128f, 2>;<br>
- def VESRAVG : BinaryVRRc<"vesravg", 0xE77A, null_frag, v128g, v128g, 3>;<br>
+ def VESRAVB : BinaryVRRc<"vesravb", 0xE77A, z_vsra, v128b, v128b, 0>;<br>
+ def VESRAVH : BinaryVRRc<"vesravh", 0xE77A, z_vsra, v128h, v128h, 1>;<br>
+ def VESRAVF : BinaryVRRc<"vesravf", 0xE77A, z_vsra, v128f, v128f, 2>;<br>
+ def VESRAVG : BinaryVRRc<"vesravg", 0xE77A, z_vsra, v128g, v128g, 3>;<br>
<br>
// Element shift right arithmetic (with scalar shift amount).<br>
- def VESRAB : BinaryVRSa<"vesrab", 0xE73A, null_frag, v128b, v128b, 0>;<br>
- def VESRAH : BinaryVRSa<"vesrah", 0xE73A, null_frag, v128h, v128h, 1>;<br>
- def VESRAF : BinaryVRSa<"vesraf", 0xE73A, null_frag, v128f, v128f, 2>;<br>
- def VESRAG : BinaryVRSa<"vesrag", 0xE73A, null_frag, v128g, v128g, 3>;<br>
+ def VESRAB : BinaryVRSa<"vesrab", 0xE73A, z_vsra_by_scalar, v128b, v128b, 0>;<br>
+ def VESRAH : BinaryVRSa<"vesrah", 0xE73A, z_vsra_by_scalar, v128h, v128h, 1>;<br>
+ def VESRAF : BinaryVRSa<"vesraf", 0xE73A, z_vsra_by_scalar, v128f, v128f, 2>;<br>
+ def VESRAG : BinaryVRSa<"vesrag", 0xE73A, z_vsra_by_scalar, v128g, v128g, 3>;<br>
<br>
// Element shift right logical (with vector shift amount).<br>
- def VESRLVB : BinaryVRRc<"vesrlvb", 0xE778, null_frag, v128b, v128b, 0>;<br>
- def VESRLVH : BinaryVRRc<"vesrlvh", 0xE778, null_frag, v128h, v128h, 1>;<br>
- def VESRLVF : BinaryVRRc<"vesrlvf", 0xE778, null_frag, v128f, v128f, 2>;<br>
- def VESRLVG : BinaryVRRc<"vesrlvg", 0xE778, null_frag, v128g, v128g, 3>;<br>
+ def VESRLVB : BinaryVRRc<"vesrlvb", 0xE778, z_vsrl, v128b, v128b, 0>;<br>
+ def VESRLVH : BinaryVRRc<"vesrlvh", 0xE778, z_vsrl, v128h, v128h, 1>;<br>
+ def VESRLVF : BinaryVRRc<"vesrlvf", 0xE778, z_vsrl, v128f, v128f, 2>;<br>
+ def VESRLVG : BinaryVRRc<"vesrlvg", 0xE778, z_vsrl, v128g, v128g, 3>;<br>
<br>
// Element shift right logical (with scalar shift amount).<br>
- def VESRLB : BinaryVRSa<"vesrlb", 0xE738, null_frag, v128b, v128b, 0>;<br>
- def VESRLH : BinaryVRSa<"vesrlh", 0xE738, null_frag, v128h, v128h, 1>;<br>
- def VESRLF : BinaryVRSa<"vesrlf", 0xE738, null_frag, v128f, v128f, 2>;<br>
- def VESRLG : BinaryVRSa<"vesrlg", 0xE738, null_frag, v128g, v128g, 3>;<br>
+ def VESRLB : BinaryVRSa<"vesrlb", 0xE738, z_vsrl_by_scalar, v128b, v128b, 0>;<br>
+ def VESRLH : BinaryVRSa<"vesrlh", 0xE738, z_vsrl_by_scalar, v128h, v128h, 1>;<br>
+ def VESRLF : BinaryVRSa<"vesrlf", 0xE738, z_vsrl_by_scalar, v128f, v128f, 2>;<br>
+ def VESRLG : BinaryVRSa<"vesrlg", 0xE738, z_vsrl_by_scalar, v128g, v128g, 3>;<br>
<br>
// Shift left.<br>
def VSL : BinaryVRRc<"vsl", 0xE774, null_frag, v128b, v128b>;<br>
@@ -470,7 +538,7 @@ let Predicates = [FeatureVector] in {<br>
def VSLB : BinaryVRRc<"vslb", 0xE775, null_frag, v128b, v128b>;<br>
<br>
// Shift left double by byte.<br>
- def VSLDB : TernaryVRId<"vsldb", 0xE777, null_frag, v128b, v128b, 0>;<br>
+ def VSLDB : TernaryVRId<"vsldb", 0xE777, z_shl_double, v128b, v128b, 0>;<br>
<br>
// Shift right arithmetic.<br>
def VSRA : BinaryVRRc<"vsra", 0xE77E, null_frag, v128b, v128b>;<br>
@@ -485,10 +553,10 @@ let Predicates = [FeatureVector] in {<br>
def VSRLB : BinaryVRRc<"vsrlb", 0xE77D, null_frag, v128b, v128b>;<br>
<br>
// Subtract.<br>
- def VSB : BinaryVRRc<"vsb", 0xE7F7, null_frag, v128b, v128b, 0>;<br>
- def VSH : BinaryVRRc<"vsh", 0xE7F7, null_frag, v128h, v128h, 1>;<br>
- def VSF : BinaryVRRc<"vsf", 0xE7F7, null_frag, v128f, v128f, 2>;<br>
- def VSG : BinaryVRRc<"vsg", 0xE7F7, null_frag, v128g, v128g, 3>;<br>
+ def VSB : BinaryVRRc<"vsb", 0xE7F7, sub, v128b, v128b, 0>;<br>
+ def VSH : BinaryVRRc<"vsh", 0xE7F7, sub, v128h, v128h, 1>;<br>
+ def VSF : BinaryVRRc<"vsf", 0xE7F7, sub, v128f, v128f, 2>;<br>
+ def VSG : BinaryVRRc<"vsg", 0xE7F7, sub, v128g, v128g, 3>;<br>
def VSQ : BinaryVRRc<"vsq", 0xE7F7, null_frag, v128q, v128q, 4>;<br>
<br>
// Subtract compute borrow indication.<br>
@@ -505,18 +573,107 @@ let Predicates = [FeatureVector] in {<br>
def VSBCBIQ : TernaryVRRd<"vsbcbiq", 0xE7BD, null_frag, v128q, v128q, 4>;<br>
<br>
// Sum across doubleword.<br>
- def VSUMGH : BinaryVRRc<"vsumgh", 0xE765, null_frag, v128g, v128h, 1>;<br>
- def VSUMGF : BinaryVRRc<"vsumgf", 0xE765, null_frag, v128g, v128f, 2>;<br>
+ def VSUMGH : BinaryVRRc<"vsumgh", 0xE765, z_vsum, v128g, v128h, 1>;<br>
+ def VSUMGF : BinaryVRRc<"vsumgf", 0xE765, z_vsum, v128g, v128f, 2>;<br>
<br>
// Sum across quadword.<br>
- def VSUMQF : BinaryVRRc<"vsumqf", 0xE767, null_frag, v128q, v128f, 2>;<br>
- def VSUMQG : BinaryVRRc<"vsumqg", 0xE767, null_frag, v128q, v128g, 3>;<br>
+ def VSUMQF : BinaryVRRc<"vsumqf", 0xE767, z_vsum, v128q, v128f, 2>;<br>
+ def VSUMQG : BinaryVRRc<"vsumqg", 0xE767, z_vsum, v128q, v128g, 3>;<br>
<br>
// Sum across word.<br>
- def VSUMB : BinaryVRRc<"vsumb", 0xE764, null_frag, v128f, v128b, 0>;<br>
- def VSUMH : BinaryVRRc<"vsumh", 0xE764, null_frag, v128f, v128h, 1>;<br>
+ def VSUMB : BinaryVRRc<"vsumb", 0xE764, z_vsum, v128f, v128b, 0>;<br>
+ def VSUMH : BinaryVRRc<"vsumh", 0xE764, z_vsum, v128f, v128h, 1>;<br>
+}<br>
+<br>
+// Instantiate the bitwise ops for type TYPE.<br>
+multiclass BitwiseVectorOps<ValueType type> {<br>
+ let Predicates = [FeatureVector] in {<br>
+ def : Pat<(type (and VR128:$x, VR128:$y)), (VN VR128:$x, VR128:$y)>;<br>
+ def : Pat<(type (and VR128:$x, (z_vnot VR128:$y))),<br>
+ (VNC VR128:$x, VR128:$y)>;<br>
+ def : Pat<(type (or VR128:$x, VR128:$y)), (VO VR128:$x, VR128:$y)>;<br>
+ def : Pat<(type (xor VR128:$x, VR128:$y)), (VX VR128:$x, VR128:$y)>;<br>
+ def : Pat<(type (or (and VR128:$x, VR128:$z),<br>
+ (and VR128:$y, (z_vnot VR128:$z)))),<br>
+ (VSEL VR128:$x, VR128:$y, VR128:$z)>;<br>
+ def : Pat<(type (z_vnot (or VR128:$x, VR128:$y))),<br>
+ (VNO VR128:$x, VR128:$y)>;<br>
+ def : Pat<(type (z_vnot VR128:$x)), (VNO VR128:$x, VR128:$x)>;<br>
+ }<br>
+}<br>
+<br>
+defm : BitwiseVectorOps<v16i8>;<br>
+defm : BitwiseVectorOps<v8i16>;<br>
+defm : BitwiseVectorOps<v4i32>;<br>
+defm : BitwiseVectorOps<v2i64>;<br>
+<br>
+// Instantiate additional patterns for absolute-related expressions on<br>
+// type TYPE. LC is the negate instruction for TYPE and LP is the absolute<br>
+// instruction.<br>
+multiclass IntegerAbsoluteVectorOps<ValueType type, Instruction lc,<br>
+ Instruction lp, int shift> {<br>
+ let Predicates = [FeatureVector] in {<br>
+ def : Pat<(type (vselect (type (z_vicmph_zero VR128:$x)),<br>
+ (z_vneg VR128:$x), VR128:$x)),<br>
+ (lc (lp VR128:$x))>;<br>
+ def : Pat<(type (vselect (type (z_vnot (z_vicmph_zero VR128:$x))),<br>
+ VR128:$x, (z_vneg VR128:$x))),<br>
+ (lc (lp VR128:$x))>;<br>
+ def : Pat<(type (vselect (type (z_vicmpl_zero VR128:$x)),<br>
+ VR128:$x, (z_vneg VR128:$x))),<br>
+ (lc (lp VR128:$x))>;<br>
+ def : Pat<(type (vselect (type (z_vnot (z_vicmpl_zero VR128:$x))),<br>
+ (z_vneg VR128:$x), VR128:$x)),<br>
+ (lc (lp VR128:$x))>;<br>
+ def : Pat<(type (or (and (z_vsra_by_scalar VR128:$x, (i32 shift)),<br>
+ (z_vneg VR128:$x)),<br>
+ (and (z_vnot (z_vsra_by_scalar VR128:$x, (i32 shift))),<br>
+ VR128:$x))),<br>
+ (lp VR128:$x)>;<br>
+ def : Pat<(type (or (and (z_vsra_by_scalar VR128:$x, (i32 shift)),<br>
+ VR128:$x),<br>
+ (and (z_vnot (z_vsra_by_scalar VR128:$x, (i32 shift))),<br>
+ (z_vneg VR128:$x)))),<br>
+ (lc (lp VR128:$x))>;<br>
+ }<br>
}<br>
<br>
+defm : IntegerAbsoluteVectorOps<v16i8, VLCB, VLPB, 7>;<br>
+defm : IntegerAbsoluteVectorOps<v8i16, VLCH, VLPH, 15>;<br>
+defm : IntegerAbsoluteVectorOps<v4i32, VLCF, VLPF, 31>;<br>
+defm : IntegerAbsoluteVectorOps<v2i64, VLCG, VLPG, 63>;<br>
+<br>
+// Instantiate minimum- and maximum-related patterns for TYPE. CMPH is the<br>
+// signed or unsigned "set if greater than" comparison instruction and<br>
+// MIN and MAX are the associated minimum and maximum instructions.<br>
+multiclass IntegerMinMaxVectorOps<ValueType type, SDPatternOperator cmph,<br>
+ Instruction min, Instruction max> {<br>
+ let Predicates = [FeatureVector] in {<br>
+ def : Pat<(type (vselect (cmph VR128:$x, VR128:$y), VR128:$x, VR128:$y)),<br>
+ (max VR128:$x, VR128:$y)>;<br>
+ def : Pat<(type (vselect (cmph VR128:$x, VR128:$y), VR128:$y, VR128:$x)),<br>
+ (min VR128:$x, VR128:$y)>;<br>
+ def : Pat<(type (vselect (z_vnot (cmph VR128:$x, VR128:$y)),<br>
+ VR128:$x, VR128:$y)),<br>
+ (min VR128:$x, VR128:$y)>;<br>
+ def : Pat<(type (vselect (z_vnot (cmph VR128:$x, VR128:$y)),<br>
+ VR128:$y, VR128:$x)),<br>
+ (max VR128:$x, VR128:$y)>;<br>
+ }<br>
+}<br>
+<br>
+// Signed min/max.<br>
+defm : IntegerMinMaxVectorOps<v16i8, z_vicmph, VMNB, VMXB>;<br>
+defm : IntegerMinMaxVectorOps<v8i16, z_vicmph, VMNH, VMXH>;<br>
+defm : IntegerMinMaxVectorOps<v4i32, z_vicmph, VMNF, VMXF>;<br>
+defm : IntegerMinMaxVectorOps<v2i64, z_vicmph, VMNG, VMXG>;<br>
+<br>
+// Unsigned min/max.<br>
+defm : IntegerMinMaxVectorOps<v16i8, z_vicmphl, VMNLB, VMXLB>;<br>
+defm : IntegerMinMaxVectorOps<v8i16, z_vicmphl, VMNLH, VMXLH>;<br>
+defm : IntegerMinMaxVectorOps<v4i32, z_vicmphl, VMNLF, VMXLF>;<br>
+defm : IntegerMinMaxVectorOps<v2i64, z_vicmphl, VMNLG, VMXLG>;<br>
+<br>
//===----------------------------------------------------------------------===//<br>
// Integer comparison<br>
//===----------------------------------------------------------------------===//<br>
@@ -539,33 +696,33 @@ let Predicates = [FeatureVector] in {<br>
}<br>
<br>
// Compare equal.<br>
- defm VCEQB : BinaryVRRbSPair<"vceqb", 0xE7F8, null_frag, null_frag,<br>
+ defm VCEQB : BinaryVRRbSPair<"vceqb", 0xE7F8, z_vicmpe, null_frag,<br>
v128b, v128b, 0>;<br>
- defm VCEQH : BinaryVRRbSPair<"vceqh", 0xE7F8, null_frag, null_frag,<br>
+ defm VCEQH : BinaryVRRbSPair<"vceqh", 0xE7F8, z_vicmpe, null_frag,<br>
v128h, v128h, 1>;<br>
- defm VCEQF : BinaryVRRbSPair<"vceqf", 0xE7F8, null_frag, null_frag,<br>
+ defm VCEQF : BinaryVRRbSPair<"vceqf", 0xE7F8, z_vicmpe, null_frag,<br>
v128f, v128f, 2>;<br>
- defm VCEQG : BinaryVRRbSPair<"vceqg", 0xE7F8, null_frag, null_frag,<br>
+ defm VCEQG : BinaryVRRbSPair<"vceqg", 0xE7F8, z_vicmpe, null_frag,<br>
v128g, v128g, 3>;<br>
<br>
// Compare high.<br>
- defm VCHB : BinaryVRRbSPair<"vchb", 0xE7FB, null_frag, null_frag,<br>
+ defm VCHB : BinaryVRRbSPair<"vchb", 0xE7FB, z_vicmph, null_frag,<br>
v128b, v128b, 0>;<br>
- defm VCHH : BinaryVRRbSPair<"vchh", 0xE7FB, null_frag, null_frag,<br>
+ defm VCHH : BinaryVRRbSPair<"vchh", 0xE7FB, z_vicmph, null_frag,<br>
v128h, v128h, 1>;<br>
- defm VCHF : BinaryVRRbSPair<"vchf", 0xE7FB, null_frag, null_frag,<br>
+ defm VCHF : BinaryVRRbSPair<"vchf", 0xE7FB, z_vicmph, null_frag,<br>
v128f, v128f, 2>;<br>
- defm VCHG : BinaryVRRbSPair<"vchg", 0xE7FB, null_frag, null_frag,<br>
+ defm VCHG : BinaryVRRbSPair<"vchg", 0xE7FB, z_vicmph, null_frag,<br>
v128g, v128g, 3>;<br>
<br>
// Compare high logical.<br>
- defm VCHLB : BinaryVRRbSPair<"vchlb", 0xE7F9, null_frag, null_frag,<br>
+ defm VCHLB : BinaryVRRbSPair<"vchlb", 0xE7F9, z_vicmphl, null_frag,<br>
v128b, v128b, 0>;<br>
- defm VCHLH : BinaryVRRbSPair<"vchlh", 0xE7F9, null_frag, null_frag,<br>
+ defm VCHLH : BinaryVRRbSPair<"vchlh", 0xE7F9, z_vicmphl, null_frag,<br>
v128h, v128h, 1>;<br>
- defm VCHLF : BinaryVRRbSPair<"vchlf", 0xE7F9, null_frag, null_frag,<br>
+ defm VCHLF : BinaryVRRbSPair<"vchlf", 0xE7F9, z_vicmphl, null_frag,<br>
v128f, v128f, 2>;<br>
- defm VCHLG : BinaryVRRbSPair<"vchlg", 0xE7F9, null_frag, null_frag,<br>
+ defm VCHLG : BinaryVRRbSPair<"vchlg", 0xE7F9, z_vicmphl, null_frag,<br>
v128g, v128g, 3>;<br>
<br>
// Test under mask.<br>
@@ -686,6 +843,44 @@ let Predicates = [FeatureVector] in {<br>
}<br>
<br>
//===----------------------------------------------------------------------===//<br>
+// Conversions<br>
+//===----------------------------------------------------------------------===//<br>
+<br>
+def : Pat<(v16i8 (bitconvert (v8i16 VR128:$src))), (v16i8 VR128:$src)>;<br>
+def : Pat<(v16i8 (bitconvert (v4i32 VR128:$src))), (v16i8 VR128:$src)>;<br>
+def : Pat<(v16i8 (bitconvert (v2i64 VR128:$src))), (v16i8 VR128:$src)>;<br>
+<br>
+def : Pat<(v8i16 (bitconvert (v16i8 VR128:$src))), (v8i16 VR128:$src)>;<br>
+def : Pat<(v8i16 (bitconvert (v4i32 VR128:$src))), (v8i16 VR128:$src)>;<br>
+def : Pat<(v8i16 (bitconvert (v2i64 VR128:$src))), (v8i16 VR128:$src)>;<br>
+<br>
+def : Pat<(v4i32 (bitconvert (v16i8 VR128:$src))), (v4i32 VR128:$src)>;<br>
+def : Pat<(v4i32 (bitconvert (v8i16 VR128:$src))), (v4i32 VR128:$src)>;<br>
+def : Pat<(v4i32 (bitconvert (v2i64 VR128:$src))), (v4i32 VR128:$src)>;<br>
+<br>
+def : Pat<(v2i64 (bitconvert (v16i8 VR128:$src))), (v2i64 VR128:$src)>;<br>
+def : Pat<(v2i64 (bitconvert (v8i16 VR128:$src))), (v2i64 VR128:$src)>;<br>
+def : Pat<(v2i64 (bitconvert (v4i32 VR128:$src))), (v2i64 VR128:$src)>;<br>
+<br>
+//===----------------------------------------------------------------------===//<br>
+// Replicating scalars<br>
+//===----------------------------------------------------------------------===//<br>
+<br>
+// Define patterns for replicating a scalar GR32 into a vector of type TYPE.<br>
+// INDEX is 8 minus the element size in bytes.<br>
+class VectorReplicateScalar<ValueType type, Instruction insn, bits<16> index><br>
+ : Pat<(type (z_replicate GR32:$scalar)),<br>
+ (insn (VLVGP32 GR32:$scalar, GR32:$scalar), index)>;<br>
+<br>
+def : VectorReplicateScalar<v16i8, VREPB, 7>;<br>
+def : VectorReplicateScalar<v8i16, VREPH, 3>;<br>
+def : VectorReplicateScalar<v4i32, VREPF, 1>;<br>
+<br>
+// i64 replications are just a single isntruction.<br>
+def : Pat<(v2i64 (z_replicate GR64:$scalar)),<br>
+ (VLVGP GR64:$scalar, GR64:$scalar)>;<br>
+<br>
+//===----------------------------------------------------------------------===//<br>
// String instructions<br>
//===----------------------------------------------------------------------===//<br>
<br>
<br>
Modified: llvm/trunk/lib/Target/SystemZ/SystemZOperators.td<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/SystemZ/SystemZOperators.td?rev=236521&r1=236520&r2=236521&view=diff" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/SystemZ/SystemZOperators.td?rev=236521&r1=236520&r2=236521&view=diff</a><br>
==============================================================================<br>
--- llvm/trunk/lib/Target/SystemZ/SystemZOperators.td (original)<br>
+++ llvm/trunk/lib/Target/SystemZ/SystemZOperators.td Tue May 5 14:25:42 2015<br>
@@ -82,6 +82,45 @@ def SDT_ZPrefetch : SDTypeProf<br>
def SDT_ZTBegin : SDTypeProfile<0, 2,<br>
[SDTCisPtrTy<0>,<br>
SDTCisVT<1, i32>]>;<br>
+def SDT_ZInsertVectorElt : SDTypeProfile<1, 3,<br>
+ [SDTCisVec<0>,<br>
+ SDTCisSameAs<0, 1>,<br>
+ SDTCisVT<3, i32>]>;<br>
+def SDT_ZExtractVectorElt : SDTypeProfile<1, 2,<br>
+ [SDTCisVec<1>,<br>
+ SDTCisVT<2, i32>]>;<br>
+def SDT_ZReplicate : SDTypeProfile<1, 1,<br>
+ [SDTCisVec<0>]>;<br>
+def SDT_ZVecBinary : SDTypeProfile<1, 2,<br>
+ [SDTCisVec<0>,<br>
+ SDTCisSameAs<0, 1>,<br>
+ SDTCisSameAs<0, 2>]>;<br>
+def SDT_ZVecBinaryInt : SDTypeProfile<1, 2,<br>
+ [SDTCisVec<0>,<br>
+ SDTCisSameAs<0, 1>,<br>
+ SDTCisVT<2, i32>]>;<br>
+def SDT_ZVecBinaryConv : SDTypeProfile<1, 2,<br>
+ [SDTCisVec<0>,<br>
+ SDTCisVec<1>,<br>
+ SDTCisSameAs<1, 2>]>;<br>
+def SDT_ZRotateMask : SDTypeProfile<1, 2,<br>
+ [SDTCisVec<0>,<br>
+ SDTCisVT<1, i32>,<br>
+ SDTCisVT<2, i32>]>;<br>
+def SDT_ZJoinDwords : SDTypeProfile<1, 2,<br>
+ [SDTCisVT<0, v2i64>,<br>
+ SDTCisVT<1, i64>,<br>
+ SDTCisVT<2, i64>]>;<br>
+def SDT_ZVecTernary : SDTypeProfile<1, 3,<br>
+ [SDTCisVec<0>,<br>
+ SDTCisSameAs<0, 1>,<br>
+ SDTCisSameAs<0, 2>,<br>
+ SDTCisSameAs<0, 3>]>;<br>
+def SDT_ZVecTernaryInt : SDTypeProfile<1, 3,<br>
+ [SDTCisVec<0>,<br>
+ SDTCisSameAs<0, 1>,<br>
+ SDTCisSameAs<0, 2>,<br>
+ SDTCisVT<3, i32>]>;<br>
<br>
//===----------------------------------------------------------------------===//<br>
// Node definitions<br>
@@ -134,6 +173,34 @@ def z_udivrem64 : SDNode<"System<br>
def z_serialize : SDNode<"SystemZISD::SERIALIZE", SDTNone,<br>
[SDNPHasChain, SDNPMayStore]>;<br>
<br>
+// Defined because the index is an i32 rather than a pointer.<br>
+def z_vector_insert : SDNode<"ISD::INSERT_VECTOR_ELT",<br>
+ SDT_ZInsertVectorElt>;<br>
+def z_vector_extract : SDNode<"ISD::EXTRACT_VECTOR_ELT",<br>
+ SDT_ZExtractVectorElt>;<br>
+def z_byte_mask : SDNode<"SystemZISD::BYTE_MASK", SDT_ZReplicate>;<br>
+def z_rotate_mask : SDNode<"SystemZISD::ROTATE_MASK", SDT_ZRotateMask>;<br>
+def z_replicate : SDNode<"SystemZISD::REPLICATE", SDT_ZReplicate>;<br>
+def z_join_dwords : SDNode<"SystemZISD::JOIN_DWORDS", SDT_ZJoinDwords>;<br>
+def z_splat : SDNode<"SystemZISD::SPLAT", SDT_ZVecBinaryInt>;<br>
+def z_merge_high : SDNode<"SystemZISD::MERGE_HIGH", SDT_ZVecBinary>;<br>
+def z_merge_low : SDNode<"SystemZISD::MERGE_LOW", SDT_ZVecBinary>;<br>
+def z_shl_double : SDNode<"SystemZISD::SHL_DOUBLE", SDT_ZVecTernaryInt>;<br>
+def z_permute_dwords : SDNode<"SystemZISD::PERMUTE_DWORDS",<br>
+ SDT_ZVecTernaryInt>;<br>
+def z_permute : SDNode<"SystemZISD::PERMUTE", SDT_ZVecTernary>;<br>
+def z_pack : SDNode<"SystemZISD::PACK", SDT_ZVecBinaryConv>;<br>
+def z_vshl_by_scalar : SDNode<"SystemZISD::VSHL_BY_SCALAR",<br>
+ SDT_ZVecBinaryInt>;<br>
+def z_vsrl_by_scalar : SDNode<"SystemZISD::VSRL_BY_SCALAR",<br>
+ SDT_ZVecBinaryInt>;<br>
+def z_vsra_by_scalar : SDNode<"SystemZISD::VSRA_BY_SCALAR",<br>
+ SDT_ZVecBinaryInt>;<br>
+def z_vsum : SDNode<"SystemZISD::VSUM", SDT_ZVecBinaryConv>;<br>
+def z_vicmpe : SDNode<"SystemZISD::VICMPE", SDT_ZVecBinary>;<br>
+def z_vicmph : SDNode<"SystemZISD::VICMPH", SDT_ZVecBinary>;<br>
+def z_vicmphl : SDNode<"SystemZISD::VICMPHL", SDT_ZVecBinary>;<br>
+<br>
class AtomicWOp<string name, SDTypeProfile profile = SDT_ZAtomicLoadBinaryW><br>
: SDNode<"SystemZISD::"##name, profile,<br>
[SDNPHasChain, SDNPMayStore, SDNPMayLoad, SDNPMemOperand]>;<br>
@@ -192,6 +259,10 @@ def z_tbegin_nofloat : SDNode<"System<br>
def z_tend : SDNode<"SystemZISD::TEND", SDTNone,<br>
[SDNPHasChain, SDNPOutGlue, SDNPSideEffect]>;<br>
<br>
+def z_vshl : SDNode<"ISD::SHL", SDT_ZVecBinary>;<br>
+def z_vsra : SDNode<"ISD::SRA", SDT_ZVecBinary>;<br>
+def z_vsrl : SDNode<"ISD::SRL", SDT_ZVecBinary>;<br>
+<br>
//===----------------------------------------------------------------------===//<br>
// Pattern fragments<br>
//===----------------------------------------------------------------------===//<br>
@@ -215,11 +286,21 @@ def sext8 : PatFrag<(ops node:$src), (s<br>
def sext16 : PatFrag<(ops node:$src), (sext_inreg node:$src, i16)>;<br>
def sext32 : PatFrag<(ops node:$src), (sext (i32 node:$src))>;<br>
<br>
+// Match extensions of an i32 to an i64, followed by an in-register sign<br>
+// extension from a sub-i32 value.<br>
+def sext8dbl : PatFrag<(ops node:$src), (sext8 (anyext node:$src))>;<br>
+def sext16dbl : PatFrag<(ops node:$src), (sext16 (anyext node:$src))>;<br>
+<br>
// Register zero-extend operations. Sub-32-bit values are represented as i32s.<br>
def zext8 : PatFrag<(ops node:$src), (and node:$src, 0xff)>;<br>
def zext16 : PatFrag<(ops node:$src), (and node:$src, 0xffff)>;<br>
def zext32 : PatFrag<(ops node:$src), (zext (i32 node:$src))>;<br>
<br>
+// Match extensions of an i32 to an i64, followed by an AND of the low<br>
+// i8 or i16 part.<br>
+def zext8dbl : PatFrag<(ops node:$src), (zext8 (anyext node:$src))>;<br>
+def zext16dbl : PatFrag<(ops node:$src), (zext16 (anyext node:$src))>;<br>
+<br>
// Typed floating-point loads.<br>
def loadf32 : PatFrag<(ops node:$src), (f32 (load node:$src))>;<br>
def loadf64 : PatFrag<(ops node:$src), (f64 (load node:$src))>;<br>
@@ -383,6 +464,10 @@ def z_iabs64 : PatFrag<(ops node:$src),<br>
def z_inegabs32 : PatFrag<(ops node:$src), (ineg (z_iabs32 node:$src))>;<br>
def z_inegabs64 : PatFrag<(ops node:$src), (ineg (z_iabs64 node:$src))>;<br>
<br>
+// Integer multiply-and-add<br>
+def z_muladd : PatFrag<(ops node:$src1, node:$src2, node:$src3),<br>
+ (add (mul node:$src1, node:$src2), node:$src3)>;<br>
+<br>
// Fused multiply-add and multiply-subtract, but with the order of the<br>
// operands matching SystemZ's MA and MS instructions.<br>
def z_fma : PatFrag<(ops node:$src1, node:$src2, node:$src3),<br>
@@ -403,3 +488,88 @@ class loadu<SDPatternOperator operator,<br>
class storeu<SDPatternOperator operator, SDPatternOperator store = store><br>
: PatFrag<(ops node:$value, node:$addr),<br>
(store (operator node:$value), node:$addr)>;<br>
+<br>
+// Vector representation of all-zeros and all-ones.<br>
+def z_vzero : PatFrag<(ops), (bitconvert (v16i8 (z_byte_mask (i32 0))))>;<br>
+def z_vones : PatFrag<(ops), (bitconvert (v16i8 (z_byte_mask (i32 65535))))>;<br>
+<br>
+// Load a scalar and replicate it in all elements of a vector.<br>
+class z_replicate_load<ValueType scalartype, SDPatternOperator load><br>
+ : PatFrag<(ops node:$addr),<br>
+ (z_replicate (scalartype (load node:$addr)))>;<br>
+def z_replicate_loadi8 : z_replicate_load<i32, anyextloadi8>;<br>
+def z_replicate_loadi16 : z_replicate_load<i32, anyextloadi16>;<br>
+def z_replicate_loadi32 : z_replicate_load<i32, load>;<br>
+def z_replicate_loadi64 : z_replicate_load<i64, load>;<br>
+<br>
+// Load a scalar and insert it into a single element of a vector.<br>
+class z_vle<ValueType scalartype, SDPatternOperator load><br>
+ : PatFrag<(ops node:$vec, node:$addr, node:$index),<br>
+ (z_vector_insert node:$vec, (scalartype (load node:$addr)),<br>
+ node:$index)>;<br>
+def z_vlei8 : z_vle<i32, anyextloadi8>;<br>
+def z_vlei16 : z_vle<i32, anyextloadi16>;<br>
+def z_vlei32 : z_vle<i32, load>;<br>
+def z_vlei64 : z_vle<i64, load>;<br>
+<br>
+// Load a scalar and insert it into the low element of the high i64 of a<br>
+// zeroed vector.<br>
+class z_vllez<ValueType scalartype, SDPatternOperator load, int index><br>
+ : PatFrag<(ops node:$addr),<br>
+ (z_vector_insert (z_vzero),<br>
+ (scalartype (load node:$addr)), (i32 index))>;<br>
+def z_vllezi8 : z_vllez<i32, anyextloadi8, 7>;<br>
+def z_vllezi16 : z_vllez<i32, anyextloadi16, 3>;<br>
+def z_vllezi32 : z_vllez<i32, load, 1>;<br>
+def z_vllezi64 : PatFrag<(ops node:$addr),<br>
+ (z_join_dwords (i64 (load node:$addr)), (i64 0))>;<br>
+<br>
+// Store one element of a vector.<br>
+class z_vste<ValueType scalartype, SDPatternOperator store><br>
+ : PatFrag<(ops node:$vec, node:$addr, node:$index),<br>
+ (store (scalartype (z_vector_extract node:$vec, node:$index)),<br>
+ node:$addr)>;<br>
+def z_vstei8 : z_vste<i32, truncstorei8>;<br>
+def z_vstei16 : z_vste<i32, truncstorei16>;<br>
+def z_vstei32 : z_vste<i32, store>;<br>
+def z_vstei64 : z_vste<i64, store>;<br>
+<br>
+// Arithmetic negation on vectors.<br>
+def z_vneg : PatFrag<(ops node:$x), (sub (z_vzero), node:$x)>;<br>
+<br>
+// Bitwise negation on vectors.<br>
+def z_vnot : PatFrag<(ops node:$x), (xor node:$x, (z_vones))>;<br>
+<br>
+// Signed "integer greater than zero" on vectors.<br>
+def z_vicmph_zero : PatFrag<(ops node:$x), (z_vicmph node:$x, (z_vzero))>;<br>
+<br>
+// Signed "integer less than zero" on vectors.<br>
+def z_vicmpl_zero : PatFrag<(ops node:$x), (z_vicmph (z_vzero), node:$x)>;<br>
+<br>
+// Integer absolute on vectors.<br>
+class z_viabs<int shift><br>
+ : PatFrag<(ops node:$src),<br>
+ (xor (add node:$src, (z_vsra_by_scalar node:$src, (i32 shift))),<br>
+ (z_vsra_by_scalar node:$src, (i32 shift)))>;<br>
+def z_viabs8 : z_viabs<7>;<br>
+def z_viabs16 : z_viabs<15>;<br>
+def z_viabs32 : z_viabs<31>;<br>
+def z_viabs64 : z_viabs<63>;<br>
+<br>
+// Sign-extend the i64 elements of a vector.<br>
+class z_vse<int shift><br>
+ : PatFrag<(ops node:$src),<br>
+ (z_vsra_by_scalar (z_vshl_by_scalar node:$src, shift), shift)>;<br>
+def z_vsei8 : z_vse<56>;<br>
+def z_vsei16 : z_vse<48>;<br>
+def z_vsei32 : z_vse<32>;<br>
+<br>
+// ...and again with the extensions being done on individual i64 scalars.<br>
+class z_vse_by_parts<SDPatternOperator operator, int index1, int index2><br>
+ : PatFrag<(ops node:$src),<br>
+ (z_join_dwords<br>
+ (operator (z_vector_extract node:$src, index1)),<br>
+ (operator (z_vector_extract node:$src, index2)))>;<br>
+def z_vsei8_by_parts : z_vse_by_parts<sext8dbl, 7, 15>;<br>
+def z_vsei16_by_parts : z_vse_by_parts<sext16dbl, 3, 7>;<br>
+def z_vsei32_by_parts : z_vse_by_parts<sext32, 1, 3>;<br>
<br>
Modified: llvm/trunk/lib/Target/SystemZ/SystemZTargetMachine.cpp<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/SystemZ/SystemZTargetMachine.cpp?rev=236521&r1=236520&r2=236521&view=diff" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/SystemZ/SystemZTargetMachine.cpp?rev=236521&r1=236520&r2=236521&view=diff</a><br>
==============================================================================<br>
--- llvm/trunk/lib/Target/SystemZ/SystemZTargetMachine.cpp (original)<br>
+++ llvm/trunk/lib/Target/SystemZ/SystemZTargetMachine.cpp Tue May 5 14:25:42 2015<br>
@@ -21,15 +21,70 @@ extern "C" void LLVMInitializeSystemZTar<br>
RegisterTargetMachine<SystemZTargetMachine> X(TheSystemZTarget);<br>
}<br>
<br>
+// Determine whether we use the vector ABI.<br>
+static bool UsesVectorABI(StringRef CPU, StringRef FS) {<br>
+ // We use the vector ABI whenever the vector facility is avaiable.<br>
+ // This is the case by default if CPU is z13 or later, and can be<br>
+ // overridden via "[+-]vector" feature string elements.<br>
+ bool VectorABI = true;<br>
+ if (CPU.empty() || CPU == "generic" ||<br>
+ CPU == "z10" || CPU == "z196" || CPU == "zEC12")<br>
+ VectorABI = false;<br>
+<br>
+ SmallVector<StringRef, 3> Features;<br>
+ FS.split(Features, ",", -1, false /* KeepEmpty */);<br>
+ for (auto &Feature : Features) {<br>
+ if (Feature == "vector" || Feature == "+vector")<br>
+ VectorABI = true;<br>
+ if (Feature == "-vector")<br>
+ VectorABI = false;<br>
+ }<br>
+<br>
+ return VectorABI;<br>
+}<br>
+<br>
+static std::string computeDataLayout(StringRef TT, StringRef CPU,<br>
+ StringRef FS) {<br>
+ const Triple Triple(TT);<br>
+ bool VectorABI = UsesVectorABI(CPU, FS);<br>
+ std::string Ret = "";<br>
+<br>
+ // Big endian.<br>
+ Ret += "E";<br>
+<br>
+ // Data mangling.<br>
+ Ret += DataLayout::getManglingComponent(Triple);<br>
+<br>
+ // Make sure that global data has at least 16 bits of alignment by<br>
+ // default, so that we can refer to it using LARL. We don't have any<br>
+ // special requirements for stack variables though.<br>
+ Ret += "-i1:8:16-i8:8:16";<br>
+<br>
+ // 64-bit integers are naturally aligned.<br>
+ Ret += "-i64:64";<br>
+<br>
+ // 128-bit floats are aligned only to 64 bits.<br>
+ Ret += "-f128:64";<br>
+<br>
+ // When using the vector ABI, 128-bit vectors are also aligned to 64 bits.<br>
+ if (VectorABI)<br>
+ Ret += "-v128:64";<br>
+<br>
+ // We prefer 16 bits of aligned for all globals; see above.<br>
+ Ret += "-a:8:16";<br>
+<br>
+ // Integer registers are 32 or 64 bits.<br>
+ Ret += "-n32:64";<br>
+<br>
+ return Ret;<br>
+}<br>
+<br>
SystemZTargetMachine::SystemZTargetMachine(const Target &T, StringRef TT,<br>
StringRef CPU, StringRef FS,<br>
const TargetOptions &Options,<br>
Reloc::Model RM, CodeModel::Model CM,<br>
CodeGenOpt::Level OL)<br>
- // Make sure that global data has at least 16 bits of alignment by<br>
- // default, so that we can refer to it using LARL. We don't have any<br>
- // special requirements for stack variables though.<br>
- : LLVMTargetMachine(T, "E-m:e-i1:8:16-i8:8:16-i64:64-f128:64-a:8:16-n32:64",<br>
+ : LLVMTargetMachine(T, computeDataLayout(TT, CPU, FS),<br>
TT, CPU, FS, Options, RM, CM, OL),<br>
TLOF(make_unique<TargetLoweringObjectFileELF>()),<br>
Subtarget(TT, CPU, FS, *this) {<br>
<br>
Modified: llvm/trunk/lib/Target/SystemZ/SystemZTargetTransformInfo.cpp<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/SystemZ/SystemZTargetTransformInfo.cpp?rev=236521&r1=236520&r2=236521&view=diff" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/SystemZ/SystemZTargetTransformInfo.cpp?rev=236521&r1=236520&r2=236521&view=diff</a><br>
==============================================================================<br>
--- llvm/trunk/lib/Target/SystemZ/SystemZTargetTransformInfo.cpp (original)<br>
+++ llvm/trunk/lib/Target/SystemZ/SystemZTargetTransformInfo.cpp Tue May 5 14:25:42 2015<br>
@@ -238,3 +238,21 @@ SystemZTTIImpl::getPopcntSupport(unsigne<br>
return TTI::PSK_Software;<br>
}<br>
<br>
+unsigned SystemZTTIImpl::getNumberOfRegisters(bool Vector) {<br>
+ if (!Vector)<br>
+ // Discount the stack pointer. Also leave out %r0, since it can't<br>
+ // be used in an address.<br>
+ return 14;<br>
+ if (ST->hasVector())<br>
+ return 32;<br>
+ return 0;<br>
+}<br>
+<br>
+unsigned SystemZTTIImpl::getRegisterBitWidth(bool Vector) {<br>
+ if (!Vector)<br>
+ return 64;<br>
+ if (ST->hasVector())<br>
+ return 128;<br>
+ return 0;<br>
+}<br>
+<br>
<br>
Modified: llvm/trunk/lib/Target/SystemZ/SystemZTargetTransformInfo.h<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/SystemZ/SystemZTargetTransformInfo.h?rev=236521&r1=236520&r2=236521&view=diff" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/SystemZ/SystemZTargetTransformInfo.h?rev=236521&r1=236520&r2=236521&view=diff</a><br>
==============================================================================<br>
--- llvm/trunk/lib/Target/SystemZ/SystemZTargetTransformInfo.h (original)<br>
+++ llvm/trunk/lib/Target/SystemZ/SystemZTargetTransformInfo.h Tue May 5 14:25:42 2015<br>
@@ -63,6 +63,14 @@ public:<br>
TTI::PopcntSupportKind getPopcntSupport(unsigned TyWidth);<br>
<br>
/// @}<br>
+<br>
+ /// \name Vector TTI Implementations<br>
+ /// @{<br>
+<br>
+ unsigned getNumberOfRegisters(bool Vector);<br>
+ unsigned getRegisterBitWidth(bool Vector);<br>
+<br>
+ /// @}<br>
};<br>
<br>
} // end namespace llvm<br>
<br>
Added: llvm/trunk/test/CodeGen/SystemZ/frame-19.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/frame-19.ll?rev=236521&view=auto" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/frame-19.ll?rev=236521&view=auto</a><br>
==============================================================================<br>
--- llvm/trunk/test/CodeGen/SystemZ/frame-19.ll (added)<br>
+++ llvm/trunk/test/CodeGen/SystemZ/frame-19.ll Tue May 5 14:25:42 2015<br>
@@ -0,0 +1,314 @@<br>
+; Test spilling of vector registers.<br>
+;<br>
+; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=z13 | FileCheck %s<br>
+<br>
+; We need to allocate a 16-byte spill slot and save the 8 call-saved FPRs.<br>
+; The frame size should be exactly 160 + 16 + 8 * 8 = 240.<br>
+define void @f1(<16 x i8> *%ptr) {<br>
+; CHECK-LABEL: f1:<br>
+; CHECK: aghi %r15, -240<br>
+; CHECK-DAG: std %f8,<br>
+; CHECK-DAG: std %f9,<br>
+; CHECK-DAG: std %f10,<br>
+; CHECK-DAG: std %f11,<br>
+; CHECK-DAG: std %f12,<br>
+; CHECK-DAG: std %f13,<br>
+; CHECK-DAG: std %f14,<br>
+; CHECK-DAG: std %f15,<br>
+; CHECK: vst {{%v[0-9]+}}, 160(%r15)<br>
+; CHECK: vl {{%v[0-9]+}}, 160(%r15)<br>
+; CHECK-DAG: ld %f8,<br>
+; CHECK-DAG: ld %f9,<br>
+; CHECK-DAG: ld %f10,<br>
+; CHECK-DAG: ld %f11,<br>
+; CHECK-DAG: ld %f12,<br>
+; CHECK-DAG: ld %f13,<br>
+; CHECK-DAG: ld %f14,<br>
+; CHECK-DAG: ld %f15,<br>
+; CHECK: aghi %r15, 240<br>
+; CHECK: br %r14<br>
+ %v0 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v1 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v2 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v3 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v4 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v5 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v6 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v7 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v8 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v9 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v10 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v11 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v12 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v13 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v14 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v15 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v16 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v17 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v18 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v19 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v20 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v21 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v22 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v23 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v24 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v25 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v26 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v27 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v28 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v29 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v30 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v31 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %vx = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %vx, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v31, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v30, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v29, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v28, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v27, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v26, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v25, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v24, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v23, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v22, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v21, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v20, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v19, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v18, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v17, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v16, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v15, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v14, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v13, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v12, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v11, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v10, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v9, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v8, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v7, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v6, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v5, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v4, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v3, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v2, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v1, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v0, <16 x i8> *%ptr<br>
+ ret void<br>
+}<br>
+<br>
+; Like f1, but no 16-byte slot should be needed.<br>
+define void @f2(<16 x i8> *%ptr) {<br>
+; CHECK-LABEL: f2:<br>
+; CHECK: aghi %r15, -224<br>
+; CHECK-DAG: std %f8,<br>
+; CHECK-DAG: std %f9,<br>
+; CHECK-DAG: std %f10,<br>
+; CHECK-DAG: std %f11,<br>
+; CHECK-DAG: std %f12,<br>
+; CHECK-DAG: std %f13,<br>
+; CHECK-DAG: std %f14,<br>
+; CHECK-DAG: std %f15,<br>
+; CHECK-NOT: vst {{.*}}(%r15)<br>
+; CHECK-NOT: vl {{.*}}(%r15)<br>
+; CHECK-DAG: ld %f8,<br>
+; CHECK-DAG: ld %f9,<br>
+; CHECK-DAG: ld %f10,<br>
+; CHECK-DAG: ld %f11,<br>
+; CHECK-DAG: ld %f12,<br>
+; CHECK-DAG: ld %f13,<br>
+; CHECK-DAG: ld %f14,<br>
+; CHECK-DAG: ld %f15,<br>
+; CHECK: aghi %r15, 224<br>
+; CHECK: br %r14<br>
+ %v0 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v1 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v2 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v3 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v4 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v5 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v6 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v7 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v8 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v9 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v10 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v11 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v12 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v13 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v14 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v15 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v16 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v17 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v18 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v19 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v20 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v21 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v22 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v23 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v24 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v25 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v26 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v27 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v28 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v29 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v30 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v31 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v31, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v30, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v29, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v28, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v27, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v26, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v25, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v24, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v23, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v22, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v21, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v20, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v19, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v18, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v17, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v16, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v15, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v14, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v13, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v12, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v11, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v10, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v9, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v8, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v7, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v6, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v5, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v4, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v3, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v2, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v1, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v0, <16 x i8> *%ptr<br>
+ ret void<br>
+}<br>
+<br>
+; Like f2, but only %f8 should be saved.<br>
+define void @f3(<16 x i8> *%ptr) {<br>
+; CHECK-LABEL: f3:<br>
+; CHECK: aghi %r15, -168<br>
+; CHECK-DAG: std %f8,<br>
+; CHECK-NOT: vst {{.*}}(%r15)<br>
+; CHECK-NOT: vl {{.*}}(%r15)<br>
+; CHECK-NOT: %v9<br>
+; CHECK-NOT: %v10<br>
+; CHECK-NOT: %v11<br>
+; CHECK-NOT: %v12<br>
+; CHECK-NOT: %v13<br>
+; CHECK-NOT: %v14<br>
+; CHECK-NOT: %v15<br>
+; CHECK-DAG: ld %f8,<br>
+; CHECK: aghi %r15, 168<br>
+; CHECK: br %r14<br>
+ %v0 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v1 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v2 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v3 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v4 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v5 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v6 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v7 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v8 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v16 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v17 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v18 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v19 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v20 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v21 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v22 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v23 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v24 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v25 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v26 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v27 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v28 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v29 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v30 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v31 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v31, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v30, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v29, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v28, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v27, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v26, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v25, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v24, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v23, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v22, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v21, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v20, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v19, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v18, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v17, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v16, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v8, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v7, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v6, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v5, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v4, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v3, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v2, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v1, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v0, <16 x i8> *%ptr<br>
+ ret void<br>
+}<br>
+<br>
+; Like f2, but no registers should be saved.<br>
+define void @f4(<16 x i8> *%ptr) {<br>
+; CHECK-LABEL: f4:<br>
+; CHECK-NOT: %r15<br>
+; CHECK: br %r14<br>
+ %v0 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v1 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v2 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v3 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v4 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v5 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v6 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v7 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v16 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v17 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v18 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v19 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v20 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v21 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v22 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v23 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v24 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v25 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v26 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v27 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v28 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v29 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v30 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ %v31 = load volatile <16 x i8>, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v31, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v30, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v29, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v28, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v27, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v26, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v25, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v24, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v23, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v22, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v21, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v20, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v19, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v18, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v17, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v16, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v7, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v6, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v5, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v4, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v3, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v2, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v1, <16 x i8> *%ptr<br>
+ store volatile <16 x i8> %v0, <16 x i8> *%ptr<br>
+ ret void<br>
+}<br>
<br>
Added: llvm/trunk/test/CodeGen/SystemZ/vec-abi-align.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-abi-align.ll?rev=236521&view=auto" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-abi-align.ll?rev=236521&view=auto</a><br>
==============================================================================<br>
--- llvm/trunk/test/CodeGen/SystemZ/vec-abi-align.ll (added)<br>
+++ llvm/trunk/test/CodeGen/SystemZ/vec-abi-align.ll Tue May 5 14:25:42 2015<br>
@@ -0,0 +1,49 @@<br>
+; Verify that we use the vector ABI datalayout if and only if<br>
+; the vector facility is present.<br>
+;<br>
+; RUN: llc < %s -mtriple=s390x-linux-gnu | \<br>
+; RUN: FileCheck -check-prefix=CHECK-NOVECTOR %s<br>
+; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=generic | \<br>
+; RUN: FileCheck -check-prefix=CHECK-NOVECTOR %s<br>
+; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=z10 | \<br>
+; RUN: FileCheck -check-prefix=CHECK-NOVECTOR %s<br>
+; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=z196 | \<br>
+; RUN: FileCheck -check-prefix=CHECK-NOVECTOR %s<br>
+; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=zEC12 | \<br>
+; RUN: FileCheck -check-prefix=CHECK-NOVECTOR %s<br>
+; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=z13 | \<br>
+; RUN: FileCheck -check-prefix=CHECK-VECTOR %s<br>
+<br>
+; RUN: llc < %s -mtriple=s390x-linux-gnu -mattr=vector | \<br>
+; RUN: FileCheck -check-prefix=CHECK-VECTOR %s<br>
+; RUN: llc < %s -mtriple=s390x-linux-gnu -mattr=+vector | \<br>
+; RUN: FileCheck -check-prefix=CHECK-VECTOR %s<br>
+; RUN: llc < %s -mtriple=s390x-linux-gnu -mattr=-vector,vector | \<br>
+; RUN: FileCheck -check-prefix=CHECK-VECTOR %s<br>
+; RUN: llc < %s -mtriple=s390x-linux-gnu -mattr=-vector,+vector | \<br>
+; RUN: FileCheck -check-prefix=CHECK-VECTOR %s<br>
+; RUN: llc < %s -mtriple=s390x-linux-gnu -mattr=-vector | \<br>
+; RUN: FileCheck -check-prefix=CHECK-NOVECTOR %s<br>
+; RUN: llc < %s -mtriple=s390x-linux-gnu -mattr=vector,-vector | \<br>
+; RUN: FileCheck -check-prefix=CHECK-NOVECTOR %s<br>
+; RUN: llc < %s -mtriple=s390x-linux-gnu -mattr=+vector,-vector | \<br>
+; RUN: FileCheck -check-prefix=CHECK-NOVECTOR %s<br>
+<br>
+; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=z13 -mattr=-vector | \<br>
+; RUN: FileCheck -check-prefix=CHECK-NOVECTOR %s<br>
+<br>
+%struct.S = type { i8, <2 x i64> }<br>
+<br>
+define void @test(%struct.S* %s) nounwind {<br>
+; CHECK-VECTOR-LABEL: @test<br>
+; CHECK-VECTOR: vl %v0, 8(%r2)<br>
+; CHECK-NOVECTOR-LABEL: @test<br>
+; CHECK-NOVECTOR-DAG: agsi 16(%r2), 1<br>
+; CHECK-NOVECTOR-DAG: agsi 24(%r2), 1<br>
+ %ptr = getelementptr %struct.S, %struct.S* %s, i64 0, i32 1<br>
+ %vec = load <2 x i64>, <2 x i64>* %ptr<br>
+ %add = add <2 x i64> %vec, <i64 1, i64 1><br>
+ store <2 x i64> %add, <2 x i64>* %ptr<br>
+ ret void<br>
+}<br>
+<br>
<br>
Added: llvm/trunk/test/CodeGen/SystemZ/vec-abs-01.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-abs-01.ll?rev=236521&view=auto" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-abs-01.ll?rev=236521&view=auto</a><br>
==============================================================================<br>
--- llvm/trunk/test/CodeGen/SystemZ/vec-abs-01.ll (added)<br>
+++ llvm/trunk/test/CodeGen/SystemZ/vec-abs-01.ll Tue May 5 14:25:42 2015<br>
@@ -0,0 +1,146 @@<br>
+; Test v16i8 absolute.<br>
+;<br>
+; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=z13 | FileCheck %s<br>
+<br>
+; Test with slt.<br>
+define <16 x i8> @f1(<16 x i8> %val) {<br>
+; CHECK-LABEL: f1:<br>
+; CHECK: vlpb %v24, %v24<br>
+; CHECK: br %r14<br>
+ %cmp = icmp slt <16 x i8> %val, zeroinitializer<br>
+ %neg = sub <16 x i8> zeroinitializer, %val<br>
+ %ret = select <16 x i1> %cmp, <16 x i8> %neg, <16 x i8> %val<br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test with sle.<br>
+define <16 x i8> @f2(<16 x i8> %val) {<br>
+; CHECK-LABEL: f2:<br>
+; CHECK: vlpb %v24, %v24<br>
+; CHECK: br %r14<br>
+ %cmp = icmp sle <16 x i8> %val, zeroinitializer<br>
+ %neg = sub <16 x i8> zeroinitializer, %val<br>
+ %ret = select <16 x i1> %cmp, <16 x i8> %neg, <16 x i8> %val<br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test with sgt.<br>
+define <16 x i8> @f3(<16 x i8> %val) {<br>
+; CHECK-LABEL: f3:<br>
+; CHECK: vlpb %v24, %v24<br>
+; CHECK: br %r14<br>
+ %cmp = icmp sgt <16 x i8> %val, zeroinitializer<br>
+ %neg = sub <16 x i8> zeroinitializer, %val<br>
+ %ret = select <16 x i1> %cmp, <16 x i8> %val, <16 x i8> %neg<br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test with sge.<br>
+define <16 x i8> @f4(<16 x i8> %val) {<br>
+; CHECK-LABEL: f4:<br>
+; CHECK: vlpb %v24, %v24<br>
+; CHECK: br %r14<br>
+ %cmp = icmp sge <16 x i8> %val, zeroinitializer<br>
+ %neg = sub <16 x i8> zeroinitializer, %val<br>
+ %ret = select <16 x i1> %cmp, <16 x i8> %val, <16 x i8> %neg<br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test that negative absolute uses VLPB too. There is no vector equivalent<br>
+; of LOAD NEGATIVE.<br>
+define <16 x i8> @f5(<16 x i8> %val) {<br>
+; CHECK-LABEL: f5:<br>
+; CHECK: vlpb [[REG:%v[0-9]+]], %v24<br>
+; CHECK: vlcb %v24, [[REG]]<br>
+; CHECK: br %r14<br>
+ %cmp = icmp slt <16 x i8> %val, zeroinitializer<br>
+ %neg = sub <16 x i8> zeroinitializer, %val<br>
+ %abs = select <16 x i1> %cmp, <16 x i8> %neg, <16 x i8> %val<br>
+ %ret = sub <16 x i8> zeroinitializer, %abs<br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Try another form of negative absolute (slt version).<br>
+define <16 x i8> @f6(<16 x i8> %val) {<br>
+; CHECK-LABEL: f6:<br>
+; CHECK: vlpb [[REG:%v[0-9]+]], %v24<br>
+; CHECK: vlcb %v24, [[REG]]<br>
+; CHECK: br %r14<br>
+ %cmp = icmp slt <16 x i8> %val, zeroinitializer<br>
+ %neg = sub <16 x i8> zeroinitializer, %val<br>
+ %ret = select <16 x i1> %cmp, <16 x i8> %val, <16 x i8> %neg<br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test with sle.<br>
+define <16 x i8> @f7(<16 x i8> %val) {<br>
+; CHECK-LABEL: f7:<br>
+; CHECK: vlpb [[REG:%v[0-9]+]], %v24<br>
+; CHECK: vlcb %v24, [[REG]]<br>
+; CHECK: br %r14<br>
+ %cmp = icmp sle <16 x i8> %val, zeroinitializer<br>
+ %neg = sub <16 x i8> zeroinitializer, %val<br>
+ %ret = select <16 x i1> %cmp, <16 x i8> %val, <16 x i8> %neg<br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test with sgt.<br>
+define <16 x i8> @f8(<16 x i8> %val) {<br>
+; CHECK-LABEL: f8:<br>
+; CHECK: vlpb [[REG:%v[0-9]+]], %v24<br>
+; CHECK: vlcb %v24, [[REG]]<br>
+; CHECK: br %r14<br>
+ %cmp = icmp sgt <16 x i8> %val, zeroinitializer<br>
+ %neg = sub <16 x i8> zeroinitializer, %val<br>
+ %ret = select <16 x i1> %cmp, <16 x i8> %neg, <16 x i8> %val<br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test with sge.<br>
+define <16 x i8> @f9(<16 x i8> %val) {<br>
+; CHECK-LABEL: f9:<br>
+; CHECK: vlpb [[REG:%v[0-9]+]], %v24<br>
+; CHECK: vlcb %v24, [[REG]]<br>
+; CHECK: br %r14<br>
+ %cmp = icmp sge <16 x i8> %val, zeroinitializer<br>
+ %neg = sub <16 x i8> zeroinitializer, %val<br>
+ %ret = select <16 x i1> %cmp, <16 x i8> %neg, <16 x i8> %val<br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test with an SRA-based boolean vector.<br>
+define <16 x i8> @f10(<16 x i8> %val) {<br>
+; CHECK-LABEL: f10:<br>
+; CHECK: vlpb %v24, %v24<br>
+; CHECK: br %r14<br>
+ %shr = ashr <16 x i8> %val,<br>
+ <i8 7, i8 7, i8 7, i8 7, i8 7, i8 7, i8 7, i8 7,<br>
+ i8 7, i8 7, i8 7, i8 7, i8 7, i8 7, i8 7, i8 7><br>
+ %neg = sub <16 x i8> zeroinitializer, %val<br>
+ %and1 = and <16 x i8> %shr, %neg<br>
+ %not = xor <16 x i8> %shr,<br>
+ <i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1,<br>
+ i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1><br>
+ %and2 = and <16 x i8> %not, %val<br>
+ %ret = or <16 x i8> %and1, %and2<br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; ...and again in reverse<br>
+define <16 x i8> @f11(<16 x i8> %val) {<br>
+; CHECK-LABEL: f11:<br>
+; CHECK: vlpb [[REG:%v[0-9]+]], %v24<br>
+; CHECK: vlcb %v24, [[REG]]<br>
+; CHECK: br %r14<br>
+ %shr = ashr <16 x i8> %val,<br>
+ <i8 7, i8 7, i8 7, i8 7, i8 7, i8 7, i8 7, i8 7,<br>
+ i8 7, i8 7, i8 7, i8 7, i8 7, i8 7, i8 7, i8 7><br>
+ %and1 = and <16 x i8> %shr, %val<br>
+ %not = xor <16 x i8> %shr,<br>
+ <i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1,<br>
+ i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1><br>
+ %neg = sub <16 x i8> zeroinitializer, %val<br>
+ %and2 = and <16 x i8> %not, %neg<br>
+ %ret = or <16 x i8> %and1, %and2<br>
+ ret <16 x i8> %ret<br>
+}<br>
<br>
Added: llvm/trunk/test/CodeGen/SystemZ/vec-abs-02.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-abs-02.ll?rev=236521&view=auto" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-abs-02.ll?rev=236521&view=auto</a><br>
==============================================================================<br>
--- llvm/trunk/test/CodeGen/SystemZ/vec-abs-02.ll (added)<br>
+++ llvm/trunk/test/CodeGen/SystemZ/vec-abs-02.ll Tue May 5 14:25:42 2015<br>
@@ -0,0 +1,142 @@<br>
+; Test v8i16 absolute.<br>
+;<br>
+; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=z13 | FileCheck %s<br>
+<br>
+; Test with slt.<br>
+define <8 x i16> @f1(<8 x i16> %val) {<br>
+; CHECK-LABEL: f1:<br>
+; CHECK: vlph %v24, %v24<br>
+; CHECK: br %r14<br>
+ %cmp = icmp slt <8 x i16> %val, zeroinitializer<br>
+ %neg = sub <8 x i16> zeroinitializer, %val<br>
+ %ret = select <8 x i1> %cmp, <8 x i16> %neg, <8 x i16> %val<br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test with sle.<br>
+define <8 x i16> @f2(<8 x i16> %val) {<br>
+; CHECK-LABEL: f2:<br>
+; CHECK: vlph %v24, %v24<br>
+; CHECK: br %r14<br>
+ %cmp = icmp sle <8 x i16> %val, zeroinitializer<br>
+ %neg = sub <8 x i16> zeroinitializer, %val<br>
+ %ret = select <8 x i1> %cmp, <8 x i16> %neg, <8 x i16> %val<br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test with sgt.<br>
+define <8 x i16> @f3(<8 x i16> %val) {<br>
+; CHECK-LABEL: f3:<br>
+; CHECK: vlph %v24, %v24<br>
+; CHECK: br %r14<br>
+ %cmp = icmp sgt <8 x i16> %val, zeroinitializer<br>
+ %neg = sub <8 x i16> zeroinitializer, %val<br>
+ %ret = select <8 x i1> %cmp, <8 x i16> %val, <8 x i16> %neg<br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test with sge.<br>
+define <8 x i16> @f4(<8 x i16> %val) {<br>
+; CHECK-LABEL: f4:<br>
+; CHECK: vlph %v24, %v24<br>
+; CHECK: br %r14<br>
+ %cmp = icmp sge <8 x i16> %val, zeroinitializer<br>
+ %neg = sub <8 x i16> zeroinitializer, %val<br>
+ %ret = select <8 x i1> %cmp, <8 x i16> %val, <8 x i16> %neg<br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test that negative absolute uses VLPH too. There is no vector equivalent<br>
+; of LOAD NEGATIVE.<br>
+define <8 x i16> @f5(<8 x i16> %val) {<br>
+; CHECK-LABEL: f5:<br>
+; CHECK: vlph [[REG:%v[0-9]+]], %v24<br>
+; CHECK: vlch %v24, [[REG]]<br>
+; CHECK: br %r14<br>
+ %cmp = icmp slt <8 x i16> %val, zeroinitializer<br>
+ %neg = sub <8 x i16> zeroinitializer, %val<br>
+ %abs = select <8 x i1> %cmp, <8 x i16> %neg, <8 x i16> %val<br>
+ %ret = sub <8 x i16> zeroinitializer, %abs<br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Try another form of negative absolute (slt version).<br>
+define <8 x i16> @f6(<8 x i16> %val) {<br>
+; CHECK-LABEL: f6:<br>
+; CHECK: vlph [[REG:%v[0-9]+]], %v24<br>
+; CHECK: vlch %v24, [[REG]]<br>
+; CHECK: br %r14<br>
+ %cmp = icmp slt <8 x i16> %val, zeroinitializer<br>
+ %neg = sub <8 x i16> zeroinitializer, %val<br>
+ %ret = select <8 x i1> %cmp, <8 x i16> %val, <8 x i16> %neg<br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test with sle.<br>
+define <8 x i16> @f7(<8 x i16> %val) {<br>
+; CHECK-LABEL: f7:<br>
+; CHECK: vlph [[REG:%v[0-9]+]], %v24<br>
+; CHECK: vlch %v24, [[REG]]<br>
+; CHECK: br %r14<br>
+ %cmp = icmp sle <8 x i16> %val, zeroinitializer<br>
+ %neg = sub <8 x i16> zeroinitializer, %val<br>
+ %ret = select <8 x i1> %cmp, <8 x i16> %val, <8 x i16> %neg<br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test with sgt.<br>
+define <8 x i16> @f8(<8 x i16> %val) {<br>
+; CHECK-LABEL: f8:<br>
+; CHECK: vlph [[REG:%v[0-9]+]], %v24<br>
+; CHECK: vlch %v24, [[REG]]<br>
+; CHECK: br %r14<br>
+ %cmp = icmp sgt <8 x i16> %val, zeroinitializer<br>
+ %neg = sub <8 x i16> zeroinitializer, %val<br>
+ %ret = select <8 x i1> %cmp, <8 x i16> %neg, <8 x i16> %val<br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test with sge.<br>
+define <8 x i16> @f9(<8 x i16> %val) {<br>
+; CHECK-LABEL: f9:<br>
+; CHECK: vlph [[REG:%v[0-9]+]], %v24<br>
+; CHECK: vlch %v24, [[REG]]<br>
+; CHECK: br %r14<br>
+ %cmp = icmp sge <8 x i16> %val, zeroinitializer<br>
+ %neg = sub <8 x i16> zeroinitializer, %val<br>
+ %ret = select <8 x i1> %cmp, <8 x i16> %neg, <8 x i16> %val<br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test with an SRA-based boolean vector.<br>
+define <8 x i16> @f10(<8 x i16> %val) {<br>
+; CHECK-LABEL: f10:<br>
+; CHECK: vlph %v24, %v24<br>
+; CHECK: br %r14<br>
+ %shr = ashr <8 x i16> %val,<br>
+ <i16 15, i16 15, i16 15, i16 15, i16 15, i16 15, i16 15, i16 15><br>
+ %neg = sub <8 x i16> zeroinitializer, %val<br>
+ %and1 = and <8 x i16> %shr, %neg<br>
+ %not = xor <8 x i16> %shr,<br>
+ <i16 -1, i16 -1, i16 -1, i16 -1, i16 -1, i16 -1, i16 -1, i16 -1><br>
+ %and2 = and <8 x i16> %not, %val<br>
+ %ret = or <8 x i16> %and1, %and2<br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; ...and again in reverse<br>
+define <8 x i16> @f11(<8 x i16> %val) {<br>
+; CHECK-LABEL: f11:<br>
+; CHECK: vlph [[REG:%v[0-9]+]], %v24<br>
+; CHECK: vlch %v24, [[REG]]<br>
+; CHECK: br %r14<br>
+ %shr = ashr <8 x i16> %val,<br>
+ <i16 15, i16 15, i16 15, i16 15, i16 15, i16 15, i16 15, i16 15><br>
+ %and1 = and <8 x i16> %shr, %val<br>
+ %not = xor <8 x i16> %shr,<br>
+ <i16 -1, i16 -1, i16 -1, i16 -1, i16 -1, i16 -1, i16 -1, i16 -1><br>
+ %neg = sub <8 x i16> zeroinitializer, %val<br>
+ %and2 = and <8 x i16> %not, %neg<br>
+ %ret = or <8 x i16> %and1, %and2<br>
+ ret <8 x i16> %ret<br>
+}<br>
<br>
Added: llvm/trunk/test/CodeGen/SystemZ/vec-abs-03.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-abs-03.ll?rev=236521&view=auto" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-abs-03.ll?rev=236521&view=auto</a><br>
==============================================================================<br>
--- llvm/trunk/test/CodeGen/SystemZ/vec-abs-03.ll (added)<br>
+++ llvm/trunk/test/CodeGen/SystemZ/vec-abs-03.ll Tue May 5 14:25:42 2015<br>
@@ -0,0 +1,138 @@<br>
+; Test v4i32 absolute.<br>
+;<br>
+; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=z13 | FileCheck %s<br>
+<br>
+; Test with slt.<br>
+define <4 x i32> @f1(<4 x i32> %val) {<br>
+; CHECK-LABEL: f1:<br>
+; CHECK: vlpf %v24, %v24<br>
+; CHECK: br %r14<br>
+ %cmp = icmp slt <4 x i32> %val, zeroinitializer<br>
+ %neg = sub <4 x i32> zeroinitializer, %val<br>
+ %ret = select <4 x i1> %cmp, <4 x i32> %neg, <4 x i32> %val<br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test with sle.<br>
+define <4 x i32> @f2(<4 x i32> %val) {<br>
+; CHECK-LABEL: f2:<br>
+; CHECK: vlpf %v24, %v24<br>
+; CHECK: br %r14<br>
+ %cmp = icmp sle <4 x i32> %val, zeroinitializer<br>
+ %neg = sub <4 x i32> zeroinitializer, %val<br>
+ %ret = select <4 x i1> %cmp, <4 x i32> %neg, <4 x i32> %val<br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test with sgt.<br>
+define <4 x i32> @f3(<4 x i32> %val) {<br>
+; CHECK-LABEL: f3:<br>
+; CHECK: vlpf %v24, %v24<br>
+; CHECK: br %r14<br>
+ %cmp = icmp sgt <4 x i32> %val, zeroinitializer<br>
+ %neg = sub <4 x i32> zeroinitializer, %val<br>
+ %ret = select <4 x i1> %cmp, <4 x i32> %val, <4 x i32> %neg<br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test with sge.<br>
+define <4 x i32> @f4(<4 x i32> %val) {<br>
+; CHECK-LABEL: f4:<br>
+; CHECK: vlpf %v24, %v24<br>
+; CHECK: br %r14<br>
+ %cmp = icmp sge <4 x i32> %val, zeroinitializer<br>
+ %neg = sub <4 x i32> zeroinitializer, %val<br>
+ %ret = select <4 x i1> %cmp, <4 x i32> %val, <4 x i32> %neg<br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test that negative absolute uses VLPF too. There is no vector equivalent<br>
+; of LOAD NEGATIVE.<br>
+define <4 x i32> @f5(<4 x i32> %val) {<br>
+; CHECK-LABEL: f5:<br>
+; CHECK: vlpf [[REG:%v[0-9]+]], %v24<br>
+; CHECK: vlcf %v24, [[REG]]<br>
+; CHECK: br %r14<br>
+ %cmp = icmp slt <4 x i32> %val, zeroinitializer<br>
+ %neg = sub <4 x i32> zeroinitializer, %val<br>
+ %abs = select <4 x i1> %cmp, <4 x i32> %neg, <4 x i32> %val<br>
+ %ret = sub <4 x i32> zeroinitializer, %abs<br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Try another form of negative absolute (slt version).<br>
+define <4 x i32> @f6(<4 x i32> %val) {<br>
+; CHECK-LABEL: f6:<br>
+; CHECK: vlpf [[REG:%v[0-9]+]], %v24<br>
+; CHECK: vlcf %v24, [[REG]]<br>
+; CHECK: br %r14<br>
+ %cmp = icmp slt <4 x i32> %val, zeroinitializer<br>
+ %neg = sub <4 x i32> zeroinitializer, %val<br>
+ %ret = select <4 x i1> %cmp, <4 x i32> %val, <4 x i32> %neg<br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test with sle.<br>
+define <4 x i32> @f7(<4 x i32> %val) {<br>
+; CHECK-LABEL: f7:<br>
+; CHECK: vlpf [[REG:%v[0-9]+]], %v24<br>
+; CHECK: vlcf %v24, [[REG]]<br>
+; CHECK: br %r14<br>
+ %cmp = icmp sle <4 x i32> %val, zeroinitializer<br>
+ %neg = sub <4 x i32> zeroinitializer, %val<br>
+ %ret = select <4 x i1> %cmp, <4 x i32> %val, <4 x i32> %neg<br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test with sgt.<br>
+define <4 x i32> @f8(<4 x i32> %val) {<br>
+; CHECK-LABEL: f8:<br>
+; CHECK: vlpf [[REG:%v[0-9]+]], %v24<br>
+; CHECK: vlcf %v24, [[REG]]<br>
+; CHECK: br %r14<br>
+ %cmp = icmp sgt <4 x i32> %val, zeroinitializer<br>
+ %neg = sub <4 x i32> zeroinitializer, %val<br>
+ %ret = select <4 x i1> %cmp, <4 x i32> %neg, <4 x i32> %val<br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test with sge.<br>
+define <4 x i32> @f9(<4 x i32> %val) {<br>
+; CHECK-LABEL: f9:<br>
+; CHECK: vlpf [[REG:%v[0-9]+]], %v24<br>
+; CHECK: vlcf %v24, [[REG]]<br>
+; CHECK: br %r14<br>
+ %cmp = icmp sge <4 x i32> %val, zeroinitializer<br>
+ %neg = sub <4 x i32> zeroinitializer, %val<br>
+ %ret = select <4 x i1> %cmp, <4 x i32> %neg, <4 x i32> %val<br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test with an SRA-based boolean vector.<br>
+define <4 x i32> @f10(<4 x i32> %val) {<br>
+; CHECK-LABEL: f10:<br>
+; CHECK: vlpf %v24, %v24<br>
+; CHECK: br %r14<br>
+ %shr = ashr <4 x i32> %val, <i32 31, i32 31, i32 31, i32 31><br>
+ %neg = sub <4 x i32> zeroinitializer, %val<br>
+ %and1 = and <4 x i32> %shr, %neg<br>
+ %not = xor <4 x i32> %shr, <i32 -1, i32 -1, i32 -1, i32 -1><br>
+ %and2 = and <4 x i32> %not, %val<br>
+ %ret = or <4 x i32> %and1, %and2<br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; ...and again in reverse<br>
+define <4 x i32> @f11(<4 x i32> %val) {<br>
+; CHECK-LABEL: f11:<br>
+; CHECK: vlpf [[REG:%v[0-9]+]], %v24<br>
+; CHECK: vlcf %v24, [[REG]]<br>
+; CHECK: br %r14<br>
+ %shr = ashr <4 x i32> %val, <i32 31, i32 31, i32 31, i32 31><br>
+ %and1 = and <4 x i32> %shr, %val<br>
+ %not = xor <4 x i32> %shr, <i32 -1, i32 -1, i32 -1, i32 -1><br>
+ %neg = sub <4 x i32> zeroinitializer, %val<br>
+ %and2 = and <4 x i32> %not, %neg<br>
+ %ret = or <4 x i32> %and1, %and2<br>
+ ret <4 x i32> %ret<br>
+}<br>
<br>
Added: llvm/trunk/test/CodeGen/SystemZ/vec-abs-04.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-abs-04.ll?rev=236521&view=auto" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-abs-04.ll?rev=236521&view=auto</a><br>
==============================================================================<br>
--- llvm/trunk/test/CodeGen/SystemZ/vec-abs-04.ll (added)<br>
+++ llvm/trunk/test/CodeGen/SystemZ/vec-abs-04.ll Tue May 5 14:25:42 2015<br>
@@ -0,0 +1,138 @@<br>
+; Test v2i64 absolute.<br>
+;<br>
+; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=z13 | FileCheck %s<br>
+<br>
+; Test with slt.<br>
+define <2 x i64> @f1(<2 x i64> %val) {<br>
+; CHECK-LABEL: f1:<br>
+; CHECK: vlpg %v24, %v24<br>
+; CHECK: br %r14<br>
+ %cmp = icmp slt <2 x i64> %val, zeroinitializer<br>
+ %neg = sub <2 x i64> zeroinitializer, %val<br>
+ %ret = select <2 x i1> %cmp, <2 x i64> %neg, <2 x i64> %val<br>
+ ret <2 x i64> %ret<br>
+}<br>
+<br>
+; Test with sle.<br>
+define <2 x i64> @f2(<2 x i64> %val) {<br>
+; CHECK-LABEL: f2:<br>
+; CHECK: vlpg %v24, %v24<br>
+; CHECK: br %r14<br>
+ %cmp = icmp sle <2 x i64> %val, zeroinitializer<br>
+ %neg = sub <2 x i64> zeroinitializer, %val<br>
+ %ret = select <2 x i1> %cmp, <2 x i64> %neg, <2 x i64> %val<br>
+ ret <2 x i64> %ret<br>
+}<br>
+<br>
+; Test with sgt.<br>
+define <2 x i64> @f3(<2 x i64> %val) {<br>
+; CHECK-LABEL: f3:<br>
+; CHECK: vlpg %v24, %v24<br>
+; CHECK: br %r14<br>
+ %cmp = icmp sgt <2 x i64> %val, zeroinitializer<br>
+ %neg = sub <2 x i64> zeroinitializer, %val<br>
+ %ret = select <2 x i1> %cmp, <2 x i64> %val, <2 x i64> %neg<br>
+ ret <2 x i64> %ret<br>
+}<br>
+<br>
+; Test with sge.<br>
+define <2 x i64> @f4(<2 x i64> %val) {<br>
+; CHECK-LABEL: f4:<br>
+; CHECK: vlpg %v24, %v24<br>
+; CHECK: br %r14<br>
+ %cmp = icmp sge <2 x i64> %val, zeroinitializer<br>
+ %neg = sub <2 x i64> zeroinitializer, %val<br>
+ %ret = select <2 x i1> %cmp, <2 x i64> %val, <2 x i64> %neg<br>
+ ret <2 x i64> %ret<br>
+}<br>
+<br>
+; Test that negative absolute uses VLPG too. There is no vector equivalent<br>
+; of LOAD NEGATIVE.<br>
+define <2 x i64> @f5(<2 x i64> %val) {<br>
+; CHECK-LABEL: f5:<br>
+; CHECK: vlpg [[REG:%v[0-9]+]], %v24<br>
+; CHECK: vlcg %v24, [[REG]]<br>
+; CHECK: br %r14<br>
+ %cmp = icmp slt <2 x i64> %val, zeroinitializer<br>
+ %neg = sub <2 x i64> zeroinitializer, %val<br>
+ %abs = select <2 x i1> %cmp, <2 x i64> %neg, <2 x i64> %val<br>
+ %ret = sub <2 x i64> zeroinitializer, %abs<br>
+ ret <2 x i64> %ret<br>
+}<br>
+<br>
+; Try another form of negative absolute (slt version).<br>
+define <2 x i64> @f6(<2 x i64> %val) {<br>
+; CHECK-LABEL: f6:<br>
+; CHECK: vlpg [[REG:%v[0-9]+]], %v24<br>
+; CHECK: vlcg %v24, [[REG]]<br>
+; CHECK: br %r14<br>
+ %cmp = icmp slt <2 x i64> %val, zeroinitializer<br>
+ %neg = sub <2 x i64> zeroinitializer, %val<br>
+ %ret = select <2 x i1> %cmp, <2 x i64> %val, <2 x i64> %neg<br>
+ ret <2 x i64> %ret<br>
+}<br>
+<br>
+; Test with sle.<br>
+define <2 x i64> @f7(<2 x i64> %val) {<br>
+; CHECK-LABEL: f7:<br>
+; CHECK: vlpg [[REG:%v[0-9]+]], %v24<br>
+; CHECK: vlcg %v24, [[REG]]<br>
+; CHECK: br %r14<br>
+ %cmp = icmp sle <2 x i64> %val, zeroinitializer<br>
+ %neg = sub <2 x i64> zeroinitializer, %val<br>
+ %ret = select <2 x i1> %cmp, <2 x i64> %val, <2 x i64> %neg<br>
+ ret <2 x i64> %ret<br>
+}<br>
+<br>
+; Test with sgt.<br>
+define <2 x i64> @f8(<2 x i64> %val) {<br>
+; CHECK-LABEL: f8:<br>
+; CHECK: vlpg [[REG:%v[0-9]+]], %v24<br>
+; CHECK: vlcg %v24, [[REG]]<br>
+; CHECK: br %r14<br>
+ %cmp = icmp sgt <2 x i64> %val, zeroinitializer<br>
+ %neg = sub <2 x i64> zeroinitializer, %val<br>
+ %ret = select <2 x i1> %cmp, <2 x i64> %neg, <2 x i64> %val<br>
+ ret <2 x i64> %ret<br>
+}<br>
+<br>
+; Test with sge.<br>
+define <2 x i64> @f9(<2 x i64> %val) {<br>
+; CHECK-LABEL: f9:<br>
+; CHECK: vlpg [[REG:%v[0-9]+]], %v24<br>
+; CHECK: vlcg %v24, [[REG]]<br>
+; CHECK: br %r14<br>
+ %cmp = icmp sge <2 x i64> %val, zeroinitializer<br>
+ %neg = sub <2 x i64> zeroinitializer, %val<br>
+ %ret = select <2 x i1> %cmp, <2 x i64> %neg, <2 x i64> %val<br>
+ ret <2 x i64> %ret<br>
+}<br>
+<br>
+; Test with an SRA-based boolean vector.<br>
+define <2 x i64> @f10(<2 x i64> %val) {<br>
+; CHECK-LABEL: f10:<br>
+; CHECK: vlpg %v24, %v24<br>
+; CHECK: br %r14<br>
+ %shr = ashr <2 x i64> %val, <i64 63, i64 63><br>
+ %neg = sub <2 x i64> zeroinitializer, %val<br>
+ %and1 = and <2 x i64> %shr, %neg<br>
+ %not = xor <2 x i64> %shr, <i64 -1, i64 -1><br>
+ %and2 = and <2 x i64> %not, %val<br>
+ %ret = or <2 x i64> %and1, %and2<br>
+ ret <2 x i64> %ret<br>
+}<br>
+<br>
+; ...and again in reverse<br>
+define <2 x i64> @f11(<2 x i64> %val) {<br>
+; CHECK-LABEL: f11:<br>
+; CHECK: vlpg [[REG:%v[0-9]+]], %v24<br>
+; CHECK: vlcg %v24, [[REG]]<br>
+; CHECK: br %r14<br>
+ %shr = ashr <2 x i64> %val, <i64 63, i64 63><br>
+ %and1 = and <2 x i64> %shr, %val<br>
+ %not = xor <2 x i64> %shr, <i64 -1, i64 -1><br>
+ %neg = sub <2 x i64> zeroinitializer, %val<br>
+ %and2 = and <2 x i64> %not, %neg<br>
+ %ret = or <2 x i64> %and1, %and2<br>
+ ret <2 x i64> %ret<br>
+}<br>
<br>
Added: llvm/trunk/test/CodeGen/SystemZ/vec-add-01.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-add-01.ll?rev=236521&view=auto" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-add-01.ll?rev=236521&view=auto</a><br>
==============================================================================<br>
--- llvm/trunk/test/CodeGen/SystemZ/vec-add-01.ll (added)<br>
+++ llvm/trunk/test/CodeGen/SystemZ/vec-add-01.ll Tue May 5 14:25:42 2015<br>
@@ -0,0 +1,39 @@<br>
+; Test vector addition.<br>
+;<br>
+; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=z13 | FileCheck %s<br>
+<br>
+; Test a v16i8 addition.<br>
+define <16 x i8> @f1(<16 x i8> %dummy, <16 x i8> %val1, <16 x i8> %val2) {<br>
+; CHECK-LABEL: f1:<br>
+; CHECK: vab %v24, %v26, %v28<br>
+; CHECK: br %r14<br>
+ %ret = add <16 x i8> %val1, %val2<br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test a v8i16 addition.<br>
+define <8 x i16> @f2(<8 x i16> %dummy, <8 x i16> %val1, <8 x i16> %val2) {<br>
+; CHECK-LABEL: f2:<br>
+; CHECK: vah %v24, %v26, %v28<br>
+; CHECK: br %r14<br>
+ %ret = add <8 x i16> %val1, %val2<br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test a v4i32 addition.<br>
+define <4 x i32> @f3(<4 x i32> %dummy, <4 x i32> %val1, <4 x i32> %val2) {<br>
+; CHECK-LABEL: f3:<br>
+; CHECK: vaf %v24, %v26, %v28<br>
+; CHECK: br %r14<br>
+ %ret = add <4 x i32> %val1, %val2<br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test a v2i64 addition.<br>
+define <2 x i64> @f4(<2 x i64> %dummy, <2 x i64> %val1, <2 x i64> %val2) {<br>
+; CHECK-LABEL: f4:<br>
+; CHECK: vag %v24, %v26, %v28<br>
+; CHECK: br %r14<br>
+ %ret = add <2 x i64> %val1, %val2<br>
+ ret <2 x i64> %ret<br>
+}<br>
<br>
Added: llvm/trunk/test/CodeGen/SystemZ/vec-and-01.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-and-01.ll?rev=236521&view=auto" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-and-01.ll?rev=236521&view=auto</a><br>
==============================================================================<br>
--- llvm/trunk/test/CodeGen/SystemZ/vec-and-01.ll (added)<br>
+++ llvm/trunk/test/CodeGen/SystemZ/vec-and-01.ll Tue May 5 14:25:42 2015<br>
@@ -0,0 +1,39 @@<br>
+; Test vector AND.<br>
+;<br>
+; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=z13 | FileCheck %s<br>
+<br>
+; Test a v16i8 AND.<br>
+define <16 x i8> @f1(<16 x i8> %dummy, <16 x i8> %val1, <16 x i8> %val2) {<br>
+; CHECK-LABEL: f1:<br>
+; CHECK: vn %v24, %v26, %v28<br>
+; CHECK: br %r14<br>
+ %ret = and <16 x i8> %val1, %val2<br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test a v8i16 AND.<br>
+define <8 x i16> @f2(<8 x i16> %dummy, <8 x i16> %val1, <8 x i16> %val2) {<br>
+; CHECK-LABEL: f2:<br>
+; CHECK: vn %v24, %v26, %v28<br>
+; CHECK: br %r14<br>
+ %ret = and <8 x i16> %val1, %val2<br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test a v4i32 AND.<br>
+define <4 x i32> @f3(<4 x i32> %dummy, <4 x i32> %val1, <4 x i32> %val2) {<br>
+; CHECK-LABEL: f3:<br>
+; CHECK: vn %v24, %v26, %v28<br>
+; CHECK: br %r14<br>
+ %ret = and <4 x i32> %val1, %val2<br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test a v2i64 AND.<br>
+define <2 x i64> @f4(<2 x i64> %dummy, <2 x i64> %val1, <2 x i64> %val2) {<br>
+; CHECK-LABEL: f4:<br>
+; CHECK: vn %v24, %v26, %v28<br>
+; CHECK: br %r14<br>
+ %ret = and <2 x i64> %val1, %val2<br>
+ ret <2 x i64> %ret<br>
+}<br>
<br>
Added: llvm/trunk/test/CodeGen/SystemZ/vec-and-02.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-and-02.ll?rev=236521&view=auto" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-and-02.ll?rev=236521&view=auto</a><br>
==============================================================================<br>
--- llvm/trunk/test/CodeGen/SystemZ/vec-and-02.ll (added)<br>
+++ llvm/trunk/test/CodeGen/SystemZ/vec-and-02.ll Tue May 5 14:25:42 2015<br>
@@ -0,0 +1,91 @@<br>
+; Test vector AND-NOT.<br>
+;<br>
+; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=z13 | FileCheck %s<br>
+<br>
+; Test a v16i8 AND-NOT.<br>
+define <16 x i8> @f1(<16 x i8> %dummy, <16 x i8> %val1, <16 x i8> %val2) {<br>
+; CHECK-LABEL: f1:<br>
+; CHECK: vnc %v24, %v26, %v28<br>
+; CHECK: br %r14<br>
+ %not = xor <16 x i8> %val2, <i8 -1, i8 -1, i8 -1, i8 -1,<br>
+ i8 -1, i8 -1, i8 -1, i8 -1,<br>
+ i8 -1, i8 -1, i8 -1, i8 -1,<br>
+ i8 -1, i8 -1, i8 -1, i8 -1><br>
+ %ret = and <16 x i8> %val1, %not<br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; ...and again with the reverse.<br>
+define <16 x i8> @f2(<16 x i8> %dummy, <16 x i8> %val1, <16 x i8> %val2) {<br>
+; CHECK-LABEL: f2:<br>
+; CHECK: vnc %v24, %v28, %v26<br>
+; CHECK: br %r14<br>
+ %not = xor <16 x i8> %val1, <i8 -1, i8 -1, i8 -1, i8 -1,<br>
+ i8 -1, i8 -1, i8 -1, i8 -1,<br>
+ i8 -1, i8 -1, i8 -1, i8 -1,<br>
+ i8 -1, i8 -1, i8 -1, i8 -1><br>
+ %ret = and <16 x i8> %not, %val2<br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test a v8i16 AND-NOT.<br>
+define <8 x i16> @f3(<8 x i16> %dummy, <8 x i16> %val1, <8 x i16> %val2) {<br>
+; CHECK-LABEL: f3:<br>
+; CHECK: vnc %v24, %v26, %v28<br>
+; CHECK: br %r14<br>
+ %not = xor <8 x i16> %val2, <i16 -1, i16 -1, i16 -1, i16 -1,<br>
+ i16 -1, i16 -1, i16 -1, i16 -1><br>
+ %ret = and <8 x i16> %val1, %not<br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; ...and again with the reverse.<br>
+define <8 x i16> @f4(<8 x i16> %dummy, <8 x i16> %val1, <8 x i16> %val2) {<br>
+; CHECK-LABEL: f4:<br>
+; CHECK: vnc %v24, %v28, %v26<br>
+; CHECK: br %r14<br>
+ %not = xor <8 x i16> %val1, <i16 -1, i16 -1, i16 -1, i16 -1,<br>
+ i16 -1, i16 -1, i16 -1, i16 -1><br>
+ %ret = and <8 x i16> %not, %val2<br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test a v4i32 AND-NOT.<br>
+define <4 x i32> @f5(<4 x i32> %dummy, <4 x i32> %val1, <4 x i32> %val2) {<br>
+; CHECK-LABEL: f5:<br>
+; CHECK: vnc %v24, %v26, %v28<br>
+; CHECK: br %r14<br>
+ %not = xor <4 x i32> %val2, <i32 -1, i32 -1, i32 -1, i32 -1><br>
+ %ret = and <4 x i32> %val1, %not<br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; ...and again with the reverse.<br>
+define <4 x i32> @f6(<4 x i32> %dummy, <4 x i32> %val1, <4 x i32> %val2) {<br>
+; CHECK-LABEL: f6:<br>
+; CHECK: vnc %v24, %v28, %v26<br>
+; CHECK: br %r14<br>
+ %not = xor <4 x i32> %val1, <i32 -1, i32 -1, i32 -1, i32 -1><br>
+ %ret = and <4 x i32> %not, %val2<br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test a v2i64 AND-NOT.<br>
+define <2 x i64> @f7(<2 x i64> %dummy, <2 x i64> %val1, <2 x i64> %val2) {<br>
+; CHECK-LABEL: f7:<br>
+; CHECK: vnc %v24, %v26, %v28<br>
+; CHECK: br %r14<br>
+ %not = xor <2 x i64> %val2, <i64 -1, i64 -1><br>
+ %ret = and <2 x i64> %val1, %not<br>
+ ret <2 x i64> %ret<br>
+}<br>
+<br>
+; ...and again with the reverse.<br>
+define <2 x i64> @f8(<2 x i64> %dummy, <2 x i64> %val1, <2 x i64> %val2) {<br>
+; CHECK-LABEL: f8:<br>
+; CHECK: vnc %v24, %v28, %v26<br>
+; CHECK: br %r14<br>
+ %not = xor <2 x i64> %val1, <i64 -1, i64 -1><br>
+ %ret = and <2 x i64> %not, %val2<br>
+ ret <2 x i64> %ret<br>
+}<br>
<br>
Added: llvm/trunk/test/CodeGen/SystemZ/vec-and-03.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-and-03.ll?rev=236521&view=auto" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-and-03.ll?rev=236521&view=auto</a><br>
==============================================================================<br>
--- llvm/trunk/test/CodeGen/SystemZ/vec-and-03.ll (added)<br>
+++ llvm/trunk/test/CodeGen/SystemZ/vec-and-03.ll Tue May 5 14:25:42 2015<br>
@@ -0,0 +1,113 @@<br>
+; Test vector zero extensions, which need to be implemented as ANDs.<br>
+;<br>
+; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=z13 | FileCheck %s<br>
+<br>
+; Test a v16i1->v16i8 extension.<br>
+define <16 x i8> @f1(<16 x i8> %val) {<br>
+; CHECK-LABEL: f1:<br>
+; CHECK: vrepib [[REG:%v[0-9]+]], 1<br>
+; CHECK: vn %v24, %v24, [[REG]]<br>
+; CHECK: br %r14<br>
+ %trunc = trunc <16 x i8> %val to <16 x i1><br>
+ %ret = zext <16 x i1> %trunc to <16 x i8><br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test a v8i1->v8i16 extension.<br>
+define <8 x i16> @f2(<8 x i16> %val) {<br>
+; CHECK-LABEL: f2:<br>
+; CHECK: vrepih [[REG:%v[0-9]+]], 1<br>
+; CHECK: vn %v24, %v24, [[REG]]<br>
+; CHECK: br %r14<br>
+ %trunc = trunc <8 x i16> %val to <8 x i1><br>
+ %ret = zext <8 x i1> %trunc to <8 x i16><br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test a v8i8->v8i16 extension.<br>
+define <8 x i16> @f3(<8 x i16> %val) {<br>
+; CHECK-LABEL: f3:<br>
+; CHECK: vgbm [[REG:%v[0-9]+]], 21845<br>
+; CHECK: vn %v24, %v24, [[REG]]<br>
+; CHECK: br %r14<br>
+ %trunc = trunc <8 x i16> %val to <8 x i8><br>
+ %ret = zext <8 x i8> %trunc to <8 x i16><br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test a v4i1->v4i32 extension.<br>
+define <4 x i32> @f4(<4 x i32> %val) {<br>
+; CHECK-LABEL: f4:<br>
+; CHECK: vrepif [[REG:%v[0-9]+]], 1<br>
+; CHECK: vn %v24, %v24, [[REG]]<br>
+; CHECK: br %r14<br>
+ %trunc = trunc <4 x i32> %val to <4 x i1><br>
+ %ret = zext <4 x i1> %trunc to <4 x i32><br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test a v4i8->v4i32 extension.<br>
+define <4 x i32> @f5(<4 x i32> %val) {<br>
+; CHECK-LABEL: f5:<br>
+; CHECK: vgbm [[REG:%v[0-9]+]], 4369<br>
+; CHECK: vn %v24, %v24, [[REG]]<br>
+; CHECK: br %r14<br>
+ %trunc = trunc <4 x i32> %val to <4 x i8><br>
+ %ret = zext <4 x i8> %trunc to <4 x i32><br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test a v4i16->v4i32 extension.<br>
+define <4 x i32> @f6(<4 x i32> %val) {<br>
+; CHECK-LABEL: f6:<br>
+; CHECK: vgbm [[REG:%v[0-9]+]], 13107<br>
+; CHECK: vn %v24, %v24, [[REG]]<br>
+; CHECK: br %r14<br>
+ %trunc = trunc <4 x i32> %val to <4 x i16><br>
+ %ret = zext <4 x i16> %trunc to <4 x i32><br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test a v2i1->v2i64 extension.<br>
+define <2 x i64> @f7(<2 x i64> %val) {<br>
+; CHECK-LABEL: f7:<br>
+; CHECK: vrepig [[REG:%v[0-9]+]], 1<br>
+; CHECK: vn %v24, %v24, [[REG]]<br>
+; CHECK: br %r14<br>
+ %trunc = trunc <2 x i64> %val to <2 x i1><br>
+ %ret = zext <2 x i1> %trunc to <2 x i64><br>
+ ret <2 x i64> %ret<br>
+}<br>
+<br>
+; Test a v2i8->v2i64 extension.<br>
+define <2 x i64> @f8(<2 x i64> %val) {<br>
+; CHECK-LABEL: f8:<br>
+; CHECK: vgbm [[REG:%v[0-9]+]], 257<br>
+; CHECK: vn %v24, %v24, [[REG]]<br>
+; CHECK: br %r14<br>
+ %trunc = trunc <2 x i64> %val to <2 x i8><br>
+ %ret = zext <2 x i8> %trunc to <2 x i64><br>
+ ret <2 x i64> %ret<br>
+}<br>
+<br>
+; Test a v2i16->v2i64 extension.<br>
+define <2 x i64> @f9(<2 x i64> %val) {<br>
+; CHECK-LABEL: f9:<br>
+; CHECK: vgbm [[REG:%v[0-9]+]], 771<br>
+; CHECK: vn %v24, %v24, [[REG]]<br>
+; CHECK: br %r14<br>
+ %trunc = trunc <2 x i64> %val to <2 x i16><br>
+ %ret = zext <2 x i16> %trunc to <2 x i64><br>
+ ret <2 x i64> %ret<br>
+}<br>
+<br>
+; Test a v2i32->v2i64 extension.<br>
+define <2 x i64> @f10(<2 x i64> %val) {<br>
+; CHECK-LABEL: f10:<br>
+; CHECK: vgbm [[REG:%v[0-9]+]], 3855<br>
+; CHECK: vn %v24, %v24, [[REG]]<br>
+; CHECK: br %r14<br>
+ %trunc = trunc <2 x i64> %val to <2 x i32><br>
+ %ret = zext <2 x i32> %trunc to <2 x i64><br>
+ ret <2 x i64> %ret<br>
+}<br>
<br>
Added: llvm/trunk/test/CodeGen/SystemZ/vec-args-01.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-args-01.ll?rev=236521&view=auto" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-args-01.ll?rev=236521&view=auto</a><br>
==============================================================================<br>
--- llvm/trunk/test/CodeGen/SystemZ/vec-args-01.ll (added)<br>
+++ llvm/trunk/test/CodeGen/SystemZ/vec-args-01.ll Tue May 5 14:25:42 2015<br>
@@ -0,0 +1,48 @@<br>
+; Test the handling of named vector arguments.<br>
+;<br>
+; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=z13 | FileCheck %s -check-prefix=CHECK-VEC<br>
+; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=z13 | FileCheck %s -check-prefix=CHECK-STACK<br>
+<br>
+; This routine has 6 integer arguments, which fill up r2-r5 and<br>
+; the stack slot at offset 160, and 10 vector arguments, which<br>
+; fill up v24-v31 and the two double-wide stack slots at 168<br>
+; and 184.<br>
+declare void @bar(i64, i64, i64, i64, i64, i64,<br>
+ <4 x i32>, <4 x i32>, <4 x i32>, <4 x i32>,<br>
+ <4 x i32>, <4 x i32>, <4 x i32>, <4 x i32>,<br>
+ <4 x i32>, <4 x i32>)<br>
+<br>
+define void @foo() {<br>
+; CHECK-VEC-LABEL: foo:<br>
+; CHECK-VEC-DAG: vrepif %v24, 1<br>
+; CHECK-VEC-DAG: vrepif %v26, 2<br>
+; CHECK-VEC-DAG: vrepif %v28, 3<br>
+; CHECK-VEC-DAG: vrepif %v30, 4<br>
+; CHECK-VEC-DAG: vrepif %v25, 5<br>
+; CHECK-VEC-DAG: vrepif %v27, 6<br>
+; CHECK-VEC-DAG: vrepif %v29, 7<br>
+; CHECK-VEC-DAG: vrepif %v31, 8<br>
+; CHECK-VEC: brasl %r14, bar@PLT<br>
+;<br>
+; CHECK-STACK-LABEL: foo:<br>
+; CHECK-STACK: aghi %r15, -200<br>
+; CHECK-STACK-DAG: mvghi 160(%r15), 6<br>
+; CHECK-STACK-DAG: vrepif [[REG1:%v[0-9]+]], 9<br>
+; CHECK-STACK-DAG: vst [[REG1]], 168(%r15)<br>
+; CHECK-STACK-DAG: vrepif [[REG2:%v[0-9]+]], 10<br>
+; CHECK-STACK-DAG: vst [[REG2]], 184(%r15)<br>
+; CHECK-STACK: brasl %r14, bar@PLT<br>
+<br>
+ call void @bar (i64 1, i64 2, i64 3, i64 4, i64 5, i64 6,<br>
+ <4 x i32> <i32 1, i32 1, i32 1, i32 1>,<br>
+ <4 x i32> <i32 2, i32 2, i32 2, i32 2>,<br>
+ <4 x i32> <i32 3, i32 3, i32 3, i32 3>,<br>
+ <4 x i32> <i32 4, i32 4, i32 4, i32 4>,<br>
+ <4 x i32> <i32 5, i32 5, i32 5, i32 5>,<br>
+ <4 x i32> <i32 6, i32 6, i32 6, i32 6>,<br>
+ <4 x i32> <i32 7, i32 7, i32 7, i32 7>,<br>
+ <4 x i32> <i32 8, i32 8, i32 8, i32 8>,<br>
+ <4 x i32> <i32 9, i32 9, i32 9, i32 9>,<br>
+ <4 x i32> <i32 10, i32 10, i32 10, i32 10>)<br>
+ ret void<br>
+}<br>
<br>
Added: llvm/trunk/test/CodeGen/SystemZ/vec-args-02.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-args-02.ll?rev=236521&view=auto" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-args-02.ll?rev=236521&view=auto</a><br>
==============================================================================<br>
--- llvm/trunk/test/CodeGen/SystemZ/vec-args-02.ll (added)<br>
+++ llvm/trunk/test/CodeGen/SystemZ/vec-args-02.ll Tue May 5 14:25:42 2015<br>
@@ -0,0 +1,31 @@<br>
+; Test the handling of unnamed vector arguments.<br>
+;<br>
+; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=z13 | FileCheck %s -check-prefix=CHECK-VEC<br>
+; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=z13 | FileCheck %s -check-prefix=CHECK-STACK<br>
+<br>
+; This routine is called with two named vector argument (passed<br>
+; in %v24 and %v26) and two unnamed vector arguments (passed<br>
+; in the double-wide stack slots at 160 and 176).<br>
+declare void @bar(<4 x i32>, <4 x i32>, ...)<br>
+<br>
+define void @foo() {<br>
+; CHECK-VEC-LABEL: foo:<br>
+; CHECK-VEC-DAG: vrepif %v24, 1<br>
+; CHECK-VEC-DAG: vrepif %v26, 2<br>
+; CHECK-VEC: brasl %r14, bar@PLT<br>
+;<br>
+; CHECK-STACK-LABEL: foo:<br>
+; CHECK-STACK: aghi %r15, -192<br>
+; CHECK-STACK-DAG: vrepif [[REG1:%v[0-9]+]], 3<br>
+; CHECK-STACK-DAG: vst [[REG1]], 160(%r15)<br>
+; CHECK-STACK-DAG: vrepif [[REG2:%v[0-9]+]], 4<br>
+; CHECK-STACK-DAG: vst [[REG2]], 176(%r15)<br>
+; CHECK-STACK: brasl %r14, bar@PLT<br>
+<br>
+ call void (<4 x i32>, <4 x i32>, ...) @bar<br>
+ (<4 x i32> <i32 1, i32 1, i32 1, i32 1>,<br>
+ <4 x i32> <i32 2, i32 2, i32 2, i32 2>,<br>
+ <4 x i32> <i32 3, i32 3, i32 3, i32 3>,<br>
+ <4 x i32> <i32 4, i32 4, i32 4, i32 4>)<br>
+ ret void<br>
+}<br>
<br>
Added: llvm/trunk/test/CodeGen/SystemZ/vec-args-03.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-args-03.ll?rev=236521&view=auto" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-args-03.ll?rev=236521&view=auto</a><br>
==============================================================================<br>
--- llvm/trunk/test/CodeGen/SystemZ/vec-args-03.ll (added)<br>
+++ llvm/trunk/test/CodeGen/SystemZ/vec-args-03.ll Tue May 5 14:25:42 2015<br>
@@ -0,0 +1,16 @@<br>
+; Test the handling of incoming vector arguments.<br>
+;<br>
+; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=z13 | FileCheck %s<br>
+<br>
+; This routine has 10 vector arguments, which fill up %v24-%v31 and<br>
+; the two double-wide stack slots at 160 and 176.<br>
+define <4 x i32> @foo(<4 x i32> %v1, <4 x i32> %v2, <4 x i32> %v3, <4 x i32> %v4,<br>
+ <4 x i32> %v5, <4 x i32> %v6, <4 x i32> %v7, <4 x i32> %v8,<br>
+ <4 x i32> %v9, <4 x i32> %v10) {<br>
+; CHECK-LABEL: foo:<br>
+; CHECK: vl [[REG1:%v[0-9]+]], 176(%r15)<br>
+; CHECK: vsf %v24, %v26, [[REG1]]<br>
+; CHECK: br %r14<br>
+ %y = sub <4 x i32> %v2, %v10<br>
+ ret <4 x i32> %y<br>
+}<br>
<br>
Added: llvm/trunk/test/CodeGen/SystemZ/vec-cmp-01.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-cmp-01.ll?rev=236521&view=auto" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-cmp-01.ll?rev=236521&view=auto</a><br>
==============================================================================<br>
--- llvm/trunk/test/CodeGen/SystemZ/vec-cmp-01.ll (added)<br>
+++ llvm/trunk/test/CodeGen/SystemZ/vec-cmp-01.ll Tue May 5 14:25:42 2015<br>
@@ -0,0 +1,228 @@<br>
+; Test v16i8 comparisons.<br>
+;<br>
+; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=z13 | FileCheck %s<br>
+<br>
+; Test eq.<br>
+define <16 x i8> @f1(<16 x i8> %dummy, <16 x i8> %val1, <16 x i8> %val2) {<br>
+; CHECK-LABEL: f1:<br>
+; CHECK: vceqb %v24, %v26, %v28<br>
+; CHECK-NEXT: br %r14<br>
+ %cmp = icmp eq <16 x i8> %val1, %val2<br>
+ %ret = sext <16 x i1> %cmp to <16 x i8><br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test ne.<br>
+define <16 x i8> @f2(<16 x i8> %dummy, <16 x i8> %val1, <16 x i8> %val2) {<br>
+; CHECK-LABEL: f2:<br>
+; CHECK: vceqb [[REG:%v[0-9]+]], %v26, %v28<br>
+; CHECK-NEXT: vno %v24, [[REG]], [[REG]]<br>
+; CHECK-NEXT: br %r14<br>
+ %cmp = icmp ne <16 x i8> %val1, %val2<br>
+ %ret = sext <16 x i1> %cmp to <16 x i8><br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test sgt.<br>
+define <16 x i8> @f3(<16 x i8> %dummy, <16 x i8> %val1, <16 x i8> %val2) {<br>
+; CHECK-LABEL: f3:<br>
+; CHECK: vchb %v24, %v26, %v28<br>
+; CHECK-NEXT: br %r14<br>
+ %cmp = icmp sgt <16 x i8> %val1, %val2<br>
+ %ret = sext <16 x i1> %cmp to <16 x i8><br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test sge.<br>
+define <16 x i8> @f4(<16 x i8> %dummy, <16 x i8> %val1, <16 x i8> %val2) {<br>
+; CHECK-LABEL: f4:<br>
+; CHECK: vchb [[REG:%v[0-9]+]], %v28, %v26<br>
+; CHECK-NEXT: vno %v24, [[REG]], [[REG]]<br>
+; CHECK-NEXT: br %r14<br>
+ %cmp = icmp sge <16 x i8> %val1, %val2<br>
+ %ret = sext <16 x i1> %cmp to <16 x i8><br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test sle.<br>
+define <16 x i8> @f5(<16 x i8> %dummy, <16 x i8> %val1, <16 x i8> %val2) {<br>
+; CHECK-LABEL: f5:<br>
+; CHECK: vchb [[REG:%v[0-9]+]], %v26, %v28<br>
+; CHECK-NEXT: vno %v24, [[REG]], [[REG]]<br>
+; CHECK-NEXT: br %r14<br>
+ %cmp = icmp sle <16 x i8> %val1, %val2<br>
+ %ret = sext <16 x i1> %cmp to <16 x i8><br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test slt.<br>
+define <16 x i8> @f6(<16 x i8> %dummy, <16 x i8> %val1, <16 x i8> %val2) {<br>
+; CHECK-LABEL: f6:<br>
+; CHECK: vchb %v24, %v28, %v26<br>
+; CHECK-NEXT: br %r14<br>
+ %cmp = icmp slt <16 x i8> %val1, %val2<br>
+ %ret = sext <16 x i1> %cmp to <16 x i8><br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test ugt.<br>
+define <16 x i8> @f7(<16 x i8> %dummy, <16 x i8> %val1, <16 x i8> %val2) {<br>
+; CHECK-LABEL: f7:<br>
+; CHECK: vchlb %v24, %v26, %v28<br>
+; CHECK-NEXT: br %r14<br>
+ %cmp = icmp ugt <16 x i8> %val1, %val2<br>
+ %ret = sext <16 x i1> %cmp to <16 x i8><br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test uge.<br>
+define <16 x i8> @f8(<16 x i8> %dummy, <16 x i8> %val1, <16 x i8> %val2) {<br>
+; CHECK-LABEL: f8:<br>
+; CHECK: vchlb [[REG:%v[0-9]+]], %v28, %v26<br>
+; CHECK-NEXT: vno %v24, [[REG]], [[REG]]<br>
+; CHECK-NEXT: br %r14<br>
+ %cmp = icmp uge <16 x i8> %val1, %val2<br>
+ %ret = sext <16 x i1> %cmp to <16 x i8><br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test ule.<br>
+define <16 x i8> @f9(<16 x i8> %dummy, <16 x i8> %val1, <16 x i8> %val2) {<br>
+; CHECK-LABEL: f9:<br>
+; CHECK: vchlb [[REG:%v[0-9]+]], %v26, %v28<br>
+; CHECK-NEXT: vno %v24, [[REG]], [[REG]]<br>
+; CHECK-NEXT: br %r14<br>
+ %cmp = icmp ule <16 x i8> %val1, %val2<br>
+ %ret = sext <16 x i1> %cmp to <16 x i8><br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test ult.<br>
+define <16 x i8> @f10(<16 x i8> %dummy, <16 x i8> %val1, <16 x i8> %val2) {<br>
+; CHECK-LABEL: f10:<br>
+; CHECK: vchlb %v24, %v28, %v26<br>
+; CHECK-NEXT: br %r14<br>
+ %cmp = icmp ult <16 x i8> %val1, %val2<br>
+ %ret = sext <16 x i1> %cmp to <16 x i8><br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test eq selects.<br>
+define <16 x i8> @f11(<16 x i8> %val1, <16 x i8> %val2,<br>
+ <16 x i8> %val3, <16 x i8> %val4) {<br>
+; CHECK-LABEL: f11:<br>
+; CHECK: vceqb [[REG:%v[0-9]+]], %v24, %v26<br>
+; CHECK-NEXT: vsel %v24, %v28, %v30, [[REG]]<br>
+; CHECK-NEXT: br %r14<br>
+ %cmp = icmp eq <16 x i8> %val1, %val2<br>
+ %ret = select <16 x i1> %cmp, <16 x i8> %val3, <16 x i8> %val4<br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test ne selects.<br>
+define <16 x i8> @f12(<16 x i8> %val1, <16 x i8> %val2,<br>
+ <16 x i8> %val3, <16 x i8> %val4) {<br>
+; CHECK-LABEL: f12:<br>
+; CHECK: vceqb [[REG:%v[0-9]+]], %v24, %v26<br>
+; CHECK-NEXT: vsel %v24, %v30, %v28, [[REG]]<br>
+; CHECK-NEXT: br %r14<br>
+ %cmp = icmp ne <16 x i8> %val1, %val2<br>
+ %ret = select <16 x i1> %cmp, <16 x i8> %val3, <16 x i8> %val4<br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test sgt selects.<br>
+define <16 x i8> @f13(<16 x i8> %val1, <16 x i8> %val2,<br>
+ <16 x i8> %val3, <16 x i8> %val4) {<br>
+; CHECK-LABEL: f13:<br>
+; CHECK: vchb [[REG:%v[0-9]+]], %v24, %v26<br>
+; CHECK-NEXT: vsel %v24, %v28, %v30, [[REG]]<br>
+; CHECK-NEXT: br %r14<br>
+ %cmp = icmp sgt <16 x i8> %val1, %val2<br>
+ %ret = select <16 x i1> %cmp, <16 x i8> %val3, <16 x i8> %val4<br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test sge selects.<br>
+define <16 x i8> @f14(<16 x i8> %val1, <16 x i8> %val2,<br>
+ <16 x i8> %val3, <16 x i8> %val4) {<br>
+; CHECK-LABEL: f14:<br>
+; CHECK: vchb [[REG:%v[0-9]+]], %v26, %v24<br>
+; CHECK-NEXT: vsel %v24, %v30, %v28, [[REG]]<br>
+; CHECK-NEXT: br %r14<br>
+ %cmp = icmp sge <16 x i8> %val1, %val2<br>
+ %ret = select <16 x i1> %cmp, <16 x i8> %val3, <16 x i8> %val4<br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test sle selects.<br>
+define <16 x i8> @f15(<16 x i8> %val1, <16 x i8> %val2,<br>
+ <16 x i8> %val3, <16 x i8> %val4) {<br>
+; CHECK-LABEL: f15:<br>
+; CHECK: vchb [[REG:%v[0-9]+]], %v24, %v26<br>
+; CHECK-NEXT: vsel %v24, %v30, %v28, [[REG]]<br>
+; CHECK-NEXT: br %r14<br>
+ %cmp = icmp sle <16 x i8> %val1, %val2<br>
+ %ret = select <16 x i1> %cmp, <16 x i8> %val3, <16 x i8> %val4<br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test slt selects.<br>
+define <16 x i8> @f16(<16 x i8> %val1, <16 x i8> %val2,<br>
+ <16 x i8> %val3, <16 x i8> %val4) {<br>
+; CHECK-LABEL: f16:<br>
+; CHECK: vchb [[REG:%v[0-9]+]], %v26, %v24<br>
+; CHECK-NEXT: vsel %v24, %v28, %v30, [[REG]]<br>
+; CHECK-NEXT: br %r14<br>
+ %cmp = icmp slt <16 x i8> %val1, %val2<br>
+ %ret = select <16 x i1> %cmp, <16 x i8> %val3, <16 x i8> %val4<br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test ugt selects.<br>
+define <16 x i8> @f17(<16 x i8> %val1, <16 x i8> %val2,<br>
+ <16 x i8> %val3, <16 x i8> %val4) {<br>
+; CHECK-LABEL: f17:<br>
+; CHECK: vchlb [[REG:%v[0-9]+]], %v24, %v26<br>
+; CHECK-NEXT: vsel %v24, %v28, %v30, [[REG]]<br>
+; CHECK-NEXT: br %r14<br>
+ %cmp = icmp ugt <16 x i8> %val1, %val2<br>
+ %ret = select <16 x i1> %cmp, <16 x i8> %val3, <16 x i8> %val4<br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test uge selects.<br>
+define <16 x i8> @f18(<16 x i8> %val1, <16 x i8> %val2,<br>
+ <16 x i8> %val3, <16 x i8> %val4) {<br>
+; CHECK-LABEL: f18:<br>
+; CHECK: vchlb [[REG:%v[0-9]+]], %v26, %v24<br>
+; CHECK-NEXT: vsel %v24, %v30, %v28, [[REG]]<br>
+; CHECK-NEXT: br %r14<br>
+ %cmp = icmp uge <16 x i8> %val1, %val2<br>
+ %ret = select <16 x i1> %cmp, <16 x i8> %val3, <16 x i8> %val4<br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test ule selects.<br>
+define <16 x i8> @f19(<16 x i8> %val1, <16 x i8> %val2,<br>
+ <16 x i8> %val3, <16 x i8> %val4) {<br>
+; CHECK-LABEL: f19:<br>
+; CHECK: vchlb [[REG:%v[0-9]+]], %v24, %v26<br>
+; CHECK-NEXT: vsel %v24, %v30, %v28, [[REG]]<br>
+; CHECK-NEXT: br %r14<br>
+ %cmp = icmp ule <16 x i8> %val1, %val2<br>
+ %ret = select <16 x i1> %cmp, <16 x i8> %val3, <16 x i8> %val4<br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test ult selects.<br>
+define <16 x i8> @f20(<16 x i8> %val1, <16 x i8> %val2,<br>
+ <16 x i8> %val3, <16 x i8> %val4) {<br>
+; CHECK-LABEL: f20:<br>
+; CHECK: vchlb [[REG:%v[0-9]+]], %v26, %v24<br>
+; CHECK-NEXT: vsel %v24, %v28, %v30, [[REG]]<br>
+; CHECK-NEXT: br %r14<br>
+ %cmp = icmp ult <16 x i8> %val1, %val2<br>
+ %ret = select <16 x i1> %cmp, <16 x i8> %val3, <16 x i8> %val4<br>
+ ret <16 x i8> %ret<br>
+}<br>
<br>
Added: llvm/trunk/test/CodeGen/SystemZ/vec-cmp-02.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-cmp-02.ll?rev=236521&view=auto" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-cmp-02.ll?rev=236521&view=auto</a><br>
==============================================================================<br>
--- llvm/trunk/test/CodeGen/SystemZ/vec-cmp-02.ll (added)<br>
+++ llvm/trunk/test/CodeGen/SystemZ/vec-cmp-02.ll Tue May 5 14:25:42 2015<br>
@@ -0,0 +1,228 @@<br>
+; Test v8i16 comparisons.<br>
+;<br>
+; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=z13 | FileCheck %s<br>
+<br>
+; Test eq.<br>
+define <8 x i16> @f1(<8 x i16> %dummy, <8 x i16> %val1, <8 x i16> %val2) {<br>
+; CHECK-LABEL: f1:<br>
+; CHECK: vceqh %v24, %v26, %v28<br>
+; CHECK-NEXT: br %r14<br>
+ %cmp = icmp eq <8 x i16> %val1, %val2<br>
+ %ret = sext <8 x i1> %cmp to <8 x i16><br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test ne.<br>
+define <8 x i16> @f2(<8 x i16> %dummy, <8 x i16> %val1, <8 x i16> %val2) {<br>
+; CHECK-LABEL: f2:<br>
+; CHECK: vceqh [[REG:%v[0-9]+]], %v26, %v28<br>
+; CHECK-NEXT: vno %v24, [[REG]], [[REG]]<br>
+; CHECK-NEXT: br %r14<br>
+ %cmp = icmp ne <8 x i16> %val1, %val2<br>
+ %ret = sext <8 x i1> %cmp to <8 x i16><br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test sgt.<br>
+define <8 x i16> @f3(<8 x i16> %dummy, <8 x i16> %val1, <8 x i16> %val2) {<br>
+; CHECK-LABEL: f3:<br>
+; CHECK: vchh %v24, %v26, %v28<br>
+; CHECK-NEXT: br %r14<br>
+ %cmp = icmp sgt <8 x i16> %val1, %val2<br>
+ %ret = sext <8 x i1> %cmp to <8 x i16><br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test sge.<br>
+define <8 x i16> @f4(<8 x i16> %dummy, <8 x i16> %val1, <8 x i16> %val2) {<br>
+; CHECK-LABEL: f4:<br>
+; CHECK: vchh [[REG:%v[0-9]+]], %v28, %v26<br>
+; CHECK-NEXT: vno %v24, [[REG]], [[REG]]<br>
+; CHECK-NEXT: br %r14<br>
+ %cmp = icmp sge <8 x i16> %val1, %val2<br>
+ %ret = sext <8 x i1> %cmp to <8 x i16><br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test sle.<br>
+define <8 x i16> @f5(<8 x i16> %dummy, <8 x i16> %val1, <8 x i16> %val2) {<br>
+; CHECK-LABEL: f5:<br>
+; CHECK: vchh [[REG:%v[0-9]+]], %v26, %v28<br>
+; CHECK-NEXT: vno %v24, [[REG]], [[REG]]<br>
+; CHECK-NEXT: br %r14<br>
+ %cmp = icmp sle <8 x i16> %val1, %val2<br>
+ %ret = sext <8 x i1> %cmp to <8 x i16><br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test slt.<br>
+define <8 x i16> @f6(<8 x i16> %dummy, <8 x i16> %val1, <8 x i16> %val2) {<br>
+; CHECK-LABEL: f6:<br>
+; CHECK: vchh %v24, %v28, %v26<br>
+; CHECK-NEXT: br %r14<br>
+ %cmp = icmp slt <8 x i16> %val1, %val2<br>
+ %ret = sext <8 x i1> %cmp to <8 x i16><br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test ugt.<br>
+define <8 x i16> @f7(<8 x i16> %dummy, <8 x i16> %val1, <8 x i16> %val2) {<br>
+; CHECK-LABEL: f7:<br>
+; CHECK: vchlh %v24, %v26, %v28<br>
+; CHECK-NEXT: br %r14<br>
+ %cmp = icmp ugt <8 x i16> %val1, %val2<br>
+ %ret = sext <8 x i1> %cmp to <8 x i16><br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test uge.<br>
+define <8 x i16> @f8(<8 x i16> %dummy, <8 x i16> %val1, <8 x i16> %val2) {<br>
+; CHECK-LABEL: f8:<br>
+; CHECK: vchlh [[REG:%v[0-9]+]], %v28, %v26<br>
+; CHECK-NEXT: vno %v24, [[REG]], [[REG]]<br>
+; CHECK-NEXT: br %r14<br>
+ %cmp = icmp uge <8 x i16> %val1, %val2<br>
+ %ret = sext <8 x i1> %cmp to <8 x i16><br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test ule.<br>
+define <8 x i16> @f9(<8 x i16> %dummy, <8 x i16> %val1, <8 x i16> %val2) {<br>
+; CHECK-LABEL: f9:<br>
+; CHECK: vchlh [[REG:%v[0-9]+]], %v26, %v28<br>
+; CHECK-NEXT: vno %v24, [[REG]], [[REG]]<br>
+; CHECK-NEXT: br %r14<br>
+ %cmp = icmp ule <8 x i16> %val1, %val2<br>
+ %ret = sext <8 x i1> %cmp to <8 x i16><br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test ult.<br>
+define <8 x i16> @f10(<8 x i16> %dummy, <8 x i16> %val1, <8 x i16> %val2) {<br>
+; CHECK-LABEL: f10:<br>
+; CHECK: vchlh %v24, %v28, %v26<br>
+; CHECK-NEXT: br %r14<br>
+ %cmp = icmp ult <8 x i16> %val1, %val2<br>
+ %ret = sext <8 x i1> %cmp to <8 x i16><br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test eq selects.<br>
+define <8 x i16> @f11(<8 x i16> %val1, <8 x i16> %val2,<br>
+ <8 x i16> %val3, <8 x i16> %val4) {<br>
+; CHECK-LABEL: f11:<br>
+; CHECK: vceqh [[REG:%v[0-9]+]], %v24, %v26<br>
+; CHECK-NEXT: vsel %v24, %v28, %v30, [[REG]]<br>
+; CHECK-NEXT: br %r14<br>
+ %cmp = icmp eq <8 x i16> %val1, %val2<br>
+ %ret = select <8 x i1> %cmp, <8 x i16> %val3, <8 x i16> %val4<br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test ne selects.<br>
+define <8 x i16> @f12(<8 x i16> %val1, <8 x i16> %val2,<br>
+ <8 x i16> %val3, <8 x i16> %val4) {<br>
+; CHECK-LABEL: f12:<br>
+; CHECK: vceqh [[REG:%v[0-9]+]], %v24, %v26<br>
+; CHECK-NEXT: vsel %v24, %v30, %v28, [[REG]]<br>
+; CHECK-NEXT: br %r14<br>
+ %cmp = icmp ne <8 x i16> %val1, %val2<br>
+ %ret = select <8 x i1> %cmp, <8 x i16> %val3, <8 x i16> %val4<br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test sgt selects.<br>
+define <8 x i16> @f13(<8 x i16> %val1, <8 x i16> %val2,<br>
+ <8 x i16> %val3, <8 x i16> %val4) {<br>
+; CHECK-LABEL: f13:<br>
+; CHECK: vchh [[REG:%v[0-9]+]], %v24, %v26<br>
+; CHECK-NEXT: vsel %v24, %v28, %v30, [[REG]]<br>
+; CHECK-NEXT: br %r14<br>
+ %cmp = icmp sgt <8 x i16> %val1, %val2<br>
+ %ret = select <8 x i1> %cmp, <8 x i16> %val3, <8 x i16> %val4<br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test sge selects.<br>
+define <8 x i16> @f14(<8 x i16> %val1, <8 x i16> %val2,<br>
+ <8 x i16> %val3, <8 x i16> %val4) {<br>
+; CHECK-LABEL: f14:<br>
+; CHECK: vchh [[REG:%v[0-9]+]], %v26, %v24<br>
+; CHECK-NEXT: vsel %v24, %v30, %v28, [[REG]]<br>
+; CHECK-NEXT: br %r14<br>
+ %cmp = icmp sge <8 x i16> %val1, %val2<br>
+ %ret = select <8 x i1> %cmp, <8 x i16> %val3, <8 x i16> %val4<br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test sle selects.<br>
+define <8 x i16> @f15(<8 x i16> %val1, <8 x i16> %val2,<br>
+ <8 x i16> %val3, <8 x i16> %val4) {<br>
+; CHECK-LABEL: f15:<br>
+; CHECK: vchh [[REG:%v[0-9]+]], %v24, %v26<br>
+; CHECK-NEXT: vsel %v24, %v30, %v28, [[REG]]<br>
+; CHECK-NEXT: br %r14<br>
+ %cmp = icmp sle <8 x i16> %val1, %val2<br>
+ %ret = select <8 x i1> %cmp, <8 x i16> %val3, <8 x i16> %val4<br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test slt selects.<br>
+define <8 x i16> @f16(<8 x i16> %val1, <8 x i16> %val2,<br>
+ <8 x i16> %val3, <8 x i16> %val4) {<br>
+; CHECK-LABEL: f16:<br>
+; CHECK: vchh [[REG:%v[0-9]+]], %v26, %v24<br>
+; CHECK-NEXT: vsel %v24, %v28, %v30, [[REG]]<br>
+; CHECK-NEXT: br %r14<br>
+ %cmp = icmp slt <8 x i16> %val1, %val2<br>
+ %ret = select <8 x i1> %cmp, <8 x i16> %val3, <8 x i16> %val4<br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test ugt selects.<br>
+define <8 x i16> @f17(<8 x i16> %val1, <8 x i16> %val2,<br>
+ <8 x i16> %val3, <8 x i16> %val4) {<br>
+; CHECK-LABEL: f17:<br>
+; CHECK: vchlh [[REG:%v[0-9]+]], %v24, %v26<br>
+; CHECK-NEXT: vsel %v24, %v28, %v30, [[REG]]<br>
+; CHECK-NEXT: br %r14<br>
+ %cmp = icmp ugt <8 x i16> %val1, %val2<br>
+ %ret = select <8 x i1> %cmp, <8 x i16> %val3, <8 x i16> %val4<br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test uge selects.<br>
+define <8 x i16> @f18(<8 x i16> %val1, <8 x i16> %val2,<br>
+ <8 x i16> %val3, <8 x i16> %val4) {<br>
+; CHECK-LABEL: f18:<br>
+; CHECK: vchlh [[REG:%v[0-9]+]], %v26, %v24<br>
+; CHECK-NEXT: vsel %v24, %v30, %v28, [[REG]]<br>
+; CHECK-NEXT: br %r14<br>
+ %cmp = icmp uge <8 x i16> %val1, %val2<br>
+ %ret = select <8 x i1> %cmp, <8 x i16> %val3, <8 x i16> %val4<br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test ule selects.<br>
+define <8 x i16> @f19(<8 x i16> %val1, <8 x i16> %val2,<br>
+ <8 x i16> %val3, <8 x i16> %val4) {<br>
+; CHECK-LABEL: f19:<br>
+; CHECK: vchlh [[REG:%v[0-9]+]], %v24, %v26<br>
+; CHECK-NEXT: vsel %v24, %v30, %v28, [[REG]]<br>
+; CHECK-NEXT: br %r14<br>
+ %cmp = icmp ule <8 x i16> %val1, %val2<br>
+ %ret = select <8 x i1> %cmp, <8 x i16> %val3, <8 x i16> %val4<br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test ult selects.<br>
+define <8 x i16> @f20(<8 x i16> %val1, <8 x i16> %val2,<br>
+ <8 x i16> %val3, <8 x i16> %val4) {<br>
+; CHECK-LABEL: f20:<br>
+; CHECK: vchlh [[REG:%v[0-9]+]], %v26, %v24<br>
+; CHECK-NEXT: vsel %v24, %v28, %v30, [[REG]]<br>
+; CHECK-NEXT: br %r14<br>
+ %cmp = icmp ult <8 x i16> %val1, %val2<br>
+ %ret = select <8 x i1> %cmp, <8 x i16> %val3, <8 x i16> %val4<br>
+ ret <8 x i16> %ret<br>
+}<br>
<br>
Added: llvm/trunk/test/CodeGen/SystemZ/vec-cmp-03.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-cmp-03.ll?rev=236521&view=auto" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-cmp-03.ll?rev=236521&view=auto</a><br>
==============================================================================<br>
--- llvm/trunk/test/CodeGen/SystemZ/vec-cmp-03.ll (added)<br>
+++ llvm/trunk/test/CodeGen/SystemZ/vec-cmp-03.ll Tue May 5 14:25:42 2015<br>
@@ -0,0 +1,228 @@<br>
+; Test v4i32 comparisons.<br>
+;<br>
+; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=z13 | FileCheck %s<br>
+<br>
+; Test eq.<br>
+define <4 x i32> @f1(<4 x i32> %dummy, <4 x i32> %val1, <4 x i32> %val2) {<br>
+; CHECK-LABEL: f1:<br>
+; CHECK: vceqf %v24, %v26, %v28<br>
+; CHECK-NEXT: br %r14<br>
+ %cmp = icmp eq <4 x i32> %val1, %val2<br>
+ %ret = sext <4 x i1> %cmp to <4 x i32><br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test ne.<br>
+define <4 x i32> @f2(<4 x i32> %dummy, <4 x i32> %val1, <4 x i32> %val2) {<br>
+; CHECK-LABEL: f2:<br>
+; CHECK: vceqf [[REG:%v[0-9]+]], %v26, %v28<br>
+; CHECK-NEXT: vno %v24, [[REG]], [[REG]]<br>
+; CHECK-NEXT: br %r14<br>
+ %cmp = icmp ne <4 x i32> %val1, %val2<br>
+ %ret = sext <4 x i1> %cmp to <4 x i32><br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test sgt.<br>
+define <4 x i32> @f3(<4 x i32> %dummy, <4 x i32> %val1, <4 x i32> %val2) {<br>
+; CHECK-LABEL: f3:<br>
+; CHECK: vchf %v24, %v26, %v28<br>
+; CHECK-NEXT: br %r14<br>
+ %cmp = icmp sgt <4 x i32> %val1, %val2<br>
+ %ret = sext <4 x i1> %cmp to <4 x i32><br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test sge.<br>
+define <4 x i32> @f4(<4 x i32> %dummy, <4 x i32> %val1, <4 x i32> %val2) {<br>
+; CHECK-LABEL: f4:<br>
+; CHECK: vchf [[REG:%v[0-9]+]], %v28, %v26<br>
+; CHECK-NEXT: vno %v24, [[REG]], [[REG]]<br>
+; CHECK-NEXT: br %r14<br>
+ %cmp = icmp sge <4 x i32> %val1, %val2<br>
+ %ret = sext <4 x i1> %cmp to <4 x i32><br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test sle.<br>
+define <4 x i32> @f5(<4 x i32> %dummy, <4 x i32> %val1, <4 x i32> %val2) {<br>
+; CHECK-LABEL: f5:<br>
+; CHECK: vchf [[REG:%v[0-9]+]], %v26, %v28<br>
+; CHECK-NEXT: vno %v24, [[REG]], [[REG]]<br>
+; CHECK-NEXT: br %r14<br>
+ %cmp = icmp sle <4 x i32> %val1, %val2<br>
+ %ret = sext <4 x i1> %cmp to <4 x i32><br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test slt.<br>
+define <4 x i32> @f6(<4 x i32> %dummy, <4 x i32> %val1, <4 x i32> %val2) {<br>
+; CHECK-LABEL: f6:<br>
+; CHECK: vchf %v24, %v28, %v26<br>
+; CHECK-NEXT: br %r14<br>
+ %cmp = icmp slt <4 x i32> %val1, %val2<br>
+ %ret = sext <4 x i1> %cmp to <4 x i32><br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test ugt.<br>
+define <4 x i32> @f7(<4 x i32> %dummy, <4 x i32> %val1, <4 x i32> %val2) {<br>
+; CHECK-LABEL: f7:<br>
+; CHECK: vchlf %v24, %v26, %v28<br>
+; CHECK-NEXT: br %r14<br>
+ %cmp = icmp ugt <4 x i32> %val1, %val2<br>
+ %ret = sext <4 x i1> %cmp to <4 x i32><br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test uge.<br>
+define <4 x i32> @f8(<4 x i32> %dummy, <4 x i32> %val1, <4 x i32> %val2) {<br>
+; CHECK-LABEL: f8:<br>
+; CHECK: vchlf [[REG:%v[0-9]+]], %v28, %v26<br>
+; CHECK-NEXT: vno %v24, [[REG]], [[REG]]<br>
+; CHECK-NEXT: br %r14<br>
+ %cmp = icmp uge <4 x i32> %val1, %val2<br>
+ %ret = sext <4 x i1> %cmp to <4 x i32><br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test ule.<br>
+define <4 x i32> @f9(<4 x i32> %dummy, <4 x i32> %val1, <4 x i32> %val2) {<br>
+; CHECK-LABEL: f9:<br>
+; CHECK: vchlf [[REG:%v[0-9]+]], %v26, %v28<br>
+; CHECK-NEXT: vno %v24, [[REG]], [[REG]]<br>
+; CHECK-NEXT: br %r14<br>
+ %cmp = icmp ule <4 x i32> %val1, %val2<br>
+ %ret = sext <4 x i1> %cmp to <4 x i32><br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test ult.<br>
+define <4 x i32> @f10(<4 x i32> %dummy, <4 x i32> %val1, <4 x i32> %val2) {<br>
+; CHECK-LABEL: f10:<br>
+; CHECK: vchlf %v24, %v28, %v26<br>
+; CHECK-NEXT: br %r14<br>
+ %cmp = icmp ult <4 x i32> %val1, %val2<br>
+ %ret = sext <4 x i1> %cmp to <4 x i32><br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test eq selects.<br>
+define <4 x i32> @f11(<4 x i32> %val1, <4 x i32> %val2,<br>
+ <4 x i32> %val3, <4 x i32> %val4) {<br>
+; CHECK-LABEL: f11:<br>
+; CHECK: vceqf [[REG:%v[0-9]+]], %v24, %v26<br>
+; CHECK-NEXT: vsel %v24, %v28, %v30, [[REG]]<br>
+; CHECK-NEXT: br %r14<br>
+ %cmp = icmp eq <4 x i32> %val1, %val2<br>
+ %ret = select <4 x i1> %cmp, <4 x i32> %val3, <4 x i32> %val4<br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test ne selects.<br>
+define <4 x i32> @f12(<4 x i32> %val1, <4 x i32> %val2,<br>
+ <4 x i32> %val3, <4 x i32> %val4) {<br>
+; CHECK-LABEL: f12:<br>
+; CHECK: vceqf [[REG:%v[0-9]+]], %v24, %v26<br>
+; CHECK-NEXT: vsel %v24, %v30, %v28, [[REG]]<br>
+; CHECK-NEXT: br %r14<br>
+ %cmp = icmp ne <4 x i32> %val1, %val2<br>
+ %ret = select <4 x i1> %cmp, <4 x i32> %val3, <4 x i32> %val4<br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test sgt selects.<br>
+define <4 x i32> @f13(<4 x i32> %val1, <4 x i32> %val2,<br>
+ <4 x i32> %val3, <4 x i32> %val4) {<br>
+; CHECK-LABEL: f13:<br>
+; CHECK: vchf [[REG:%v[0-9]+]], %v24, %v26<br>
+; CHECK-NEXT: vsel %v24, %v28, %v30, [[REG]]<br>
+; CHECK-NEXT: br %r14<br>
+ %cmp = icmp sgt <4 x i32> %val1, %val2<br>
+ %ret = select <4 x i1> %cmp, <4 x i32> %val3, <4 x i32> %val4<br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test sge selects.<br>
+define <4 x i32> @f14(<4 x i32> %val1, <4 x i32> %val2,<br>
+ <4 x i32> %val3, <4 x i32> %val4) {<br>
+; CHECK-LABEL: f14:<br>
+; CHECK: vchf [[REG:%v[0-9]+]], %v26, %v24<br>
+; CHECK-NEXT: vsel %v24, %v30, %v28, [[REG]]<br>
+; CHECK-NEXT: br %r14<br>
+ %cmp = icmp sge <4 x i32> %val1, %val2<br>
+ %ret = select <4 x i1> %cmp, <4 x i32> %val3, <4 x i32> %val4<br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test sle selects.<br>
+define <4 x i32> @f15(<4 x i32> %val1, <4 x i32> %val2,<br>
+ <4 x i32> %val3, <4 x i32> %val4) {<br>
+; CHECK-LABEL: f15:<br>
+; CHECK: vchf [[REG:%v[0-9]+]], %v24, %v26<br>
+; CHECK-NEXT: vsel %v24, %v30, %v28, [[REG]]<br>
+; CHECK-NEXT: br %r14<br>
+ %cmp = icmp sle <4 x i32> %val1, %val2<br>
+ %ret = select <4 x i1> %cmp, <4 x i32> %val3, <4 x i32> %val4<br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test slt selects.<br>
+define <4 x i32> @f16(<4 x i32> %val1, <4 x i32> %val2,<br>
+ <4 x i32> %val3, <4 x i32> %val4) {<br>
+; CHECK-LABEL: f16:<br>
+; CHECK: vchf [[REG:%v[0-9]+]], %v26, %v24<br>
+; CHECK-NEXT: vsel %v24, %v28, %v30, [[REG]]<br>
+; CHECK-NEXT: br %r14<br>
+ %cmp = icmp slt <4 x i32> %val1, %val2<br>
+ %ret = select <4 x i1> %cmp, <4 x i32> %val3, <4 x i32> %val4<br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test ugt selects.<br>
+define <4 x i32> @f17(<4 x i32> %val1, <4 x i32> %val2,<br>
+ <4 x i32> %val3, <4 x i32> %val4) {<br>
+; CHECK-LABEL: f17:<br>
+; CHECK: vchlf [[REG:%v[0-9]+]], %v24, %v26<br>
+; CHECK-NEXT: vsel %v24, %v28, %v30, [[REG]]<br>
+; CHECK-NEXT: br %r14<br>
+ %cmp = icmp ugt <4 x i32> %val1, %val2<br>
+ %ret = select <4 x i1> %cmp, <4 x i32> %val3, <4 x i32> %val4<br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test uge selects.<br>
+define <4 x i32> @f18(<4 x i32> %val1, <4 x i32> %val2,<br>
+ <4 x i32> %val3, <4 x i32> %val4) {<br>
+; CHECK-LABEL: f18:<br>
+; CHECK: vchlf [[REG:%v[0-9]+]], %v26, %v24<br>
+; CHECK-NEXT: vsel %v24, %v30, %v28, [[REG]]<br>
+; CHECK-NEXT: br %r14<br>
+ %cmp = icmp uge <4 x i32> %val1, %val2<br>
+ %ret = select <4 x i1> %cmp, <4 x i32> %val3, <4 x i32> %val4<br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test ule selects.<br>
+define <4 x i32> @f19(<4 x i32> %val1, <4 x i32> %val2,<br>
+ <4 x i32> %val3, <4 x i32> %val4) {<br>
+; CHECK-LABEL: f19:<br>
+; CHECK: vchlf [[REG:%v[0-9]+]], %v24, %v26<br>
+; CHECK-NEXT: vsel %v24, %v30, %v28, [[REG]]<br>
+; CHECK-NEXT: br %r14<br>
+ %cmp = icmp ule <4 x i32> %val1, %val2<br>
+ %ret = select <4 x i1> %cmp, <4 x i32> %val3, <4 x i32> %val4<br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test ult selects.<br>
+define <4 x i32> @f20(<4 x i32> %val1, <4 x i32> %val2,<br>
+ <4 x i32> %val3, <4 x i32> %val4) {<br>
+; CHECK-LABEL: f20:<br>
+; CHECK: vchlf [[REG:%v[0-9]+]], %v26, %v24<br>
+; CHECK-NEXT: vsel %v24, %v28, %v30, [[REG]]<br>
+; CHECK-NEXT: br %r14<br>
+ %cmp = icmp ult <4 x i32> %val1, %val2<br>
+ %ret = select <4 x i1> %cmp, <4 x i32> %val3, <4 x i32> %val4<br>
+ ret <4 x i32> %ret<br>
+}<br>
<br>
Added: llvm/trunk/test/CodeGen/SystemZ/vec-cmp-04.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-cmp-04.ll?rev=236521&view=auto" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-cmp-04.ll?rev=236521&view=auto</a><br>
==============================================================================<br>
--- llvm/trunk/test/CodeGen/SystemZ/vec-cmp-04.ll (added)<br>
+++ llvm/trunk/test/CodeGen/SystemZ/vec-cmp-04.ll Tue May 5 14:25:42 2015<br>
@@ -0,0 +1,228 @@<br>
+; Test v2i64 comparisons.<br>
+;<br>
+; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=z13 | FileCheck %s<br>
+<br>
+; Test eq.<br>
+define <2 x i64> @f1(<2 x i64> %dummy, <2 x i64> %val1, <2 x i64> %val2) {<br>
+; CHECK-LABEL: f1:<br>
+; CHECK: vceqg %v24, %v26, %v28<br>
+; CHECK-NEXT: br %r14<br>
+ %cmp = icmp eq <2 x i64> %val1, %val2<br>
+ %ret = sext <2 x i1> %cmp to <2 x i64><br>
+ ret <2 x i64> %ret<br>
+}<br>
+<br>
+; Test ne.<br>
+define <2 x i64> @f2(<2 x i64> %dummy, <2 x i64> %val1, <2 x i64> %val2) {<br>
+; CHECK-LABEL: f2:<br>
+; CHECK: vceqg [[REG:%v[0-9]+]], %v26, %v28<br>
+; CHECK-NEXT: vno %v24, [[REG]], [[REG]]<br>
+; CHECK-NEXT: br %r14<br>
+ %cmp = icmp ne <2 x i64> %val1, %val2<br>
+ %ret = sext <2 x i1> %cmp to <2 x i64><br>
+ ret <2 x i64> %ret<br>
+}<br>
+<br>
+; Test sgt.<br>
+define <2 x i64> @f3(<2 x i64> %dummy, <2 x i64> %val1, <2 x i64> %val2) {<br>
+; CHECK-LABEL: f3:<br>
+; CHECK: vchg %v24, %v26, %v28<br>
+; CHECK-NEXT: br %r14<br>
+ %cmp = icmp sgt <2 x i64> %val1, %val2<br>
+ %ret = sext <2 x i1> %cmp to <2 x i64><br>
+ ret <2 x i64> %ret<br>
+}<br>
+<br>
+; Test sge.<br>
+define <2 x i64> @f4(<2 x i64> %dummy, <2 x i64> %val1, <2 x i64> %val2) {<br>
+; CHECK-LABEL: f4:<br>
+; CHECK: vchg [[REG:%v[0-9]+]], %v28, %v26<br>
+; CHECK-NEXT: vno %v24, [[REG]], [[REG]]<br>
+; CHECK-NEXT: br %r14<br>
+ %cmp = icmp sge <2 x i64> %val1, %val2<br>
+ %ret = sext <2 x i1> %cmp to <2 x i64><br>
+ ret <2 x i64> %ret<br>
+}<br>
+<br>
+; Test sle.<br>
+define <2 x i64> @f5(<2 x i64> %dummy, <2 x i64> %val1, <2 x i64> %val2) {<br>
+; CHECK-LABEL: f5:<br>
+; CHECK: vchg [[REG:%v[0-9]+]], %v26, %v28<br>
+; CHECK-NEXT: vno %v24, [[REG]], [[REG]]<br>
+; CHECK-NEXT: br %r14<br>
+ %cmp = icmp sle <2 x i64> %val1, %val2<br>
+ %ret = sext <2 x i1> %cmp to <2 x i64><br>
+ ret <2 x i64> %ret<br>
+}<br>
+<br>
+; Test slt.<br>
+define <2 x i64> @f6(<2 x i64> %dummy, <2 x i64> %val1, <2 x i64> %val2) {<br>
+; CHECK-LABEL: f6:<br>
+; CHECK: vchg %v24, %v28, %v26<br>
+; CHECK-NEXT: br %r14<br>
+ %cmp = icmp slt <2 x i64> %val1, %val2<br>
+ %ret = sext <2 x i1> %cmp to <2 x i64><br>
+ ret <2 x i64> %ret<br>
+}<br>
+<br>
+; Test ugt.<br>
+define <2 x i64> @f7(<2 x i64> %dummy, <2 x i64> %val1, <2 x i64> %val2) {<br>
+; CHECK-LABEL: f7:<br>
+; CHECK: vchlg %v24, %v26, %v28<br>
+; CHECK-NEXT: br %r14<br>
+ %cmp = icmp ugt <2 x i64> %val1, %val2<br>
+ %ret = sext <2 x i1> %cmp to <2 x i64><br>
+ ret <2 x i64> %ret<br>
+}<br>
+<br>
+; Test uge.<br>
+define <2 x i64> @f8(<2 x i64> %dummy, <2 x i64> %val1, <2 x i64> %val2) {<br>
+; CHECK-LABEL: f8:<br>
+; CHECK: vchlg [[REG:%v[0-9]+]], %v28, %v26<br>
+; CHECK-NEXT: vno %v24, [[REG]], [[REG]]<br>
+; CHECK-NEXT: br %r14<br>
+ %cmp = icmp uge <2 x i64> %val1, %val2<br>
+ %ret = sext <2 x i1> %cmp to <2 x i64><br>
+ ret <2 x i64> %ret<br>
+}<br>
+<br>
+; Test ule.<br>
+define <2 x i64> @f9(<2 x i64> %dummy, <2 x i64> %val1, <2 x i64> %val2) {<br>
+; CHECK-LABEL: f9:<br>
+; CHECK: vchlg [[REG:%v[0-9]+]], %v26, %v28<br>
+; CHECK-NEXT: vno %v24, [[REG]], [[REG]]<br>
+; CHECK-NEXT: br %r14<br>
+ %cmp = icmp ule <2 x i64> %val1, %val2<br>
+ %ret = sext <2 x i1> %cmp to <2 x i64><br>
+ ret <2 x i64> %ret<br>
+}<br>
+<br>
+; Test ult.<br>
+define <2 x i64> @f10(<2 x i64> %dummy, <2 x i64> %val1, <2 x i64> %val2) {<br>
+; CHECK-LABEL: f10:<br>
+; CHECK: vchlg %v24, %v28, %v26<br>
+; CHECK-NEXT: br %r14<br>
+ %cmp = icmp ult <2 x i64> %val1, %val2<br>
+ %ret = sext <2 x i1> %cmp to <2 x i64><br>
+ ret <2 x i64> %ret<br>
+}<br>
+<br>
+; Test eq selects.<br>
+define <2 x i64> @f11(<2 x i64> %val1, <2 x i64> %val2,<br>
+ <2 x i64> %val3, <2 x i64> %val4) {<br>
+; CHECK-LABEL: f11:<br>
+; CHECK: vceqg [[REG:%v[0-9]+]], %v24, %v26<br>
+; CHECK-NEXT: vsel %v24, %v28, %v30, [[REG]]<br>
+; CHECK-NEXT: br %r14<br>
+ %cmp = icmp eq <2 x i64> %val1, %val2<br>
+ %ret = select <2 x i1> %cmp, <2 x i64> %val3, <2 x i64> %val4<br>
+ ret <2 x i64> %ret<br>
+}<br>
+<br>
+; Test ne selects.<br>
+define <2 x i64> @f12(<2 x i64> %val1, <2 x i64> %val2,<br>
+ <2 x i64> %val3, <2 x i64> %val4) {<br>
+; CHECK-LABEL: f12:<br>
+; CHECK: vceqg [[REG:%v[0-9]+]], %v24, %v26<br>
+; CHECK-NEXT: vsel %v24, %v30, %v28, [[REG]]<br>
+; CHECK-NEXT: br %r14<br>
+ %cmp = icmp ne <2 x i64> %val1, %val2<br>
+ %ret = select <2 x i1> %cmp, <2 x i64> %val3, <2 x i64> %val4<br>
+ ret <2 x i64> %ret<br>
+}<br>
+<br>
+; Test sgt selects.<br>
+define <2 x i64> @f13(<2 x i64> %val1, <2 x i64> %val2,<br>
+ <2 x i64> %val3, <2 x i64> %val4) {<br>
+; CHECK-LABEL: f13:<br>
+; CHECK: vchg [[REG:%v[0-9]+]], %v24, %v26<br>
+; CHECK-NEXT: vsel %v24, %v28, %v30, [[REG]]<br>
+; CHECK-NEXT: br %r14<br>
+ %cmp = icmp sgt <2 x i64> %val1, %val2<br>
+ %ret = select <2 x i1> %cmp, <2 x i64> %val3, <2 x i64> %val4<br>
+ ret <2 x i64> %ret<br>
+}<br>
+<br>
+; Test sge selects.<br>
+define <2 x i64> @f14(<2 x i64> %val1, <2 x i64> %val2,<br>
+ <2 x i64> %val3, <2 x i64> %val4) {<br>
+; CHECK-LABEL: f14:<br>
+; CHECK: vchg [[REG:%v[0-9]+]], %v26, %v24<br>
+; CHECK-NEXT: vsel %v24, %v30, %v28, [[REG]]<br>
+; CHECK-NEXT: br %r14<br>
+ %cmp = icmp sge <2 x i64> %val1, %val2<br>
+ %ret = select <2 x i1> %cmp, <2 x i64> %val3, <2 x i64> %val4<br>
+ ret <2 x i64> %ret<br>
+}<br>
+<br>
+; Test sle selects.<br>
+define <2 x i64> @f15(<2 x i64> %val1, <2 x i64> %val2,<br>
+ <2 x i64> %val3, <2 x i64> %val4) {<br>
+; CHECK-LABEL: f15:<br>
+; CHECK: vchg [[REG:%v[0-9]+]], %v24, %v26<br>
+; CHECK-NEXT: vsel %v24, %v30, %v28, [[REG]]<br>
+; CHECK-NEXT: br %r14<br>
+ %cmp = icmp sle <2 x i64> %val1, %val2<br>
+ %ret = select <2 x i1> %cmp, <2 x i64> %val3, <2 x i64> %val4<br>
+ ret <2 x i64> %ret<br>
+}<br>
+<br>
+; Test slt selects.<br>
+define <2 x i64> @f16(<2 x i64> %val1, <2 x i64> %val2,<br>
+ <2 x i64> %val3, <2 x i64> %val4) {<br>
+; CHECK-LABEL: f16:<br>
+; CHECK: vchg [[REG:%v[0-9]+]], %v26, %v24<br>
+; CHECK-NEXT: vsel %v24, %v28, %v30, [[REG]]<br>
+; CHECK-NEXT: br %r14<br>
+ %cmp = icmp slt <2 x i64> %val1, %val2<br>
+ %ret = select <2 x i1> %cmp, <2 x i64> %val3, <2 x i64> %val4<br>
+ ret <2 x i64> %ret<br>
+}<br>
+<br>
+; Test ugt selects.<br>
+define <2 x i64> @f17(<2 x i64> %val1, <2 x i64> %val2,<br>
+ <2 x i64> %val3, <2 x i64> %val4) {<br>
+; CHECK-LABEL: f17:<br>
+; CHECK: vchlg [[REG:%v[0-9]+]], %v24, %v26<br>
+; CHECK-NEXT: vsel %v24, %v28, %v30, [[REG]]<br>
+; CHECK-NEXT: br %r14<br>
+ %cmp = icmp ugt <2 x i64> %val1, %val2<br>
+ %ret = select <2 x i1> %cmp, <2 x i64> %val3, <2 x i64> %val4<br>
+ ret <2 x i64> %ret<br>
+}<br>
+<br>
+; Test uge selects.<br>
+define <2 x i64> @f18(<2 x i64> %val1, <2 x i64> %val2,<br>
+ <2 x i64> %val3, <2 x i64> %val4) {<br>
+; CHECK-LABEL: f18:<br>
+; CHECK: vchlg [[REG:%v[0-9]+]], %v26, %v24<br>
+; CHECK-NEXT: vsel %v24, %v30, %v28, [[REG]]<br>
+; CHECK-NEXT: br %r14<br>
+ %cmp = icmp uge <2 x i64> %val1, %val2<br>
+ %ret = select <2 x i1> %cmp, <2 x i64> %val3, <2 x i64> %val4<br>
+ ret <2 x i64> %ret<br>
+}<br>
+<br>
+; Test ule selects.<br>
+define <2 x i64> @f19(<2 x i64> %val1, <2 x i64> %val2,<br>
+ <2 x i64> %val3, <2 x i64> %val4) {<br>
+; CHECK-LABEL: f19:<br>
+; CHECK: vchlg [[REG:%v[0-9]+]], %v24, %v26<br>
+; CHECK-NEXT: vsel %v24, %v30, %v28, [[REG]]<br>
+; CHECK-NEXT: br %r14<br>
+ %cmp = icmp ule <2 x i64> %val1, %val2<br>
+ %ret = select <2 x i1> %cmp, <2 x i64> %val3, <2 x i64> %val4<br>
+ ret <2 x i64> %ret<br>
+}<br>
+<br>
+; Test ult selects.<br>
+define <2 x i64> @f20(<2 x i64> %val1, <2 x i64> %val2,<br>
+ <2 x i64> %val3, <2 x i64> %val4) {<br>
+; CHECK-LABEL: f20:<br>
+; CHECK: vchlg [[REG:%v[0-9]+]], %v26, %v24<br>
+; CHECK-NEXT: vsel %v24, %v28, %v30, [[REG]]<br>
+; CHECK-NEXT: br %r14<br>
+ %cmp = icmp ult <2 x i64> %val1, %val2<br>
+ %ret = select <2 x i1> %cmp, <2 x i64> %val3, <2 x i64> %val4<br>
+ ret <2 x i64> %ret<br>
+}<br>
<br>
Added: llvm/trunk/test/CodeGen/SystemZ/vec-combine-01.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-combine-01.ll?rev=236521&view=auto" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-combine-01.ll?rev=236521&view=auto</a><br>
==============================================================================<br>
--- llvm/trunk/test/CodeGen/SystemZ/vec-combine-01.ll (added)<br>
+++ llvm/trunk/test/CodeGen/SystemZ/vec-combine-01.ll Tue May 5 14:25:42 2015<br>
@@ -0,0 +1,107 @@<br>
+; Test various target-specific DAG combiner patterns.<br>
+;<br>
+; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=z13 | FileCheck %s<br>
+<br>
+; Check that an extraction followed by a truncation is effectively treated<br>
+; as a bitcast.<br>
+define void @f1(<4 x i32> %v1, <4 x i32> %v2, i8 *%ptr1, i8 *%ptr2) {<br>
+; CHECK-LABEL: f1:<br>
+; CHECK: vaf [[REG:%v[0-9]+]], %v24, %v26<br>
+; CHECK-DAG: vsteb [[REG]], 0(%r2), 3<br>
+; CHECK-DAG: vsteb [[REG]], 0(%r3), 15<br>
+; CHECK: br %r14<br>
+ %add = add <4 x i32> %v1, %v2<br>
+ %elem1 = extractelement <4 x i32> %add, i32 0<br>
+ %elem2 = extractelement <4 x i32> %add, i32 3<br>
+ %trunc1 = trunc i32 %elem1 to i8<br>
+ %trunc2 = trunc i32 %elem2 to i8<br>
+ store i8 %trunc1, i8 *%ptr1<br>
+ store i8 %trunc2, i8 *%ptr2<br>
+ ret void<br>
+}<br>
+<br>
+; Test a case where a pack-type shuffle can be eliminated.<br>
+define i16 @f2(<4 x i32> %v1, <4 x i32> %v2, <4 x i32> %v3) {<br>
+; CHECK-LABEL: f2:<br>
+; CHECK-NOT: vpk<br>
+; CHECK-DAG: vaf [[REG1:%v[0-9]+]], %v24, %v26<br>
+; CHECK-DAG: vaf [[REG2:%v[0-9]+]], %v26, %v28<br>
+; CHECK-DAG: vlgvh {{%r[0-5]}}, [[REG1]], 3<br>
+; CHECK-DAG: vlgvh {{%r[0-5]}}, [[REG2]], 7<br>
+; CHECK: br %r14<br>
+ %add1 = add <4 x i32> %v1, %v2<br>
+ %add2 = add <4 x i32> %v2, %v3<br>
+ %shuffle = shufflevector <4 x i32> %add1, <4 x i32> %add2,<br>
+ <4 x i32> <i32 1, i32 3, i32 5, i32 7><br>
+ %bitcast = bitcast <4 x i32> %shuffle to <8 x i16><br>
+ %elem1 = extractelement <8 x i16> %bitcast, i32 1<br>
+ %elem2 = extractelement <8 x i16> %bitcast, i32 7<br>
+ %res = add i16 %elem1, %elem2<br>
+ ret i16 %res<br>
+}<br>
+<br>
+; ...and again in a case where there's also a splat and a bitcast.<br>
+define i16 @f3(<4 x i32> %v1, <4 x i32> %v2, <2 x i64> %v3) {<br>
+; CHECK-LABEL: f3:<br>
+; CHECK-NOT: vrepg<br>
+; CHECK-NOT: vpk<br>
+; CHECK-DAG: vaf [[REG:%v[0-9]+]], %v24, %v26<br>
+; CHECK-DAG: vlgvh {{%r[0-5]}}, [[REG]], 6<br>
+; CHECK-DAG: vlgvh {{%r[0-5]}}, %v28, 3<br>
+; CHECK: br %r14<br>
+ %add = add <4 x i32> %v1, %v2<br>
+ %splat = shufflevector <2 x i64> %v3, <2 x i64> undef,<br>
+ <2 x i32> <i32 0, i32 0><br>
+ %splatcast = bitcast <2 x i64> %splat to <4 x i32><br>
+ %shuffle = shufflevector <4 x i32> %add, <4 x i32> %splatcast,<br>
+ <4 x i32> <i32 1, i32 3, i32 5, i32 7><br>
+ %bitcast = bitcast <4 x i32> %shuffle to <8 x i16><br>
+ %elem1 = extractelement <8 x i16> %bitcast, i32 2<br>
+ %elem2 = extractelement <8 x i16> %bitcast, i32 7<br>
+ %res = add i16 %elem1, %elem2<br>
+ ret i16 %res<br>
+}<br>
+<br>
+; ...and again with a merge low instead of a pack.<br>
+define i16 @f4(<4 x i32> %v1, <4 x i32> %v2, <2 x i64> %v3) {<br>
+; CHECK-LABEL: f4:<br>
+; CHECK-NOT: vrepg<br>
+; CHECK-NOT: vmr<br>
+; CHECK-DAG: vaf [[REG:%v[0-9]+]], %v24, %v26<br>
+; CHECK-DAG: vlgvh {{%r[0-5]}}, [[REG]], 6<br>
+; CHECK-DAG: vlgvh {{%r[0-5]}}, %v28, 3<br>
+; CHECK: br %r14<br>
+ %add = add <4 x i32> %v1, %v2<br>
+ %splat = shufflevector <2 x i64> %v3, <2 x i64> undef,<br>
+ <2 x i32> <i32 0, i32 0><br>
+ %splatcast = bitcast <2 x i64> %splat to <4 x i32><br>
+ %shuffle = shufflevector <4 x i32> %add, <4 x i32> %splatcast,<br>
+ <4 x i32> <i32 2, i32 6, i32 3, i32 7><br>
+ %bitcast = bitcast <4 x i32> %shuffle to <8 x i16><br>
+ %elem1 = extractelement <8 x i16> %bitcast, i32 4<br>
+ %elem2 = extractelement <8 x i16> %bitcast, i32 7<br>
+ %res = add i16 %elem1, %elem2<br>
+ ret i16 %res<br>
+}<br>
+<br>
+; ...and again with a merge high.<br>
+define i16 @f5(<4 x i32> %v1, <4 x i32> %v2, <2 x i64> %v3) {<br>
+; CHECK-LABEL: f5:<br>
+; CHECK-NOT: vrepg<br>
+; CHECK-NOT: vmr<br>
+; CHECK-DAG: vaf [[REG:%v[0-9]+]], %v24, %v26<br>
+; CHECK-DAG: vlgvh {{%r[0-5]}}, [[REG]], 2<br>
+; CHECK-DAG: vlgvh {{%r[0-5]}}, %v28, 3<br>
+; CHECK: br %r14<br>
+ %add = add <4 x i32> %v1, %v2<br>
+ %splat = shufflevector <2 x i64> %v3, <2 x i64> undef,<br>
+ <2 x i32> <i32 0, i32 0><br>
+ %splatcast = bitcast <2 x i64> %splat to <4 x i32><br>
+ %shuffle = shufflevector <4 x i32> %add, <4 x i32> %splatcast,<br>
+ <4 x i32> <i32 0, i32 4, i32 1, i32 5><br>
+ %bitcast = bitcast <4 x i32> %shuffle to <8 x i16><br>
+ %elem1 = extractelement <8 x i16> %bitcast, i32 4<br>
+ %elem2 = extractelement <8 x i16> %bitcast, i32 7<br>
+ %res = add i16 %elem1, %elem2<br>
+ ret i16 %res<br>
+}<br>
<br>
Added: llvm/trunk/test/CodeGen/SystemZ/vec-const-01.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-const-01.ll?rev=236521&view=auto" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-const-01.ll?rev=236521&view=auto</a><br>
==============================================================================<br>
--- llvm/trunk/test/CodeGen/SystemZ/vec-const-01.ll (added)<br>
+++ llvm/trunk/test/CodeGen/SystemZ/vec-const-01.ll Tue May 5 14:25:42 2015<br>
@@ -0,0 +1,55 @@<br>
+; Test vector byte masks, v16i8 version.<br>
+;<br>
+; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=z13 | FileCheck %s<br>
+<br>
+; Test an all-zeros vector.<br>
+define <16 x i8> @f1() {<br>
+; CHECK-LABEL: f1:<br>
+; CHECK: vgbm %v24, 0<br>
+; CHECK: br %r14<br>
+ ret <16 x i8> zeroinitializer<br>
+}<br>
+<br>
+; Test an all-ones vector.<br>
+define <16 x i8> @f2() {<br>
+; CHECK-LABEL: f2:<br>
+; CHECK: vgbm %v24, 65535<br>
+; CHECK: br %r14<br>
+ ret <16 x i8> <i8 -1, i8 -1, i8 -1, i8 -1,<br>
+ i8 -1, i8 -1, i8 -1, i8 -1,<br>
+ i8 -1, i8 -1, i8 -1, i8 -1,<br>
+ i8 -1, i8 -1, i8 -1, i8 -1><br>
+}<br>
+<br>
+; Test a mixed vector (mask 0x8c75).<br>
+define <16 x i8> @f3() {<br>
+; CHECK-LABEL: f3:<br>
+; CHECK: vgbm %v24, 35957<br>
+; CHECK: br %r14<br>
+ ret <16 x i8> <i8 -1, i8 0, i8 0, i8 0,<br>
+ i8 -1, i8 -1, i8 0, i8 0,<br>
+ i8 0, i8 -1, i8 -1, i8 -1,<br>
+ i8 0, i8 -1, i8 0, i8 -1><br>
+}<br>
+<br>
+; Test that undefs are treated as zero.<br>
+define <16 x i8> @f4() {<br>
+; CHECK-LABEL: f4:<br>
+; CHECK: vgbm %v24, 35957<br>
+; CHECK: br %r14<br>
+ ret <16 x i8> <i8 -1, i8 undef, i8 undef, i8 undef,<br>
+ i8 -1, i8 -1, i8 undef, i8 undef,<br>
+ i8 undef, i8 -1, i8 -1, i8 -1,<br>
+ i8 undef, i8 -1, i8 undef, i8 -1><br>
+}<br>
+<br>
+; Test that we don't use VGBM if one of the bytes is not 0 or 0xff.<br>
+define <16 x i8> @f5() {<br>
+; CHECK-LABEL: f5:<br>
+; CHECK-NOT: vgbm<br>
+; CHECK: br %r14<br>
+ ret <16 x i8> <i8 -1, i8 0, i8 0, i8 0,<br>
+ i8 -1, i8 -1, i8 0, i8 1,<br>
+ i8 0, i8 -1, i8 -1, i8 -1,<br>
+ i8 0, i8 -1, i8 0, i8 -1><br>
+}<br>
<br>
Added: llvm/trunk/test/CodeGen/SystemZ/vec-const-02.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-const-02.ll?rev=236521&view=auto" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-const-02.ll?rev=236521&view=auto</a><br>
==============================================================================<br>
--- llvm/trunk/test/CodeGen/SystemZ/vec-const-02.ll (added)<br>
+++ llvm/trunk/test/CodeGen/SystemZ/vec-const-02.ll Tue May 5 14:25:42 2015<br>
@@ -0,0 +1,47 @@<br>
+; Test vector byte masks, v8i16 version.<br>
+;<br>
+; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=z13 | FileCheck %s<br>
+<br>
+; Test an all-zeros vector.<br>
+define <8 x i16> @f1() {<br>
+; CHECK-LABEL: f1:<br>
+; CHECK: vgbm %v24, 0<br>
+; CHECK: br %r14<br>
+ ret <8 x i16> zeroinitializer<br>
+}<br>
+<br>
+; Test an all-ones vector.<br>
+define <8 x i16> @f2() {<br>
+; CHECK-LABEL: f2:<br>
+; CHECK: vgbm %v24, 65535<br>
+; CHECK: br %r14<br>
+ ret <8 x i16> <i16 -1, i16 -1, i16 -1, i16 -1,<br>
+ i16 -1, i16 -1, i16 -1, i16 -1><br>
+}<br>
+<br>
+; Test a mixed vector (mask 0x8c76).<br>
+define <8 x i16> @f3() {<br>
+; CHECK-LABEL: f3:<br>
+; CHECK: vgbm %v24, 35958<br>
+; CHECK: br %r14<br>
+ ret <8 x i16> <i16 65280, i16 0, i16 65535, i16 0,<br>
+ i16 255, i16 65535, i16 255, i16 65280><br>
+}<br>
+<br>
+; Test that undefs are treated as zero.<br>
+define <8 x i16> @f4() {<br>
+; CHECK-LABEL: f4:<br>
+; CHECK: vgbm %v24, 35958<br>
+; CHECK: br %r14<br>
+ ret <8 x i16> <i16 65280, i16 undef, i16 65535, i16 undef,<br>
+ i16 255, i16 65535, i16 255, i16 65280><br>
+}<br>
+<br>
+; Test that we don't use VGBM if one of the bytes is not 0 or 0xff.<br>
+define <8 x i16> @f5() {<br>
+; CHECK-LABEL: f5:<br>
+; CHECK-NOT: vgbm<br>
+; CHECK: br %r14<br>
+ ret <8 x i16> <i16 65280, i16 0, i16 65535, i16 0,<br>
+ i16 255, i16 65535, i16 256, i16 65280><br>
+}<br>
<br>
Added: llvm/trunk/test/CodeGen/SystemZ/vec-const-03.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-const-03.ll?rev=236521&view=auto" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-const-03.ll?rev=236521&view=auto</a><br>
==============================================================================<br>
--- llvm/trunk/test/CodeGen/SystemZ/vec-const-03.ll (added)<br>
+++ llvm/trunk/test/CodeGen/SystemZ/vec-const-03.ll Tue May 5 14:25:42 2015<br>
@@ -0,0 +1,43 @@<br>
+; Test vector byte masks, v4i32 version.<br>
+;<br>
+; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=z13 | FileCheck %s<br>
+<br>
+; Test an all-zeros vector.<br>
+define <4 x i32> @f1() {<br>
+; CHECK-LABEL: f1:<br>
+; CHECK: vgbm %v24, 0<br>
+; CHECK: br %r14<br>
+ ret <4 x i32> zeroinitializer<br>
+}<br>
+<br>
+; Test an all-ones vector.<br>
+define <4 x i32> @f2() {<br>
+; CHECK-LABEL: f2:<br>
+; CHECK: vgbm %v24, 65535<br>
+; CHECK: br %r14<br>
+ ret <4 x i32> <i32 -1, i32 -1, i32 -1, i32 -1><br>
+}<br>
+<br>
+; Test a mixed vector (mask 0x8c76).<br>
+define <4 x i32> @f3() {<br>
+; CHECK-LABEL: f3:<br>
+; CHECK: vgbm %v24, 35958<br>
+; CHECK: br %r14<br>
+ ret <4 x i32> <i32 4278190080, i32 4294901760, i32 16777215, i32 16776960><br>
+}<br>
+<br>
+; Test that undefs are treated as zero (mask 0x8076).<br>
+define <4 x i32> @f4() {<br>
+; CHECK-LABEL: f4:<br>
+; CHECK: vgbm %v24, 32886<br>
+; CHECK: br %r14<br>
+ ret <4 x i32> <i32 4278190080, i32 undef, i32 16777215, i32 16776960><br>
+}<br>
+<br>
+; Test that we don't use VGBM if one of the bytes is not 0 or 0xff.<br>
+define <4 x i32> @f5() {<br>
+; CHECK-LABEL: f5:<br>
+; CHECK-NOT: vgbm<br>
+; CHECK: br %r14<br>
+ ret <4 x i32> <i32 4278190080, i32 1, i32 16777215, i32 16776960><br>
+}<br>
<br>
Added: llvm/trunk/test/CodeGen/SystemZ/vec-const-04.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-const-04.ll?rev=236521&view=auto" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-const-04.ll?rev=236521&view=auto</a><br>
==============================================================================<br>
--- llvm/trunk/test/CodeGen/SystemZ/vec-const-04.ll (added)<br>
+++ llvm/trunk/test/CodeGen/SystemZ/vec-const-04.ll Tue May 5 14:25:42 2015<br>
@@ -0,0 +1,43 @@<br>
+; Test vector byte masks, v2i64 version.<br>
+;<br>
+; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=z13 | FileCheck %s<br>
+<br>
+; Test an all-zeros vector.<br>
+define <2 x i64> @f1() {<br>
+; CHECK-LABEL: f1:<br>
+; CHECK: vgbm %v24, 0<br>
+; CHECK: br %r14<br>
+ ret <2 x i64> zeroinitializer<br>
+}<br>
+<br>
+; Test an all-ones vector.<br>
+define <2 x i64> @f2() {<br>
+; CHECK-LABEL: f2:<br>
+; CHECK: vgbm %v24, 65535<br>
+; CHECK: br %r14<br>
+ ret <2 x i64> <i64 -1, i64 -1><br>
+}<br>
+<br>
+; Test a mixed vector (mask 0x8c76).<br>
+define <2 x i64> @f3() {<br>
+; CHECK-LABEL: f3:<br>
+; CHECK: vgbm %v24, 35958<br>
+; CHECK: br %r14<br>
+ ret <2 x i64> <i64 18374686483966525440, i64 72057589759737600><br>
+}<br>
+<br>
+; Test that undefs are treated as zero (mask 0x8c00).<br>
+define <2 x i64> @f4() {<br>
+; CHECK-LABEL: f4:<br>
+; CHECK: vgbm %v24, 35840<br>
+; CHECK: br %r14<br>
+ ret <2 x i64> <i64 18374686483966525440, i64 undef><br>
+}<br>
+<br>
+; Test that we don't use VGBM if one of the bytes is not 0 or 0xff.<br>
+define <2 x i64> @f5() {<br>
+; CHECK-LABEL: f5:<br>
+; CHECK-NOT: vgbm<br>
+; CHECK: br %r14<br>
+ ret <2 x i64> <i64 18374686483966525441, i64 72057589759737600><br>
+}<br>
<br>
Added: llvm/trunk/test/CodeGen/SystemZ/vec-const-07.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-const-07.ll?rev=236521&view=auto" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-const-07.ll?rev=236521&view=auto</a><br>
==============================================================================<br>
--- llvm/trunk/test/CodeGen/SystemZ/vec-const-07.ll (added)<br>
+++ llvm/trunk/test/CodeGen/SystemZ/vec-const-07.ll Tue May 5 14:25:42 2015<br>
@@ -0,0 +1,229 @@<br>
+; Test vector replicates, v16i8 version.<br>
+;<br>
+; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=z13 | FileCheck %s<br>
+<br>
+; Test a byte-granularity replicate with the lowest useful value.<br>
+define <16 x i8> @f1() {<br>
+; CHECK-LABEL: f1:<br>
+; CHECK: vrepib %v24, 1<br>
+; CHECK: br %r14<br>
+ ret <16 x i8> <i8 1, i8 1, i8 1, i8 1,<br>
+ i8 1, i8 1, i8 1, i8 1,<br>
+ i8 1, i8 1, i8 1, i8 1,<br>
+ i8 1, i8 1, i8 1, i8 1><br>
+}<br>
+<br>
+; Test a byte-granularity replicate with an arbitrary value.<br>
+define <16 x i8> @f2() {<br>
+; CHECK-LABEL: f2:<br>
+; CHECK: vrepib %v24, -55<br>
+; CHECK: br %r14<br>
+ ret <16 x i8> <i8 201, i8 201, i8 201, i8 201,<br>
+ i8 201, i8 201, i8 201, i8 201,<br>
+ i8 201, i8 201, i8 201, i8 201,<br>
+ i8 201, i8 201, i8 201, i8 201><br>
+}<br>
+<br>
+; Test a byte-granularity replicate with the highest useful value.<br>
+define <16 x i8> @f3() {<br>
+; CHECK-LABEL: f3:<br>
+; CHECK: vrepib %v24, -2<br>
+; CHECK: br %r14<br>
+ ret <16 x i8> <i8 254, i8 254, i8 254, i8 254,<br>
+ i8 254, i8 254, i8 254, i8 254,<br>
+ i8 254, i8 254, i8 254, i8 254,<br>
+ i8 254, i8 254, i8 254, i8 254><br>
+}<br>
+<br>
+; Test a halfword-granularity replicate with the lowest useful value.<br>
+define <16 x i8> @f4() {<br>
+; CHECK-LABEL: f4:<br>
+; CHECK: vrepih %v24, 1<br>
+; CHECK: br %r14<br>
+ ret <16 x i8> <i8 0, i8 1, i8 0, i8 1,<br>
+ i8 0, i8 1, i8 0, i8 1,<br>
+ i8 0, i8 1, i8 0, i8 1,<br>
+ i8 0, i8 1, i8 0, i8 1><br>
+}<br>
+<br>
+; Test a halfword-granularity replicate with an arbitrary value.<br>
+define <16 x i8> @f5() {<br>
+; CHECK-LABEL: f5:<br>
+; CHECK: vrepih %v24, 25650<br>
+; CHECK: br %r14<br>
+ ret <16 x i8> <i8 100, i8 50, i8 100, i8 50,<br>
+ i8 100, i8 50, i8 100, i8 50,<br>
+ i8 100, i8 50, i8 100, i8 50,<br>
+ i8 100, i8 50, i8 100, i8 50><br>
+}<br>
+<br>
+; Test a halfword-granularity replicate with the highest useful value.<br>
+define <16 x i8> @f6() {<br>
+; CHECK-LABEL: f6:<br>
+; CHECK: vrepih %v24, -2<br>
+; CHECK: br %r14<br>
+ ret <16 x i8> <i8 255, i8 254, i8 255, i8 254,<br>
+ i8 255, i8 254, i8 255, i8 254,<br>
+ i8 255, i8 254, i8 255, i8 254,<br>
+ i8 255, i8 254, i8 255, i8 254><br>
+}<br>
+<br>
+; Test a word-granularity replicate with the lowest useful positive value.<br>
+define <16 x i8> @f7() {<br>
+; CHECK-LABEL: f7:<br>
+; CHECK: vrepif %v24, 1<br>
+; CHECK: br %r14<br>
+ ret <16 x i8> <i8 0, i8 0, i8 0, i8 1,<br>
+ i8 0, i8 0, i8 0, i8 1,<br>
+ i8 0, i8 0, i8 0, i8 1,<br>
+ i8 0, i8 0, i8 0, i8 1><br>
+}<br>
+<br>
+; Test a word-granularity replicate with the highest in-range value.<br>
+define <16 x i8> @f8() {<br>
+; CHECK-LABEL: f8:<br>
+; CHECK: vrepif %v24, 32767<br>
+; CHECK: br %r14<br>
+ ret <16 x i8> <i8 0, i8 0, i8 127, i8 255,<br>
+ i8 0, i8 0, i8 127, i8 255,<br>
+ i8 0, i8 0, i8 127, i8 255,<br>
+ i8 0, i8 0, i8 127, i8 255><br>
+}<br>
+<br>
+; Test a word-granularity replicate with the next highest value.<br>
+; This cannot use VREPIF.<br>
+define <16 x i8> @f9() {<br>
+; CHECK-LABEL: f9:<br>
+; CHECK-NOT: vrepif<br>
+; CHECK: br %r14<br>
+ ret <16 x i8> <i8 0, i8 0, i8 128, i8 0,<br>
+ i8 0, i8 0, i8 128, i8 0,<br>
+ i8 0, i8 0, i8 128, i8 0,<br>
+ i8 0, i8 0, i8 128, i8 0><br>
+}<br>
+<br>
+; Test a word-granularity replicate with the lowest in-range value.<br>
+define <16 x i8> @f10() {<br>
+; CHECK-LABEL: f10:<br>
+; CHECK: vrepif %v24, -32768<br>
+; CHECK: br %r14<br>
+ ret <16 x i8> <i8 255, i8 255, i8 128, i8 0,<br>
+ i8 255, i8 255, i8 128, i8 0,<br>
+ i8 255, i8 255, i8 128, i8 0,<br>
+ i8 255, i8 255, i8 128, i8 0><br>
+}<br>
+<br>
+; Test a word-granularity replicate with the next lowest value.<br>
+; This cannot use VREPIF.<br>
+define <16 x i8> @f11() {<br>
+; CHECK-LABEL: f11:<br>
+; CHECK-NOT: vrepif<br>
+; CHECK: br %r14<br>
+ ret <16 x i8> <i8 255, i8 255, i8 127, i8 255,<br>
+ i8 255, i8 255, i8 127, i8 255,<br>
+ i8 255, i8 255, i8 127, i8 255,<br>
+ i8 255, i8 255, i8 127, i8 255><br>
+}<br>
+<br>
+; Test a word-granularity replicate with the highest useful negative value.<br>
+define <16 x i8> @f12() {<br>
+; CHECK-LABEL: f12:<br>
+; CHECK: vrepif %v24, -2<br>
+; CHECK: br %r14<br>
+ ret <16 x i8> <i8 255, i8 255, i8 255, i8 254,<br>
+ i8 255, i8 255, i8 255, i8 254,<br>
+ i8 255, i8 255, i8 255, i8 254,<br>
+ i8 255, i8 255, i8 255, i8 254><br>
+}<br>
+<br>
+; Test a doubleword-granularity replicate with the lowest useful positive<br>
+; value.<br>
+define <16 x i8> @f13() {<br>
+; CHECK-LABEL: f13:<br>
+; CHECK: vrepig %v24, 1<br>
+; CHECK: br %r14<br>
+ ret <16 x i8> <i8 0, i8 0, i8 0, i8 0,<br>
+ i8 0, i8 0, i8 0, i8 1,<br>
+ i8 0, i8 0, i8 0, i8 0,<br>
+ i8 0, i8 0, i8 0, i8 1><br>
+}<br>
+<br>
+; Test a doubleword-granularity replicate with the highest in-range value.<br>
+define <16 x i8> @f14() {<br>
+; CHECK-LABEL: f14:<br>
+; CHECK: vrepig %v24, 32767<br>
+; CHECK: br %r14<br>
+ ret <16 x i8> <i8 0, i8 0, i8 0, i8 0,<br>
+ i8 0, i8 0, i8 127, i8 255,<br>
+ i8 0, i8 0, i8 0, i8 0,<br>
+ i8 0, i8 0, i8 127, i8 255><br>
+}<br>
+<br>
+; Test a doubleword-granularity replicate with the next highest value.<br>
+; This cannot use VREPIG.<br>
+define <16 x i8> @f15() {<br>
+; CHECK-LABEL: f15:<br>
+; CHECK-NOT: vrepig<br>
+; CHECK: br %r14<br>
+ ret <16 x i8> <i8 0, i8 0, i8 0, i8 0,<br>
+ i8 0, i8 0, i8 128, i8 0,<br>
+ i8 0, i8 0, i8 0, i8 0,<br>
+ i8 0, i8 0, i8 128, i8 0><br>
+}<br>
+<br>
+; Test a doubleword-granularity replicate with the lowest in-range value.<br>
+define <16 x i8> @f16() {<br>
+; CHECK-LABEL: f16:<br>
+; CHECK: vrepig %v24, -32768<br>
+; CHECK: br %r14<br>
+ ret <16 x i8> <i8 255, i8 255, i8 255, i8 255,<br>
+ i8 255, i8 255, i8 128, i8 0,<br>
+ i8 255, i8 255, i8 255, i8 255,<br>
+ i8 255, i8 255, i8 128, i8 0><br>
+}<br>
+<br>
+; Test a doubleword-granularity replicate with the next lowest value.<br>
+; This cannot use VREPIG.<br>
+define <16 x i8> @f17() {<br>
+; CHECK-LABEL: f17:<br>
+; CHECK-NOT: vrepig<br>
+; CHECK: br %r14<br>
+ ret <16 x i8> <i8 255, i8 255, i8 255, i8 255,<br>
+ i8 255, i8 255, i8 127, i8 255,<br>
+ i8 255, i8 255, i8 255, i8 255,<br>
+ i8 255, i8 255, i8 127, i8 255><br>
+}<br>
+<br>
+; Test a doubleword-granularity replicate with the highest useful negative<br>
+; value.<br>
+define <16 x i8> @f18() {<br>
+; CHECK-LABEL: f18:<br>
+; CHECK: vrepig %v24, -2<br>
+; CHECK: br %r14<br>
+ ret <16 x i8> <i8 255, i8 255, i8 255, i8 255,<br>
+ i8 255, i8 255, i8 255, i8 254,<br>
+ i8 255, i8 255, i8 255, i8 255,<br>
+ i8 255, i8 255, i8 255, i8 254><br>
+}<br>
+<br>
+; Repeat f14 with undefs optimistically treated as 0.<br>
+define <16 x i8> @f19() {<br>
+; CHECK-LABEL: f19:<br>
+; CHECK: vrepig %v24, 32767<br>
+; CHECK: br %r14<br>
+ ret <16 x i8> <i8 0, i8 undef, i8 0, i8 0,<br>
+ i8 0, i8 0, i8 127, i8 255,<br>
+ i8 undef, i8 0, i8 undef, i8 0,<br>
+ i8 0, i8 0, i8 127, i8 255><br>
+}<br>
+<br>
+; Repeat f18 with undefs optimistically treated as -1.<br>
+define <16 x i8> @f20() {<br>
+; CHECK-LABEL: f20:<br>
+; CHECK: vrepig %v24, -2<br>
+; CHECK: br %r14<br>
+ ret <16 x i8> <i8 undef, i8 255, i8 255, i8 255,<br>
+ i8 255, i8 255, i8 undef, i8 254,<br>
+ i8 255, i8 255, i8 255, i8 undef,<br>
+ i8 255, i8 undef, i8 255, i8 254><br>
+}<br>
<br>
Added: llvm/trunk/test/CodeGen/SystemZ/vec-const-08.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-const-08.ll?rev=236521&view=auto" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-const-08.ll?rev=236521&view=auto</a><br>
==============================================================================<br>
--- llvm/trunk/test/CodeGen/SystemZ/vec-const-08.ll (added)<br>
+++ llvm/trunk/test/CodeGen/SystemZ/vec-const-08.ll Tue May 5 14:25:42 2015<br>
@@ -0,0 +1,189 @@<br>
+; Test vector replicates, v8i16 version.<br>
+;<br>
+; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=z13 | FileCheck %s<br>
+<br>
+; Test a byte-granularity replicate with the lowest useful value.<br>
+define <8 x i16> @f1() {<br>
+; CHECK-LABEL: f1:<br>
+; CHECK: vrepib %v24, 1<br>
+; CHECK: br %r14<br>
+ ret <8 x i16> <i16 257, i16 257, i16 257, i16 257,<br>
+ i16 257, i16 257, i16 257, i16 257><br>
+}<br>
+<br>
+; Test a byte-granularity replicate with an arbitrary value.<br>
+define <8 x i16> @f2() {<br>
+; CHECK-LABEL: f2:<br>
+; CHECK: vrepib %v24, -55<br>
+; CHECK: br %r14<br>
+ ret <8 x i16> <i16 51657, i16 51657, i16 51657, i16 51657,<br>
+ i16 51657, i16 51657, i16 51657, i16 51657><br>
+}<br>
+<br>
+; Test a byte-granularity replicate with the highest useful value.<br>
+define <8 x i16> @f3() {<br>
+; CHECK-LABEL: f3:<br>
+; CHECK: vrepib %v24, -2<br>
+; CHECK: br %r14<br>
+ ret <8 x i16> <i16 -258, i16 -258, i16 -258, i16 -258,<br>
+ i16 -258, i16 -258, i16 -258, i16 -258><br>
+}<br>
+<br>
+; Test a halfword-granularity replicate with the lowest useful value.<br>
+define <8 x i16> @f4() {<br>
+; CHECK-LABEL: f4:<br>
+; CHECK: vrepih %v24, 1<br>
+; CHECK: br %r14<br>
+ ret <8 x i16> <i16 1, i16 1, i16 1, i16 1,<br>
+ i16 1, i16 1, i16 1, i16 1><br>
+}<br>
+<br>
+; Test a halfword-granularity replicate with an arbitrary value.<br>
+define <8 x i16> @f5() {<br>
+; CHECK-LABEL: f5:<br>
+; CHECK: vrepih %v24, 25650<br>
+; CHECK: br %r14<br>
+ ret <8 x i16> <i16 25650, i16 25650, i16 25650, i16 25650,<br>
+ i16 25650, i16 25650, i16 25650, i16 25650><br>
+}<br>
+<br>
+; Test a halfword-granularity replicate with the highest useful value.<br>
+define <8 x i16> @f6() {<br>
+; CHECK-LABEL: f6:<br>
+; CHECK: vrepih %v24, -2<br>
+; CHECK: br %r14<br>
+ ret <8 x i16> <i16 65534, i16 65534, i16 65534, i16 65534,<br>
+ i16 65534, i16 65534, i16 65534, i16 65534><br>
+}<br>
+<br>
+; Test a word-granularity replicate with the lowest useful positive value.<br>
+define <8 x i16> @f7() {<br>
+; CHECK-LABEL: f7:<br>
+; CHECK: vrepif %v24, 1<br>
+; CHECK: br %r14<br>
+ ret <8 x i16> <i16 0, i16 1, i16 0, i16 1,<br>
+ i16 0, i16 1, i16 0, i16 1><br>
+}<br>
+<br>
+; Test a word-granularity replicate with the highest in-range value.<br>
+define <8 x i16> @f8() {<br>
+; CHECK-LABEL: f8:<br>
+; CHECK: vrepif %v24, 32767<br>
+; CHECK: br %r14<br>
+ ret <8 x i16> <i16 0, i16 32767, i16 0, i16 32767,<br>
+ i16 0, i16 32767, i16 0, i16 32767><br>
+}<br>
+<br>
+; Test a word-granularity replicate with the next highest value.<br>
+; This cannot use VREPIF.<br>
+define <8 x i16> @f9() {<br>
+; CHECK-LABEL: f9:<br>
+; CHECK-NOT: vrepif<br>
+; CHECK: br %r14<br>
+ ret <8 x i16> <i16 0, i16 32768, i16 0, i16 32768,<br>
+ i16 0, i16 32768, i16 0, i16 32768><br>
+}<br>
+<br>
+; Test a word-granularity replicate with the lowest in-range value.<br>
+define <8 x i16> @f10() {<br>
+; CHECK-LABEL: f10:<br>
+; CHECK: vrepif %v24, -32768<br>
+; CHECK: br %r14<br>
+ ret <8 x i16> <i16 -1, i16 -32768, i16 -1, i16 -32768,<br>
+ i16 -1, i16 -32768, i16 -1, i16 -32768><br>
+}<br>
+<br>
+; Test a word-granularity replicate with the next lowest value.<br>
+; This cannot use VREPIF.<br>
+define <8 x i16> @f11() {<br>
+; CHECK-LABEL: f11:<br>
+; CHECK-NOT: vrepif<br>
+; CHECK: br %r14<br>
+ ret <8 x i16> <i16 -1, i16 -32769, i16 -1, i16 -32769,<br>
+ i16 -1, i16 -32769, i16 -1, i16 -32769><br>
+}<br>
+<br>
+; Test a word-granularity replicate with the highest useful negative value.<br>
+define <8 x i16> @f12() {<br>
+; CHECK-LABEL: f12:<br>
+; CHECK: vrepif %v24, -2<br>
+; CHECK: br %r14<br>
+ ret <8 x i16> <i16 -1, i16 -2, i16 -1, i16 -2,<br>
+ i16 -1, i16 -2, i16 -1, i16 -2><br>
+}<br>
+<br>
+; Test a doubleword-granularity replicate with the lowest useful positive<br>
+; value.<br>
+define <8 x i16> @f13() {<br>
+; CHECK-LABEL: f13:<br>
+; CHECK: vrepig %v24, 1<br>
+; CHECK: br %r14<br>
+ ret <8 x i16> <i16 0, i16 0, i16 0, i16 1,<br>
+ i16 0, i16 0, i16 0, i16 1><br>
+}<br>
+<br>
+; Test a doubleword-granularity replicate with the highest in-range value.<br>
+define <8 x i16> @f14() {<br>
+; CHECK-LABEL: f14:<br>
+; CHECK: vrepig %v24, 32767<br>
+; CHECK: br %r14<br>
+ ret <8 x i16> <i16 0, i16 0, i16 0, i16 32767,<br>
+ i16 0, i16 0, i16 0, i16 32767><br>
+}<br>
+<br>
+; Test a doubleword-granularity replicate with the next highest value.<br>
+; This cannot use VREPIG.<br>
+define <8 x i16> @f15() {<br>
+; CHECK-LABEL: f15:<br>
+; CHECK-NOT: vrepig<br>
+; CHECK: br %r14<br>
+ ret <8 x i16> <i16 0, i16 0, i16 0, i16 32768,<br>
+ i16 0, i16 0, i16 0, i16 32768><br>
+}<br>
+<br>
+; Test a doubleword-granularity replicate with the lowest in-range value.<br>
+define <8 x i16> @f16() {<br>
+; CHECK-LABEL: f16:<br>
+; CHECK: vrepig %v24, -32768<br>
+; CHECK: br %r14<br>
+ ret <8 x i16> <i16 -1, i16 -1, i16 -1, i16 -32768,<br>
+ i16 -1, i16 -1, i16 -1, i16 -32768><br>
+}<br>
+<br>
+; Test a doubleword-granularity replicate with the next lowest value.<br>
+; This cannot use VREPIG.<br>
+define <8 x i16> @f17() {<br>
+; CHECK-LABEL: f17:<br>
+; CHECK-NOT: vrepig<br>
+; CHECK: br %r14<br>
+ ret <8 x i16> <i16 -1, i16 -1, i16 -1, i16 -32769,<br>
+ i16 -1, i16 -1, i16 -1, i16 -32769><br>
+}<br>
+<br>
+; Test a doubleword-granularity replicate with the highest useful negative<br>
+; value.<br>
+define <8 x i16> @f18() {<br>
+; CHECK-LABEL: f18:<br>
+; CHECK: vrepig %v24, -2<br>
+; CHECK: br %r14<br>
+ ret <8 x i16> <i16 -1, i16 -1, i16 -1, i16 -2,<br>
+ i16 -1, i16 -1, i16 -1, i16 -2><br>
+}<br>
+<br>
+; Repeat f14 with undefs optimistically treated as 0.<br>
+define <8 x i16> @f19() {<br>
+; CHECK-LABEL: f19:<br>
+; CHECK: vrepig %v24, 32767<br>
+; CHECK: br %r14<br>
+ ret <8 x i16> <i16 0, i16 undef, i16 0, i16 32767,<br>
+ i16 undef, i16 0, i16 undef, i16 32767><br>
+}<br>
+<br>
+; Repeat f18 with undefs optimistically treated as -1.<br>
+define <8 x i16> @f20() {<br>
+; CHECK-LABEL: f20:<br>
+; CHECK: vrepig %v24, -2<br>
+; CHECK: br %r14<br>
+ ret <8 x i16> <i16 -1, i16 -1, i16 undef, i16 -2,<br>
+ i16 undef, i16 undef, i16 -1, i16 -2><br>
+}<br>
<br>
Added: llvm/trunk/test/CodeGen/SystemZ/vec-const-09.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-const-09.ll?rev=236521&view=auto" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-const-09.ll?rev=236521&view=auto</a><br>
==============================================================================<br>
--- llvm/trunk/test/CodeGen/SystemZ/vec-const-09.ll (added)<br>
+++ llvm/trunk/test/CodeGen/SystemZ/vec-const-09.ll Tue May 5 14:25:42 2015<br>
@@ -0,0 +1,169 @@<br>
+; Test vector replicates, v4i32 version.<br>
+;<br>
+; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=z13 | FileCheck %s<br>
+<br>
+; Test a byte-granularity replicate with the lowest useful value.<br>
+define <4 x i32> @f1() {<br>
+; CHECK-LABEL: f1:<br>
+; CHECK: vrepib %v24, 1<br>
+; CHECK: br %r14<br>
+ ret <4 x i32> <i32 16843009, i32 16843009, i32 16843009, i32 16843009><br>
+}<br>
+<br>
+; Test a byte-granularity replicate with an arbitrary value.<br>
+define <4 x i32> @f2() {<br>
+; CHECK-LABEL: f2:<br>
+; CHECK: vrepib %v24, -55<br>
+; CHECK: br %r14<br>
+ ret <4 x i32> <i32 3385444809, i32 3385444809, i32 3385444809, i32 3385444809><br>
+}<br>
+<br>
+; Test a byte-granularity replicate with the highest useful value.<br>
+define <4 x i32> @f3() {<br>
+; CHECK-LABEL: f3:<br>
+; CHECK: vrepib %v24, -2<br>
+; CHECK: br %r14<br>
+ ret <4 x i32> <i32 4278124286, i32 4278124286, i32 4278124286, i32 4278124286><br>
+}<br>
+<br>
+; Test a halfword-granularity replicate with the lowest useful value.<br>
+define <4 x i32> @f4() {<br>
+; CHECK-LABEL: f4:<br>
+; CHECK: vrepih %v24, 1<br>
+; CHECK: br %r14<br>
+ ret <4 x i32> <i32 65537, i32 65537, i32 65537, i32 65537><br>
+}<br>
+<br>
+; Test a halfword-granularity replicate with an arbitrary value.<br>
+define <4 x i32> @f5() {<br>
+; CHECK-LABEL: f5:<br>
+; CHECK: vrepih %v24, 25650<br>
+; CHECK: br %r14<br>
+ ret <4 x i32> <i32 1681024050, i32 1681024050, i32 1681024050, i32 1681024050><br>
+}<br>
+<br>
+; Test a halfword-granularity replicate with the highest useful value.<br>
+define <4 x i32> @f6() {<br>
+; CHECK-LABEL: f6:<br>
+; CHECK: vrepih %v24, -2<br>
+; CHECK: br %r14<br>
+ ret <4 x i32> <i32 -65538, i32 -65538, i32 -65538, i32 -65538><br>
+}<br>
+<br>
+; Test a word-granularity replicate with the lowest useful positive value.<br>
+define <4 x i32> @f7() {<br>
+; CHECK-LABEL: f7:<br>
+; CHECK: vrepif %v24, 1<br>
+; CHECK: br %r14<br>
+ ret <4 x i32> <i32 1, i32 1, i32 1, i32 1><br>
+}<br>
+<br>
+; Test a word-granularity replicate with the highest in-range value.<br>
+define <4 x i32> @f8() {<br>
+; CHECK-LABEL: f8:<br>
+; CHECK: vrepif %v24, 32767<br>
+; CHECK: br %r14<br>
+ ret <4 x i32> <i32 32767, i32 32767, i32 32767, i32 32767><br>
+}<br>
+<br>
+; Test a word-granularity replicate with the next highest value.<br>
+; This cannot use VREPIF.<br>
+define <4 x i32> @f9() {<br>
+; CHECK-LABEL: f9:<br>
+; CHECK-NOT: vrepif<br>
+; CHECK: br %r14<br>
+ ret <4 x i32> <i32 32768, i32 32768, i32 32768, i32 32768><br>
+}<br>
+<br>
+; Test a word-granularity replicate with the lowest in-range value.<br>
+define <4 x i32> @f10() {<br>
+; CHECK-LABEL: f10:<br>
+; CHECK: vrepif %v24, -32768<br>
+; CHECK: br %r14<br>
+ ret <4 x i32> <i32 -32768, i32 -32768, i32 -32768, i32 -32768><br>
+}<br>
+<br>
+; Test a word-granularity replicate with the next lowest value.<br>
+; This cannot use VREPIF.<br>
+define <4 x i32> @f11() {<br>
+; CHECK-LABEL: f11:<br>
+; CHECK-NOT: vrepif<br>
+; CHECK: br %r14<br>
+ ret <4 x i32> <i32 -32769, i32 -32769, i32 -32769, i32 -32769><br>
+}<br>
+<br>
+; Test a word-granularity replicate with the highest useful negative value.<br>
+define <4 x i32> @f12() {<br>
+; CHECK-LABEL: f12:<br>
+; CHECK: vrepif %v24, -2<br>
+; CHECK: br %r14<br>
+ ret <4 x i32> <i32 -2, i32 -2, i32 -2, i32 -2><br>
+}<br>
+<br>
+; Test a doubleword-granularity replicate with the lowest useful positive<br>
+; value.<br>
+define <4 x i32> @f13() {<br>
+; CHECK-LABEL: f13:<br>
+; CHECK: vrepig %v24, 1<br>
+; CHECK: br %r14<br>
+ ret <4 x i32> <i32 0, i32 1, i32 0, i32 1><br>
+}<br>
+<br>
+; Test a doubleword-granularity replicate with the highest in-range value.<br>
+define <4 x i32> @f14() {<br>
+; CHECK-LABEL: f14:<br>
+; CHECK: vrepig %v24, 32767<br>
+; CHECK: br %r14<br>
+ ret <4 x i32> <i32 0, i32 32767, i32 0, i32 32767><br>
+}<br>
+<br>
+; Test a doubleword-granularity replicate with the next highest value.<br>
+; This cannot use VREPIG.<br>
+define <4 x i32> @f15() {<br>
+; CHECK-LABEL: f15:<br>
+; CHECK-NOT: vrepig<br>
+; CHECK: br %r14<br>
+ ret <4 x i32> <i32 0, i32 32768, i32 0, i32 32768><br>
+}<br>
+<br>
+; Test a doubleword-granularity replicate with the lowest in-range value.<br>
+define <4 x i32> @f16() {<br>
+; CHECK-LABEL: f16:<br>
+; CHECK: vrepig %v24, -32768<br>
+; CHECK: br %r14<br>
+ ret <4 x i32> <i32 -1, i32 -32768, i32 -1, i32 -32768><br>
+}<br>
+<br>
+; Test a doubleword-granularity replicate with the next lowest value.<br>
+; This cannot use VREPIG.<br>
+define <4 x i32> @f17() {<br>
+; CHECK-LABEL: f17:<br>
+; CHECK-NOT: vrepig<br>
+; CHECK: br %r14<br>
+ ret <4 x i32> <i32 -1, i32 -32769, i32 -1, i32 -32769><br>
+}<br>
+<br>
+; Test a doubleword-granularity replicate with the highest useful negative<br>
+; value.<br>
+define <4 x i32> @f18() {<br>
+; CHECK-LABEL: f18:<br>
+; CHECK: vrepig %v24, -2<br>
+; CHECK: br %r14<br>
+ ret <4 x i32> <i32 -1, i32 -2, i32 -1, i32 -2><br>
+}<br>
+<br>
+; Repeat f14 with undefs optimistically treated as 0, 32767.<br>
+define <4 x i32> @f19() {<br>
+; CHECK-LABEL: f19:<br>
+; CHECK: vrepig %v24, 32767<br>
+; CHECK: br %r14<br>
+ ret <4 x i32> <i32 undef, i32 undef, i32 0, i32 32767><br>
+}<br>
+<br>
+; Repeat f18 with undefs optimistically treated as -2, -1.<br>
+define <4 x i32> @f20() {<br>
+; CHECK-LABEL: f20:<br>
+; CHECK: vrepig %v24, -2<br>
+; CHECK: br %r14<br>
+ ret <4 x i32> <i32 -1, i32 undef, i32 undef, i32 -2><br>
+}<br>
<br>
Added: llvm/trunk/test/CodeGen/SystemZ/vec-const-10.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-const-10.ll?rev=236521&view=auto" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-const-10.ll?rev=236521&view=auto</a><br>
==============================================================================<br>
--- llvm/trunk/test/CodeGen/SystemZ/vec-const-10.ll (added)<br>
+++ llvm/trunk/test/CodeGen/SystemZ/vec-const-10.ll Tue May 5 14:25:42 2015<br>
@@ -0,0 +1,169 @@<br>
+; Test vector replicates, v2i64 version.<br>
+;<br>
+; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=z13 | FileCheck %s<br>
+<br>
+; Test a byte-granularity replicate with the lowest useful value.<br>
+define <2 x i64> @f1() {<br>
+; CHECK-LABEL: f1:<br>
+; CHECK: vrepib %v24, 1<br>
+; CHECK: br %r14<br>
+ ret <2 x i64> <i64 72340172838076673, i64 72340172838076673><br>
+}<br>
+<br>
+; Test a byte-granularity replicate with an arbitrary value.<br>
+define <2 x i64> @f2() {<br>
+; CHECK-LABEL: f2:<br>
+; CHECK: vrepib %v24, -55<br>
+; CHECK: br %r14<br>
+ ret <2 x i64> <i64 -3906369333256140343, i64 -3906369333256140343><br>
+}<br>
+<br>
+; Test a byte-granularity replicate with the highest useful value.<br>
+define <2 x i64> @f3() {<br>
+; CHECK-LABEL: f3:<br>
+; CHECK: vrepib %v24, -2<br>
+; CHECK: br %r14<br>
+ ret <2 x i64> <i64 -72340172838076674, i64 -72340172838076674><br>
+}<br>
+<br>
+; Test a halfword-granularity replicate with the lowest useful value.<br>
+define <2 x i64> @f4() {<br>
+; CHECK-LABEL: f4:<br>
+; CHECK: vrepih %v24, 1<br>
+; CHECK: br %r14<br>
+ ret <2 x i64> <i64 281479271743489, i64 281479271743489><br>
+}<br>
+<br>
+; Test a halfword-granularity replicate with an arbitrary value.<br>
+define <2 x i64> @f5() {<br>
+; CHECK-LABEL: f5:<br>
+; CHECK: vrepih %v24, 25650<br>
+; CHECK: br %r14<br>
+ ret <2 x i64> <i64 7219943320220492850, i64 7219943320220492850><br>
+}<br>
+<br>
+; Test a halfword-granularity replicate with the highest useful value.<br>
+define <2 x i64> @f6() {<br>
+; CHECK-LABEL: f6:<br>
+; CHECK: vrepih %v24, -2<br>
+; CHECK: br %r14<br>
+ ret <2 x i64> <i64 -281479271743490, i64 -281479271743490><br>
+}<br>
+<br>
+; Test a word-granularity replicate with the lowest useful positive value.<br>
+define <2 x i64> @f7() {<br>
+; CHECK-LABEL: f7:<br>
+; CHECK: vrepif %v24, 1<br>
+; CHECK: br %r14<br>
+ ret <2 x i64> <i64 4294967297, i64 4294967297><br>
+}<br>
+<br>
+; Test a word-granularity replicate with the highest in-range value.<br>
+define <2 x i64> @f8() {<br>
+; CHECK-LABEL: f8:<br>
+; CHECK: vrepif %v24, 32767<br>
+; CHECK: br %r14<br>
+ ret <2 x i64> <i64 140733193420799, i64 140733193420799><br>
+}<br>
+<br>
+; Test a word-granularity replicate with the next highest value.<br>
+; This cannot use VREPIF.<br>
+define <2 x i64> @f9() {<br>
+; CHECK-LABEL: f9:<br>
+; CHECK-NOT: vrepif<br>
+; CHECK: br %r14<br>
+ ret <2 x i64> <i64 140737488388096, i64 140737488388096><br>
+}<br>
+<br>
+; Test a word-granularity replicate with the lowest in-range value.<br>
+define <2 x i64> @f10() {<br>
+; CHECK-LABEL: f10:<br>
+; CHECK: vrepif %v24, -32768<br>
+; CHECK: br %r14<br>
+ ret <2 x i64> <i64 -140733193420800, i64 -140733193420800><br>
+}<br>
+<br>
+; Test a word-granularity replicate with the next lowest value.<br>
+; This cannot use VREPIF.<br>
+define <2 x i64> @f11() {<br>
+; CHECK-LABEL: f11:<br>
+; CHECK-NOT: vrepif<br>
+; CHECK: br %r14<br>
+ ret <2 x i64> <i64 -140737488388097, i64 -140737488388097><br>
+}<br>
+<br>
+; Test a word-granularity replicate with the highest useful negative value.<br>
+define <2 x i64> @f12() {<br>
+; CHECK-LABEL: f12:<br>
+; CHECK: vrepif %v24, -2<br>
+; CHECK: br %r14<br>
+ ret <2 x i64> <i64 -4294967298, i64 -4294967298><br>
+}<br>
+<br>
+; Test a doubleword-granularity replicate with the lowest useful positive<br>
+; value.<br>
+define <2 x i64> @f13() {<br>
+; CHECK-LABEL: f13:<br>
+; CHECK: vrepig %v24, 1<br>
+; CHECK: br %r14<br>
+ ret <2 x i64> <i64 1, i64 1><br>
+}<br>
+<br>
+; Test a doubleword-granularity replicate with the highest in-range value.<br>
+define <2 x i64> @f14() {<br>
+; CHECK-LABEL: f14:<br>
+; CHECK: vrepig %v24, 32767<br>
+; CHECK: br %r14<br>
+ ret <2 x i64> <i64 32767, i64 32767><br>
+}<br>
+<br>
+; Test a doubleword-granularity replicate with the next highest value.<br>
+; This cannot use VREPIG.<br>
+define <2 x i64> @f15() {<br>
+; CHECK-LABEL: f15:<br>
+; CHECK-NOT: vrepig<br>
+; CHECK: br %r14<br>
+ ret <2 x i64> <i64 32768, i64 32768><br>
+}<br>
+<br>
+; Test a doubleword-granularity replicate with the lowest in-range value.<br>
+define <2 x i64> @f16() {<br>
+; CHECK-LABEL: f16:<br>
+; CHECK: vrepig %v24, -32768<br>
+; CHECK: br %r14<br>
+ ret <2 x i64> <i64 -32768, i64 -32768><br>
+}<br>
+<br>
+; Test a doubleword-granularity replicate with the next lowest value.<br>
+; This cannot use VREPIG.<br>
+define <2 x i64> @f17() {<br>
+; CHECK-LABEL: f17:<br>
+; CHECK-NOT: vrepig<br>
+; CHECK: br %r14<br>
+ ret <2 x i64> <i64 -32769, i64 -32769><br>
+}<br>
+<br>
+; Test a doubleword-granularity replicate with the highest useful negative<br>
+; value.<br>
+define <2 x i64> @f18() {<br>
+; CHECK-LABEL: f18:<br>
+; CHECK: vrepig %v24, -2<br>
+; CHECK: br %r14<br>
+ ret <2 x i64> <i64 -2, i64 -2><br>
+}<br>
+<br>
+; Repeat f14 with undefs optimistically treated as 32767.<br>
+define <2 x i64> @f19() {<br>
+; CHECK-LABEL: f19:<br>
+; CHECK: vrepig %v24, 32767<br>
+; CHECK: br %r14<br>
+ ret <2 x i64> <i64 undef, i64 32767><br>
+}<br>
+<br>
+; Repeat f18 with undefs optimistically treated as -2.<br>
+define <2 x i64> @f20() {<br>
+; CHECK-LABEL: f20:<br>
+; CHECK: vrepig %v24, -2<br>
+; CHECK: br %r14<br>
+ ret <2 x i64> <i64 undef, i64 -2><br>
+}<br>
<br>
Added: llvm/trunk/test/CodeGen/SystemZ/vec-const-13.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-const-13.ll?rev=236521&view=auto" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-const-13.ll?rev=236521&view=auto</a><br>
==============================================================================<br>
--- llvm/trunk/test/CodeGen/SystemZ/vec-const-13.ll (added)<br>
+++ llvm/trunk/test/CodeGen/SystemZ/vec-const-13.ll Tue May 5 14:25:42 2015<br>
@@ -0,0 +1,193 @@<br>
+; Test vector replicates that use VECTOR GENERATE MASK, v16i8 version.<br>
+;<br>
+; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=z13 | FileCheck %s<br>
+<br>
+; Test a word-granularity replicate with the lowest value that cannot use<br>
+; VREPIF.<br>
+define <16 x i8> @f1() {<br>
+; CHECK-LABEL: f1:<br>
+; CHECK: vgmf %v24, 16, 16<br>
+; CHECK: br %r14<br>
+ ret <16 x i8> <i8 0, i8 0, i8 128, i8 0,<br>
+ i8 0, i8 0, i8 128, i8 0,<br>
+ i8 0, i8 0, i8 128, i8 0,<br>
+ i8 0, i8 0, i8 128, i8 0><br>
+}<br>
+<br>
+; Test a word-granularity replicate that has the lower 17 bits set.<br>
+define <16 x i8> @f2() {<br>
+; CHECK-LABEL: f2:<br>
+; CHECK: vgmf %v24, 15, 31<br>
+; CHECK: br %r14<br>
+ ret <16 x i8> <i8 0, i8 1, i8 255, i8 255,<br>
+ i8 0, i8 1, i8 255, i8 255,<br>
+ i8 0, i8 1, i8 255, i8 255,<br>
+ i8 0, i8 1, i8 255, i8 255><br>
+}<br>
+<br>
+; Test a word-granularity replicate that has the upper 15 bits set.<br>
+define <16 x i8> @f3() {<br>
+; CHECK-LABEL: f3:<br>
+; CHECK: vgmf %v24, 0, 14<br>
+; CHECK: br %r14<br>
+ ret <16 x i8> <i8 255, i8 254, i8 0, i8 0,<br>
+ i8 255, i8 254, i8 0, i8 0,<br>
+ i8 255, i8 254, i8 0, i8 0,<br>
+ i8 255, i8 254, i8 0, i8 0><br>
+}<br>
+<br>
+; Test a word-granularity replicate that has middle bits set.<br>
+define <16 x i8> @f4() {<br>
+; CHECK-LABEL: f4:<br>
+; CHECK: vgmf %v24, 12, 17<br>
+; CHECK: br %r14<br>
+ ret <16 x i8> <i8 0, i8 15, i8 192, i8 0,<br>
+ i8 0, i8 15, i8 192, i8 0,<br>
+ i8 0, i8 15, i8 192, i8 0,<br>
+ i8 0, i8 15, i8 192, i8 0><br>
+}<br>
+<br>
+; Test a word-granularity replicate with a wrap-around mask.<br>
+define <16 x i8> @f5() {<br>
+; CHECK-LABEL: f5:<br>
+; CHECK: vgmf %v24, 17, 15<br>
+; CHECK: br %r14<br>
+ ret <16 x i8> <i8 255, i8 255, i8 127, i8 255,<br>
+ i8 255, i8 255, i8 127, i8 255,<br>
+ i8 255, i8 255, i8 127, i8 255,<br>
+ i8 255, i8 255, i8 127, i8 255><br>
+}<br>
+<br>
+; Test a doubleword-granularity replicate with the lowest value that cannot<br>
+; use VREPIG.<br>
+define <16 x i8> @f6() {<br>
+; CHECK-LABEL: f6:<br>
+; CHECK: vgmg %v24, 48, 48<br>
+; CHECK: br %r14<br>
+ ret <16 x i8> <i8 0, i8 0, i8 0, i8 0,<br>
+ i8 0, i8 0, i8 128, i8 0,<br>
+ i8 0, i8 0, i8 0, i8 0,<br>
+ i8 0, i8 0, i8 128, i8 0><br>
+}<br>
+<br>
+; Test a doubleword-granularity replicate that has the lower 22 bits set.<br>
+define <16 x i8> @f7() {<br>
+; CHECK-LABEL: f7:<br>
+; CHECK: vgmg %v24, 42, 63<br>
+; CHECK: br %r14<br>
+ ret <16 x i8> <i8 0, i8 0, i8 0, i8 0,<br>
+ i8 0, i8 63, i8 255, i8 255,<br>
+ i8 0, i8 0, i8 0, i8 0,<br>
+ i8 0, i8 63, i8 255, i8 255><br>
+}<br>
+<br>
+; Test a doubleword-granularity replicate that has the upper 45 bits set.<br>
+define <16 x i8> @f8() {<br>
+; CHECK-LABEL: f8:<br>
+; CHECK: vgmg %v24, 0, 44<br>
+; CHECK: br %r14<br>
+ ret <16 x i8> <i8 255, i8 255, i8 255, i8 255,<br>
+ i8 255, i8 248, i8 0, i8 0,<br>
+ i8 255, i8 255, i8 255, i8 255,<br>
+ i8 255, i8 248, i8 0, i8 0><br>
+}<br>
+<br>
+; Test a doubleword-granularity replicate that has middle bits set.<br>
+define <16 x i8> @f9() {<br>
+; CHECK-LABEL: f9:<br>
+; CHECK: vgmg %v24, 31, 42<br>
+; CHECK: br %r14<br>
+ ret <16 x i8> <i8 0, i8 0, i8 0, i8 1,<br>
+ i8 255, i8 224, i8 0, i8 0,<br>
+ i8 0, i8 0, i8 0, i8 1,<br>
+ i8 255, i8 224, i8 0, i8 0><br>
+}<br>
+<br>
+; Test a doubleword-granularity replicate with a wrap-around mask.<br>
+define <16 x i8> @f10() {<br>
+; CHECK-LABEL: f10:<br>
+; CHECK: vgmg %v24, 18, 0<br>
+; CHECK: br %r14<br>
+ ret <16 x i8> <i8 128, i8 0, i8 63, i8 255,<br>
+ i8 255, i8 255, i8 255, i8 255,<br>
+ i8 128, i8 0, i8 63, i8 255,<br>
+ i8 255, i8 255, i8 255, i8 255><br>
+}<br>
+<br>
+; Retest f1 with arbitrary undefs instead of 0s.<br>
+define <16 x i8> @f11() {<br>
+; CHECK-LABEL: f11:<br>
+; CHECK: vgmf %v24, 16, 16<br>
+; CHECK: br %r14<br>
+ ret <16 x i8> <i8 0, i8 undef, i8 128, i8 0,<br>
+ i8 0, i8 0, i8 128, i8 undef,<br>
+ i8 undef, i8 0, i8 128, i8 0,<br>
+ i8 undef, i8 undef, i8 128, i8 0><br>
+}<br>
+<br>
+; Try a case where we want consistent undefs to be treated as 0.<br>
+define <16 x i8> @f12() {<br>
+; CHECK-LABEL: f12:<br>
+; CHECK: vgmf %v24, 15, 23<br>
+; CHECK: br %r14<br>
+ ret <16 x i8> <i8 undef, i8 1, i8 255, i8 0,<br>
+ i8 undef, i8 1, i8 255, i8 0,<br>
+ i8 undef, i8 1, i8 255, i8 0,<br>
+ i8 undef, i8 1, i8 255, i8 0><br>
+}<br>
+<br>
+; ...and again with the lower bits of the replicated constant.<br>
+define <16 x i8> @f13() {<br>
+; CHECK-LABEL: f13:<br>
+; CHECK: vgmf %v24, 15, 22<br>
+; CHECK: br %r14<br>
+ ret <16 x i8> <i8 0, i8 1, i8 254, i8 undef,<br>
+ i8 0, i8 1, i8 254, i8 undef,<br>
+ i8 0, i8 1, i8 254, i8 undef,<br>
+ i8 0, i8 1, i8 254, i8 undef><br>
+}<br>
+<br>
+; Try a case where we want consistent undefs to be treated as -1.<br>
+define <16 x i8> @f14() {<br>
+; CHECK-LABEL: f14:<br>
+; CHECK: vgmf %v24, 28, 8<br>
+; CHECK: br %r14<br>
+ ret <16 x i8> <i8 undef, i8 128, i8 0, i8 15,<br>
+ i8 undef, i8 128, i8 0, i8 15,<br>
+ i8 undef, i8 128, i8 0, i8 15,<br>
+ i8 undef, i8 128, i8 0, i8 15><br>
+}<br>
+<br>
+; ...and again with the lower bits of the replicated constant.<br>
+define <16 x i8> @f15() {<br>
+; CHECK-LABEL: f15:<br>
+; CHECK: vgmf %v24, 18, 3<br>
+; CHECK: br %r14<br>
+ ret <16 x i8> <i8 240, i8 0, i8 63, i8 undef,<br>
+ i8 240, i8 0, i8 63, i8 undef,<br>
+ i8 240, i8 0, i8 63, i8 undef,<br>
+ i8 240, i8 0, i8 63, i8 undef><br>
+}<br>
+<br>
+; Repeat f9 with arbitrary undefs.<br>
+define <16 x i8> @f16() {<br>
+; CHECK-LABEL: f16:<br>
+; CHECK: vgmg %v24, 31, 42<br>
+; CHECK: br %r14<br>
+ ret <16 x i8> <i8 undef, i8 0, i8 undef, i8 1,<br>
+ i8 255, i8 undef, i8 0, i8 0,<br>
+ i8 0, i8 0, i8 0, i8 1,<br>
+ i8 undef, i8 224, i8 undef, i8 undef><br>
+}<br>
+<br>
+; Try a case where we want some consistent undefs to be treated as 0<br>
+; and some to be treated as 255.<br>
+define <16 x i8> @f17() {<br>
+; CHECK-LABEL: f17:<br>
+; CHECK: vgmg %v24, 23, 35<br>
+; CHECK: br %r14<br>
+ ret <16 x i8> <i8 0, i8 undef, i8 1, i8 undef,<br>
+ i8 240, i8 undef, i8 0, i8 0,<br>
+ i8 0, i8 undef, i8 1, i8 undef,<br>
+ i8 240, i8 undef, i8 0, i8 0><br>
+}<br>
<br>
Added: llvm/trunk/test/CodeGen/SystemZ/vec-const-14.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-const-14.ll?rev=236521&view=auto" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-const-14.ll?rev=236521&view=auto</a><br>
==============================================================================<br>
--- llvm/trunk/test/CodeGen/SystemZ/vec-const-14.ll (added)<br>
+++ llvm/trunk/test/CodeGen/SystemZ/vec-const-14.ll Tue May 5 14:25:42 2015<br>
@@ -0,0 +1,113 @@<br>
+; Test vector replicates that use VECTOR GENERATE MASK, v8i16 version.<br>
+;<br>
+; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=z13 | FileCheck %s<br>
+<br>
+; Test a word-granularity replicate with the lowest value that cannot use<br>
+; VREPIF.<br>
+define <8 x i16> @f1() {<br>
+; CHECK-LABEL: f1:<br>
+; CHECK: vgmf %v24, 16, 16<br>
+; CHECK: br %r14<br>
+ ret <8 x i16> <i16 0, i16 32768, i16 0, i16 32768,<br>
+ i16 0, i16 32768, i16 0, i16 32768><br>
+}<br>
+<br>
+; Test a word-granularity replicate that has the lower 17 bits set.<br>
+define <8 x i16> @f2() {<br>
+; CHECK-LABEL: f2:<br>
+; CHECK: vgmf %v24, 15, 31<br>
+; CHECK: br %r14<br>
+ ret <8 x i16> <i16 1, i16 -1, i16 1, i16 -1,<br>
+ i16 1, i16 -1, i16 1, i16 -1><br>
+}<br>
+<br>
+; Test a word-granularity replicate that has the upper 15 bits set.<br>
+define <8 x i16> @f3() {<br>
+; CHECK-LABEL: f3:<br>
+; CHECK: vgmf %v24, 0, 14<br>
+; CHECK: br %r14<br>
+ ret <8 x i16> <i16 -2, i16 0, i16 -2, i16 0,<br>
+ i16 -2, i16 0, i16 -2, i16 0><br>
+}<br>
+<br>
+; Test a word-granularity replicate that has middle bits set.<br>
+define <8 x i16> @f4() {<br>
+; CHECK-LABEL: f4:<br>
+; CHECK: vgmf %v24, 12, 17<br>
+; CHECK: br %r14<br>
+ ret <8 x i16> <i16 15, i16 49152, i16 15, i16 49152,<br>
+ i16 15, i16 49152, i16 15, i16 49152><br>
+}<br>
+<br>
+; Test a word-granularity replicate with a wrap-around mask.<br>
+define <8 x i16> @f5() {<br>
+; CHECK-LABEL: f5:<br>
+; CHECK: vgmf %v24, 17, 15<br>
+; CHECK: br %r14<br>
+ ret <8 x i16> <i16 -1, i16 32767, i16 -1, i16 32767,<br>
+ i16 -1, i16 32767, i16 -1, i16 32767><br>
+}<br>
+<br>
+; Test a doubleword-granularity replicate with the lowest value that cannot<br>
+; use VREPIG.<br>
+define <8 x i16> @f6() {<br>
+; CHECK-LABEL: f6:<br>
+; CHECK: vgmg %v24, 48, 48<br>
+; CHECK: br %r14<br>
+ ret <8 x i16> <i16 0, i16 0, i16 0, i16 32768,<br>
+ i16 0, i16 0, i16 0, i16 32768><br>
+}<br>
+<br>
+; Test a doubleword-granularity replicate that has the lower 22 bits set.<br>
+define <8 x i16> @f7() {<br>
+; CHECK-LABEL: f7:<br>
+; CHECK: vgmg %v24, 42, 63<br>
+; CHECK: br %r14<br>
+ ret <8 x i16> <i16 0, i16 0, i16 63, i16 -1,<br>
+ i16 0, i16 0, i16 63, i16 -1><br>
+}<br>
+<br>
+; Test a doubleword-granularity replicate that has the upper 45 bits set.<br>
+define <8 x i16> @f8() {<br>
+; CHECK-LABEL: f8:<br>
+; CHECK: vgmg %v24, 0, 44<br>
+; CHECK: br %r14<br>
+ ret <8 x i16> <i16 -1, i16 -1, i16 -8, i16 0,<br>
+ i16 -1, i16 -1, i16 -8, i16 0><br>
+}<br>
+<br>
+; Test a doubleword-granularity replicate that has middle bits set.<br>
+define <8 x i16> @f9() {<br>
+; CHECK-LABEL: f9:<br>
+; CHECK: vgmg %v24, 31, 42<br>
+; CHECK: br %r14<br>
+ ret <8 x i16> <i16 0, i16 1, i16 -32, i16 0,<br>
+ i16 0, i16 1, i16 -32, i16 0><br>
+}<br>
+<br>
+; Test a doubleword-granularity replicate with a wrap-around mask.<br>
+define <8 x i16> @f10() {<br>
+; CHECK-LABEL: f10:<br>
+; CHECK: vgmg %v24, 18, 0<br>
+; CHECK: br %r14<br>
+ ret <8 x i16> <i16 32768, i16 16383, i16 -1, i16 -1,<br>
+ i16 32768, i16 16383, i16 -1, i16 -1><br>
+}<br>
+<br>
+; Retest f1 with arbitrary undefs instead of 0s.<br>
+define <8 x i16> @f11() {<br>
+; CHECK-LABEL: f11:<br>
+; CHECK: vgmf %v24, 16, 16<br>
+; CHECK: br %r14<br>
+ ret <8 x i16> <i16 undef, i16 32768, i16 0, i16 32768,<br>
+ i16 0, i16 32768, i16 undef, i16 32768><br>
+}<br>
+<br>
+; ...likewise f9.<br>
+define <8 x i16> @f12() {<br>
+; CHECK-LABEL: f12:<br>
+; CHECK: vgmg %v24, 31, 42<br>
+; CHECK: br %r14<br>
+ ret <8 x i16> <i16 undef, i16 1, i16 -32, i16 0,<br>
+ i16 0, i16 1, i16 -32, i16 undef><br>
+}<br>
<br>
Added: llvm/trunk/test/CodeGen/SystemZ/vec-const-15.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-const-15.ll?rev=236521&view=auto" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-const-15.ll?rev=236521&view=auto</a><br>
==============================================================================<br>
--- llvm/trunk/test/CodeGen/SystemZ/vec-const-15.ll (added)<br>
+++ llvm/trunk/test/CodeGen/SystemZ/vec-const-15.ll Tue May 5 14:25:42 2015<br>
@@ -0,0 +1,85 @@<br>
+; Test vector replicates that use VECTOR GENERATE MASK, v4i32 version.<br>
+;<br>
+; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=z13 | FileCheck %s<br>
+<br>
+; Test a word-granularity replicate with the lowest value that cannot use<br>
+; VREPIF.<br>
+define <4 x i32> @f1() {<br>
+; CHECK-LABEL: f1:<br>
+; CHECK: vgmf %v24, 16, 16<br>
+; CHECK: br %r14<br>
+ ret <4 x i32> <i32 32768, i32 32768, i32 32768, i32 32768><br>
+}<br>
+<br>
+; Test a word-granularity replicate that has the lower 17 bits set.<br>
+define <4 x i32> @f2() {<br>
+; CHECK-LABEL: f2:<br>
+; CHECK: vgmf %v24, 15, 31<br>
+; CHECK: br %r14<br>
+ ret <4 x i32> <i32 131071, i32 131071, i32 131071, i32 131071><br>
+}<br>
+<br>
+; Test a word-granularity replicate that has the upper 15 bits set.<br>
+define <4 x i32> @f3() {<br>
+; CHECK-LABEL: f3:<br>
+; CHECK: vgmf %v24, 0, 14<br>
+; CHECK: br %r14<br>
+ ret <4 x i32> <i32 -131072, i32 -131072, i32 -131072, i32 -131072><br>
+}<br>
+<br>
+; Test a word-granularity replicate that has middle bits set.<br>
+define <4 x i32> @f4() {<br>
+; CHECK-LABEL: f4:<br>
+; CHECK: vgmf %v24, 12, 17<br>
+; CHECK: br %r14<br>
+ ret <4 x i32> <i32 1032192, i32 1032192, i32 1032192, i32 1032192><br>
+}<br>
+<br>
+; Test a word-granularity replicate with a wrap-around mask.<br>
+define <4 x i32> @f5() {<br>
+; CHECK-LABEL: f5:<br>
+; CHECK: vgmf %v24, 17, 15<br>
+; CHECK: br %r14<br>
+ ret <4 x i32> <i32 -32769, i32 -32769, i32 -32769, i32 -32769><br>
+}<br>
+<br>
+; Test a doubleword-granularity replicate with the lowest value that cannot<br>
+; use VREPIG.<br>
+define <4 x i32> @f6() {<br>
+; CHECK-LABEL: f6:<br>
+; CHECK: vgmg %v24, 48, 48<br>
+; CHECK: br %r14<br>
+ ret <4 x i32> <i32 0, i32 32768, i32 0, i32 32768><br>
+}<br>
+<br>
+; Test a doubleword-granularity replicate that has the lower 22 bits set.<br>
+define <4 x i32> @f7() {<br>
+; CHECK-LABEL: f7:<br>
+; CHECK: vgmg %v24, 42, 63<br>
+; CHECK: br %r14<br>
+ ret <4 x i32> <i32 0, i32 4194303, i32 0, i32 4194303><br>
+}<br>
+<br>
+; Test a doubleword-granularity replicate that has the upper 45 bits set.<br>
+define <4 x i32> @f8() {<br>
+; CHECK-LABEL: f8:<br>
+; CHECK: vgmg %v24, 0, 44<br>
+; CHECK: br %r14<br>
+ ret <4 x i32> <i32 -1, i32 -524288, i32 -1, i32 -524288><br>
+}<br>
+<br>
+; Test a doubleword-granularity replicate that has middle bits set.<br>
+define <4 x i32> @f9() {<br>
+; CHECK-LABEL: f9:<br>
+; CHECK: vgmg %v24, 31, 42<br>
+; CHECK: br %r14<br>
+ ret <4 x i32> <i32 1, i32 -2097152, i32 1, i32 -2097152><br>
+}<br>
+<br>
+; Test a doubleword-granularity replicate with a wrap-around mask.<br>
+define <4 x i32> @f10() {<br>
+; CHECK-LABEL: f10:<br>
+; CHECK: vgmg %v24, 18, 0<br>
+; CHECK: br %r14<br>
+ ret <4 x i32> <i32 -2147467265, i32 -1, i32 -2147467265, i32 -1><br>
+}<br>
<br>
Added: llvm/trunk/test/CodeGen/SystemZ/vec-const-16.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-const-16.ll?rev=236521&view=auto" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-const-16.ll?rev=236521&view=auto</a><br>
==============================================================================<br>
--- llvm/trunk/test/CodeGen/SystemZ/vec-const-16.ll (added)<br>
+++ llvm/trunk/test/CodeGen/SystemZ/vec-const-16.ll Tue May 5 14:25:42 2015<br>
@@ -0,0 +1,85 @@<br>
+; Test vector replicates that use VECTOR GENERATE MASK, v2i64 version.<br>
+;<br>
+; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=z13 | FileCheck %s<br>
+<br>
+; Test a word-granularity replicate with the lowest value that cannot use<br>
+; VREPIF.<br>
+define <2 x i64> @f1() {<br>
+; CHECK-LABEL: f1:<br>
+; CHECK: vgmf %v24, 16, 16<br>
+; CHECK: br %r14<br>
+ ret <2 x i64> <i64 140737488388096, i64 140737488388096><br>
+}<br>
+<br>
+; Test a word-granularity replicate that has the lower 17 bits set.<br>
+define <2 x i64> @f2() {<br>
+; CHECK-LABEL: f2:<br>
+; CHECK: vgmf %v24, 15, 31<br>
+; CHECK: br %r14<br>
+ ret <2 x i64> <i64 562945658585087, i64 562945658585087><br>
+}<br>
+<br>
+; Test a word-granularity replicate that has the upper 15 bits set.<br>
+define <2 x i64> @f3() {<br>
+; CHECK-LABEL: f3:<br>
+; CHECK: vgmf %v24, 0, 14<br>
+; CHECK: br %r14<br>
+ ret <2 x i64> <i64 -562945658585088, i64 -562945658585088><br>
+}<br>
+<br>
+; Test a word-granularity replicate that has middle bits set.<br>
+define <2 x i64> @f4() {<br>
+; CHECK-LABEL: f4:<br>
+; CHECK: vgmf %v24, 12, 17<br>
+; CHECK: br %r14<br>
+ ret <2 x i64> <i64 4433230884225024, i64 4433230884225024><br>
+}<br>
+<br>
+; Test a word-granularity replicate with a wrap-around mask.<br>
+define <2 x i64> @f5() {<br>
+; CHECK-LABEL: f5:<br>
+; CHECK: vgmf %v24, 17, 15<br>
+; CHECK: br %r14<br>
+ ret <2 x i64> <i64 -140737488388097, i64 -140737488388097><br>
+}<br>
+<br>
+; Test a doubleword-granularity replicate with the lowest value that cannot<br>
+; use VREPIG.<br>
+define <2 x i64> @f6() {<br>
+; CHECK-LABEL: f6:<br>
+; CHECK: vgmg %v24, 48, 48<br>
+; CHECK: br %r14<br>
+ ret <2 x i64> <i64 32768, i64 32768><br>
+}<br>
+<br>
+; Test a doubleword-granularity replicate that has the lower 22 bits set.<br>
+define <2 x i64> @f7() {<br>
+; CHECK-LABEL: f7:<br>
+; CHECK: vgmg %v24, 42, 63<br>
+; CHECK: br %r14<br>
+ ret <2 x i64> <i64 4194303, i64 4194303><br>
+}<br>
+<br>
+; Test a doubleword-granularity replicate that has the upper 45 bits set.<br>
+define <2 x i64> @f8() {<br>
+; CHECK-LABEL: f8:<br>
+; CHECK: vgmg %v24, 0, 44<br>
+; CHECK: br %r14<br>
+ ret <2 x i64> <i64 -524288, i64 -524288><br>
+}<br>
+<br>
+; Test a doubleword-granularity replicate that has middle bits set.<br>
+define <2 x i64> @f9() {<br>
+; CHECK-LABEL: f9:<br>
+; CHECK: vgmg %v24, 31, 42<br>
+; CHECK: br %r14<br>
+ ret <2 x i64> <i64 8587837440, i64 8587837440><br>
+}<br>
+<br>
+; Test a doubleword-granularity replicate with a wrap-around mask.<br>
+define <2 x i64> @f10() {<br>
+; CHECK-LABEL: f10:<br>
+; CHECK: vgmg %v24, 18, 0<br>
+; CHECK: br %r14<br>
+ ret <2 x i64> <i64 -9223301668110598145, i64 -9223301668110598145><br>
+}<br>
<br>
Added: llvm/trunk/test/CodeGen/SystemZ/vec-ctlz-01.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-ctlz-01.ll?rev=236521&view=auto" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-ctlz-01.ll?rev=236521&view=auto</a><br>
==============================================================================<br>
--- llvm/trunk/test/CodeGen/SystemZ/vec-ctlz-01.ll (added)<br>
+++ llvm/trunk/test/CodeGen/SystemZ/vec-ctlz-01.ll Tue May 5 14:25:42 2015<br>
@@ -0,0 +1,81 @@<br>
+; Test vector count leading zeros<br>
+;<br>
+; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=z13 | FileCheck %s<br>
+<br>
+declare <16 x i8> @llvm.ctlz.v16i8(<16 x i8> %src, i1 %is_zero_undef)<br>
+declare <8 x i16> @llvm.ctlz.v8i16(<8 x i16> %src, i1 %is_zero_undef)<br>
+declare <4 x i32> @llvm.ctlz.v4i32(<4 x i32> %src, i1 %is_zero_undef)<br>
+declare <2 x i64> @llvm.ctlz.v2i64(<2 x i64> %src, i1 %is_zero_undef)<br>
+<br>
+define <16 x i8> @f1(<16 x i8> %a) {<br>
+; CHECK-LABEL: f1:<br>
+; CHECK: vclzb %v24, %v24<br>
+; CHECK: br %r14<br>
+<br>
+ %res = call <16 x i8> @llvm.ctlz.v16i8(<16 x i8> %a, i1 false)<br>
+ ret <16 x i8> %res<br>
+}<br>
+<br>
+define <16 x i8> @f2(<16 x i8> %a) {<br>
+; CHECK-LABEL: f2:<br>
+; CHECK: vclzb %v24, %v24<br>
+; CHECK: br %r14<br>
+<br>
+ %res = call <16 x i8> @llvm.ctlz.v16i8(<16 x i8> %a, i1 true)<br>
+ ret <16 x i8> %res<br>
+}<br>
+<br>
+define <8 x i16> @f3(<8 x i16> %a) {<br>
+; CHECK-LABEL: f3:<br>
+; CHECK: vclzh %v24, %v24<br>
+; CHECK: br %r14<br>
+<br>
+ %res = call <8 x i16> @llvm.ctlz.v8i16(<8 x i16> %a, i1 false)<br>
+ ret <8 x i16> %res<br>
+}<br>
+<br>
+define <8 x i16> @f4(<8 x i16> %a) {<br>
+; CHECK-LABEL: f4:<br>
+; CHECK: vclzh %v24, %v24<br>
+; CHECK: br %r14<br>
+<br>
+ %res = call <8 x i16> @llvm.ctlz.v8i16(<8 x i16> %a, i1 true)<br>
+ ret <8 x i16> %res<br>
+}<br>
+<br>
+define <4 x i32> @f5(<4 x i32> %a) {<br>
+; CHECK-LABEL: f5:<br>
+; CHECK: vclzf %v24, %v24<br>
+; CHECK: br %r14<br>
+<br>
+ %res = call <4 x i32> @llvm.ctlz.v4i32(<4 x i32> %a, i1 false)<br>
+ ret <4 x i32> %res<br>
+}<br>
+<br>
+define <4 x i32> @f6(<4 x i32> %a) {<br>
+; CHECK-LABEL: f6:<br>
+; CHECK: vclzf %v24, %v24<br>
+; CHECK: br %r14<br>
+<br>
+ %res = call <4 x i32> @llvm.ctlz.v4i32(<4 x i32> %a, i1 true)<br>
+ ret <4 x i32> %res<br>
+}<br>
+<br>
+define <2 x i64> @f7(<2 x i64> %a) {<br>
+; CHECK-LABEL: f7:<br>
+; CHECK: vclzg %v24, %v24<br>
+; CHECK: br %r14<br>
+<br>
+ %res = call <2 x i64> @llvm.ctlz.v2i64(<2 x i64> %a, i1 false)<br>
+ ret <2 x i64> %res<br>
+}<br>
+<br>
+define <2 x i64> @f8(<2 x i64> %a) {<br>
+; CHECK-LABEL: f8:<br>
+; CHECK: vclzg %v24, %v24<br>
+; CHECK: br %r14<br>
+<br>
+ %res = call <2 x i64> @llvm.ctlz.v2i64(<2 x i64> %a, i1 true)<br>
+ ret <2 x i64> %res<br>
+}<br>
+<br>
<br>
Added: llvm/trunk/test/CodeGen/SystemZ/vec-ctpop-01.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-ctpop-01.ll?rev=236521&view=auto" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-ctpop-01.ll?rev=236521&view=auto</a><br>
==============================================================================<br>
--- llvm/trunk/test/CodeGen/SystemZ/vec-ctpop-01.ll (added)<br>
+++ llvm/trunk/test/CodeGen/SystemZ/vec-ctpop-01.ll Tue May 5 14:25:42 2015<br>
@@ -0,0 +1,53 @@<br>
+; Test vector population-count instruction<br>
+;<br>
+; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=z13 | FileCheck %s<br>
+<br>
+declare <16 x i8> @llvm.ctpop.v16i8(<16 x i8> %a)<br>
+declare <8 x i16> @llvm.ctpop.v8i16(<8 x i16> %a)<br>
+declare <4 x i32> @llvm.ctpop.v4i32(<4 x i32> %a)<br>
+declare <2 x i64> @llvm.ctpop.v2i64(<2 x i64> %a)<br>
+<br>
+define <16 x i8> @f1(<16 x i8> %a) {<br>
+; CHECK-LABEL: f1:<br>
+; CHECK: vpopct %v24, %v24, 0<br>
+; CHECK: br %r14<br>
+<br>
+ %popcnt = call <16 x i8> @llvm.ctpop.v16i8(<16 x i8> %a)<br>
+ ret <16 x i8> %popcnt<br>
+}<br>
+<br>
+define <8 x i16> @f2(<8 x i16> %a) {<br>
+; CHECK-LABEL: f2:<br>
+; CHECK: vpopct [[T1:%v[0-9]+]], %v24, 0<br>
+; CHECK: veslh [[T2:%v[0-9]+]], [[T1]], 8<br>
+; CHECK: vah [[T3:%v[0-9]+]], [[T1]], [[T2]]<br>
+; CHECK: vesrlh %v24, [[T3]], 8<br>
+; CHECK: br %r14<br>
+<br>
+ %popcnt = call <8 x i16> @llvm.ctpop.v8i16(<8 x i16> %a)<br>
+ ret <8 x i16> %popcnt<br>
+}<br>
+<br>
+define <4 x i32> @f3(<4 x i32> %a) {<br>
+; CHECK-LABEL: f3:<br>
+; CHECK: vpopct [[T1:%v[0-9]+]], %v24, 0<br>
+; CHECK: vgbm [[T2:%v[0-9]+]], 0<br>
+; CHECK: vsumb %v24, [[T1]], [[T2]]<br>
+; CHECK: br %r14<br>
+<br>
+ %popcnt = call <4 x i32> @llvm.ctpop.v4i32(<4 x i32> %a)<br>
+ ret <4 x i32> %popcnt<br>
+}<br>
+<br>
+define <2 x i64> @f4(<2 x i64> %a) {<br>
+; CHECK-LABEL: f4:<br>
+; CHECK: vpopct [[T1:%v[0-9]+]], %v24, 0<br>
+; CHECK: vgbm [[T2:%v[0-9]+]], 0<br>
+; CHECK: vsumb [[T3:%v[0-9]+]], [[T1]], [[T2]]<br>
+; CHECK: vsumgf %v24, [[T3]], [[T2]]<br>
+; CHECK: br %r14<br>
+<br>
+ %popcnt = call <2 x i64> @llvm.ctpop.v2i64(<2 x i64> %a)<br>
+ ret <2 x i64> %popcnt<br>
+}<br>
+<br>
<br>
Added: llvm/trunk/test/CodeGen/SystemZ/vec-cttz-01.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-cttz-01.ll?rev=236521&view=auto" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-cttz-01.ll?rev=236521&view=auto</a><br>
==============================================================================<br>
--- llvm/trunk/test/CodeGen/SystemZ/vec-cttz-01.ll (added)<br>
+++ llvm/trunk/test/CodeGen/SystemZ/vec-cttz-01.ll Tue May 5 14:25:42 2015<br>
@@ -0,0 +1,81 @@<br>
+; Test vector count trailing zeros<br>
+;<br>
+; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=z13 | FileCheck %s<br>
+<br>
+declare <16 x i8> @llvm.cttz.v16i8(<16 x i8> %src, i1 %is_zero_undef)<br>
+declare <8 x i16> @llvm.cttz.v8i16(<8 x i16> %src, i1 %is_zero_undef)<br>
+declare <4 x i32> @llvm.cttz.v4i32(<4 x i32> %src, i1 %is_zero_undef)<br>
+declare <2 x i64> @llvm.cttz.v2i64(<2 x i64> %src, i1 %is_zero_undef)<br>
+<br>
+define <16 x i8> @f1(<16 x i8> %a) {<br>
+; CHECK-LABEL: f1:<br>
+; CHECK: vctzb %v24, %v24<br>
+; CHECK: br %r14<br>
+<br>
+ %res = call <16 x i8> @llvm.cttz.v16i8(<16 x i8> %a, i1 false)<br>
+ ret <16 x i8> %res<br>
+}<br>
+<br>
+define <16 x i8> @f2(<16 x i8> %a) {<br>
+; CHECK-LABEL: f2:<br>
+; CHECK: vctzb %v24, %v24<br>
+; CHECK: br %r14<br>
+<br>
+ %res = call <16 x i8> @llvm.cttz.v16i8(<16 x i8> %a, i1 true)<br>
+ ret <16 x i8> %res<br>
+}<br>
+<br>
+define <8 x i16> @f3(<8 x i16> %a) {<br>
+; CHECK-LABEL: f3:<br>
+; CHECK: vctzh %v24, %v24<br>
+; CHECK: br %r14<br>
+<br>
+ %res = call <8 x i16> @llvm.cttz.v8i16(<8 x i16> %a, i1 false)<br>
+ ret <8 x i16> %res<br>
+}<br>
+<br>
+define <8 x i16> @f4(<8 x i16> %a) {<br>
+; CHECK-LABEL: f4:<br>
+; CHECK: vctzh %v24, %v24<br>
+; CHECK: br %r14<br>
+<br>
+ %res = call <8 x i16> @llvm.cttz.v8i16(<8 x i16> %a, i1 true)<br>
+ ret <8 x i16> %res<br>
+}<br>
+<br>
+define <4 x i32> @f5(<4 x i32> %a) {<br>
+; CHECK-LABEL: f5:<br>
+; CHECK: vctzf %v24, %v24<br>
+; CHECK: br %r14<br>
+<br>
+ %res = call <4 x i32> @llvm.cttz.v4i32(<4 x i32> %a, i1 false)<br>
+ ret <4 x i32> %res<br>
+}<br>
+<br>
+define <4 x i32> @f6(<4 x i32> %a) {<br>
+; CHECK-LABEL: f6:<br>
+; CHECK: vctzf %v24, %v24<br>
+; CHECK: br %r14<br>
+<br>
+ %res = call <4 x i32> @llvm.cttz.v4i32(<4 x i32> %a, i1 true)<br>
+ ret <4 x i32> %res<br>
+}<br>
+<br>
+define <2 x i64> @f7(<2 x i64> %a) {<br>
+; CHECK-LABEL: f7:<br>
+; CHECK: vctzg %v24, %v24<br>
+; CHECK: br %r14<br>
+<br>
+ %res = call <2 x i64> @llvm.cttz.v2i64(<2 x i64> %a, i1 false)<br>
+ ret <2 x i64> %res<br>
+}<br>
+<br>
+define <2 x i64> @f8(<2 x i64> %a) {<br>
+; CHECK-LABEL: f8:<br>
+; CHECK: vctzg %v24, %v24<br>
+; CHECK: br %r14<br>
+<br>
+ %res = call <2 x i64> @llvm.cttz.v2i64(<2 x i64> %a, i1 true)<br>
+ ret <2 x i64> %res<br>
+}<br>
+<br>
<br>
Added: llvm/trunk/test/CodeGen/SystemZ/vec-div-01.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-div-01.ll?rev=236521&view=auto" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-div-01.ll?rev=236521&view=auto</a><br>
==============================================================================<br>
--- llvm/trunk/test/CodeGen/SystemZ/vec-div-01.ll (added)<br>
+++ llvm/trunk/test/CodeGen/SystemZ/vec-div-01.ll Tue May 5 14:25:42 2015<br>
@@ -0,0 +1,62 @@<br>
+; Test vector division. There is no native support for this, so it's really<br>
+; a test of the operation legalization code.<br>
+;<br>
+; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=z13 | FileCheck %s<br>
+<br>
+; Test a v16i8 division.<br>
+define <16 x i8> @f1(<16 x i8> %dummy, <16 x i8> %val1, <16 x i8> %val2) {<br>
+; CHECK-LABEL: f1:<br>
+; CHECK: vlvgp [[REG:%v[0-9]+]],<br>
+; CHECK-DAG: vlvgb [[REG]], {{%r[0-5]}}, 0<br>
+; CHECK-DAG: vlvgb [[REG]], {{%r[0-5]}}, 1<br>
+; CHECK-DAG: vlvgb [[REG]], {{%r[0-5]}}, 2<br>
+; CHECK-DAG: vlvgb [[REG]], {{%r[0-5]}}, 3<br>
+; CHECK-DAG: vlvgb [[REG]], {{%r[0-5]}}, 4<br>
+; CHECK-DAG: vlvgb [[REG]], {{%r[0-5]}}, 5<br>
+; CHECK-DAG: vlvgb [[REG]], {{%r[0-5]}}, 6<br>
+; CHECK-DAG: vlvgb [[REG]], {{%r[0-5]}}, 8<br>
+; CHECK-DAG: vlvgb [[REG]], {{%r[0-5]}}, 9<br>
+; CHECK-DAG: vlvgb [[REG]], {{%r[0-5]}}, 10<br>
+; CHECK-DAG: vlvgb [[REG]], {{%r[0-5]}}, 11<br>
+; CHECK-DAG: vlvgb [[REG]], {{%r[0-5]}}, 12<br>
+; CHECK-DAG: vlvgb [[REG]], {{%r[0-5]}}, 13<br>
+; CHECK-DAG: vlvgb [[REG]], {{%r[0-5]}}, 14<br>
+; CHECK: br %r14<br>
+ %ret = sdiv <16 x i8> %val1, %val2<br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test a v8i16 division.<br>
+define <8 x i16> @f2(<8 x i16> %dummy, <8 x i16> %val1, <8 x i16> %val2) {<br>
+; CHECK-LABEL: f2:<br>
+; CHECK: vlvgp [[REG:%v[0-9]+]],<br>
+; CHECK-DAG: vlvgh [[REG]], {{%r[0-5]}}, 0<br>
+; CHECK-DAG: vlvgh [[REG]], {{%r[0-5]}}, 1<br>
+; CHECK-DAG: vlvgh [[REG]], {{%r[0-5]}}, 2<br>
+; CHECK-DAG: vlvgh [[REG]], {{%r[0-5]}}, 4<br>
+; CHECK-DAG: vlvgh [[REG]], {{%r[0-5]}}, 5<br>
+; CHECK-DAG: vlvgh [[REG]], {{%r[0-5]}}, 6<br>
+; CHECK: br %r14<br>
+ %ret = sdiv <8 x i16> %val1, %val2<br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test a v4i32 division.<br>
+define <4 x i32> @f3(<4 x i32> %dummy, <4 x i32> %val1, <4 x i32> %val2) {<br>
+; CHECK-LABEL: f3:<br>
+; CHECK: vlvgp [[REG:%v[0-9]+]],<br>
+; CHECK-DAG: vlvgf [[REG]], {{%r[0-5]}}, 0<br>
+; CHECK-DAG: vlvgf [[REG]], {{%r[0-5]}}, 2<br>
+; CHECK: br %r14<br>
+ %ret = sdiv <4 x i32> %val1, %val2<br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test a v2i64 division.<br>
+define <2 x i64> @f4(<2 x i64> %dummy, <2 x i64> %val1, <2 x i64> %val2) {<br>
+; CHECK-LABEL: f4:<br>
+; CHECK: vlvgp %v24,<br>
+; CHECK: br %r14<br>
+ %ret = sdiv <2 x i64> %val1, %val2<br>
+ ret <2 x i64> %ret<br>
+}<br>
<br>
Added: llvm/trunk/test/CodeGen/SystemZ/vec-max-01.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-max-01.ll?rev=236521&view=auto" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-max-01.ll?rev=236521&view=auto</a><br>
==============================================================================<br>
--- llvm/trunk/test/CodeGen/SystemZ/vec-max-01.ll (added)<br>
+++ llvm/trunk/test/CodeGen/SystemZ/vec-max-01.ll Tue May 5 14:25:42 2015<br>
@@ -0,0 +1,83 @@<br>
+; Test v16i8 maximum.<br>
+;<br>
+; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=z13 | FileCheck %s<br>
+<br>
+; Test with slt.<br>
+define <16 x i8> @f1(<16 x i8> %val1, <16 x i8> %val2) {<br>
+; CHECK-LABEL: f1:<br>
+; CHECK: vmxb %v24, {{%v24, %v26|%v26, %v24}}<br>
+; CHECK: br %r14<br>
+ %cmp = icmp slt <16 x i8> %val1, %val2<br>
+ %ret = select <16 x i1> %cmp, <16 x i8> %val2, <16 x i8> %val1<br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test with sle.<br>
+define <16 x i8> @f2(<16 x i8> %val1, <16 x i8> %val2) {<br>
+; CHECK-LABEL: f2:<br>
+; CHECK: vmxb %v24, {{%v24, %v26|%v26, %v24}}<br>
+; CHECK: br %r14<br>
+ %cmp = icmp sle <16 x i8> %val1, %val2<br>
+ %ret = select <16 x i1> %cmp, <16 x i8> %val2, <16 x i8> %val1<br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test with sgt.<br>
+define <16 x i8> @f3(<16 x i8> %val1, <16 x i8> %val2) {<br>
+; CHECK-LABEL: f3:<br>
+; CHECK: vmxb %v24, {{%v24, %v26|%v26, %v24}}<br>
+; CHECK: br %r14<br>
+ %cmp = icmp sgt <16 x i8> %val1, %val2<br>
+ %ret = select <16 x i1> %cmp, <16 x i8> %val1, <16 x i8> %val2<br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test with sge.<br>
+define <16 x i8> @f4(<16 x i8> %val1, <16 x i8> %val2) {<br>
+; CHECK-LABEL: f4:<br>
+; CHECK: vmxb %v24, {{%v24, %v26|%v26, %v24}}<br>
+; CHECK: br %r14<br>
+ %cmp = icmp sge <16 x i8> %val1, %val2<br>
+ %ret = select <16 x i1> %cmp, <16 x i8> %val1, <16 x i8> %val2<br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test with ult.<br>
+define <16 x i8> @f5(<16 x i8> %val1, <16 x i8> %val2) {<br>
+; CHECK-LABEL: f5:<br>
+; CHECK: vmxlb %v24, {{%v24, %v26|%v26, %v24}}<br>
+; CHECK: br %r14<br>
+ %cmp = icmp ult <16 x i8> %val1, %val2<br>
+ %ret = select <16 x i1> %cmp, <16 x i8> %val2, <16 x i8> %val1<br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test with ule.<br>
+define <16 x i8> @f6(<16 x i8> %val1, <16 x i8> %val2) {<br>
+; CHECK-LABEL: f6:<br>
+; CHECK: vmxlb %v24, {{%v24, %v26|%v26, %v24}}<br>
+; CHECK: br %r14<br>
+ %cmp = icmp ule <16 x i8> %val1, %val2<br>
+ %ret = select <16 x i1> %cmp, <16 x i8> %val2, <16 x i8> %val1<br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test with ugt.<br>
+define <16 x i8> @f7(<16 x i8> %val1, <16 x i8> %val2) {<br>
+; CHECK-LABEL: f7:<br>
+; CHECK: vmxlb %v24, {{%v24, %v26|%v26, %v24}}<br>
+; CHECK: br %r14<br>
+ %cmp = icmp ugt <16 x i8> %val1, %val2<br>
+ %ret = select <16 x i1> %cmp, <16 x i8> %val1, <16 x i8> %val2<br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test with uge.<br>
+define <16 x i8> @f8(<16 x i8> %val1, <16 x i8> %val2) {<br>
+; CHECK-LABEL: f8:<br>
+; CHECK: vmxlb %v24, {{%v24, %v26|%v26, %v24}}<br>
+; CHECK: br %r14<br>
+ %cmp = icmp uge <16 x i8> %val1, %val2<br>
+ %ret = select <16 x i1> %cmp, <16 x i8> %val1, <16 x i8> %val2<br>
+ ret <16 x i8> %ret<br>
+}<br>
<br>
Added: llvm/trunk/test/CodeGen/SystemZ/vec-max-02.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-max-02.ll?rev=236521&view=auto" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-max-02.ll?rev=236521&view=auto</a><br>
==============================================================================<br>
--- llvm/trunk/test/CodeGen/SystemZ/vec-max-02.ll (added)<br>
+++ llvm/trunk/test/CodeGen/SystemZ/vec-max-02.ll Tue May 5 14:25:42 2015<br>
@@ -0,0 +1,83 @@<br>
+; Test v8i16 maximum.<br>
+;<br>
+; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=z13 | FileCheck %s<br>
+<br>
+; Test with slt.<br>
+define <8 x i16> @f1(<8 x i16> %val1, <8 x i16> %val2) {<br>
+; CHECK-LABEL: f1:<br>
+; CHECK: vmxh %v24, {{%v24, %v26|%v26, %v24}}<br>
+; CHECK: br %r14<br>
+ %cmp = icmp slt <8 x i16> %val1, %val2<br>
+ %ret = select <8 x i1> %cmp, <8 x i16> %val2, <8 x i16> %val1<br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test with sle.<br>
+define <8 x i16> @f2(<8 x i16> %val1, <8 x i16> %val2) {<br>
+; CHECK-LABEL: f2:<br>
+; CHECK: vmxh %v24, {{%v24, %v26|%v26, %v24}}<br>
+; CHECK: br %r14<br>
+ %cmp = icmp sle <8 x i16> %val1, %val2<br>
+ %ret = select <8 x i1> %cmp, <8 x i16> %val2, <8 x i16> %val1<br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test with sgt.<br>
+define <8 x i16> @f3(<8 x i16> %val1, <8 x i16> %val2) {<br>
+; CHECK-LABEL: f3:<br>
+; CHECK: vmxh %v24, {{%v24, %v26|%v26, %v24}}<br>
+; CHECK: br %r14<br>
+ %cmp = icmp sgt <8 x i16> %val1, %val2<br>
+ %ret = select <8 x i1> %cmp, <8 x i16> %val1, <8 x i16> %val2<br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test with sge.<br>
+define <8 x i16> @f4(<8 x i16> %val1, <8 x i16> %val2) {<br>
+; CHECK-LABEL: f4:<br>
+; CHECK: vmxh %v24, {{%v24, %v26|%v26, %v24}}<br>
+; CHECK: br %r14<br>
+ %cmp = icmp sge <8 x i16> %val1, %val2<br>
+ %ret = select <8 x i1> %cmp, <8 x i16> %val1, <8 x i16> %val2<br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test with ult.<br>
+define <8 x i16> @f5(<8 x i16> %val1, <8 x i16> %val2) {<br>
+; CHECK-LABEL: f5:<br>
+; CHECK: vmxlh %v24, {{%v24, %v26|%v26, %v24}}<br>
+; CHECK: br %r14<br>
+ %cmp = icmp ult <8 x i16> %val1, %val2<br>
+ %ret = select <8 x i1> %cmp, <8 x i16> %val2, <8 x i16> %val1<br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test with ule.<br>
+define <8 x i16> @f6(<8 x i16> %val1, <8 x i16> %val2) {<br>
+; CHECK-LABEL: f6:<br>
+; CHECK: vmxlh %v24, {{%v24, %v26|%v26, %v24}}<br>
+; CHECK: br %r14<br>
+ %cmp = icmp ule <8 x i16> %val1, %val2<br>
+ %ret = select <8 x i1> %cmp, <8 x i16> %val2, <8 x i16> %val1<br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test with ugt.<br>
+define <8 x i16> @f7(<8 x i16> %val1, <8 x i16> %val2) {<br>
+; CHECK-LABEL: f7:<br>
+; CHECK: vmxlh %v24, {{%v24, %v26|%v26, %v24}}<br>
+; CHECK: br %r14<br>
+ %cmp = icmp ugt <8 x i16> %val1, %val2<br>
+ %ret = select <8 x i1> %cmp, <8 x i16> %val1, <8 x i16> %val2<br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test with uge.<br>
+define <8 x i16> @f8(<8 x i16> %val1, <8 x i16> %val2) {<br>
+; CHECK-LABEL: f8:<br>
+; CHECK: vmxlh %v24, {{%v24, %v26|%v26, %v24}}<br>
+; CHECK: br %r14<br>
+ %cmp = icmp uge <8 x i16> %val1, %val2<br>
+ %ret = select <8 x i1> %cmp, <8 x i16> %val1, <8 x i16> %val2<br>
+ ret <8 x i16> %ret<br>
+}<br>
<br>
Added: llvm/trunk/test/CodeGen/SystemZ/vec-max-03.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-max-03.ll?rev=236521&view=auto" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-max-03.ll?rev=236521&view=auto</a><br>
==============================================================================<br>
--- llvm/trunk/test/CodeGen/SystemZ/vec-max-03.ll (added)<br>
+++ llvm/trunk/test/CodeGen/SystemZ/vec-max-03.ll Tue May 5 14:25:42 2015<br>
@@ -0,0 +1,83 @@<br>
+; Test v4i32 maximum.<br>
+;<br>
+; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=z13 | FileCheck %s<br>
+<br>
+; Test with slt.<br>
+define <4 x i32> @f1(<4 x i32> %val1, <4 x i32> %val2) {<br>
+; CHECK-LABEL: f1:<br>
+; CHECK: vmxf %v24, {{%v24, %v26|%v26, %v24}}<br>
+; CHECK: br %r14<br>
+ %cmp = icmp slt <4 x i32> %val1, %val2<br>
+ %ret = select <4 x i1> %cmp, <4 x i32> %val2, <4 x i32> %val1<br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test with sle.<br>
+define <4 x i32> @f2(<4 x i32> %val1, <4 x i32> %val2) {<br>
+; CHECK-LABEL: f2:<br>
+; CHECK: vmxf %v24, {{%v24, %v26|%v26, %v24}}<br>
+; CHECK: br %r14<br>
+ %cmp = icmp sle <4 x i32> %val1, %val2<br>
+ %ret = select <4 x i1> %cmp, <4 x i32> %val2, <4 x i32> %val1<br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test with sgt.<br>
+define <4 x i32> @f3(<4 x i32> %val1, <4 x i32> %val2) {<br>
+; CHECK-LABEL: f3:<br>
+; CHECK: vmxf %v24, {{%v24, %v26|%v26, %v24}}<br>
+; CHECK: br %r14<br>
+ %cmp = icmp sgt <4 x i32> %val1, %val2<br>
+ %ret = select <4 x i1> %cmp, <4 x i32> %val1, <4 x i32> %val2<br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test with sge.<br>
+define <4 x i32> @f4(<4 x i32> %val1, <4 x i32> %val2) {<br>
+; CHECK-LABEL: f4:<br>
+; CHECK: vmxf %v24, {{%v24, %v26|%v26, %v24}}<br>
+; CHECK: br %r14<br>
+ %cmp = icmp sge <4 x i32> %val1, %val2<br>
+ %ret = select <4 x i1> %cmp, <4 x i32> %val1, <4 x i32> %val2<br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test with ult.<br>
+define <4 x i32> @f5(<4 x i32> %val1, <4 x i32> %val2) {<br>
+; CHECK-LABEL: f5:<br>
+; CHECK: vmxlf %v24, {{%v24, %v26|%v26, %v24}}<br>
+; CHECK: br %r14<br>
+ %cmp = icmp ult <4 x i32> %val1, %val2<br>
+ %ret = select <4 x i1> %cmp, <4 x i32> %val2, <4 x i32> %val1<br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test with ule.<br>
+define <4 x i32> @f6(<4 x i32> %val1, <4 x i32> %val2) {<br>
+; CHECK-LABEL: f6:<br>
+; CHECK: vmxlf %v24, {{%v24, %v26|%v26, %v24}}<br>
+; CHECK: br %r14<br>
+ %cmp = icmp ule <4 x i32> %val1, %val2<br>
+ %ret = select <4 x i1> %cmp, <4 x i32> %val2, <4 x i32> %val1<br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test with ugt.<br>
+define <4 x i32> @f7(<4 x i32> %val1, <4 x i32> %val2) {<br>
+; CHECK-LABEL: f7:<br>
+; CHECK: vmxlf %v24, {{%v24, %v26|%v26, %v24}}<br>
+; CHECK: br %r14<br>
+ %cmp = icmp ugt <4 x i32> %val1, %val2<br>
+ %ret = select <4 x i1> %cmp, <4 x i32> %val1, <4 x i32> %val2<br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test with uge.<br>
+define <4 x i32> @f8(<4 x i32> %val1, <4 x i32> %val2) {<br>
+; CHECK-LABEL: f8:<br>
+; CHECK: vmxlf %v24, {{%v24, %v26|%v26, %v24}}<br>
+; CHECK: br %r14<br>
+ %cmp = icmp uge <4 x i32> %val1, %val2<br>
+ %ret = select <4 x i1> %cmp, <4 x i32> %val1, <4 x i32> %val2<br>
+ ret <4 x i32> %ret<br>
+}<br>
<br>
Added: llvm/trunk/test/CodeGen/SystemZ/vec-max-04.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-max-04.ll?rev=236521&view=auto" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-max-04.ll?rev=236521&view=auto</a><br>
==============================================================================<br>
--- llvm/trunk/test/CodeGen/SystemZ/vec-max-04.ll (added)<br>
+++ llvm/trunk/test/CodeGen/SystemZ/vec-max-04.ll Tue May 5 14:25:42 2015<br>
@@ -0,0 +1,83 @@<br>
+; Test v2i64 maximum.<br>
+;<br>
+; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=z13 | FileCheck %s<br>
+<br>
+; Test with slt.<br>
+define <2 x i64> @f1(<2 x i64> %val1, <2 x i64> %val2) {<br>
+; CHECK-LABEL: f1:<br>
+; CHECK: vmxg %v24, {{%v24, %v26|%v26, %v24}}<br>
+; CHECK: br %r14<br>
+ %cmp = icmp slt <2 x i64> %val1, %val2<br>
+ %ret = select <2 x i1> %cmp, <2 x i64> %val2, <2 x i64> %val1<br>
+ ret <2 x i64> %ret<br>
+}<br>
+<br>
+; Test with sle.<br>
+define <2 x i64> @f2(<2 x i64> %val1, <2 x i64> %val2) {<br>
+; CHECK-LABEL: f2:<br>
+; CHECK: vmxg %v24, {{%v24, %v26|%v26, %v24}}<br>
+; CHECK: br %r14<br>
+ %cmp = icmp sle <2 x i64> %val1, %val2<br>
+ %ret = select <2 x i1> %cmp, <2 x i64> %val2, <2 x i64> %val1<br>
+ ret <2 x i64> %ret<br>
+}<br>
+<br>
+; Test with sgt.<br>
+define <2 x i64> @f3(<2 x i64> %val1, <2 x i64> %val2) {<br>
+; CHECK-LABEL: f3:<br>
+; CHECK: vmxg %v24, {{%v24, %v26|%v26, %v24}}<br>
+; CHECK: br %r14<br>
+ %cmp = icmp sgt <2 x i64> %val1, %val2<br>
+ %ret = select <2 x i1> %cmp, <2 x i64> %val1, <2 x i64> %val2<br>
+ ret <2 x i64> %ret<br>
+}<br>
+<br>
+; Test with sge.<br>
+define <2 x i64> @f4(<2 x i64> %val1, <2 x i64> %val2) {<br>
+; CHECK-LABEL: f4:<br>
+; CHECK: vmxg %v24, {{%v24, %v26|%v26, %v24}}<br>
+; CHECK: br %r14<br>
+ %cmp = icmp sge <2 x i64> %val1, %val2<br>
+ %ret = select <2 x i1> %cmp, <2 x i64> %val1, <2 x i64> %val2<br>
+ ret <2 x i64> %ret<br>
+}<br>
+<br>
+; Test with ult.<br>
+define <2 x i64> @f5(<2 x i64> %val1, <2 x i64> %val2) {<br>
+; CHECK-LABEL: f5:<br>
+; CHECK: vmxlg %v24, {{%v24, %v26|%v26, %v24}}<br>
+; CHECK: br %r14<br>
+ %cmp = icmp ult <2 x i64> %val1, %val2<br>
+ %ret = select <2 x i1> %cmp, <2 x i64> %val2, <2 x i64> %val1<br>
+ ret <2 x i64> %ret<br>
+}<br>
+<br>
+; Test with ule.<br>
+define <2 x i64> @f6(<2 x i64> %val1, <2 x i64> %val2) {<br>
+; CHECK-LABEL: f6:<br>
+; CHECK: vmxlg %v24, {{%v24, %v26|%v26, %v24}}<br>
+; CHECK: br %r14<br>
+ %cmp = icmp ule <2 x i64> %val1, %val2<br>
+ %ret = select <2 x i1> %cmp, <2 x i64> %val2, <2 x i64> %val1<br>
+ ret <2 x i64> %ret<br>
+}<br>
+<br>
+; Test with ugt.<br>
+define <2 x i64> @f7(<2 x i64> %val1, <2 x i64> %val2) {<br>
+; CHECK-LABEL: f7:<br>
+; CHECK: vmxlg %v24, {{%v24, %v26|%v26, %v24}}<br>
+; CHECK: br %r14<br>
+ %cmp = icmp ugt <2 x i64> %val1, %val2<br>
+ %ret = select <2 x i1> %cmp, <2 x i64> %val1, <2 x i64> %val2<br>
+ ret <2 x i64> %ret<br>
+}<br>
+<br>
+; Test with uge.<br>
+define <2 x i64> @f8(<2 x i64> %val1, <2 x i64> %val2) {<br>
+; CHECK-LABEL: f8:<br>
+; CHECK: vmxlg %v24, {{%v24, %v26|%v26, %v24}}<br>
+; CHECK: br %r14<br>
+ %cmp = icmp uge <2 x i64> %val1, %val2<br>
+ %ret = select <2 x i1> %cmp, <2 x i64> %val1, <2 x i64> %val2<br>
+ ret <2 x i64> %ret<br>
+}<br>
<br>
Added: llvm/trunk/test/CodeGen/SystemZ/vec-min-01.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-min-01.ll?rev=236521&view=auto" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-min-01.ll?rev=236521&view=auto</a><br>
==============================================================================<br>
--- llvm/trunk/test/CodeGen/SystemZ/vec-min-01.ll (added)<br>
+++ llvm/trunk/test/CodeGen/SystemZ/vec-min-01.ll Tue May 5 14:25:42 2015<br>
@@ -0,0 +1,83 @@<br>
+; Test v16i8 minimum.<br>
+;<br>
+; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=z13 | FileCheck %s<br>
+<br>
+; Test with slt.<br>
+define <16 x i8> @f1(<16 x i8> %val1, <16 x i8> %val2) {<br>
+; CHECK-LABEL: f1:<br>
+; CHECK: vmnb %v24, {{%v24, %v26|%v26, %v24}}<br>
+; CHECK: br %r14<br>
+ %cmp = icmp slt <16 x i8> %val2, %val1<br>
+ %ret = select <16 x i1> %cmp, <16 x i8> %val2, <16 x i8> %val1<br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test with sle.<br>
+define <16 x i8> @f2(<16 x i8> %val1, <16 x i8> %val2) {<br>
+; CHECK-LABEL: f2:<br>
+; CHECK: vmnb %v24, {{%v24, %v26|%v26, %v24}}<br>
+; CHECK: br %r14<br>
+ %cmp = icmp sle <16 x i8> %val2, %val1<br>
+ %ret = select <16 x i1> %cmp, <16 x i8> %val2, <16 x i8> %val1<br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test with sgt.<br>
+define <16 x i8> @f3(<16 x i8> %val1, <16 x i8> %val2) {<br>
+; CHECK-LABEL: f3:<br>
+; CHECK: vmnb %v24, {{%v24, %v26|%v26, %v24}}<br>
+; CHECK: br %r14<br>
+ %cmp = icmp sgt <16 x i8> %val2, %val1<br>
+ %ret = select <16 x i1> %cmp, <16 x i8> %val1, <16 x i8> %val2<br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test with sge.<br>
+define <16 x i8> @f4(<16 x i8> %val1, <16 x i8> %val2) {<br>
+; CHECK-LABEL: f4:<br>
+; CHECK: vmnb %v24, {{%v24, %v26|%v26, %v24}}<br>
+; CHECK: br %r14<br>
+ %cmp = icmp sge <16 x i8> %val2, %val1<br>
+ %ret = select <16 x i1> %cmp, <16 x i8> %val1, <16 x i8> %val2<br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test with ult.<br>
+define <16 x i8> @f5(<16 x i8> %val1, <16 x i8> %val2) {<br>
+; CHECK-LABEL: f5:<br>
+; CHECK: vmnlb %v24, {{%v24, %v26|%v26, %v24}}<br>
+; CHECK: br %r14<br>
+ %cmp = icmp ult <16 x i8> %val2, %val1<br>
+ %ret = select <16 x i1> %cmp, <16 x i8> %val2, <16 x i8> %val1<br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test with ule.<br>
+define <16 x i8> @f6(<16 x i8> %val1, <16 x i8> %val2) {<br>
+; CHECK-LABEL: f6:<br>
+; CHECK: vmnlb %v24, {{%v24, %v26|%v26, %v24}}<br>
+; CHECK: br %r14<br>
+ %cmp = icmp ule <16 x i8> %val2, %val1<br>
+ %ret = select <16 x i1> %cmp, <16 x i8> %val2, <16 x i8> %val1<br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test with ugt.<br>
+define <16 x i8> @f7(<16 x i8> %val1, <16 x i8> %val2) {<br>
+; CHECK-LABEL: f7:<br>
+; CHECK: vmnlb %v24, {{%v24, %v26|%v26, %v24}}<br>
+; CHECK: br %r14<br>
+ %cmp = icmp ugt <16 x i8> %val2, %val1<br>
+ %ret = select <16 x i1> %cmp, <16 x i8> %val1, <16 x i8> %val2<br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test with uge.<br>
+define <16 x i8> @f8(<16 x i8> %val1, <16 x i8> %val2) {<br>
+; CHECK-LABEL: f8:<br>
+; CHECK: vmnlb %v24, {{%v24, %v26|%v26, %v24}}<br>
+; CHECK: br %r14<br>
+ %cmp = icmp uge <16 x i8> %val2, %val1<br>
+ %ret = select <16 x i1> %cmp, <16 x i8> %val1, <16 x i8> %val2<br>
+ ret <16 x i8> %ret<br>
+}<br>
<br>
Added: llvm/trunk/test/CodeGen/SystemZ/vec-min-02.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-min-02.ll?rev=236521&view=auto" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-min-02.ll?rev=236521&view=auto</a><br>
==============================================================================<br>
--- llvm/trunk/test/CodeGen/SystemZ/vec-min-02.ll (added)<br>
+++ llvm/trunk/test/CodeGen/SystemZ/vec-min-02.ll Tue May 5 14:25:42 2015<br>
@@ -0,0 +1,83 @@<br>
+; Test v8i16 minimum.<br>
+;<br>
+; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=z13 | FileCheck %s<br>
+<br>
+; Test with slt.<br>
+define <8 x i16> @f1(<8 x i16> %val1, <8 x i16> %val2) {<br>
+; CHECK-LABEL: f1:<br>
+; CHECK: vmnh %v24, {{%v24, %v26|%v26, %v24}}<br>
+; CHECK: br %r14<br>
+ %cmp = icmp slt <8 x i16> %val2, %val1<br>
+ %ret = select <8 x i1> %cmp, <8 x i16> %val2, <8 x i16> %val1<br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test with sle.<br>
+define <8 x i16> @f2(<8 x i16> %val1, <8 x i16> %val2) {<br>
+; CHECK-LABEL: f2:<br>
+; CHECK: vmnh %v24, {{%v24, %v26|%v26, %v24}}<br>
+; CHECK: br %r14<br>
+ %cmp = icmp sle <8 x i16> %val2, %val1<br>
+ %ret = select <8 x i1> %cmp, <8 x i16> %val2, <8 x i16> %val1<br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test with sgt.<br>
+define <8 x i16> @f3(<8 x i16> %val1, <8 x i16> %val2) {<br>
+; CHECK-LABEL: f3:<br>
+; CHECK: vmnh %v24, {{%v24, %v26|%v26, %v24}}<br>
+; CHECK: br %r14<br>
+ %cmp = icmp sgt <8 x i16> %val2, %val1<br>
+ %ret = select <8 x i1> %cmp, <8 x i16> %val1, <8 x i16> %val2<br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test with sge.<br>
+define <8 x i16> @f4(<8 x i16> %val1, <8 x i16> %val2) {<br>
+; CHECK-LABEL: f4:<br>
+; CHECK: vmnh %v24, {{%v24, %v26|%v26, %v24}}<br>
+; CHECK: br %r14<br>
+ %cmp = icmp sge <8 x i16> %val2, %val1<br>
+ %ret = select <8 x i1> %cmp, <8 x i16> %val1, <8 x i16> %val2<br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test with ult.<br>
+define <8 x i16> @f5(<8 x i16> %val1, <8 x i16> %val2) {<br>
+; CHECK-LABEL: f5:<br>
+; CHECK: vmnlh %v24, {{%v24, %v26|%v26, %v24}}<br>
+; CHECK: br %r14<br>
+ %cmp = icmp ult <8 x i16> %val2, %val1<br>
+ %ret = select <8 x i1> %cmp, <8 x i16> %val2, <8 x i16> %val1<br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test with ule.<br>
+define <8 x i16> @f6(<8 x i16> %val1, <8 x i16> %val2) {<br>
+; CHECK-LABEL: f6:<br>
+; CHECK: vmnlh %v24, {{%v24, %v26|%v26, %v24}}<br>
+; CHECK: br %r14<br>
+ %cmp = icmp ule <8 x i16> %val2, %val1<br>
+ %ret = select <8 x i1> %cmp, <8 x i16> %val2, <8 x i16> %val1<br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test with ugt.<br>
+define <8 x i16> @f7(<8 x i16> %val1, <8 x i16> %val2) {<br>
+; CHECK-LABEL: f7:<br>
+; CHECK: vmnlh %v24, {{%v24, %v26|%v26, %v24}}<br>
+; CHECK: br %r14<br>
+ %cmp = icmp ugt <8 x i16> %val2, %val1<br>
+ %ret = select <8 x i1> %cmp, <8 x i16> %val1, <8 x i16> %val2<br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test with uge.<br>
+define <8 x i16> @f8(<8 x i16> %val1, <8 x i16> %val2) {<br>
+; CHECK-LABEL: f8:<br>
+; CHECK: vmnlh %v24, {{%v24, %v26|%v26, %v24}}<br>
+; CHECK: br %r14<br>
+ %cmp = icmp uge <8 x i16> %val2, %val1<br>
+ %ret = select <8 x i1> %cmp, <8 x i16> %val1, <8 x i16> %val2<br>
+ ret <8 x i16> %ret<br>
+}<br>
<br>
Added: llvm/trunk/test/CodeGen/SystemZ/vec-min-03.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-min-03.ll?rev=236521&view=auto" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-min-03.ll?rev=236521&view=auto</a><br>
==============================================================================<br>
--- llvm/trunk/test/CodeGen/SystemZ/vec-min-03.ll (added)<br>
+++ llvm/trunk/test/CodeGen/SystemZ/vec-min-03.ll Tue May 5 14:25:42 2015<br>
@@ -0,0 +1,83 @@<br>
+; Test v4i32 minimum.<br>
+;<br>
+; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=z13 | FileCheck %s<br>
+<br>
+; Test with slt.<br>
+define <4 x i32> @f1(<4 x i32> %val1, <4 x i32> %val2) {<br>
+; CHECK-LABEL: f1:<br>
+; CHECK: vmnf %v24, {{%v24, %v26|%v26, %v24}}<br>
+; CHECK: br %r14<br>
+ %cmp = icmp slt <4 x i32> %val2, %val1<br>
+ %ret = select <4 x i1> %cmp, <4 x i32> %val2, <4 x i32> %val1<br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test with sle.<br>
+define <4 x i32> @f2(<4 x i32> %val1, <4 x i32> %val2) {<br>
+; CHECK-LABEL: f2:<br>
+; CHECK: vmnf %v24, {{%v24, %v26|%v26, %v24}}<br>
+; CHECK: br %r14<br>
+ %cmp = icmp sle <4 x i32> %val2, %val1<br>
+ %ret = select <4 x i1> %cmp, <4 x i32> %val2, <4 x i32> %val1<br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test with sgt.<br>
+define <4 x i32> @f3(<4 x i32> %val1, <4 x i32> %val2) {<br>
+; CHECK-LABEL: f3:<br>
+; CHECK: vmnf %v24, {{%v24, %v26|%v26, %v24}}<br>
+; CHECK: br %r14<br>
+ %cmp = icmp sgt <4 x i32> %val2, %val1<br>
+ %ret = select <4 x i1> %cmp, <4 x i32> %val1, <4 x i32> %val2<br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test with sge.<br>
+define <4 x i32> @f4(<4 x i32> %val1, <4 x i32> %val2) {<br>
+; CHECK-LABEL: f4:<br>
+; CHECK: vmnf %v24, {{%v24, %v26|%v26, %v24}}<br>
+; CHECK: br %r14<br>
+ %cmp = icmp sge <4 x i32> %val2, %val1<br>
+ %ret = select <4 x i1> %cmp, <4 x i32> %val1, <4 x i32> %val2<br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test with ult.<br>
+define <4 x i32> @f5(<4 x i32> %val1, <4 x i32> %val2) {<br>
+; CHECK-LABEL: f5:<br>
+; CHECK: vmnlf %v24, {{%v24, %v26|%v26, %v24}}<br>
+; CHECK: br %r14<br>
+ %cmp = icmp ult <4 x i32> %val2, %val1<br>
+ %ret = select <4 x i1> %cmp, <4 x i32> %val2, <4 x i32> %val1<br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test with ule.<br>
+define <4 x i32> @f6(<4 x i32> %val1, <4 x i32> %val2) {<br>
+; CHECK-LABEL: f6:<br>
+; CHECK: vmnlf %v24, {{%v24, %v26|%v26, %v24}}<br>
+; CHECK: br %r14<br>
+ %cmp = icmp ule <4 x i32> %val2, %val1<br>
+ %ret = select <4 x i1> %cmp, <4 x i32> %val2, <4 x i32> %val1<br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test with ugt.<br>
+define <4 x i32> @f7(<4 x i32> %val1, <4 x i32> %val2) {<br>
+; CHECK-LABEL: f7:<br>
+; CHECK: vmnlf %v24, {{%v24, %v26|%v26, %v24}}<br>
+; CHECK: br %r14<br>
+ %cmp = icmp ugt <4 x i32> %val2, %val1<br>
+ %ret = select <4 x i1> %cmp, <4 x i32> %val1, <4 x i32> %val2<br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test with uge.<br>
+define <4 x i32> @f8(<4 x i32> %val1, <4 x i32> %val2) {<br>
+; CHECK-LABEL: f8:<br>
+; CHECK: vmnlf %v24, {{%v24, %v26|%v26, %v24}}<br>
+; CHECK: br %r14<br>
+ %cmp = icmp uge <4 x i32> %val2, %val1<br>
+ %ret = select <4 x i1> %cmp, <4 x i32> %val1, <4 x i32> %val2<br>
+ ret <4 x i32> %ret<br>
+}<br>
<br>
Added: llvm/trunk/test/CodeGen/SystemZ/vec-min-04.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-min-04.ll?rev=236521&view=auto" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-min-04.ll?rev=236521&view=auto</a><br>
==============================================================================<br>
--- llvm/trunk/test/CodeGen/SystemZ/vec-min-04.ll (added)<br>
+++ llvm/trunk/test/CodeGen/SystemZ/vec-min-04.ll Tue May 5 14:25:42 2015<br>
@@ -0,0 +1,83 @@<br>
+; Test v2i64 minimum.<br>
+;<br>
+; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=z13 | FileCheck %s<br>
+<br>
+; Test with slt.<br>
+define <2 x i64> @f1(<2 x i64> %val1, <2 x i64> %val2) {<br>
+; CHECK-LABEL: f1:<br>
+; CHECK: vmng %v24, {{%v24, %v26|%v26, %v24}}<br>
+; CHECK: br %r14<br>
+ %cmp = icmp slt <2 x i64> %val2, %val1<br>
+ %ret = select <2 x i1> %cmp, <2 x i64> %val2, <2 x i64> %val1<br>
+ ret <2 x i64> %ret<br>
+}<br>
+<br>
+; Test with sle.<br>
+define <2 x i64> @f2(<2 x i64> %val1, <2 x i64> %val2) {<br>
+; CHECK-LABEL: f2:<br>
+; CHECK: vmng %v24, {{%v24, %v26|%v26, %v24}}<br>
+; CHECK: br %r14<br>
+ %cmp = icmp sle <2 x i64> %val2, %val1<br>
+ %ret = select <2 x i1> %cmp, <2 x i64> %val2, <2 x i64> %val1<br>
+ ret <2 x i64> %ret<br>
+}<br>
+<br>
+; Test with sgt.<br>
+define <2 x i64> @f3(<2 x i64> %val1, <2 x i64> %val2) {<br>
+; CHECK-LABEL: f3:<br>
+; CHECK: vmng %v24, {{%v24, %v26|%v26, %v24}}<br>
+; CHECK: br %r14<br>
+ %cmp = icmp sgt <2 x i64> %val2, %val1<br>
+ %ret = select <2 x i1> %cmp, <2 x i64> %val1, <2 x i64> %val2<br>
+ ret <2 x i64> %ret<br>
+}<br>
+<br>
+; Test with sge.<br>
+define <2 x i64> @f4(<2 x i64> %val1, <2 x i64> %val2) {<br>
+; CHECK-LABEL: f4:<br>
+; CHECK: vmng %v24, {{%v24, %v26|%v26, %v24}}<br>
+; CHECK: br %r14<br>
+ %cmp = icmp sge <2 x i64> %val2, %val1<br>
+ %ret = select <2 x i1> %cmp, <2 x i64> %val1, <2 x i64> %val2<br>
+ ret <2 x i64> %ret<br>
+}<br>
+<br>
+; Test with ult.<br>
+define <2 x i64> @f5(<2 x i64> %val1, <2 x i64> %val2) {<br>
+; CHECK-LABEL: f5:<br>
+; CHECK: vmnlg %v24, {{%v24, %v26|%v26, %v24}}<br>
+; CHECK: br %r14<br>
+ %cmp = icmp ult <2 x i64> %val2, %val1<br>
+ %ret = select <2 x i1> %cmp, <2 x i64> %val2, <2 x i64> %val1<br>
+ ret <2 x i64> %ret<br>
+}<br>
+<br>
+; Test with ule.<br>
+define <2 x i64> @f6(<2 x i64> %val1, <2 x i64> %val2) {<br>
+; CHECK-LABEL: f6:<br>
+; CHECK: vmnlg %v24, {{%v24, %v26|%v26, %v24}}<br>
+; CHECK: br %r14<br>
+ %cmp = icmp ule <2 x i64> %val2, %val1<br>
+ %ret = select <2 x i1> %cmp, <2 x i64> %val2, <2 x i64> %val1<br>
+ ret <2 x i64> %ret<br>
+}<br>
+<br>
+; Test with ugt.<br>
+define <2 x i64> @f7(<2 x i64> %val1, <2 x i64> %val2) {<br>
+; CHECK-LABEL: f7:<br>
+; CHECK: vmnlg %v24, {{%v24, %v26|%v26, %v24}}<br>
+; CHECK: br %r14<br>
+ %cmp = icmp ugt <2 x i64> %val2, %val1<br>
+ %ret = select <2 x i1> %cmp, <2 x i64> %val1, <2 x i64> %val2<br>
+ ret <2 x i64> %ret<br>
+}<br>
+<br>
+; Test with uge.<br>
+define <2 x i64> @f8(<2 x i64> %val1, <2 x i64> %val2) {<br>
+; CHECK-LABEL: f8:<br>
+; CHECK: vmnlg %v24, {{%v24, %v26|%v26, %v24}}<br>
+; CHECK: br %r14<br>
+ %cmp = icmp uge <2 x i64> %val2, %val1<br>
+ %ret = select <2 x i1> %cmp, <2 x i64> %val1, <2 x i64> %val2<br>
+ ret <2 x i64> %ret<br>
+}<br>
<br>
Added: llvm/trunk/test/CodeGen/SystemZ/vec-move-01.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-move-01.ll?rev=236521&view=auto" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-move-01.ll?rev=236521&view=auto</a><br>
==============================================================================<br>
--- llvm/trunk/test/CodeGen/SystemZ/vec-move-01.ll (added)<br>
+++ llvm/trunk/test/CodeGen/SystemZ/vec-move-01.ll Tue May 5 14:25:42 2015<br>
@@ -0,0 +1,35 @@<br>
+; Test vector register moves.<br>
+;<br>
+; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=z13 | FileCheck %s<br>
+<br>
+; Test v16i8 moves.<br>
+define <16 x i8> @f1(<16 x i8> %val1, <16 x i8> %val2) {<br>
+; CHECK-LABEL: f1:<br>
+; CHECK: vlr %v24, %v26<br>
+; CHECK: br %r14<br>
+ ret <16 x i8> %val2<br>
+}<br>
+<br>
+; Test v8i16 moves.<br>
+define <8 x i16> @f2(<8 x i16> %val1, <8 x i16> %val2) {<br>
+; CHECK-LABEL: f2:<br>
+; CHECK: vlr %v24, %v26<br>
+; CHECK: br %r14<br>
+ ret <8 x i16> %val2<br>
+}<br>
+<br>
+; Test v4i32 moves.<br>
+define <4 x i32> @f3(<4 x i32> %val1, <4 x i32> %val2) {<br>
+; CHECK-LABEL: f3:<br>
+; CHECK: vlr %v24, %v26<br>
+; CHECK: br %r14<br>
+ ret <4 x i32> %val2<br>
+}<br>
+<br>
+; Test v2i64 moves.<br>
+define <2 x i64> @f4(<2 x i64> %val1, <2 x i64> %val2) {<br>
+; CHECK-LABEL: f4:<br>
+; CHECK: vlr %v24, %v26<br>
+; CHECK: br %r14<br>
+ ret <2 x i64> %val2<br>
+}<br>
<br>
Added: llvm/trunk/test/CodeGen/SystemZ/vec-move-02.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-move-02.ll?rev=236521&view=auto" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-move-02.ll?rev=236521&view=auto</a><br>
==============================================================================<br>
--- llvm/trunk/test/CodeGen/SystemZ/vec-move-02.ll (added)<br>
+++ llvm/trunk/test/CodeGen/SystemZ/vec-move-02.ll Tue May 5 14:25:42 2015<br>
@@ -0,0 +1,93 @@<br>
+; Test vector loads.<br>
+;<br>
+; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=z13 | FileCheck %s<br>
+<br>
+; Test v16i8 loads.<br>
+define <16 x i8> @f1(<16 x i8> *%ptr) {<br>
+; CHECK-LABEL: f1:<br>
+; CHECK: vl %v24, 0(%r2)<br>
+; CHECK: br %r14<br>
+ %ret = load <16 x i8>, <16 x i8> *%ptr<br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test v8i16 loads.<br>
+define <8 x i16> @f2(<8 x i16> *%ptr) {<br>
+; CHECK-LABEL: f2:<br>
+; CHECK: vl %v24, 0(%r2)<br>
+; CHECK: br %r14<br>
+ %ret = load <8 x i16>, <8 x i16> *%ptr<br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test v4i32 loads.<br>
+define <4 x i32> @f3(<4 x i32> *%ptr) {<br>
+; CHECK-LABEL: f3:<br>
+; CHECK: vl %v24, 0(%r2)<br>
+; CHECK: br %r14<br>
+ %ret = load <4 x i32>, <4 x i32> *%ptr<br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test v2i64 loads.<br>
+define <2 x i64> @f4(<2 x i64> *%ptr) {<br>
+; CHECK-LABEL: f4:<br>
+; CHECK: vl %v24, 0(%r2)<br>
+; CHECK: br %r14<br>
+ %ret = load <2 x i64>, <2 x i64> *%ptr<br>
+ ret <2 x i64> %ret<br>
+}<br>
+<br>
+; Test the highest aligned in-range offset.<br>
+define <16 x i8> @f7(<16 x i8> *%base) {<br>
+; CHECK-LABEL: f7:<br>
+; CHECK: vl %v24, 4080(%r2)<br>
+; CHECK: br %r14<br>
+ %ptr = getelementptr <16 x i8>, <16 x i8> *%base, i64 255<br>
+ %ret = load <16 x i8>, <16 x i8> *%ptr<br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test the highest unaligned in-range offset.<br>
+define <16 x i8> @f8(i8 *%base) {<br>
+; CHECK-LABEL: f8:<br>
+; CHECK: vl %v24, 4095(%r2)<br>
+; CHECK: br %r14<br>
+ %addr = getelementptr i8, i8 *%base, i64 4095<br>
+ %ptr = bitcast i8 *%addr to <16 x i8> *<br>
+ %ret = load <16 x i8>, <16 x i8> *%ptr, align 1<br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test the next offset up, which requires separate address logic,<br>
+define <16 x i8> @f9(<16 x i8> *%base) {<br>
+; CHECK-LABEL: f9:<br>
+; CHECK: aghi %r2, 4096<br>
+; CHECK: vl %v24, 0(%r2)<br>
+; CHECK: br %r14<br>
+ %ptr = getelementptr <16 x i8>, <16 x i8> *%base, i64 256<br>
+ %ret = load <16 x i8>, <16 x i8> *%ptr<br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test negative offsets, which also require separate address logic,<br>
+define <16 x i8> @f10(<16 x i8> *%base) {<br>
+; CHECK-LABEL: f10:<br>
+; CHECK: aghi %r2, -16<br>
+; CHECK: vl %v24, 0(%r2)<br>
+; CHECK: br %r14<br>
+ %ptr = getelementptr <16 x i8>, <16 x i8> *%base, i64 -1<br>
+ %ret = load <16 x i8>, <16 x i8> *%ptr<br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Check that indexes are allowed.<br>
+define <16 x i8> @f11(i8 *%base, i64 %index) {<br>
+; CHECK-LABEL: f11:<br>
+; CHECK: vl %v24, 0(%r3,%r2)<br>
+; CHECK: br %r14<br>
+ %addr = getelementptr i8, i8 *%base, i64 %index<br>
+ %ptr = bitcast i8 *%addr to <16 x i8> *<br>
+ %ret = load <16 x i8>, <16 x i8> *%ptr, align 1<br>
+ ret <16 x i8> %ret<br>
+}<br>
<br>
Added: llvm/trunk/test/CodeGen/SystemZ/vec-move-03.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-move-03.ll?rev=236521&view=auto" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-move-03.ll?rev=236521&view=auto</a><br>
==============================================================================<br>
--- llvm/trunk/test/CodeGen/SystemZ/vec-move-03.ll (added)<br>
+++ llvm/trunk/test/CodeGen/SystemZ/vec-move-03.ll Tue May 5 14:25:42 2015<br>
@@ -0,0 +1,93 @@<br>
+; Test vector stores.<br>
+;<br>
+; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=z13 | FileCheck %s<br>
+<br>
+; Test v16i8 stores.<br>
+define void @f1(<16 x i8> %val, <16 x i8> *%ptr) {<br>
+; CHECK-LABEL: f1:<br>
+; CHECK: vst %v24, 0(%r2)<br>
+; CHECK: br %r14<br>
+ store <16 x i8> %val, <16 x i8> *%ptr<br>
+ ret void<br>
+}<br>
+<br>
+; Test v8i16 stores.<br>
+define void @f2(<8 x i16> %val, <8 x i16> *%ptr) {<br>
+; CHECK-LABEL: f2:<br>
+; CHECK: vst %v24, 0(%r2)<br>
+; CHECK: br %r14<br>
+ store <8 x i16> %val, <8 x i16> *%ptr<br>
+ ret void<br>
+}<br>
+<br>
+; Test v4i32 stores.<br>
+define void @f3(<4 x i32> %val, <4 x i32> *%ptr) {<br>
+; CHECK-LABEL: f3:<br>
+; CHECK: vst %v24, 0(%r2)<br>
+; CHECK: br %r14<br>
+ store <4 x i32> %val, <4 x i32> *%ptr<br>
+ ret void<br>
+}<br>
+<br>
+; Test v2i64 stores.<br>
+define void @f4(<2 x i64> %val, <2 x i64> *%ptr) {<br>
+; CHECK-LABEL: f4:<br>
+; CHECK: vst %v24, 0(%r2)<br>
+; CHECK: br %r14<br>
+ store <2 x i64> %val, <2 x i64> *%ptr<br>
+ ret void<br>
+}<br>
+<br>
+; Test the highest aligned in-range offset.<br>
+define void @f7(<16 x i8> %val, <16 x i8> *%base) {<br>
+; CHECK-LABEL: f7:<br>
+; CHECK: vst %v24, 4080(%r2)<br>
+; CHECK: br %r14<br>
+ %ptr = getelementptr <16 x i8>, <16 x i8> *%base, i64 255<br>
+ store <16 x i8> %val, <16 x i8> *%ptr<br>
+ ret void<br>
+}<br>
+<br>
+; Test the highest unaligned in-range offset.<br>
+define void @f8(<16 x i8> %val, i8 *%base) {<br>
+; CHECK-LABEL: f8:<br>
+; CHECK: vst %v24, 4095(%r2)<br>
+; CHECK: br %r14<br>
+ %addr = getelementptr i8, i8 *%base, i64 4095<br>
+ %ptr = bitcast i8 *%addr to <16 x i8> *<br>
+ store <16 x i8> %val, <16 x i8> *%ptr, align 1<br>
+ ret void<br>
+}<br>
+<br>
+; Test the next offset up, which requires separate address logic,<br>
+define void @f9(<16 x i8> %val, <16 x i8> *%base) {<br>
+; CHECK-LABEL: f9:<br>
+; CHECK: aghi %r2, 4096<br>
+; CHECK: vst %v24, 0(%r2)<br>
+; CHECK: br %r14<br>
+ %ptr = getelementptr <16 x i8>, <16 x i8> *%base, i64 256<br>
+ store <16 x i8> %val, <16 x i8> *%ptr<br>
+ ret void<br>
+}<br>
+<br>
+; Test negative offsets, which also require separate address logic,<br>
+define void @f10(<16 x i8> %val, <16 x i8> *%base) {<br>
+; CHECK-LABEL: f10:<br>
+; CHECK: aghi %r2, -16<br>
+; CHECK: vst %v24, 0(%r2)<br>
+; CHECK: br %r14<br>
+ %ptr = getelementptr <16 x i8>, <16 x i8> *%base, i64 -1<br>
+ store <16 x i8> %val, <16 x i8> *%ptr<br>
+ ret void<br>
+}<br>
+<br>
+; Check that indexes are allowed.<br>
+define void @f11(<16 x i8> %val, i8 *%base, i64 %index) {<br>
+; CHECK-LABEL: f11:<br>
+; CHECK: vst %v24, 0(%r3,%r2)<br>
+; CHECK: br %r14<br>
+ %addr = getelementptr i8, i8 *%base, i64 %index<br>
+ %ptr = bitcast i8 *%addr to <16 x i8> *<br>
+ store <16 x i8> %val, <16 x i8> *%ptr, align 1<br>
+ ret void<br>
+}<br>
<br>
Added: llvm/trunk/test/CodeGen/SystemZ/vec-move-04.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-move-04.ll?rev=236521&view=auto" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-move-04.ll?rev=236521&view=auto</a><br>
==============================================================================<br>
--- llvm/trunk/test/CodeGen/SystemZ/vec-move-04.ll (added)<br>
+++ llvm/trunk/test/CodeGen/SystemZ/vec-move-04.ll Tue May 5 14:25:42 2015<br>
@@ -0,0 +1,121 @@<br>
+; Test vector insertion of register variables.<br>
+;<br>
+; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=z13 | FileCheck %s<br>
+<br>
+; Test v16i8 insertion into the first element.<br>
+define <16 x i8> @f1(<16 x i8> %val, i8 %element) {<br>
+; CHECK-LABEL: f1:<br>
+; CHECK: vlvgb %v24, %r2, 0<br>
+; CHECK: br %r14<br>
+ %ret = insertelement <16 x i8> %val, i8 %element, i32 0<br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test v16i8 insertion into the last element.<br>
+define <16 x i8> @f2(<16 x i8> %val, i8 %element) {<br>
+; CHECK-LABEL: f2:<br>
+; CHECK: vlvgb %v24, %r2, 15<br>
+; CHECK: br %r14<br>
+ %ret = insertelement <16 x i8> %val, i8 %element, i32 15<br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test v16i8 insertion into a variable element.<br>
+define <16 x i8> @f3(<16 x i8> %val, i8 %element, i32 %index) {<br>
+; CHECK-LABEL: f3:<br>
+; CHECK: vlvgb %v24, %r2, 0(%r3)<br>
+; CHECK: br %r14<br>
+ %ret = insertelement <16 x i8> %val, i8 %element, i32 %index<br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test v8i16 insertion into the first element.<br>
+define <8 x i16> @f4(<8 x i16> %val, i16 %element) {<br>
+; CHECK-LABEL: f4:<br>
+; CHECK: vlvgh %v24, %r2, 0<br>
+; CHECK: br %r14<br>
+ %ret = insertelement <8 x i16> %val, i16 %element, i32 0<br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test v8i16 insertion into the last element.<br>
+define <8 x i16> @f5(<8 x i16> %val, i16 %element) {<br>
+; CHECK-LABEL: f5:<br>
+; CHECK: vlvgh %v24, %r2, 7<br>
+; CHECK: br %r14<br>
+ %ret = insertelement <8 x i16> %val, i16 %element, i32 7<br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test v8i16 insertion into a variable element.<br>
+define <8 x i16> @f6(<8 x i16> %val, i16 %element, i32 %index) {<br>
+; CHECK-LABEL: f6:<br>
+; CHECK: vlvgh %v24, %r2, 0(%r3)<br>
+; CHECK: br %r14<br>
+ %ret = insertelement <8 x i16> %val, i16 %element, i32 %index<br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test v4i32 insertion into the first element.<br>
+define <4 x i32> @f7(<4 x i32> %val, i32 %element) {<br>
+; CHECK-LABEL: f7:<br>
+; CHECK: vlvgf %v24, %r2, 0<br>
+; CHECK: br %r14<br>
+ %ret = insertelement <4 x i32> %val, i32 %element, i32 0<br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test v4i32 insertion into the last element.<br>
+define <4 x i32> @f8(<4 x i32> %val, i32 %element) {<br>
+; CHECK-LABEL: f8:<br>
+; CHECK: vlvgf %v24, %r2, 3<br>
+; CHECK: br %r14<br>
+ %ret = insertelement <4 x i32> %val, i32 %element, i32 3<br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test v4i32 insertion into a variable element.<br>
+define <4 x i32> @f9(<4 x i32> %val, i32 %element, i32 %index) {<br>
+; CHECK-LABEL: f9:<br>
+; CHECK: vlvgf %v24, %r2, 0(%r3)<br>
+; CHECK: br %r14<br>
+ %ret = insertelement <4 x i32> %val, i32 %element, i32 %index<br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test v2i64 insertion into the first element.<br>
+define <2 x i64> @f10(<2 x i64> %val, i64 %element) {<br>
+; CHECK-LABEL: f10:<br>
+; CHECK: vlvgg %v24, %r2, 0<br>
+; CHECK: br %r14<br>
+ %ret = insertelement <2 x i64> %val, i64 %element, i32 0<br>
+ ret <2 x i64> %ret<br>
+}<br>
+<br>
+; Test v2i64 insertion into the last element.<br>
+define <2 x i64> @f11(<2 x i64> %val, i64 %element) {<br>
+; CHECK-LABEL: f11:<br>
+; CHECK: vlvgg %v24, %r2, 1<br>
+; CHECK: br %r14<br>
+ %ret = insertelement <2 x i64> %val, i64 %element, i32 1<br>
+ ret <2 x i64> %ret<br>
+}<br>
+<br>
+; Test v2i64 insertion into a variable element.<br>
+define <2 x i64> @f12(<2 x i64> %val, i64 %element, i32 %index) {<br>
+; CHECK-LABEL: f12:<br>
+; CHECK: vlvgg %v24, %r2, 0(%r3)<br>
+; CHECK: br %r14<br>
+ %ret = insertelement <2 x i64> %val, i64 %element, i32 %index<br>
+ ret <2 x i64> %ret<br>
+}<br>
+<br>
+; Test v16i8 insertion into a variable element plus one.<br>
+define <16 x i8> @f19(<16 x i8> %val, i8 %element, i32 %index) {<br>
+; CHECK-LABEL: f19:<br>
+; CHECK: vlvgb %v24, %r2, 1(%r3)<br>
+; CHECK: br %r14<br>
+ %add = add i32 %index, 1<br>
+ %ret = insertelement <16 x i8> %val, i8 %element, i32 %add<br>
+ ret <16 x i8> %ret<br>
+}<br>
<br>
Added: llvm/trunk/test/CodeGen/SystemZ/vec-move-05.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-move-05.ll?rev=236521&view=auto" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-move-05.ll?rev=236521&view=auto</a><br>
==============================================================================<br>
--- llvm/trunk/test/CodeGen/SystemZ/vec-move-05.ll (added)<br>
+++ llvm/trunk/test/CodeGen/SystemZ/vec-move-05.ll Tue May 5 14:25:42 2015<br>
@@ -0,0 +1,161 @@<br>
+; Test vector extraction.<br>
+;<br>
+; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=z13 | FileCheck %s<br>
+<br>
+; Test v16i8 extraction of the first element.<br>
+define i8 @f1(<16 x i8> %val) {<br>
+; CHECK-LABEL: f1:<br>
+; CHECK: vlgvb %r2, %v24, 0<br>
+; CHECK: br %r14<br>
+ %ret = extractelement <16 x i8> %val, i32 0<br>
+ ret i8 %ret<br>
+}<br>
+<br>
+; Test v16i8 extraction of the last element.<br>
+define i8 @f2(<16 x i8> %val) {<br>
+; CHECK-LABEL: f2:<br>
+; CHECK: vlgvb %r2, %v24, 15<br>
+; CHECK: br %r14<br>
+ %ret = extractelement <16 x i8> %val, i32 15<br>
+ ret i8 %ret<br>
+}<br>
+<br>
+; Test v16i8 extractions of an absurd element number. This must compile<br>
+; but we don't care what it does.<br>
+define i8 @f3(<16 x i8> %val) {<br>
+; CHECK-LABEL: f3:<br>
+; CHECK-NOT: vlgvb %r2, %v24, 100000<br>
+; CHECK: br %r14<br>
+ %ret = extractelement <16 x i8> %val, i32 100000<br>
+ ret i8 %ret<br>
+}<br>
+<br>
+; Test v16i8 extraction of a variable element.<br>
+define i8 @f4(<16 x i8> %val, i32 %index) {<br>
+; CHECK-LABEL: f4:<br>
+; CHECK: vlgvb %r2, %v24, 0(%r2)<br>
+; CHECK: br %r14<br>
+ %ret = extractelement <16 x i8> %val, i32 %index<br>
+ ret i8 %ret<br>
+}<br>
+<br>
+; Test v8i16 extraction of the first element.<br>
+define i16 @f5(<8 x i16> %val) {<br>
+; CHECK-LABEL: f5:<br>
+; CHECK: vlgvh %r2, %v24, 0<br>
+; CHECK: br %r14<br>
+ %ret = extractelement <8 x i16> %val, i32 0<br>
+ ret i16 %ret<br>
+}<br>
+<br>
+; Test v8i16 extraction of the last element.<br>
+define i16 @f6(<8 x i16> %val) {<br>
+; CHECK-LABEL: f6:<br>
+; CHECK: vlgvh %r2, %v24, 7<br>
+; CHECK: br %r14<br>
+ %ret = extractelement <8 x i16> %val, i32 7<br>
+ ret i16 %ret<br>
+}<br>
+<br>
+; Test v8i16 extractions of an absurd element number. This must compile<br>
+; but we don't care what it does.<br>
+define i16 @f7(<8 x i16> %val) {<br>
+; CHECK-LABEL: f7:<br>
+; CHECK-NOT: vlgvh %r2, %v24, 100000<br>
+; CHECK: br %r14<br>
+ %ret = extractelement <8 x i16> %val, i32 100000<br>
+ ret i16 %ret<br>
+}<br>
+<br>
+; Test v8i16 extraction of a variable element.<br>
+define i16 @f8(<8 x i16> %val, i32 %index) {<br>
+; CHECK-LABEL: f8:<br>
+; CHECK: vlgvh %r2, %v24, 0(%r2)<br>
+; CHECK: br %r14<br>
+ %ret = extractelement <8 x i16> %val, i32 %index<br>
+ ret i16 %ret<br>
+}<br>
+<br>
+; Test v4i32 extraction of the first element.<br>
+define i32 @f9(<4 x i32> %val) {<br>
+; CHECK-LABEL: f9:<br>
+; CHECK: vlgvf %r2, %v24, 0<br>
+; CHECK: br %r14<br>
+ %ret = extractelement <4 x i32> %val, i32 0<br>
+ ret i32 %ret<br>
+}<br>
+<br>
+; Test v4i32 extraction of the last element.<br>
+define i32 @f10(<4 x i32> %val) {<br>
+; CHECK-LABEL: f10:<br>
+; CHECK: vlgvf %r2, %v24, 3<br>
+; CHECK: br %r14<br>
+ %ret = extractelement <4 x i32> %val, i32 3<br>
+ ret i32 %ret<br>
+}<br>
+<br>
+; Test v4i32 extractions of an absurd element number. This must compile<br>
+; but we don't care what it does.<br>
+define i32 @f11(<4 x i32> %val) {<br>
+; CHECK-LABEL: f11:<br>
+; CHECK-NOT: vlgvf %r2, %v24, 100000<br>
+; CHECK: br %r14<br>
+ %ret = extractelement <4 x i32> %val, i32 100000<br>
+ ret i32 %ret<br>
+}<br>
+<br>
+; Test v4i32 extraction of a variable element.<br>
+define i32 @f12(<4 x i32> %val, i32 %index) {<br>
+; CHECK-LABEL: f12:<br>
+; CHECK: vlgvf %r2, %v24, 0(%r2)<br>
+; CHECK: br %r14<br>
+ %ret = extractelement <4 x i32> %val, i32 %index<br>
+ ret i32 %ret<br>
+}<br>
+<br>
+; Test v2i64 extraction of the first element.<br>
+define i64 @f13(<2 x i64> %val) {<br>
+; CHECK-LABEL: f13:<br>
+; CHECK: vlgvg %r2, %v24, 0<br>
+; CHECK: br %r14<br>
+ %ret = extractelement <2 x i64> %val, i32 0<br>
+ ret i64 %ret<br>
+}<br>
+<br>
+; Test v2i64 extraction of the last element.<br>
+define i64 @f14(<2 x i64> %val) {<br>
+; CHECK-LABEL: f14:<br>
+; CHECK: vlgvg %r2, %v24, 1<br>
+; CHECK: br %r14<br>
+ %ret = extractelement <2 x i64> %val, i32 1<br>
+ ret i64 %ret<br>
+}<br>
+<br>
+; Test v2i64 extractions of an absurd element number. This must compile<br>
+; but we don't care what it does.<br>
+define i64 @f15(<2 x i64> %val) {<br>
+; CHECK-LABEL: f15:<br>
+; CHECK-NOT: vlgvg %r2, %v24, 100000<br>
+; CHECK: br %r14<br>
+ %ret = extractelement <2 x i64> %val, i32 100000<br>
+ ret i64 %ret<br>
+}<br>
+<br>
+; Test v2i64 extraction of a variable element.<br>
+define i64 @f16(<2 x i64> %val, i32 %index) {<br>
+; CHECK-LABEL: f16:<br>
+; CHECK: vlgvg %r2, %v24, 0(%r2)<br>
+; CHECK: br %r14<br>
+ %ret = extractelement <2 x i64> %val, i32 %index<br>
+ ret i64 %ret<br>
+}<br>
+<br>
+; Test v16i8 extraction of a variable element with an offset.<br>
+define i8 @f27(<16 x i8> %val, i32 %index) {<br>
+; CHECK-LABEL: f27:<br>
+; CHECK: vlgvb %r2, %v24, 1(%r2)<br>
+; CHECK: br %r14<br>
+ %add = add i32 %index, 1<br>
+ %ret = extractelement <16 x i8> %val, i32 %add<br>
+ ret i8 %ret<br>
+}<br>
<br>
Added: llvm/trunk/test/CodeGen/SystemZ/vec-move-06.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-move-06.ll?rev=236521&view=auto" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-move-06.ll?rev=236521&view=auto</a><br>
==============================================================================<br>
--- llvm/trunk/test/CodeGen/SystemZ/vec-move-06.ll (added)<br>
+++ llvm/trunk/test/CodeGen/SystemZ/vec-move-06.ll Tue May 5 14:25:42 2015<br>
@@ -0,0 +1,13 @@<br>
+; Test vector builds using VLVGP.<br>
+;<br>
+; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=z13 | FileCheck %s<br>
+<br>
+; Test the basic v2i64 usage.<br>
+define <2 x i64> @f1(i64 %a, i64 %b) {<br>
+; CHECK-LABEL: f1:<br>
+; CHECK: vlvgp %v24, %r2, %r3<br>
+; CHECK: br %r14<br>
+ %veca = insertelement <2 x i64> undef, i64 %a, i32 0<br>
+ %vecb = insertelement <2 x i64> %veca, i64 %b, i32 1<br>
+ ret <2 x i64> %vecb<br>
+}<br>
<br>
Added: llvm/trunk/test/CodeGen/SystemZ/vec-move-07.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-move-07.ll?rev=236521&view=auto" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-move-07.ll?rev=236521&view=auto</a><br>
==============================================================================<br>
--- llvm/trunk/test/CodeGen/SystemZ/vec-move-07.ll (added)<br>
+++ llvm/trunk/test/CodeGen/SystemZ/vec-move-07.ll Tue May 5 14:25:42 2015<br>
@@ -0,0 +1,39 @@<br>
+; Test scalar_to_vector expansion.<br>
+;<br>
+; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=z13 | FileCheck %s<br>
+<br>
+; Test v16i8.<br>
+define <16 x i8> @f1(i8 %val) {<br>
+; CHECK-LABEL: f1:<br>
+; CHECK: vlvgb %v24, %r2, 0<br>
+; CHECK: br %r14<br>
+ %ret = insertelement <16 x i8> undef, i8 %val, i32 0<br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test v8i16.<br>
+define <8 x i16> @f2(i16 %val) {<br>
+; CHECK-LABEL: f2:<br>
+; CHECK: vlvgh %v24, %r2, 0<br>
+; CHECK: br %r14<br>
+ %ret = insertelement <8 x i16> undef, i16 %val, i32 0<br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test v4i32.<br>
+define <4 x i32> @f3(i32 %val) {<br>
+; CHECK-LABEL: f3:<br>
+; CHECK: vlvgf %v24, %r2, 0<br>
+; CHECK: br %r14<br>
+ %ret = insertelement <4 x i32> undef, i32 %val, i32 0<br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test v2i64. Here we load %val into both halves.<br>
+define <2 x i64> @f4(i64 %val) {<br>
+; CHECK-LABEL: f4:<br>
+; CHECK: vlvgp %v24, %r2, %r2<br>
+; CHECK: br %r14<br>
+ %ret = insertelement <2 x i64> undef, i64 %val, i32 0<br>
+ ret <2 x i64> %ret<br>
+}<br>
<br>
Added: llvm/trunk/test/CodeGen/SystemZ/vec-move-08.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-move-08.ll?rev=236521&view=auto" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-move-08.ll?rev=236521&view=auto</a><br>
==============================================================================<br>
--- llvm/trunk/test/CodeGen/SystemZ/vec-move-08.ll (added)<br>
+++ llvm/trunk/test/CodeGen/SystemZ/vec-move-08.ll Tue May 5 14:25:42 2015<br>
@@ -0,0 +1,284 @@<br>
+; Test vector insertion of memory values.<br>
+;<br>
+; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=z13 | FileCheck %s<br>
+<br>
+; Test v16i8 insertion into the first element.<br>
+define <16 x i8> @f1(<16 x i8> %val, i8 *%ptr) {<br>
+; CHECK-LABEL: f1:<br>
+; CHECK: vleb %v24, 0(%r2), 0<br>
+; CHECK: br %r14<br>
+ %element = load i8, i8 *%ptr<br>
+ %ret = insertelement <16 x i8> %val, i8 %element, i32 0<br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test v16i8 insertion into the last element.<br>
+define <16 x i8> @f2(<16 x i8> %val, i8 *%ptr) {<br>
+; CHECK-LABEL: f2:<br>
+; CHECK: vleb %v24, 0(%r2), 15<br>
+; CHECK: br %r14<br>
+ %element = load i8, i8 *%ptr<br>
+ %ret = insertelement <16 x i8> %val, i8 %element, i32 15<br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test v16i8 insertion with the highest in-range offset.<br>
+define <16 x i8> @f3(<16 x i8> %val, i8 *%base) {<br>
+; CHECK-LABEL: f3:<br>
+; CHECK: vleb %v24, 4095(%r2), 10<br>
+; CHECK: br %r14<br>
+ %ptr = getelementptr i8, i8 *%base, i32 4095<br>
+ %element = load i8, i8 *%ptr<br>
+ %ret = insertelement <16 x i8> %val, i8 %element, i32 10<br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test v16i8 insertion with the first ouf-of-range offset.<br>
+define <16 x i8> @f4(<16 x i8> %val, i8 *%base) {<br>
+; CHECK-LABEL: f4:<br>
+; CHECK: aghi %r2, 4096<br>
+; CHECK: vleb %v24, 0(%r2), 5<br>
+; CHECK: br %r14<br>
+ %ptr = getelementptr i8, i8 *%base, i32 4096<br>
+ %element = load i8, i8 *%ptr<br>
+ %ret = insertelement <16 x i8> %val, i8 %element, i32 5<br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test v16i8 insertion into a variable element.<br>
+define <16 x i8> @f5(<16 x i8> %val, i8 *%ptr, i32 %index) {<br>
+; CHECK-LABEL: f5:<br>
+; CHECK-NOT: vleb<br>
+; CHECK: br %r14<br>
+ %element = load i8, i8 *%ptr<br>
+ %ret = insertelement <16 x i8> %val, i8 %element, i32 %index<br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test v8i16 insertion into the first element.<br>
+define <8 x i16> @f6(<8 x i16> %val, i16 *%ptr) {<br>
+; CHECK-LABEL: f6:<br>
+; CHECK: vleh %v24, 0(%r2), 0<br>
+; CHECK: br %r14<br>
+ %element = load i16, i16 *%ptr<br>
+ %ret = insertelement <8 x i16> %val, i16 %element, i32 0<br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test v8i16 insertion into the last element.<br>
+define <8 x i16> @f7(<8 x i16> %val, i16 *%ptr) {<br>
+; CHECK-LABEL: f7:<br>
+; CHECK: vleh %v24, 0(%r2), 7<br>
+; CHECK: br %r14<br>
+ %element = load i16, i16 *%ptr<br>
+ %ret = insertelement <8 x i16> %val, i16 %element, i32 7<br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test v8i16 insertion with the highest in-range offset.<br>
+define <8 x i16> @f8(<8 x i16> %val, i16 *%base) {<br>
+; CHECK-LABEL: f8:<br>
+; CHECK: vleh %v24, 4094(%r2), 5<br>
+; CHECK: br %r14<br>
+ %ptr = getelementptr i16, i16 *%base, i32 2047<br>
+ %element = load i16, i16 *%ptr<br>
+ %ret = insertelement <8 x i16> %val, i16 %element, i32 5<br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test v8i16 insertion with the first ouf-of-range offset.<br>
+define <8 x i16> @f9(<8 x i16> %val, i16 *%base) {<br>
+; CHECK-LABEL: f9:<br>
+; CHECK: aghi %r2, 4096<br>
+; CHECK: vleh %v24, 0(%r2), 1<br>
+; CHECK: br %r14<br>
+ %ptr = getelementptr i16, i16 *%base, i32 2048<br>
+ %element = load i16, i16 *%ptr<br>
+ %ret = insertelement <8 x i16> %val, i16 %element, i32 1<br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test v8i16 insertion into a variable element.<br>
+define <8 x i16> @f10(<8 x i16> %val, i16 *%ptr, i32 %index) {<br>
+; CHECK-LABEL: f10:<br>
+; CHECK-NOT: vleh<br>
+; CHECK: br %r14<br>
+ %element = load i16, i16 *%ptr<br>
+ %ret = insertelement <8 x i16> %val, i16 %element, i32 %index<br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test v4i32 insertion into the first element.<br>
+define <4 x i32> @f11(<4 x i32> %val, i32 *%ptr) {<br>
+; CHECK-LABEL: f11:<br>
+; CHECK: vlef %v24, 0(%r2), 0<br>
+; CHECK: br %r14<br>
+ %element = load i32, i32 *%ptr<br>
+ %ret = insertelement <4 x i32> %val, i32 %element, i32 0<br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test v4i32 insertion into the last element.<br>
+define <4 x i32> @f12(<4 x i32> %val, i32 *%ptr) {<br>
+; CHECK-LABEL: f12:<br>
+; CHECK: vlef %v24, 0(%r2), 3<br>
+; CHECK: br %r14<br>
+ %element = load i32, i32 *%ptr<br>
+ %ret = insertelement <4 x i32> %val, i32 %element, i32 3<br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test v4i32 insertion with the highest in-range offset.<br>
+define <4 x i32> @f13(<4 x i32> %val, i32 *%base) {<br>
+; CHECK-LABEL: f13:<br>
+; CHECK: vlef %v24, 4092(%r2), 2<br>
+; CHECK: br %r14<br>
+ %ptr = getelementptr i32, i32 *%base, i32 1023<br>
+ %element = load i32, i32 *%ptr<br>
+ %ret = insertelement <4 x i32> %val, i32 %element, i32 2<br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test v4i32 insertion with the first ouf-of-range offset.<br>
+define <4 x i32> @f14(<4 x i32> %val, i32 *%base) {<br>
+; CHECK-LABEL: f14:<br>
+; CHECK: aghi %r2, 4096<br>
+; CHECK: vlef %v24, 0(%r2), 1<br>
+; CHECK: br %r14<br>
+ %ptr = getelementptr i32, i32 *%base, i32 1024<br>
+ %element = load i32, i32 *%ptr<br>
+ %ret = insertelement <4 x i32> %val, i32 %element, i32 1<br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test v4i32 insertion into a variable element.<br>
+define <4 x i32> @f15(<4 x i32> %val, i32 *%ptr, i32 %index) {<br>
+; CHECK-LABEL: f15:<br>
+; CHECK-NOT: vlef<br>
+; CHECK: br %r14<br>
+ %element = load i32, i32 *%ptr<br>
+ %ret = insertelement <4 x i32> %val, i32 %element, i32 %index<br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test v2i64 insertion into the first element.<br>
+define <2 x i64> @f16(<2 x i64> %val, i64 *%ptr) {<br>
+; CHECK-LABEL: f16:<br>
+; CHECK: vleg %v24, 0(%r2), 0<br>
+; CHECK: br %r14<br>
+ %element = load i64, i64 *%ptr<br>
+ %ret = insertelement <2 x i64> %val, i64 %element, i32 0<br>
+ ret <2 x i64> %ret<br>
+}<br>
+<br>
+; Test v2i64 insertion into the last element.<br>
+define <2 x i64> @f17(<2 x i64> %val, i64 *%ptr) {<br>
+; CHECK-LABEL: f17:<br>
+; CHECK: vleg %v24, 0(%r2), 1<br>
+; CHECK: br %r14<br>
+ %element = load i64, i64 *%ptr<br>
+ %ret = insertelement <2 x i64> %val, i64 %element, i32 1<br>
+ ret <2 x i64> %ret<br>
+}<br>
+<br>
+; Test v2i64 insertion with the highest in-range offset.<br>
+define <2 x i64> @f18(<2 x i64> %val, i64 *%base) {<br>
+; CHECK-LABEL: f18:<br>
+; CHECK: vleg %v24, 4088(%r2), 1<br>
+; CHECK: br %r14<br>
+ %ptr = getelementptr i64, i64 *%base, i32 511<br>
+ %element = load i64, i64 *%ptr<br>
+ %ret = insertelement <2 x i64> %val, i64 %element, i32 1<br>
+ ret <2 x i64> %ret<br>
+}<br>
+<br>
+; Test v2i64 insertion with the first ouf-of-range offset.<br>
+define <2 x i64> @f19(<2 x i64> %val, i64 *%base) {<br>
+; CHECK-LABEL: f19:<br>
+; CHECK: aghi %r2, 4096<br>
+; CHECK: vleg %v24, 0(%r2), 0<br>
+; CHECK: br %r14<br>
+ %ptr = getelementptr i64, i64 *%base, i32 512<br>
+ %element = load i64, i64 *%ptr<br>
+ %ret = insertelement <2 x i64> %val, i64 %element, i32 0<br>
+ ret <2 x i64> %ret<br>
+}<br>
+<br>
+; Test v2i64 insertion into a variable element.<br>
+define <2 x i64> @f20(<2 x i64> %val, i64 *%ptr, i32 %index) {<br>
+; CHECK-LABEL: f20:<br>
+; CHECK-NOT: vleg<br>
+; CHECK: br %r14<br>
+ %element = load i64, i64 *%ptr<br>
+ %ret = insertelement <2 x i64> %val, i64 %element, i32 %index<br>
+ ret <2 x i64> %ret<br>
+}<br>
+<br>
+; Test a v4i32 gather of the first element.<br>
+define <4 x i32> @f31(<4 x i32> %val, <4 x i32> %index, i64 %base) {<br>
+; CHECK-LABEL: f31:<br>
+; CHECK: vgef %v24, 0(%v26,%r2), 0<br>
+; CHECK: br %r14<br>
+ %elem = extractelement <4 x i32> %index, i32 0<br>
+ %ext = zext i32 %elem to i64<br>
+ %add = add i64 %base, %ext<br>
+ %ptr = inttoptr i64 %add to i32 *<br>
+ %element = load i32, i32 *%ptr<br>
+ %ret = insertelement <4 x i32> %val, i32 %element, i32 0<br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test a v4i32 gather of the last element.<br>
+define <4 x i32> @f32(<4 x i32> %val, <4 x i32> %index, i64 %base) {<br>
+; CHECK-LABEL: f32:<br>
+; CHECK: vgef %v24, 0(%v26,%r2), 3<br>
+; CHECK: br %r14<br>
+ %elem = extractelement <4 x i32> %index, i32 3<br>
+ %ext = zext i32 %elem to i64<br>
+ %add = add i64 %base, %ext<br>
+ %ptr = inttoptr i64 %add to i32 *<br>
+ %element = load i32, i32 *%ptr<br>
+ %ret = insertelement <4 x i32> %val, i32 %element, i32 3<br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test a v4i32 gather with the highest in-range offset.<br>
+define <4 x i32> @f33(<4 x i32> %val, <4 x i32> %index, i64 %base) {<br>
+; CHECK-LABEL: f33:<br>
+; CHECK: vgef %v24, 4095(%v26,%r2), 1<br>
+; CHECK: br %r14<br>
+ %elem = extractelement <4 x i32> %index, i32 1<br>
+ %ext = zext i32 %elem to i64<br>
+ %add1 = add i64 %base, %ext<br>
+ %add2 = add i64 %add1, 4095<br>
+ %ptr = inttoptr i64 %add2 to i32 *<br>
+ %element = load i32, i32 *%ptr<br>
+ %ret = insertelement <4 x i32> %val, i32 %element, i32 1<br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test a v2i64 gather of the first element.<br>
+define <2 x i64> @f34(<2 x i64> %val, <2 x i64> %index, i64 %base) {<br>
+; CHECK-LABEL: f34:<br>
+; CHECK: vgeg %v24, 0(%v26,%r2), 0<br>
+; CHECK: br %r14<br>
+ %elem = extractelement <2 x i64> %index, i32 0<br>
+ %add = add i64 %base, %elem<br>
+ %ptr = inttoptr i64 %add to i64 *<br>
+ %element = load i64, i64 *%ptr<br>
+ %ret = insertelement <2 x i64> %val, i64 %element, i32 0<br>
+ ret <2 x i64> %ret<br>
+}<br>
+<br>
+; Test a v2i64 gather of the last element.<br>
+define <2 x i64> @f35(<2 x i64> %val, <2 x i64> %index, i64 %base) {<br>
+; CHECK-LABEL: f35:<br>
+; CHECK: vgeg %v24, 0(%v26,%r2), 1<br>
+; CHECK: br %r14<br>
+ %elem = extractelement <2 x i64> %index, i32 1<br>
+ %add = add i64 %base, %elem<br>
+ %ptr = inttoptr i64 %add to i64 *<br>
+ %element = load i64, i64 *%ptr<br>
+ %ret = insertelement <2 x i64> %val, i64 %element, i32 1<br>
+ ret <2 x i64> %ret<br>
+}<br>
<br>
Added: llvm/trunk/test/CodeGen/SystemZ/vec-move-09.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-move-09.ll?rev=236521&view=auto" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-move-09.ll?rev=236521&view=auto</a><br>
==============================================================================<br>
--- llvm/trunk/test/CodeGen/SystemZ/vec-move-09.ll (added)<br>
+++ llvm/trunk/test/CodeGen/SystemZ/vec-move-09.ll Tue May 5 14:25:42 2015<br>
@@ -0,0 +1,237 @@<br>
+; Test vector insertion of constants.<br>
+;<br>
+; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=z13 | FileCheck %s<br>
+<br>
+; Test v16i8 insertion into the first element.<br>
+define <16 x i8> @f1(<16 x i8> %val) {<br>
+; CHECK-LABEL: f1:<br>
+; CHECK: vleib %v24, 0, 0<br>
+; CHECK: br %r14<br>
+ %ret = insertelement <16 x i8> %val, i8 0, i32 0<br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test v16i8 insertion into the last element.<br>
+define <16 x i8> @f2(<16 x i8> %val) {<br>
+; CHECK-LABEL: f2:<br>
+; CHECK: vleib %v24, 100, 15<br>
+; CHECK: br %r14<br>
+ %ret = insertelement <16 x i8> %val, i8 100, i32 15<br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test v16i8 insertion with the maximum signed value.<br>
+define <16 x i8> @f3(<16 x i8> %val) {<br>
+; CHECK-LABEL: f3:<br>
+; CHECK: vleib %v24, 127, 10<br>
+; CHECK: br %r14<br>
+ %ret = insertelement <16 x i8> %val, i8 127, i32 10<br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test v16i8 insertion with the minimum signed value.<br>
+define <16 x i8> @f4(<16 x i8> %val) {<br>
+; CHECK-LABEL: f4:<br>
+; CHECK: vleib %v24, -128, 11<br>
+; CHECK: br %r14<br>
+ %ret = insertelement <16 x i8> %val, i8 128, i32 11<br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test v16i8 insertion with the maximum unsigned value.<br>
+define <16 x i8> @f5(<16 x i8> %val) {<br>
+; CHECK-LABEL: f5:<br>
+; CHECK: vleib %v24, -1, 12<br>
+; CHECK: br %r14<br>
+ %ret = insertelement <16 x i8> %val, i8 255, i32 12<br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test v16i8 insertion into a variable element.<br>
+define <16 x i8> @f6(<16 x i8> %val, i32 %index) {<br>
+; CHECK-LABEL: f6:<br>
+; CHECK-NOT: vleib<br>
+; CHECK: br %r14<br>
+ %ret = insertelement <16 x i8> %val, i8 0, i32 %index<br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test v8i16 insertion into the first element.<br>
+define <8 x i16> @f7(<8 x i16> %val) {<br>
+; CHECK-LABEL: f7:<br>
+; CHECK: vleih %v24, 0, 0<br>
+; CHECK: br %r14<br>
+ %ret = insertelement <8 x i16> %val, i16 0, i32 0<br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test v8i16 insertion into the last element.<br>
+define <8 x i16> @f8(<8 x i16> %val) {<br>
+; CHECK-LABEL: f8:<br>
+; CHECK: vleih %v24, 0, 7<br>
+; CHECK: br %r14<br>
+ %ret = insertelement <8 x i16> %val, i16 0, i32 7<br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test v8i16 insertion with the maximum signed value.<br>
+define <8 x i16> @f9(<8 x i16> %val) {<br>
+; CHECK-LABEL: f9:<br>
+; CHECK: vleih %v24, 32767, 4<br>
+; CHECK: br %r14<br>
+ %ret = insertelement <8 x i16> %val, i16 32767, i32 4<br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test v8i16 insertion with the minimum signed value.<br>
+define <8 x i16> @f10(<8 x i16> %val) {<br>
+; CHECK-LABEL: f10:<br>
+; CHECK: vleih %v24, -32768, 5<br>
+; CHECK: br %r14<br>
+ %ret = insertelement <8 x i16> %val, i16 32768, i32 5<br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test v8i16 insertion with the maximum unsigned value.<br>
+define <8 x i16> @f11(<8 x i16> %val) {<br>
+; CHECK-LABEL: f11:<br>
+; CHECK: vleih %v24, -1, 6<br>
+; CHECK: br %r14<br>
+ %ret = insertelement <8 x i16> %val, i16 65535, i32 6<br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test v8i16 insertion into a variable element.<br>
+define <8 x i16> @f12(<8 x i16> %val, i32 %index) {<br>
+; CHECK-LABEL: f12:<br>
+; CHECK-NOT: vleih<br>
+; CHECK: br %r14<br>
+ %ret = insertelement <8 x i16> %val, i16 0, i32 %index<br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test v4i32 insertion into the first element.<br>
+define <4 x i32> @f13(<4 x i32> %val) {<br>
+; CHECK-LABEL: f13:<br>
+; CHECK: vleif %v24, 0, 0<br>
+; CHECK: br %r14<br>
+ %ret = insertelement <4 x i32> %val, i32 0, i32 0<br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test v4i32 insertion into the last element.<br>
+define <4 x i32> @f14(<4 x i32> %val) {<br>
+; CHECK-LABEL: f14:<br>
+; CHECK: vleif %v24, 0, 3<br>
+; CHECK: br %r14<br>
+ %ret = insertelement <4 x i32> %val, i32 0, i32 3<br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test v4i32 insertion with the maximum value allowed by VLEIF.<br>
+define <4 x i32> @f15(<4 x i32> %val) {<br>
+; CHECK-LABEL: f15:<br>
+; CHECK: vleif %v24, 32767, 1<br>
+; CHECK: br %r14<br>
+ %ret = insertelement <4 x i32> %val, i32 32767, i32 1<br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test v4i32 insertion with the next value up.<br>
+define <4 x i32> @f16(<4 x i32> %val) {<br>
+; CHECK-LABEL: f16:<br>
+; CHECK-NOT: vleif<br>
+; CHECK: br %r14<br>
+ %ret = insertelement <4 x i32> %val, i32 32768, i32 1<br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test v4i32 insertion with the minimum value allowed by VLEIF.<br>
+define <4 x i32> @f17(<4 x i32> %val) {<br>
+; CHECK-LABEL: f17:<br>
+; CHECK: vleif %v24, -32768, 2<br>
+; CHECK: br %r14<br>
+ %ret = insertelement <4 x i32> %val, i32 -32768, i32 2<br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test v4i32 insertion with the next value down.<br>
+define <4 x i32> @f18(<4 x i32> %val) {<br>
+; CHECK-LABEL: f18:<br>
+; CHECK-NOT: vleif<br>
+; CHECK: br %r14<br>
+ %ret = insertelement <4 x i32> %val, i32 -32769, i32 2<br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test v4i32 insertion into a variable element.<br>
+define <4 x i32> @f19(<4 x i32> %val, i32 %index) {<br>
+; CHECK-LABEL: f19:<br>
+; CHECK-NOT: vleif<br>
+; CHECK: br %r14<br>
+ %ret = insertelement <4 x i32> %val, i32 0, i32 %index<br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test v2i64 insertion into the first element.<br>
+define <2 x i64> @f20(<2 x i64> %val) {<br>
+; CHECK-LABEL: f20:<br>
+; CHECK: vleig %v24, 0, 0<br>
+; CHECK: br %r14<br>
+ %ret = insertelement <2 x i64> %val, i64 0, i32 0<br>
+ ret <2 x i64> %ret<br>
+}<br>
+<br>
+; Test v2i64 insertion into the last element.<br>
+define <2 x i64> @f21(<2 x i64> %val) {<br>
+; CHECK-LABEL: f21:<br>
+; CHECK: vleig %v24, 0, 1<br>
+; CHECK: br %r14<br>
+ %ret = insertelement <2 x i64> %val, i64 0, i32 1<br>
+ ret <2 x i64> %ret<br>
+}<br>
+<br>
+; Test v2i64 insertion with the maximum value allowed by VLEIG.<br>
+define <2 x i64> @f22(<2 x i64> %val) {<br>
+; CHECK-LABEL: f22:<br>
+; CHECK: vleig %v24, 32767, 1<br>
+; CHECK: br %r14<br>
+ %ret = insertelement <2 x i64> %val, i64 32767, i32 1<br>
+ ret <2 x i64> %ret<br>
+}<br>
+<br>
+; Test v2i64 insertion with the next value up.<br>
+define <2 x i64> @f23(<2 x i64> %val) {<br>
+; CHECK-LABEL: f23:<br>
+; CHECK-NOT: vleig<br>
+; CHECK: br %r14<br>
+ %ret = insertelement <2 x i64> %val, i64 32768, i32 1<br>
+ ret <2 x i64> %ret<br>
+}<br>
+<br>
+; Test v2i64 insertion with the minimum value allowed by VLEIG.<br>
+define <2 x i64> @f24(<2 x i64> %val) {<br>
+; CHECK-LABEL: f24:<br>
+; CHECK: vleig %v24, -32768, 0<br>
+; CHECK: br %r14<br>
+ %ret = insertelement <2 x i64> %val, i64 -32768, i32 0<br>
+ ret <2 x i64> %ret<br>
+}<br>
+<br>
+; Test v2i64 insertion with the next value down.<br>
+define <2 x i64> @f25(<2 x i64> %val) {<br>
+; CHECK-LABEL: f25:<br>
+; CHECK-NOT: vleig<br>
+; CHECK: br %r14<br>
+ %ret = insertelement <2 x i64> %val, i64 -32769, i32 0<br>
+ ret <2 x i64> %ret<br>
+}<br>
+<br>
+; Test v2i64 insertion into a variable element.<br>
+define <2 x i64> @f26(<2 x i64> %val, i32 %index) {<br>
+; CHECK-LABEL: f26:<br>
+; CHECK-NOT: vleig<br>
+; CHECK: br %r14<br>
+ %ret = insertelement <2 x i64> %val, i64 0, i32 %index<br>
+ ret <2 x i64> %ret<br>
+}<br>
<br>
Added: llvm/trunk/test/CodeGen/SystemZ/vec-move-10.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-move-10.ll?rev=236521&view=auto" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-move-10.ll?rev=236521&view=auto</a><br>
==============================================================================<br>
--- llvm/trunk/test/CodeGen/SystemZ/vec-move-10.ll (added)<br>
+++ llvm/trunk/test/CodeGen/SystemZ/vec-move-10.ll Tue May 5 14:25:42 2015<br>
@@ -0,0 +1,328 @@<br>
+; Test vector extraction to memory.<br>
+;<br>
+; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=z13 | FileCheck %s<br>
+<br>
+; Test v16i8 extraction from the first element.<br>
+define void @f1(<16 x i8> %val, i8 *%ptr) {<br>
+; CHECK-LABEL: f1:<br>
+; CHECK: vsteb %v24, 0(%r2), 0<br>
+; CHECK: br %r14<br>
+ %element = extractelement <16 x i8> %val, i32 0<br>
+ store i8 %element, i8 *%ptr<br>
+ ret void<br>
+}<br>
+<br>
+; Test v16i8 extraction from the last element.<br>
+define void @f2(<16 x i8> %val, i8 *%ptr) {<br>
+; CHECK-LABEL: f2:<br>
+; CHECK: vsteb %v24, 0(%r2), 15<br>
+; CHECK: br %r14<br>
+ %element = extractelement <16 x i8> %val, i32 15<br>
+ store i8 %element, i8 *%ptr<br>
+ ret void<br>
+}<br>
+<br>
+; Test v16i8 extraction of an invalid element. This must compile,<br>
+; but we don't care what it does.<br>
+define void @f3(<16 x i8> %val, i8 *%ptr) {<br>
+; CHECK-LABEL: f3:<br>
+; CHECK-NOT: vsteb %v24, 0(%r2), 16<br>
+; CHECK: br %r14<br>
+ %element = extractelement <16 x i8> %val, i32 16<br>
+ store i8 %element, i8 *%ptr<br>
+ ret void<br>
+}<br>
+<br>
+; Test v16i8 extraction with the highest in-range offset.<br>
+define void @f4(<16 x i8> %val, i8 *%base) {<br>
+; CHECK-LABEL: f4:<br>
+; CHECK: vsteb %v24, 4095(%r2), 10<br>
+; CHECK: br %r14<br>
+ %ptr = getelementptr i8, i8 *%base, i32 4095<br>
+ %element = extractelement <16 x i8> %val, i32 10<br>
+ store i8 %element, i8 *%ptr<br>
+ ret void<br>
+}<br>
+<br>
+; Test v16i8 extraction with the first ouf-of-range offset.<br>
+define void @f5(<16 x i8> %val, i8 *%base) {<br>
+; CHECK-LABEL: f5:<br>
+; CHECK: aghi %r2, 4096<br>
+; CHECK: vsteb %v24, 0(%r2), 5<br>
+; CHECK: br %r14<br>
+ %ptr = getelementptr i8, i8 *%base, i32 4096<br>
+ %element = extractelement <16 x i8> %val, i32 5<br>
+ store i8 %element, i8 *%ptr<br>
+ ret void<br>
+}<br>
+<br>
+; Test v16i8 extraction from a variable element.<br>
+define void @f6(<16 x i8> %val, i8 *%ptr, i32 %index) {<br>
+; CHECK-LABEL: f6:<br>
+; CHECK-NOT: vsteb<br>
+; CHECK: br %r14<br>
+ %element = extractelement <16 x i8> %val, i32 %index<br>
+ store i8 %element, i8 *%ptr<br>
+ ret void<br>
+}<br>
+<br>
+; Test v8i16 extraction from the first element.<br>
+define void @f7(<8 x i16> %val, i16 *%ptr) {<br>
+; CHECK-LABEL: f7:<br>
+; CHECK: vsteh %v24, 0(%r2), 0<br>
+; CHECK: br %r14<br>
+ %element = extractelement <8 x i16> %val, i32 0<br>
+ store i16 %element, i16 *%ptr<br>
+ ret void<br>
+}<br>
+<br>
+; Test v8i16 extraction from the last element.<br>
+define void @f8(<8 x i16> %val, i16 *%ptr) {<br>
+; CHECK-LABEL: f8:<br>
+; CHECK: vsteh %v24, 0(%r2), 7<br>
+; CHECK: br %r14<br>
+ %element = extractelement <8 x i16> %val, i32 7<br>
+ store i16 %element, i16 *%ptr<br>
+ ret void<br>
+}<br>
+<br>
+; Test v8i16 extraction of an invalid element. This must compile,<br>
+; but we don't care what it does.<br>
+define void @f9(<8 x i16> %val, i16 *%ptr) {<br>
+; CHECK-LABEL: f9:<br>
+; CHECK-NOT: vsteh %v24, 0(%r2), 8<br>
+; CHECK: br %r14<br>
+ %element = extractelement <8 x i16> %val, i32 8<br>
+ store i16 %element, i16 *%ptr<br>
+ ret void<br>
+}<br>
+<br>
+; Test v8i16 extraction with the highest in-range offset.<br>
+define void @f10(<8 x i16> %val, i16 *%base) {<br>
+; CHECK-LABEL: f10:<br>
+; CHECK: vsteh %v24, 4094(%r2), 5<br>
+; CHECK: br %r14<br>
+ %ptr = getelementptr i16, i16 *%base, i32 2047<br>
+ %element = extractelement <8 x i16> %val, i32 5<br>
+ store i16 %element, i16 *%ptr<br>
+ ret void<br>
+}<br>
+<br>
+; Test v8i16 extraction with the first ouf-of-range offset.<br>
+define void @f11(<8 x i16> %val, i16 *%base) {<br>
+; CHECK-LABEL: f11:<br>
+; CHECK: aghi %r2, 4096<br>
+; CHECK: vsteh %v24, 0(%r2), 1<br>
+; CHECK: br %r14<br>
+ %ptr = getelementptr i16, i16 *%base, i32 2048<br>
+ %element = extractelement <8 x i16> %val, i32 1<br>
+ store i16 %element, i16 *%ptr<br>
+ ret void<br>
+}<br>
+<br>
+; Test v8i16 extraction from a variable element.<br>
+define void @f12(<8 x i16> %val, i16 *%ptr, i32 %index) {<br>
+; CHECK-LABEL: f12:<br>
+; CHECK-NOT: vsteh<br>
+; CHECK: br %r14<br>
+ %element = extractelement <8 x i16> %val, i32 %index<br>
+ store i16 %element, i16 *%ptr<br>
+ ret void<br>
+}<br>
+<br>
+; Test v4i32 extraction from the first element.<br>
+define void @f13(<4 x i32> %val, i32 *%ptr) {<br>
+; CHECK-LABEL: f13:<br>
+; CHECK: vstef %v24, 0(%r2), 0<br>
+; CHECK: br %r14<br>
+ %element = extractelement <4 x i32> %val, i32 0<br>
+ store i32 %element, i32 *%ptr<br>
+ ret void<br>
+}<br>
+<br>
+; Test v4i32 extraction from the last element.<br>
+define void @f14(<4 x i32> %val, i32 *%ptr) {<br>
+; CHECK-LABEL: f14:<br>
+; CHECK: vstef %v24, 0(%r2), 3<br>
+; CHECK: br %r14<br>
+ %element = extractelement <4 x i32> %val, i32 3<br>
+ store i32 %element, i32 *%ptr<br>
+ ret void<br>
+}<br>
+<br>
+; Test v4i32 extraction of an invalid element. This must compile,<br>
+; but we don't care what it does.<br>
+define void @f15(<4 x i32> %val, i32 *%ptr) {<br>
+; CHECK-LABEL: f15:<br>
+; CHECK-NOT: vstef %v24, 0(%r2), 4<br>
+; CHECK: br %r14<br>
+ %element = extractelement <4 x i32> %val, i32 4<br>
+ store i32 %element, i32 *%ptr<br>
+ ret void<br>
+}<br>
+<br>
+; Test v4i32 extraction with the highest in-range offset.<br>
+define void @f16(<4 x i32> %val, i32 *%base) {<br>
+; CHECK-LABEL: f16:<br>
+; CHECK: vstef %v24, 4092(%r2), 2<br>
+; CHECK: br %r14<br>
+ %ptr = getelementptr i32, i32 *%base, i32 1023<br>
+ %element = extractelement <4 x i32> %val, i32 2<br>
+ store i32 %element, i32 *%ptr<br>
+ ret void<br>
+}<br>
+<br>
+; Test v4i32 extraction with the first ouf-of-range offset.<br>
+define void @f17(<4 x i32> %val, i32 *%base) {<br>
+; CHECK-LABEL: f17:<br>
+; CHECK: aghi %r2, 4096<br>
+; CHECK: vstef %v24, 0(%r2), 1<br>
+; CHECK: br %r14<br>
+ %ptr = getelementptr i32, i32 *%base, i32 1024<br>
+ %element = extractelement <4 x i32> %val, i32 1<br>
+ store i32 %element, i32 *%ptr<br>
+ ret void<br>
+}<br>
+<br>
+; Test v4i32 extraction from a variable element.<br>
+define void @f18(<4 x i32> %val, i32 *%ptr, i32 %index) {<br>
+; CHECK-LABEL: f18:<br>
+; CHECK-NOT: vstef<br>
+; CHECK: br %r14<br>
+ %element = extractelement <4 x i32> %val, i32 %index<br>
+ store i32 %element, i32 *%ptr<br>
+ ret void<br>
+}<br>
+<br>
+; Test v2i64 extraction from the first element.<br>
+define void @f19(<2 x i64> %val, i64 *%ptr) {<br>
+; CHECK-LABEL: f19:<br>
+; CHECK: vsteg %v24, 0(%r2), 0<br>
+; CHECK: br %r14<br>
+ %element = extractelement <2 x i64> %val, i32 0<br>
+ store i64 %element, i64 *%ptr<br>
+ ret void<br>
+}<br>
+<br>
+; Test v2i64 extraction from the last element.<br>
+define void @f20(<2 x i64> %val, i64 *%ptr) {<br>
+; CHECK-LABEL: f20:<br>
+; CHECK: vsteg %v24, 0(%r2), 1<br>
+; CHECK: br %r14<br>
+ %element = extractelement <2 x i64> %val, i32 1<br>
+ store i64 %element, i64 *%ptr<br>
+ ret void<br>
+}<br>
+<br>
+; Test v2i64 extraction of an invalid element. This must compile,<br>
+; but we don't care what it does.<br>
+define void @f21(<2 x i64> %val, i64 *%ptr) {<br>
+; CHECK-LABEL: f21:<br>
+; CHECK-NOT: vsteg %v24, 0(%r2), 2<br>
+; CHECK: br %r14<br>
+ %element = extractelement <2 x i64> %val, i32 2<br>
+ store i64 %element, i64 *%ptr<br>
+ ret void<br>
+}<br>
+<br>
+; Test v2i64 extraction with the highest in-range offset.<br>
+define void @f22(<2 x i64> %val, i64 *%base) {<br>
+; CHECK-LABEL: f22:<br>
+; CHECK: vsteg %v24, 4088(%r2), 1<br>
+; CHECK: br %r14<br>
+ %ptr = getelementptr i64, i64 *%base, i32 511<br>
+ %element = extractelement <2 x i64> %val, i32 1<br>
+ store i64 %element, i64 *%ptr<br>
+ ret void<br>
+}<br>
+<br>
+; Test v2i64 extraction with the first ouf-of-range offset.<br>
+define void @f23(<2 x i64> %val, i64 *%base) {<br>
+; CHECK-LABEL: f23:<br>
+; CHECK: aghi %r2, 4096<br>
+; CHECK: vsteg %v24, 0(%r2), 0<br>
+; CHECK: br %r14<br>
+ %ptr = getelementptr i64, i64 *%base, i32 512<br>
+ %element = extractelement <2 x i64> %val, i32 0<br>
+ store i64 %element, i64 *%ptr<br>
+ ret void<br>
+}<br>
+<br>
+; Test v2i64 extraction from a variable element.<br>
+define void @f24(<2 x i64> %val, i64 *%ptr, i32 %index) {<br>
+; CHECK-LABEL: f24:<br>
+; CHECK-NOT: vsteg<br>
+; CHECK: br %r14<br>
+ %element = extractelement <2 x i64> %val, i32 %index<br>
+ store i64 %element, i64 *%ptr<br>
+ ret void<br>
+}<br>
+<br>
+; Test a v4i32 scatter of the first element.<br>
+define void @f37(<4 x i32> %val, <4 x i32> %index, i64 %base) {<br>
+; CHECK-LABEL: f37:<br>
+; CHECK: vscef %v24, 0(%v26,%r2), 0<br>
+; CHECK: br %r14<br>
+ %elem = extractelement <4 x i32> %index, i32 0<br>
+ %ext = zext i32 %elem to i64<br>
+ %add = add i64 %base, %ext<br>
+ %ptr = inttoptr i64 %add to i32 *<br>
+ %element = extractelement <4 x i32> %val, i32 0<br>
+ store i32 %element, i32 *%ptr<br>
+ ret void<br>
+}<br>
+<br>
+; Test a v4i32 scatter of the last element.<br>
+define void @f38(<4 x i32> %val, <4 x i32> %index, i64 %base) {<br>
+; CHECK-LABEL: f38:<br>
+; CHECK: vscef %v24, 0(%v26,%r2), 3<br>
+; CHECK: br %r14<br>
+ %elem = extractelement <4 x i32> %index, i32 3<br>
+ %ext = zext i32 %elem to i64<br>
+ %add = add i64 %base, %ext<br>
+ %ptr = inttoptr i64 %add to i32 *<br>
+ %element = extractelement <4 x i32> %val, i32 3<br>
+ store i32 %element, i32 *%ptr<br>
+ ret void<br>
+}<br>
+<br>
+; Test a v4i32 scatter with the highest in-range offset.<br>
+define void @f39(<4 x i32> %val, <4 x i32> %index, i64 %base) {<br>
+; CHECK-LABEL: f39:<br>
+; CHECK: vscef %v24, 4095(%v26,%r2), 1<br>
+; CHECK: br %r14<br>
+ %elem = extractelement <4 x i32> %index, i32 1<br>
+ %ext = zext i32 %elem to i64<br>
+ %add1 = add i64 %base, %ext<br>
+ %add2 = add i64 %add1, 4095<br>
+ %ptr = inttoptr i64 %add2 to i32 *<br>
+ %element = extractelement <4 x i32> %val, i32 1<br>
+ store i32 %element, i32 *%ptr<br>
+ ret void<br>
+}<br>
+<br>
+; Test a v2i64 scatter of the first element.<br>
+define void @f40(<2 x i64> %val, <2 x i64> %index, i64 %base) {<br>
+; CHECK-LABEL: f40:<br>
+; CHECK: vsceg %v24, 0(%v26,%r2), 0<br>
+; CHECK: br %r14<br>
+ %elem = extractelement <2 x i64> %index, i32 0<br>
+ %add = add i64 %base, %elem<br>
+ %ptr = inttoptr i64 %add to i64 *<br>
+ %element = extractelement <2 x i64> %val, i32 0<br>
+ store i64 %element, i64 *%ptr<br>
+ ret void<br>
+}<br>
+<br>
+; Test a v2i64 scatter of the last element.<br>
+define void @f41(<2 x i64> %val, <2 x i64> %index, i64 %base) {<br>
+; CHECK-LABEL: f41:<br>
+; CHECK: vsceg %v24, 0(%v26,%r2), 1<br>
+; CHECK: br %r14<br>
+ %elem = extractelement <2 x i64> %index, i32 1<br>
+ %add = add i64 %base, %elem<br>
+ %ptr = inttoptr i64 %add to i64 *<br>
+ %element = extractelement <2 x i64> %val, i32 1<br>
+ store i64 %element, i64 *%ptr<br>
+ ret void<br>
+}<br>
<br>
Added: llvm/trunk/test/CodeGen/SystemZ/vec-move-11.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-move-11.ll?rev=236521&view=auto" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-move-11.ll?rev=236521&view=auto</a><br>
==============================================================================<br>
--- llvm/trunk/test/CodeGen/SystemZ/vec-move-11.ll (added)<br>
+++ llvm/trunk/test/CodeGen/SystemZ/vec-move-11.ll Tue May 5 14:25:42 2015<br>
@@ -0,0 +1,93 @@<br>
+; Test insertions of register values into a nonzero index of an undef.<br>
+;<br>
+; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=z13 | FileCheck %s<br>
+<br>
+; Test v16i8 insertion into an undef, with an arbitrary index.<br>
+define <16 x i8> @f1(i8 %val) {<br>
+; CHECK-LABEL: f1:<br>
+; CHECK: vlvgb %v24, %r2, 12<br>
+; CHECK-NEXT: br %r14<br>
+ %ret = insertelement <16 x i8> undef, i8 %val, i32 12<br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test v16i8 insertion into an undef, with the first good index for VLVGP.<br>
+define <16 x i8> @f2(i8 %val) {<br>
+; CHECK-LABEL: f2:<br>
+; CHECK: vlvgp %v24, %r2, %r2<br>
+; CHECK-NEXT: br %r14<br>
+ %ret = insertelement <16 x i8> undef, i8 %val, i32 7<br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test v16i8 insertion into an undef, with the second good index for VLVGP.<br>
+define <16 x i8> @f3(i8 %val) {<br>
+; CHECK-LABEL: f3:<br>
+; CHECK: vlvgp %v24, %r2, %r2<br>
+; CHECK-NEXT: br %r14<br>
+ %ret = insertelement <16 x i8> undef, i8 %val, i32 15<br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test v8i16 insertion into an undef, with an arbitrary index.<br>
+define <8 x i16> @f4(i16 %val) {<br>
+; CHECK-LABEL: f4:<br>
+; CHECK: vlvgh %v24, %r2, 5<br>
+; CHECK-NEXT: br %r14<br>
+ %ret = insertelement <8 x i16> undef, i16 %val, i32 5<br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test v8i16 insertion into an undef, with the first good index for VLVGP.<br>
+define <8 x i16> @f5(i16 %val) {<br>
+; CHECK-LABEL: f5:<br>
+; CHECK: vlvgp %v24, %r2, %r2<br>
+; CHECK-NEXT: br %r14<br>
+ %ret = insertelement <8 x i16> undef, i16 %val, i32 3<br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test v8i16 insertion into an undef, with the second good index for VLVGP.<br>
+define <8 x i16> @f6(i16 %val) {<br>
+; CHECK-LABEL: f6:<br>
+; CHECK: vlvgp %v24, %r2, %r2<br>
+; CHECK-NEXT: br %r14<br>
+ %ret = insertelement <8 x i16> undef, i16 %val, i32 7<br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test v4i32 insertion into an undef, with an arbitrary index.<br>
+define <4 x i32> @f7(i32 %val) {<br>
+; CHECK-LABEL: f7:<br>
+; CHECK: vlvgf %v24, %r2, 2<br>
+; CHECK-NEXT: br %r14<br>
+ %ret = insertelement <4 x i32> undef, i32 %val, i32 2<br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test v4i32 insertion into an undef, with the first good index for VLVGP.<br>
+define <4 x i32> @f8(i32 %val) {<br>
+; CHECK-LABEL: f8:<br>
+; CHECK: vlvgp %v24, %r2, %r2<br>
+; CHECK-NEXT: br %r14<br>
+ %ret = insertelement <4 x i32> undef, i32 %val, i32 1<br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test v4i32 insertion into an undef, with the second good index for VLVGP.<br>
+define <4 x i32> @f9(i32 %val) {<br>
+; CHECK-LABEL: f9:<br>
+; CHECK: vlvgp %v24, %r2, %r2<br>
+; CHECK-NEXT: br %r14<br>
+ %ret = insertelement <4 x i32> undef, i32 %val, i32 3<br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test v2i64 insertion into an undef.<br>
+define <2 x i64> @f10(i64 %val) {<br>
+; CHECK-LABEL: f10:<br>
+; CHECK: vlvgp %v24, %r2, %r2<br>
+; CHECK-NEXT: br %r14<br>
+ %ret = insertelement <2 x i64> undef, i64 %val, i32 1<br>
+ ret <2 x i64> %ret<br>
+}<br>
<br>
Added: llvm/trunk/test/CodeGen/SystemZ/vec-move-12.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-move-12.ll?rev=236521&view=auto" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-move-12.ll?rev=236521&view=auto</a><br>
==============================================================================<br>
--- llvm/trunk/test/CodeGen/SystemZ/vec-move-12.ll (added)<br>
+++ llvm/trunk/test/CodeGen/SystemZ/vec-move-12.ll Tue May 5 14:25:42 2015<br>
@@ -0,0 +1,103 @@<br>
+; Test insertions of memory values into a nonzero index of an undef.<br>
+;<br>
+; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=z13 | FileCheck %s<br>
+<br>
+; Test v16i8 insertion into an undef, with an arbitrary index.<br>
+define <16 x i8> @f1(i8 *%ptr) {<br>
+; CHECK-LABEL: f1:<br>
+; CHECK: vlrepb %v24, 0(%r2)<br>
+; CHECK-NEXT: br %r14<br>
+ %val = load i8, i8 *%ptr<br>
+ %ret = insertelement <16 x i8> undef, i8 %val, i32 12<br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test v16i8 insertion into an undef, with the first good index for VLVGP.<br>
+define <16 x i8> @f2(i8 *%ptr) {<br>
+; CHECK-LABEL: f2:<br>
+; CHECK: {{vlrepb|vllezb}} %v24, 0(%r2)<br>
+; CHECK-NEXT: br %r14<br>
+ %val = load i8, i8 *%ptr<br>
+ %ret = insertelement <16 x i8> undef, i8 %val, i32 7<br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test v16i8 insertion into an undef, with the second good index for VLVGP.<br>
+define <16 x i8> @f3(i8 *%ptr) {<br>
+; CHECK-LABEL: f3:<br>
+; CHECK: vlrepb %v24, 0(%r2)<br>
+; CHECK-NEXT: br %r14<br>
+ %val = load i8, i8 *%ptr<br>
+ %ret = insertelement <16 x i8> undef, i8 %val, i32 15<br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test v8i16 insertion into an undef, with an arbitrary index.<br>
+define <8 x i16> @f4(i16 *%ptr) {<br>
+; CHECK-LABEL: f4:<br>
+; CHECK: vlreph %v24, 0(%r2)<br>
+; CHECK-NEXT: br %r14<br>
+ %val = load i16, i16 *%ptr<br>
+ %ret = insertelement <8 x i16> undef, i16 %val, i32 5<br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test v8i16 insertion into an undef, with the first good index for VLVGP.<br>
+define <8 x i16> @f5(i16 *%ptr) {<br>
+; CHECK-LABEL: f5:<br>
+; CHECK: {{vlreph|vllezh}} %v24, 0(%r2)<br>
+; CHECK-NEXT: br %r14<br>
+ %val = load i16, i16 *%ptr<br>
+ %ret = insertelement <8 x i16> undef, i16 %val, i32 3<br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test v8i16 insertion into an undef, with the second good index for VLVGP.<br>
+define <8 x i16> @f6(i16 *%ptr) {<br>
+; CHECK-LABEL: f6:<br>
+; CHECK: vlreph %v24, 0(%r2)<br>
+; CHECK-NEXT: br %r14<br>
+ %val = load i16, i16 *%ptr<br>
+ %ret = insertelement <8 x i16> undef, i16 %val, i32 7<br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test v4i32 insertion into an undef, with an arbitrary index.<br>
+define <4 x i32> @f7(i32 *%ptr) {<br>
+; CHECK-LABEL: f7:<br>
+; CHECK: vlrepf %v24, 0(%r2)<br>
+; CHECK-NEXT: br %r14<br>
+ %val = load i32, i32 *%ptr<br>
+ %ret = insertelement <4 x i32> undef, i32 %val, i32 2<br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test v4i32 insertion into an undef, with the first good index for VLVGP.<br>
+define <4 x i32> @f8(i32 *%ptr) {<br>
+; CHECK-LABEL: f8:<br>
+; CHECK: {{vlrepf|vllezf}} %v24, 0(%r2)<br>
+; CHECK-NEXT: br %r14<br>
+ %val = load i32, i32 *%ptr<br>
+ %ret = insertelement <4 x i32> undef, i32 %val, i32 1<br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test v4i32 insertion into an undef, with the second good index for VLVGP.<br>
+define <4 x i32> @f9(i32 *%ptr) {<br>
+; CHECK-LABEL: f9:<br>
+; CHECK: vlrepf %v24, 0(%r2)<br>
+; CHECK-NEXT: br %r14<br>
+ %val = load i32, i32 *%ptr<br>
+ %ret = insertelement <4 x i32> undef, i32 %val, i32 3<br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test v2i64 insertion into an undef.<br>
+define <2 x i64> @f10(i64 *%ptr) {<br>
+; CHECK-LABEL: f10:<br>
+; CHECK: vlrepg %v24, 0(%r2)<br>
+; CHECK-NEXT: br %r14<br>
+ %val = load i64, i64 *%ptr<br>
+ %ret = insertelement <2 x i64> undef, i64 %val, i32 1<br>
+ ret <2 x i64> %ret<br>
+}<br>
<br>
Added: llvm/trunk/test/CodeGen/SystemZ/vec-move-13.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-move-13.ll?rev=236521&view=auto" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-move-13.ll?rev=236521&view=auto</a><br>
==============================================================================<br>
--- llvm/trunk/test/CodeGen/SystemZ/vec-move-13.ll (added)<br>
+++ llvm/trunk/test/CodeGen/SystemZ/vec-move-13.ll Tue May 5 14:25:42 2015<br>
@@ -0,0 +1,47 @@<br>
+; Test insertions of register values into 0.<br>
+;<br>
+; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=z13 | FileCheck %s<br>
+<br>
+; Test v16i8 insertion into 0.<br>
+define <16 x i8> @f1(i8 %val1, i8 %val2) {<br>
+; CHECK-LABEL: f1:<br>
+; CHECK: vgbm %v24, 0<br>
+; CHECK-DAG: vlvgb %v24, %r2, 2<br>
+; CHECK-DAG: vlvgb %v24, %r3, 12<br>
+; CHECK: br %r14<br>
+ %vec1 = insertelement <16 x i8> zeroinitializer, i8 %val1, i32 2<br>
+ %vec2 = insertelement <16 x i8> %vec1, i8 %val2, i32 12<br>
+ ret <16 x i8> %vec2<br>
+}<br>
+<br>
+; Test v8i16 insertion into 0.<br>
+define <8 x i16> @f2(i16 %val1, i16 %val2) {<br>
+; CHECK-LABEL: f2:<br>
+; CHECK: vgbm %v24, 0<br>
+; CHECK-DAG: vlvgh %v24, %r2, 3<br>
+; CHECK-DAG: vlvgh %v24, %r3, 5<br>
+; CHECK: br %r14<br>
+ %vec1 = insertelement <8 x i16> zeroinitializer, i16 %val1, i32 3<br>
+ %vec2 = insertelement <8 x i16> %vec1, i16 %val2, i32 5<br>
+ ret <8 x i16> %vec2<br>
+}<br>
+<br>
+; Test v4i32 insertion into 0.<br>
+define <4 x i32> @f3(i32 %val) {<br>
+; CHECK-LABEL: f3:<br>
+; CHECK: vgbm %v24, 0<br>
+; CHECK: vlvgf %v24, %r2, 3<br>
+; CHECK: br %r14<br>
+ %ret = insertelement <4 x i32> zeroinitializer, i32 %val, i32 3<br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test v2i64 insertion into 0.<br>
+define <2 x i64> @f4(i64 %val) {<br>
+; CHECK-LABEL: f4:<br>
+; CHECK: lghi [[REG:%r[0-5]]], 0<br>
+; CHECK: vlvgp %v24, [[REG]], %r2<br>
+; CHECK: br %r14<br>
+ %ret = insertelement <2 x i64> zeroinitializer, i64 %val, i32 1<br>
+ ret <2 x i64> %ret<br>
+}<br>
<br>
Added: llvm/trunk/test/CodeGen/SystemZ/vec-move-14.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-move-14.ll?rev=236521&view=auto" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-move-14.ll?rev=236521&view=auto</a><br>
==============================================================================<br>
--- llvm/trunk/test/CodeGen/SystemZ/vec-move-14.ll (added)<br>
+++ llvm/trunk/test/CodeGen/SystemZ/vec-move-14.ll Tue May 5 14:25:42 2015<br>
@@ -0,0 +1,76 @@<br>
+; Test insertions of memory values into 0.<br>
+;<br>
+; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=z13 | FileCheck %s<br>
+<br>
+; Test VLLEZB.<br>
+define <16 x i8> @f1(i8 *%ptr) {<br>
+; CHECK-LABEL: f1:<br>
+; CHECK: vllezb %v24, 0(%r2)<br>
+; CHECK: br %r14<br>
+ %val = load i8, i8 *%ptr<br>
+ %ret = insertelement <16 x i8> zeroinitializer, i8 %val, i32 7<br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test VLLEZB with the highest in-range offset.<br>
+define <16 x i8> @f2(i8 *%base) {<br>
+; CHECK-LABEL: f2:<br>
+; CHECK: vllezb %v24, 4095(%r2)<br>
+; CHECK: br %r14<br>
+ %ptr = getelementptr i8, i8 *%base, i64 4095<br>
+ %val = load i8, i8 *%ptr<br>
+ %ret = insertelement <16 x i8> zeroinitializer, i8 %val, i32 7<br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test VLLEZB with the next highest offset.<br>
+define <16 x i8> @f3(i8 *%base) {<br>
+; CHECK-LABEL: f3:<br>
+; CHECK-NOT: vllezb %v24, 4096(%r2)<br>
+; CHECK: br %r14<br>
+ %ptr = getelementptr i8, i8 *%base, i64 4096<br>
+ %val = load i8, i8 *%ptr<br>
+ %ret = insertelement <16 x i8> zeroinitializer, i8 %val, i32 7<br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test that VLLEZB allows an index.<br>
+define <16 x i8> @f4(i8 *%base, i64 %index) {<br>
+; CHECK-LABEL: f4:<br>
+; CHECK: vllezb %v24, 0({{%r2,%r3|%r3,%r2}})<br>
+; CHECK: br %r14<br>
+ %ptr = getelementptr i8, i8 *%base, i64 %index<br>
+ %val = load i8, i8 *%ptr<br>
+ %ret = insertelement <16 x i8> zeroinitializer, i8 %val, i32 7<br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test VLLEZH.<br>
+define <8 x i16> @f5(i16 *%ptr) {<br>
+; CHECK-LABEL: f5:<br>
+; CHECK: vllezh %v24, 0(%r2)<br>
+; CHECK: br %r14<br>
+ %val = load i16, i16 *%ptr<br>
+ %ret = insertelement <8 x i16> zeroinitializer, i16 %val, i32 3<br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test VLLEZF.<br>
+define <4 x i32> @f6(i32 *%ptr) {<br>
+; CHECK-LABEL: f6:<br>
+; CHECK: vllezf %v24, 0(%r2)<br>
+; CHECK: br %r14<br>
+ %val = load i32, i32 *%ptr<br>
+ %ret = insertelement <4 x i32> zeroinitializer, i32 %val, i32 1<br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test VLLEZG.<br>
+define <2 x i64> @f7(i64 *%ptr) {<br>
+; CHECK-LABEL: f7:<br>
+; CHECK: vllezg %v24, 0(%r2)<br>
+; CHECK: br %r14<br>
+ %val = load i64, i64 *%ptr<br>
+ %ret = insertelement <2 x i64> zeroinitializer, i64 %val, i32 0<br>
+ ret <2 x i64> %ret<br>
+}<br>
<br>
Added: llvm/trunk/test/CodeGen/SystemZ/vec-mul-01.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-mul-01.ll?rev=236521&view=auto" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-mul-01.ll?rev=236521&view=auto</a><br>
==============================================================================<br>
--- llvm/trunk/test/CodeGen/SystemZ/vec-mul-01.ll (added)<br>
+++ llvm/trunk/test/CodeGen/SystemZ/vec-mul-01.ll Tue May 5 14:25:42 2015<br>
@@ -0,0 +1,39 @@<br>
+; Test vector multiplication.<br>
+;<br>
+; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=z13 | FileCheck %s<br>
+<br>
+; Test a v16i8 multiplication.<br>
+define <16 x i8> @f1(<16 x i8> %dummy, <16 x i8> %val1, <16 x i8> %val2) {<br>
+; CHECK-LABEL: f1:<br>
+; CHECK: vmlb %v24, %v26, %v28<br>
+; CHECK: br %r14<br>
+ %ret = mul <16 x i8> %val1, %val2<br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test a v8i16 multiplication.<br>
+define <8 x i16> @f2(<8 x i16> %dummy, <8 x i16> %val1, <8 x i16> %val2) {<br>
+; CHECK-LABEL: f2:<br>
+; CHECK: vmlhw %v24, %v26, %v28<br>
+; CHECK: br %r14<br>
+ %ret = mul <8 x i16> %val1, %val2<br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test a v4i32 multiplication.<br>
+define <4 x i32> @f3(<4 x i32> %dummy, <4 x i32> %val1, <4 x i32> %val2) {<br>
+; CHECK-LABEL: f3:<br>
+; CHECK: vmlf %v24, %v26, %v28<br>
+; CHECK: br %r14<br>
+ %ret = mul <4 x i32> %val1, %val2<br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test a v2i64 multiplication. There's no vector equivalent.<br>
+define <2 x i64> @f4(<2 x i64> %dummy, <2 x i64> %val1, <2 x i64> %val2) {<br>
+; CHECK-LABEL: f4:<br>
+; CHECK-NOT: vmlg<br>
+; CHECK: br %r14<br>
+ %ret = mul <2 x i64> %val1, %val2<br>
+ ret <2 x i64> %ret<br>
+}<br>
<br>
Added: llvm/trunk/test/CodeGen/SystemZ/vec-mul-02.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-mul-02.ll?rev=236521&view=auto" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-mul-02.ll?rev=236521&view=auto</a><br>
==============================================================================<br>
--- llvm/trunk/test/CodeGen/SystemZ/vec-mul-02.ll (added)<br>
+++ llvm/trunk/test/CodeGen/SystemZ/vec-mul-02.ll Tue May 5 14:25:42 2015<br>
@@ -0,0 +1,36 @@<br>
+; Test vector multiply-and-add.<br>
+;<br>
+; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=z13 | FileCheck %s<br>
+<br>
+; Test a v16i8 multiply-and-add.<br>
+define <16 x i8> @f1(<16 x i8> %dummy, <16 x i8> %val1, <16 x i8> %val2,<br>
+ <16 x i8> %val3) {<br>
+; CHECK-LABEL: f1:<br>
+; CHECK: vmalb %v24, %v26, %v28, %v30<br>
+; CHECK: br %r14<br>
+ %mul = mul <16 x i8> %val1, %val2<br>
+ %ret = add <16 x i8> %mul, %val3<br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test a v8i16 multiply-and-add.<br>
+define <8 x i16> @f2(<8 x i16> %dummy, <8 x i16> %val1, <8 x i16> %val2,<br>
+ <8 x i16> %val3) {<br>
+; CHECK-LABEL: f2:<br>
+; CHECK: vmalhw %v24, %v26, %v28, %v30<br>
+; CHECK: br %r14<br>
+ %mul = mul <8 x i16> %val1, %val2<br>
+ %ret = add <8 x i16> %mul, %val3<br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test a v4i32 multiply-and-add.<br>
+define <4 x i32> @f3(<4 x i32> %dummy, <4 x i32> %val1, <4 x i32> %val2,<br>
+ <4 x i32> %val3) {<br>
+; CHECK-LABEL: f3:<br>
+; CHECK: vmalf %v24, %v26, %v28, %v30<br>
+; CHECK: br %r14<br>
+ %mul = mul <4 x i32> %val1, %val2<br>
+ %ret = add <4 x i32> %mul, %val3<br>
+ ret <4 x i32> %ret<br>
+}<br>
<br>
Added: llvm/trunk/test/CodeGen/SystemZ/vec-neg-01.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-neg-01.ll?rev=236521&view=auto" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-neg-01.ll?rev=236521&view=auto</a><br>
==============================================================================<br>
--- llvm/trunk/test/CodeGen/SystemZ/vec-neg-01.ll (added)<br>
+++ llvm/trunk/test/CodeGen/SystemZ/vec-neg-01.ll Tue May 5 14:25:42 2015<br>
@@ -0,0 +1,39 @@<br>
+; Test vector negation.<br>
+;<br>
+; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=z13 | FileCheck %s<br>
+<br>
+; Test a v16i8 negation.<br>
+define <16 x i8> @f1(<16 x i8> %dummy, <16 x i8> %val) {<br>
+; CHECK-LABEL: f1:<br>
+; CHECK: vlcb %v24, %v26<br>
+; CHECK: br %r14<br>
+ %ret = sub <16 x i8> zeroinitializer, %val<br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test a v8i16 negation.<br>
+define <8 x i16> @f2(<8 x i16> %dummy, <8 x i16> %val) {<br>
+; CHECK-LABEL: f2:<br>
+; CHECK: vlch %v24, %v26<br>
+; CHECK: br %r14<br>
+ %ret = sub <8 x i16> zeroinitializer, %val<br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test a v4i32 negation.<br>
+define <4 x i32> @f3(<4 x i32> %dummy, <4 x i32> %val) {<br>
+; CHECK-LABEL: f3:<br>
+; CHECK: vlcf %v24, %v26<br>
+; CHECK: br %r14<br>
+ %ret = sub <4 x i32> zeroinitializer, %val<br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test a v2i64 negation.<br>
+define <2 x i64> @f4(<2 x i64> %dummy, <2 x i64> %val) {<br>
+; CHECK-LABEL: f4:<br>
+; CHECK: vlcg %v24, %v26<br>
+; CHECK: br %r14<br>
+ %ret = sub <2 x i64> zeroinitializer, %val<br>
+ ret <2 x i64> %ret<br>
+}<br>
<br>
Added: llvm/trunk/test/CodeGen/SystemZ/vec-or-01.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-or-01.ll?rev=236521&view=auto" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-or-01.ll?rev=236521&view=auto</a><br>
==============================================================================<br>
--- llvm/trunk/test/CodeGen/SystemZ/vec-or-01.ll (added)<br>
+++ llvm/trunk/test/CodeGen/SystemZ/vec-or-01.ll Tue May 5 14:25:42 2015<br>
@@ -0,0 +1,39 @@<br>
+; Test vector OR.<br>
+;<br>
+; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=z13 | FileCheck %s<br>
+<br>
+; Test a v16i8 OR.<br>
+define <16 x i8> @f1(<16 x i8> %dummy, <16 x i8> %val1, <16 x i8> %val2) {<br>
+; CHECK-LABEL: f1:<br>
+; CHECK: vo %v24, %v26, %v28<br>
+; CHECK: br %r14<br>
+ %ret = or <16 x i8> %val1, %val2<br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test a v8i16 OR.<br>
+define <8 x i16> @f2(<8 x i16> %dummy, <8 x i16> %val1, <8 x i16> %val2) {<br>
+; CHECK-LABEL: f2:<br>
+; CHECK: vo %v24, %v26, %v28<br>
+; CHECK: br %r14<br>
+ %ret = or <8 x i16> %val1, %val2<br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test a v4i32 OR.<br>
+define <4 x i32> @f3(<4 x i32> %dummy, <4 x i32> %val1, <4 x i32> %val2) {<br>
+; CHECK-LABEL: f3:<br>
+; CHECK: vo %v24, %v26, %v28<br>
+; CHECK: br %r14<br>
+ %ret = or <4 x i32> %val1, %val2<br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test a v2i64 OR.<br>
+define <2 x i64> @f4(<2 x i64> %dummy, <2 x i64> %val1, <2 x i64> %val2) {<br>
+; CHECK-LABEL: f4:<br>
+; CHECK: vo %v24, %v26, %v28<br>
+; CHECK: br %r14<br>
+ %ret = or <2 x i64> %val1, %val2<br>
+ ret <2 x i64> %ret<br>
+}<br>
<br>
Added: llvm/trunk/test/CodeGen/SystemZ/vec-or-02.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-or-02.ll?rev=236521&view=auto" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-or-02.ll?rev=236521&view=auto</a><br>
==============================================================================<br>
--- llvm/trunk/test/CodeGen/SystemZ/vec-or-02.ll (added)<br>
+++ llvm/trunk/test/CodeGen/SystemZ/vec-or-02.ll Tue May 5 14:25:42 2015<br>
@@ -0,0 +1,107 @@<br>
+; Test vector (or (and X, Z), (and Y, (not Z))) patterns.<br>
+;<br>
+; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=z13 | FileCheck %s<br>
+<br>
+; Test v16i8.<br>
+define <16 x i8> @f1(<16 x i8> %val1, <16 x i8> %val2, <16 x i8> %val3) {<br>
+; CHECK-LABEL: f1:<br>
+; CHECK: vsel %v24, %v24, %v26, %v28<br>
+; CHECK: br %r14<br>
+ %not = xor <16 x i8> %val3, <i8 -1, i8 -1, i8 -1, i8 -1,<br>
+ i8 -1, i8 -1, i8 -1, i8 -1,<br>
+ i8 -1, i8 -1, i8 -1, i8 -1,<br>
+ i8 -1, i8 -1, i8 -1, i8 -1><br>
+ %and1 = and <16 x i8> %val1, %val3<br>
+ %and2 = and <16 x i8> %val2, %not<br>
+ %ret = or <16 x i8> %and1, %and2<br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; ...and again with the XOR applied to the other operand of the AND.<br>
+define <16 x i8> @f2(<16 x i8> %val1, <16 x i8> %val2, <16 x i8> %val3) {<br>
+; CHECK-LABEL: f2:<br>
+; CHECK: vsel %v24, %v26, %v24, %v28<br>
+; CHECK: br %r14<br>
+ %not = xor <16 x i8> %val3, <i8 -1, i8 -1, i8 -1, i8 -1,<br>
+ i8 -1, i8 -1, i8 -1, i8 -1,<br>
+ i8 -1, i8 -1, i8 -1, i8 -1,<br>
+ i8 -1, i8 -1, i8 -1, i8 -1><br>
+ %and1 = and <16 x i8> %val1, %not<br>
+ %and2 = and <16 x i8> %val2, %val3<br>
+ %ret = or <16 x i8> %and1, %and2<br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test v8i16.<br>
+define <8 x i16> @f3(<8 x i16> %val1, <8 x i16> %val2, <8 x i16> %val3) {<br>
+; CHECK-LABEL: f3:<br>
+; CHECK: vsel %v24, %v24, %v26, %v28<br>
+; CHECK: br %r14<br>
+ %not = xor <8 x i16> %val3, <i16 -1, i16 -1, i16 -1, i16 -1,<br>
+ i16 -1, i16 -1, i16 -1, i16 -1><br>
+ %and1 = and <8 x i16> %val1, %val3<br>
+ %and2 = and <8 x i16> %val2, %not<br>
+ %ret = or <8 x i16> %and1, %and2<br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; ...and again with the XOR applied to the other operand of the AND.<br>
+define <8 x i16> @f4(<8 x i16> %val1, <8 x i16> %val2, <8 x i16> %val3) {<br>
+; CHECK-LABEL: f4:<br>
+; CHECK: vsel %v24, %v26, %v24, %v28<br>
+; CHECK: br %r14<br>
+ %not = xor <8 x i16> %val3, <i16 -1, i16 -1, i16 -1, i16 -1,<br>
+ i16 -1, i16 -1, i16 -1, i16 -1><br>
+ %and1 = and <8 x i16> %val1, %not<br>
+ %and2 = and <8 x i16> %val2, %val3<br>
+ %ret = or <8 x i16> %and1, %and2<br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test v4i32.<br>
+define <4 x i32> @f5(<4 x i32> %val1, <4 x i32> %val2, <4 x i32> %val3) {<br>
+; CHECK-LABEL: f5:<br>
+; CHECK: vsel %v24, %v24, %v26, %v28<br>
+; CHECK: br %r14<br>
+ %not = xor <4 x i32> %val3, <i32 -1, i32 -1, i32 -1, i32 -1><br>
+ %and1 = and <4 x i32> %val1, %val3<br>
+ %and2 = and <4 x i32> %val2, %not<br>
+ %ret = or <4 x i32> %and1, %and2<br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; ...and again with the XOR applied to the other operand of the AND.<br>
+define <4 x i32> @f6(<4 x i32> %val1, <4 x i32> %val2, <4 x i32> %val3) {<br>
+; CHECK-LABEL: f6:<br>
+; CHECK: vsel %v24, %v26, %v24, %v28<br>
+; CHECK: br %r14<br>
+ %not = xor <4 x i32> %val3, <i32 -1, i32 -1, i32 -1, i32 -1><br>
+ %and1 = and <4 x i32> %val1, %not<br>
+ %and2 = and <4 x i32> %val2, %val3<br>
+ %ret = or <4 x i32> %and1, %and2<br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test v2i64.<br>
+define <2 x i64> @f7(<2 x i64> %val1, <2 x i64> %val2, <2 x i64> %val3) {<br>
+; CHECK-LABEL: f7:<br>
+; CHECK: vsel %v24, %v24, %v26, %v28<br>
+; CHECK: br %r14<br>
+ %not = xor <2 x i64> %val3, <i64 -1, i64 -1><br>
+ %and1 = and <2 x i64> %val1, %val3<br>
+ %and2 = and <2 x i64> %val2, %not<br>
+ %ret = or <2 x i64> %and1, %and2<br>
+ ret <2 x i64> %ret<br>
+}<br>
+<br>
+; ...and again with the XOR applied to the other operand of the AND.<br>
+define <2 x i64> @f8(<2 x i64> %val1, <2 x i64> %val2, <2 x i64> %val3) {<br>
+; CHECK-LABEL: f8:<br>
+; CHECK: vsel %v24, %v26, %v24, %v28<br>
+; CHECK: br %r14<br>
+ %not = xor <2 x i64> %val3, <i64 -1, i64 -1><br>
+ %and1 = and <2 x i64> %val1, %not<br>
+ %and2 = and <2 x i64> %val2, %val3<br>
+ %ret = or <2 x i64> %and1, %and2<br>
+ ret <2 x i64> %ret<br>
+}<br>
<br>
Added: llvm/trunk/test/CodeGen/SystemZ/vec-perm-01.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-perm-01.ll?rev=236521&view=auto" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-perm-01.ll?rev=236521&view=auto</a><br>
==============================================================================<br>
--- llvm/trunk/test/CodeGen/SystemZ/vec-perm-01.ll (added)<br>
+++ llvm/trunk/test/CodeGen/SystemZ/vec-perm-01.ll Tue May 5 14:25:42 2015<br>
@@ -0,0 +1,124 @@<br>
+; Test vector splat.<br>
+;<br>
+; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=z13 | FileCheck %s<br>
+<br>
+; Test v16i8 splat of the first element.<br>
+define <16 x i8> @f1(<16 x i8> %val) {<br>
+; CHECK-LABEL: f1:<br>
+; CHECK: vrepb %v24, %v24, 0<br>
+; CHECK: br %r14<br>
+ %ret = shufflevector <16 x i8> %val, <16 x i8> undef,<br>
+ <16 x i32> zeroinitializer<br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test v16i8 splat of the last element.<br>
+define <16 x i8> @f2(<16 x i8> %val) {<br>
+; CHECK-LABEL: f2:<br>
+; CHECK: vrepb %v24, %v24, 15<br>
+; CHECK: br %r14<br>
+ %ret = shufflevector <16 x i8> %val, <16 x i8> undef,<br>
+ <16 x i32> <i32 15, i32 15, i32 15, i32 15,<br>
+ i32 15, i32 15, i32 15, i32 15,<br>
+ i32 15, i32 15, i32 15, i32 15,<br>
+ i32 15, i32 15, i32 15, i32 15><br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test v16i8 splat of an arbitrary element, using the second operand of<br>
+; the shufflevector.<br>
+define <16 x i8> @f3(<16 x i8> %val) {<br>
+; CHECK-LABEL: f3:<br>
+; CHECK: vrepb %v24, %v24, 4<br>
+; CHECK: br %r14<br>
+ %ret = shufflevector <16 x i8> undef, <16 x i8> %val,<br>
+ <16 x i32> <i32 20, i32 20, i32 20, i32 20,<br>
+ i32 20, i32 20, i32 20, i32 20,<br>
+ i32 20, i32 20, i32 20, i32 20,<br>
+ i32 20, i32 20, i32 20, i32 20><br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test v8i16 splat of the first element.<br>
+define <8 x i16> @f4(<8 x i16> %val) {<br>
+; CHECK-LABEL: f4:<br>
+; CHECK: vreph %v24, %v24, 0<br>
+; CHECK: br %r14<br>
+ %ret = shufflevector <8 x i16> %val, <8 x i16> undef,<br>
+ <8 x i32> zeroinitializer<br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test v8i16 splat of the last element.<br>
+define <8 x i16> @f5(<8 x i16> %val) {<br>
+; CHECK-LABEL: f5:<br>
+; CHECK: vreph %v24, %v24, 7<br>
+; CHECK: br %r14<br>
+ %ret = shufflevector <8 x i16> %val, <8 x i16> undef,<br>
+ <8 x i32> <i32 7, i32 7, i32 7, i32 7,<br>
+ i32 7, i32 7, i32 7, i32 7><br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test v8i16 splat of an arbitrary element, using the second operand of<br>
+; the shufflevector.<br>
+define <8 x i16> @f6(<8 x i16> %val) {<br>
+; CHECK-LABEL: f6:<br>
+; CHECK: vreph %v24, %v24, 2<br>
+; CHECK: br %r14<br>
+ %ret = shufflevector <8 x i16> undef, <8 x i16> %val,<br>
+ <8 x i32> <i32 10, i32 10, i32 10, i32 10,<br>
+ i32 10, i32 10, i32 10, i32 10><br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test v4i32 splat of the first element.<br>
+define <4 x i32> @f7(<4 x i32> %val) {<br>
+; CHECK-LABEL: f7:<br>
+; CHECK: vrepf %v24, %v24, 0<br>
+; CHECK: br %r14<br>
+ %ret = shufflevector <4 x i32> %val, <4 x i32> undef,<br>
+ <4 x i32> zeroinitializer<br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test v4i32 splat of the last element.<br>
+define <4 x i32> @f8(<4 x i32> %val) {<br>
+; CHECK-LABEL: f8:<br>
+; CHECK: vrepf %v24, %v24, 3<br>
+; CHECK: br %r14<br>
+ %ret = shufflevector <4 x i32> %val, <4 x i32> undef,<br>
+ <4 x i32> <i32 3, i32 3, i32 3, i32 3><br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test v4i32 splat of an arbitrary element, using the second operand of<br>
+; the shufflevector.<br>
+define <4 x i32> @f9(<4 x i32> %val) {<br>
+; CHECK-LABEL: f9:<br>
+; CHECK: vrepf %v24, %v24, 1<br>
+; CHECK: br %r14<br>
+ %ret = shufflevector <4 x i32> undef, <4 x i32> %val,<br>
+ <4 x i32> <i32 5, i32 5, i32 5, i32 5><br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test v2i64 splat of the first element.<br>
+define <2 x i64> @f10(<2 x i64> %val) {<br>
+; CHECK-LABEL: f10:<br>
+; CHECK: vrepg %v24, %v24, 0<br>
+; CHECK: br %r14<br>
+ %ret = shufflevector <2 x i64> %val, <2 x i64> undef,<br>
+ <2 x i32> zeroinitializer<br>
+ ret <2 x i64> %ret<br>
+}<br>
+<br>
+; Test v2i64 splat of the last element.<br>
+define <2 x i64> @f11(<2 x i64> %val) {<br>
+; CHECK-LABEL: f11:<br>
+; CHECK: vrepg %v24, %v24, 1<br>
+; CHECK: br %r14<br>
+ %ret = shufflevector <2 x i64> %val, <2 x i64> undef,<br>
+ <2 x i32> <i32 1, i32 1><br>
+ ret <2 x i64> %ret<br>
+}<br>
<br>
Added: llvm/trunk/test/CodeGen/SystemZ/vec-perm-02.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-perm-02.ll?rev=236521&view=auto" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-perm-02.ll?rev=236521&view=auto</a><br>
==============================================================================<br>
--- llvm/trunk/test/CodeGen/SystemZ/vec-perm-02.ll (added)<br>
+++ llvm/trunk/test/CodeGen/SystemZ/vec-perm-02.ll Tue May 5 14:25:42 2015<br>
@@ -0,0 +1,144 @@<br>
+; Test replications of a scalar register value, represented as splats.<br>
+;<br>
+; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=z13 | FileCheck %s<br>
+<br>
+; Test v16i8 splat of the first element.<br>
+define <16 x i8> @f1(i8 %scalar) {<br>
+; CHECK-LABEL: f1:<br>
+; CHECK: vlvgp [[REG:%v[0-9]+]], %r2, %r2<br>
+; CHECK: vrepb %v24, [[REG]], 7<br>
+; CHECK: br %r14<br>
+ %val = insertelement <16 x i8> undef, i8 %scalar, i32 0<br>
+ %ret = shufflevector <16 x i8> %val, <16 x i8> undef,<br>
+ <16 x i32> zeroinitializer<br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test v16i8 splat of the last element.<br>
+define <16 x i8> @f2(i8 %scalar) {<br>
+; CHECK-LABEL: f2:<br>
+; CHECK: vlvgp [[REG:%v[0-9]+]], %r2, %r2<br>
+; CHECK: vrepb %v24, [[REG]], 7<br>
+; CHECK: br %r14<br>
+ %val = insertelement <16 x i8> undef, i8 %scalar, i32 15<br>
+ %ret = shufflevector <16 x i8> %val, <16 x i8> undef,<br>
+ <16 x i32> <i32 15, i32 15, i32 15, i32 15,<br>
+ i32 15, i32 15, i32 15, i32 15,<br>
+ i32 15, i32 15, i32 15, i32 15,<br>
+ i32 15, i32 15, i32 15, i32 15><br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test v16i8 splat of an arbitrary element, using the second operand of<br>
+; the shufflevector.<br>
+define <16 x i8> @f3(i8 %scalar) {<br>
+; CHECK-LABEL: f3:<br>
+; CHECK: vlvgp [[REG:%v[0-9]+]], %r2, %r2<br>
+; CHECK: vrepb %v24, [[REG]], 7<br>
+; CHECK: br %r14<br>
+ %val = insertelement <16 x i8> undef, i8 %scalar, i32 4<br>
+ %ret = shufflevector <16 x i8> undef, <16 x i8> %val,<br>
+ <16 x i32> <i32 20, i32 20, i32 20, i32 20,<br>
+ i32 20, i32 20, i32 20, i32 20,<br>
+ i32 20, i32 20, i32 20, i32 20,<br>
+ i32 20, i32 20, i32 20, i32 20><br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test v8i16 splat of the first element.<br>
+define <8 x i16> @f4(i16 %scalar) {<br>
+; CHECK-LABEL: f4:<br>
+; CHECK: vlvgp [[REG:%v[0-9]+]], %r2, %r2<br>
+; CHECK: vreph %v24, [[REG]], 3<br>
+; CHECK: br %r14<br>
+ %val = insertelement <8 x i16> undef, i16 %scalar, i32 0<br>
+ %ret = shufflevector <8 x i16> %val, <8 x i16> undef,<br>
+ <8 x i32> zeroinitializer<br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test v8i16 splat of the last element.<br>
+define <8 x i16> @f5(i16 %scalar) {<br>
+; CHECK-LABEL: f5:<br>
+; CHECK: vlvgp [[REG:%v[0-9]+]], %r2, %r2<br>
+; CHECK: vreph %v24, [[REG]], 3<br>
+; CHECK: br %r14<br>
+ %val = insertelement <8 x i16> undef, i16 %scalar, i32 7<br>
+ %ret = shufflevector <8 x i16> %val, <8 x i16> undef,<br>
+ <8 x i32> <i32 7, i32 7, i32 7, i32 7,<br>
+ i32 7, i32 7, i32 7, i32 7><br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test v8i16 splat of an arbitrary element, using the second operand of<br>
+; the shufflevector.<br>
+define <8 x i16> @f6(i16 %scalar) {<br>
+; CHECK-LABEL: f6:<br>
+; CHECK: vlvgp [[REG:%v[0-9]+]], %r2, %r2<br>
+; CHECK: vreph %v24, [[REG]], 3<br>
+; CHECK: br %r14<br>
+ %val = insertelement <8 x i16> undef, i16 %scalar, i32 2<br>
+ %ret = shufflevector <8 x i16> undef, <8 x i16> %val,<br>
+ <8 x i32> <i32 10, i32 10, i32 10, i32 10,<br>
+ i32 10, i32 10, i32 10, i32 10><br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test v4i32 splat of the first element.<br>
+define <4 x i32> @f7(i32 %scalar) {<br>
+; CHECK-LABEL: f7:<br>
+; CHECK: vlvgp [[REG:%v[0-9]+]], %r2, %r2<br>
+; CHECK: vrepf %v24, [[REG]], 1<br>
+; CHECK: br %r14<br>
+ %val = insertelement <4 x i32> undef, i32 %scalar, i32 0<br>
+ %ret = shufflevector <4 x i32> %val, <4 x i32> undef,<br>
+ <4 x i32> zeroinitializer<br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test v4i32 splat of the last element.<br>
+define <4 x i32> @f8(i32 %scalar) {<br>
+; CHECK-LABEL: f8:<br>
+; CHECK: vlvgp [[REG:%v[0-9]+]], %r2, %r2<br>
+; CHECK: vrepf %v24, [[REG]], 1<br>
+; CHECK: br %r14<br>
+ %val = insertelement <4 x i32> undef, i32 %scalar, i32 3<br>
+ %ret = shufflevector <4 x i32> %val, <4 x i32> undef,<br>
+ <4 x i32> <i32 3, i32 3, i32 3, i32 3><br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test v4i32 splat of an arbitrary element, using the second operand of<br>
+; the shufflevector.<br>
+define <4 x i32> @f9(i32 %scalar) {<br>
+; CHECK-LABEL: f9:<br>
+; CHECK: vlvgp [[REG:%v[0-9]+]], %r2, %r2<br>
+; CHECK: vrepf %v24, [[REG]], 1<br>
+; CHECK: br %r14<br>
+ %val = insertelement <4 x i32> undef, i32 %scalar, i32 1<br>
+ %ret = shufflevector <4 x i32> undef, <4 x i32> %val,<br>
+ <4 x i32> <i32 5, i32 5, i32 5, i32 5><br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test v2i64 splat of the first element.<br>
+define <2 x i64> @f10(i64 %scalar) {<br>
+; CHECK-LABEL: f10:<br>
+; CHECK: vlvgp %v24, %r2, %r2<br>
+; CHECK: br %r14<br>
+ %val = insertelement <2 x i64> undef, i64 %scalar, i32 0<br>
+ %ret = shufflevector <2 x i64> %val, <2 x i64> undef,<br>
+ <2 x i32> zeroinitializer<br>
+ ret <2 x i64> %ret<br>
+}<br>
+<br>
+; Test v2i64 splat of the last element.<br>
+define <2 x i64> @f11(i64 %scalar) {<br>
+; CHECK-LABEL: f11:<br>
+; CHECK: vlvgp %v24, %r2, %r2<br>
+; CHECK: br %r14<br>
+ %val = insertelement <2 x i64> undef, i64 %scalar, i32 1<br>
+ %ret = shufflevector <2 x i64> %val, <2 x i64> undef,<br>
+ <2 x i32> <i32 1, i32 1><br>
+ ret <2 x i64> %ret<br>
+}<br>
<br>
Added: llvm/trunk/test/CodeGen/SystemZ/vec-perm-03.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-perm-03.ll?rev=236521&view=auto" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-perm-03.ll?rev=236521&view=auto</a><br>
==============================================================================<br>
--- llvm/trunk/test/CodeGen/SystemZ/vec-perm-03.ll (added)<br>
+++ llvm/trunk/test/CodeGen/SystemZ/vec-perm-03.ll Tue May 5 14:25:42 2015<br>
@@ -0,0 +1,173 @@<br>
+; Test replications of a scalar memory value, represented as splats.<br>
+;<br>
+; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=z13 | FileCheck %s<br>
+<br>
+; Test a v16i8 replicating load with no offset.<br>
+define <16 x i8> @f1(i8 *%ptr) {<br>
+; CHECK-LABEL: f1:<br>
+; CHECK: vlrepb %v24, 0(%r2)<br>
+; CHECK: br %r14<br>
+ %scalar = load i8, i8 *%ptr<br>
+ %val = insertelement <16 x i8> undef, i8 %scalar, i32 0<br>
+ %ret = shufflevector <16 x i8> %val, <16 x i8> undef,<br>
+ <16 x i32> zeroinitializer<br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test a v16i8 replicating load with the maximum in-range offset.<br>
+define <16 x i8> @f2(i8 *%base) {<br>
+; CHECK-LABEL: f2:<br>
+; CHECK: vlrepb %v24, 4095(%r2)<br>
+; CHECK: br %r14<br>
+ %ptr = getelementptr i8, i8 *%base, i64 4095<br>
+ %scalar = load i8, i8 *%ptr<br>
+ %val = insertelement <16 x i8> undef, i8 %scalar, i32 0<br>
+ %ret = shufflevector <16 x i8> %val, <16 x i8> undef,<br>
+ <16 x i32> zeroinitializer<br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test a v16i8 replicating load with the first out-of-range offset.<br>
+define <16 x i8> @f3(i8 *%base) {<br>
+; CHECK-LABEL: f3:<br>
+; CHECK: aghi %r2, 4096<br>
+; CHECK: vlrepb %v24, 0(%r2)<br>
+; CHECK: br %r14<br>
+ %ptr = getelementptr i8, i8 *%base, i64 4096<br>
+ %scalar = load i8, i8 *%ptr<br>
+ %val = insertelement <16 x i8> undef, i8 %scalar, i32 0<br>
+ %ret = shufflevector <16 x i8> %val, <16 x i8> undef,<br>
+ <16 x i32> zeroinitializer<br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test a v8i16 replicating load with no offset.<br>
+define <8 x i16> @f4(i16 *%ptr) {<br>
+; CHECK-LABEL: f4:<br>
+; CHECK: vlreph %v24, 0(%r2)<br>
+; CHECK: br %r14<br>
+ %scalar = load i16, i16 *%ptr<br>
+ %val = insertelement <8 x i16> undef, i16 %scalar, i32 0<br>
+ %ret = shufflevector <8 x i16> %val, <8 x i16> undef,<br>
+ <8 x i32> zeroinitializer<br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test a v8i16 replicating load with the maximum in-range offset.<br>
+define <8 x i16> @f5(i16 *%base) {<br>
+; CHECK-LABEL: f5:<br>
+; CHECK: vlreph %v24, 4094(%r2)<br>
+; CHECK: br %r14<br>
+ %ptr = getelementptr i16, i16 *%base, i64 2047<br>
+ %scalar = load i16, i16 *%ptr<br>
+ %val = insertelement <8 x i16> undef, i16 %scalar, i32 0<br>
+ %ret = shufflevector <8 x i16> %val, <8 x i16> undef,<br>
+ <8 x i32> zeroinitializer<br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test a v8i16 replicating load with the first out-of-range offset.<br>
+define <8 x i16> @f6(i16 *%base) {<br>
+; CHECK-LABEL: f6:<br>
+; CHECK: aghi %r2, 4096<br>
+; CHECK: vlreph %v24, 0(%r2)<br>
+; CHECK: br %r14<br>
+ %ptr = getelementptr i16, i16 *%base, i64 2048<br>
+ %scalar = load i16, i16 *%ptr<br>
+ %val = insertelement <8 x i16> undef, i16 %scalar, i32 0<br>
+ %ret = shufflevector <8 x i16> %val, <8 x i16> undef,<br>
+ <8 x i32> zeroinitializer<br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test a v4i32 replicating load with no offset.<br>
+define <4 x i32> @f7(i32 *%ptr) {<br>
+; CHECK-LABEL: f7:<br>
+; CHECK: vlrepf %v24, 0(%r2)<br>
+; CHECK: br %r14<br>
+ %scalar = load i32, i32 *%ptr<br>
+ %val = insertelement <4 x i32> undef, i32 %scalar, i32 0<br>
+ %ret = shufflevector <4 x i32> %val, <4 x i32> undef,<br>
+ <4 x i32> zeroinitializer<br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test a v4i32 replicating load with the maximum in-range offset.<br>
+define <4 x i32> @f8(i32 *%base) {<br>
+; CHECK-LABEL: f8:<br>
+; CHECK: vlrepf %v24, 4092(%r2)<br>
+; CHECK: br %r14<br>
+ %ptr = getelementptr i32, i32 *%base, i64 1023<br>
+ %scalar = load i32, i32 *%ptr<br>
+ %val = insertelement <4 x i32> undef, i32 %scalar, i32 0<br>
+ %ret = shufflevector <4 x i32> %val, <4 x i32> undef,<br>
+ <4 x i32> zeroinitializer<br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test a v4i32 replicating load with the first out-of-range offset.<br>
+define <4 x i32> @f9(i32 *%base) {<br>
+; CHECK-LABEL: f9:<br>
+; CHECK: aghi %r2, 4096<br>
+; CHECK: vlrepf %v24, 0(%r2)<br>
+; CHECK: br %r14<br>
+ %ptr = getelementptr i32, i32 *%base, i64 1024<br>
+ %scalar = load i32, i32 *%ptr<br>
+ %val = insertelement <4 x i32> undef, i32 %scalar, i32 0<br>
+ %ret = shufflevector <4 x i32> %val, <4 x i32> undef,<br>
+ <4 x i32> zeroinitializer<br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test a v2i64 replicating load with no offset.<br>
+define <2 x i64> @f10(i64 *%ptr) {<br>
+; CHECK-LABEL: f10:<br>
+; CHECK: vlrepg %v24, 0(%r2)<br>
+; CHECK: br %r14<br>
+ %scalar = load i64, i64 *%ptr<br>
+ %val = insertelement <2 x i64> undef, i64 %scalar, i32 0<br>
+ %ret = shufflevector <2 x i64> %val, <2 x i64> undef,<br>
+ <2 x i32> zeroinitializer<br>
+ ret <2 x i64> %ret<br>
+}<br>
+<br>
+; Test a v2i64 replicating load with the maximum in-range offset.<br>
+define <2 x i64> @f11(i64 *%base) {<br>
+; CHECK-LABEL: f11:<br>
+; CHECK: vlrepg %v24, 4088(%r2)<br>
+; CHECK: br %r14<br>
+ %ptr = getelementptr i64, i64 *%base, i32 511<br>
+ %scalar = load i64, i64 *%ptr<br>
+ %val = insertelement <2 x i64> undef, i64 %scalar, i32 0<br>
+ %ret = shufflevector <2 x i64> %val, <2 x i64> undef,<br>
+ <2 x i32> zeroinitializer<br>
+ ret <2 x i64> %ret<br>
+}<br>
+<br>
+; Test a v2i64 replicating load with the first out-of-range offset.<br>
+define <2 x i64> @f12(i64 *%base) {<br>
+; CHECK-LABEL: f12:<br>
+; CHECK: aghi %r2, 4096<br>
+; CHECK: vlrepg %v24, 0(%r2)<br>
+; CHECK: br %r14<br>
+ %ptr = getelementptr i64, i64 *%base, i32 512<br>
+ %scalar = load i64, i64 *%ptr<br>
+ %val = insertelement <2 x i64> undef, i64 %scalar, i32 0<br>
+ %ret = shufflevector <2 x i64> %val, <2 x i64> undef,<br>
+ <2 x i32> zeroinitializer<br>
+ ret <2 x i64> %ret<br>
+}<br>
+<br>
+; Test a v16i8 replicating load with an index.<br>
+define <16 x i8> @f19(i8 *%base, i64 %index) {<br>
+; CHECK-LABEL: f19:<br>
+; CHECK: vlrepb %v24, 1023(%r3,%r2)<br>
+; CHECK: br %r14<br>
+ %ptr1 = getelementptr i8, i8 *%base, i64 %index<br>
+ %ptr = getelementptr i8, i8 *%ptr1, i64 1023<br>
+ %scalar = load i8, i8 *%ptr<br>
+ %val = insertelement <16 x i8> undef, i8 %scalar, i32 0<br>
+ %ret = shufflevector <16 x i8> %val, <16 x i8> undef,<br>
+ <16 x i32> zeroinitializer<br>
+ ret <16 x i8> %ret<br>
+}<br>
<br>
Added: llvm/trunk/test/CodeGen/SystemZ/vec-perm-04.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-perm-04.ll?rev=236521&view=auto" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-perm-04.ll?rev=236521&view=auto</a><br>
==============================================================================<br>
--- llvm/trunk/test/CodeGen/SystemZ/vec-perm-04.ll (added)<br>
+++ llvm/trunk/test/CodeGen/SystemZ/vec-perm-04.ll Tue May 5 14:25:42 2015<br>
@@ -0,0 +1,160 @@<br>
+; Test vector merge high.<br>
+;<br>
+; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=z13 | FileCheck %s<br>
+<br>
+; Test a canonical v16i8 merge high.<br>
+define <16 x i8> @f1(<16 x i8> %val1, <16 x i8> %val2) {<br>
+; CHECK-LABEL: f1:<br>
+; CHECK: vmrhb %v24, %v24, %v26<br>
+; CHECK: br %r14<br>
+ %ret = shufflevector <16 x i8> %val1, <16 x i8> %val2,<br>
+ <16 x i32> <i32 0, i32 16, i32 1, i32 17,<br>
+ i32 2, i32 18, i32 3, i32 19,<br>
+ i32 4, i32 20, i32 5, i32 21,<br>
+ i32 6, i32 22, i32 7, i32 23><br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test a reversed v16i8 merge high.<br>
+define <16 x i8> @f2(<16 x i8> %val1, <16 x i8> %val2) {<br>
+; CHECK-LABEL: f2:<br>
+; CHECK: vmrhb %v24, %v26, %v24<br>
+; CHECK: br %r14<br>
+ %ret = shufflevector <16 x i8> %val1, <16 x i8> %val2,<br>
+ <16 x i32> <i32 16, i32 0, i32 17, i32 1,<br>
+ i32 18, i32 2, i32 19, i32 3,<br>
+ i32 20, i32 4, i32 21, i32 5,<br>
+ i32 22, i32 6, i32 23, i32 7><br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test a v16i8 merge high with only the first operand being used.<br>
+define <16 x i8> @f3(<16 x i8> %val1, <16 x i8> %val2) {<br>
+; CHECK-LABEL: f3:<br>
+; CHECK: vmrhb %v24, %v24, %v24<br>
+; CHECK: br %r14<br>
+ %ret = shufflevector <16 x i8> %val1, <16 x i8> %val2,<br>
+ <16 x i32> <i32 0, i32 0, i32 1, i32 1,<br>
+ i32 2, i32 2, i32 3, i32 3,<br>
+ i32 4, i32 4, i32 5, i32 5,<br>
+ i32 6, i32 6, i32 7, i32 7><br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test a v16i8 merge high with only the second operand being used.<br>
+; This is converted into @f3 by target-independent code.<br>
+define <16 x i8> @f4(<16 x i8> %val1, <16 x i8> %val2) {<br>
+; CHECK-LABEL: f4:<br>
+; CHECK: vmrhb %v24, %v26, %v26<br>
+; CHECK: br %r14<br>
+ %ret = shufflevector <16 x i8> %val1, <16 x i8> %val2,<br>
+ <16 x i32> <i32 16, i32 16, i32 17, i32 17,<br>
+ i32 18, i32 18, i32 19, i32 19,<br>
+ i32 20, i32 20, i32 21, i32 21,<br>
+ i32 22, i32 22, i32 23, i32 23><br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test a v16i8 merge with both operands being the same. This too is<br>
+; converted into @f3 by target-independent code.<br>
+define <16 x i8> @f5(<16 x i8> %val) {<br>
+; CHECK-LABEL: f5:<br>
+; CHECK: vmrhb %v24, %v24, %v24<br>
+; CHECK: br %r14<br>
+ %ret = shufflevector <16 x i8> %val, <16 x i8> %val,<br>
+ <16 x i32> <i32 0, i32 16, i32 17, i32 17,<br>
+ i32 18, i32 2, i32 3, i32 3,<br>
+ i32 20, i32 20, i32 5, i32 5,<br>
+ i32 6, i32 22, i32 23, i32 7><br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test a v16i8 merge in which some of the indices are don't care.<br>
+define <16 x i8> @f6(<16 x i8> %val1, <16 x i8> %val2) {<br>
+; CHECK-LABEL: f6:<br>
+; CHECK: vmrhb %v24, %v24, %v26<br>
+; CHECK: br %r14<br>
+ %ret = shufflevector <16 x i8> %val1, <16 x i8> %val2,<br>
+ <16 x i32> <i32 0, i32 undef, i32 1, i32 17,<br>
+ i32 undef, i32 18, i32 undef, i32 undef,<br>
+ i32 undef, i32 20, i32 5, i32 21,<br>
+ i32 undef, i32 22, i32 7, i32 undef><br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test a v16i8 merge in which one of the operands is undefined and where<br>
+; indices for that operand are "don't care". Target-independent code<br>
+; converts the indices themselves into "undef"s.<br>
+define <16 x i8> @f7(<16 x i8> %val) {<br>
+; CHECK-LABEL: f7:<br>
+; CHECK: vmrhb %v24, %v24, %v24<br>
+; CHECK: br %r14<br>
+ %ret = shufflevector <16 x i8> undef, <16 x i8> %val,<br>
+ <16 x i32> <i32 11, i32 16, i32 17, i32 5,<br>
+ i32 18, i32 10, i32 19, i32 19,<br>
+ i32 20, i32 20, i32 21, i32 3,<br>
+ i32 2, i32 22, i32 9, i32 23><br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test a canonical v8i16 merge high.<br>
+define <8 x i16> @f8(<8 x i16> %val1, <8 x i16> %val2) {<br>
+; CHECK-LABEL: f8:<br>
+; CHECK: vmrhh %v24, %v24, %v26<br>
+; CHECK: br %r14<br>
+ %ret = shufflevector <8 x i16> %val1, <8 x i16> %val2,<br>
+ <8 x i32> <i32 0, i32 8, i32 1, i32 9,<br>
+ i32 2, i32 10, i32 3, i32 11><br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test a reversed v8i16 merge high.<br>
+define <8 x i16> @f9(<8 x i16> %val1, <8 x i16> %val2) {<br>
+; CHECK-LABEL: f9:<br>
+; CHECK: vmrhh %v24, %v26, %v24<br>
+; CHECK: br %r14<br>
+ %ret = shufflevector <8 x i16> %val1, <8 x i16> %val2,<br>
+ <8 x i32> <i32 8, i32 0, i32 9, i32 1,<br>
+ i32 10, i32 2, i32 11, i32 3><br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test a canonical v4i32 merge high.<br>
+define <4 x i32> @f10(<4 x i32> %val1, <4 x i32> %val2) {<br>
+; CHECK-LABEL: f10:<br>
+; CHECK: vmrhf %v24, %v24, %v26<br>
+; CHECK: br %r14<br>
+ %ret = shufflevector <4 x i32> %val1, <4 x i32> %val2,<br>
+ <4 x i32> <i32 0, i32 4, i32 1, i32 5><br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test a reversed v4i32 merge high.<br>
+define <4 x i32> @f11(<4 x i32> %val1, <4 x i32> %val2) {<br>
+; CHECK-LABEL: f11:<br>
+; CHECK: vmrhf %v24, %v26, %v24<br>
+; CHECK: br %r14<br>
+ %ret = shufflevector <4 x i32> %val1, <4 x i32> %val2,<br>
+ <4 x i32> <i32 4, i32 0, i32 5, i32 1><br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test a canonical v2i64 merge high.<br>
+define <2 x i64> @f12(<2 x i64> %val1, <2 x i64> %val2) {<br>
+; CHECK-LABEL: f12:<br>
+; CHECK: vmrhg %v24, %v24, %v26<br>
+; CHECK: br %r14<br>
+ %ret = shufflevector <2 x i64> %val1, <2 x i64> %val2,<br>
+ <2 x i32> <i32 0, i32 2><br>
+ ret <2 x i64> %ret<br>
+}<br>
+<br>
+; Test a reversed v2i64 merge high.<br>
+define <2 x i64> @f13(<2 x i64> %val1, <2 x i64> %val2) {<br>
+; CHECK-LABEL: f13:<br>
+; CHECK: vmrhg %v24, %v26, %v24<br>
+; CHECK: br %r14<br>
+ %ret = shufflevector <2 x i64> %val1, <2 x i64> %val2,<br>
+ <2 x i32> <i32 2, i32 0><br>
+ ret <2 x i64> %ret<br>
+}<br>
<br>
Added: llvm/trunk/test/CodeGen/SystemZ/vec-perm-05.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-perm-05.ll?rev=236521&view=auto" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-perm-05.ll?rev=236521&view=auto</a><br>
==============================================================================<br>
--- llvm/trunk/test/CodeGen/SystemZ/vec-perm-05.ll (added)<br>
+++ llvm/trunk/test/CodeGen/SystemZ/vec-perm-05.ll Tue May 5 14:25:42 2015<br>
@@ -0,0 +1,160 @@<br>
+; Test vector merge low.<br>
+;<br>
+; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=z13 | FileCheck %s<br>
+<br>
+; Test a canonical v16i8 merge low.<br>
+define <16 x i8> @f1(<16 x i8> %val1, <16 x i8> %val2) {<br>
+; CHECK-LABEL: f1:<br>
+; CHECK: vmrlb %v24, %v24, %v26<br>
+; CHECK: br %r14<br>
+ %ret = shufflevector <16 x i8> %val1, <16 x i8> %val2,<br>
+ <16 x i32> <i32 8, i32 24, i32 9, i32 25,<br>
+ i32 10, i32 26, i32 11, i32 27,<br>
+ i32 12, i32 28, i32 13, i32 29,<br>
+ i32 14, i32 30, i32 15, i32 31><br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test a reversed v16i8 merge low.<br>
+define <16 x i8> @f2(<16 x i8> %val1, <16 x i8> %val2) {<br>
+; CHECK-LABEL: f2:<br>
+; CHECK: vmrlb %v24, %v26, %v24<br>
+; CHECK: br %r14<br>
+ %ret = shufflevector <16 x i8> %val1, <16 x i8> %val2,<br>
+ <16 x i32> <i32 24, i32 8, i32 25, i32 9,<br>
+ i32 26, i32 10, i32 27, i32 11,<br>
+ i32 28, i32 12, i32 29, i32 13,<br>
+ i32 30, i32 14, i32 31, i32 15><br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test a v16i8 merge low with only the first operand being used.<br>
+define <16 x i8> @f3(<16 x i8> %val1, <16 x i8> %val2) {<br>
+; CHECK-LABEL: f3:<br>
+; CHECK: vmrlb %v24, %v24, %v24<br>
+; CHECK: br %r14<br>
+ %ret = shufflevector <16 x i8> %val1, <16 x i8> %val2,<br>
+ <16 x i32> <i32 8, i32 8, i32 9, i32 9,<br>
+ i32 10, i32 10, i32 11, i32 11,<br>
+ i32 12, i32 12, i32 13, i32 13,<br>
+ i32 14, i32 14, i32 15, i32 15><br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test a v16i8 merge low with only the second operand being used.<br>
+; This is converted into @f3 by target-independent code.<br>
+define <16 x i8> @f4(<16 x i8> %val1, <16 x i8> %val2) {<br>
+; CHECK-LABEL: f4:<br>
+; CHECK: vmrlb %v24, %v26, %v26<br>
+; CHECK: br %r14<br>
+ %ret = shufflevector <16 x i8> %val1, <16 x i8> %val2,<br>
+ <16 x i32> <i32 24, i32 24, i32 25, i32 25,<br>
+ i32 26, i32 26, i32 27, i32 27,<br>
+ i32 28, i32 28, i32 29, i32 29,<br>
+ i32 30, i32 30, i32 31, i32 31><br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test a v16i8 merge with both operands being the same. This too is<br>
+; converted into @f3 by target-independent code.<br>
+define <16 x i8> @f5(<16 x i8> %val) {<br>
+; CHECK-LABEL: f5:<br>
+; CHECK: vmrlb %v24, %v24, %v24<br>
+; CHECK: br %r14<br>
+ %ret = shufflevector <16 x i8> %val, <16 x i8> %val,<br>
+ <16 x i32> <i32 8, i32 24, i32 25, i32 25,<br>
+ i32 26, i32 10, i32 11, i32 11,<br>
+ i32 28, i32 28, i32 13, i32 13,<br>
+ i32 14, i32 30, i32 31, i32 15><br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test a v16i8 merge in which some of the indices are don't care.<br>
+define <16 x i8> @f6(<16 x i8> %val1, <16 x i8> %val2) {<br>
+; CHECK-LABEL: f6:<br>
+; CHECK: vmrlb %v24, %v24, %v26<br>
+; CHECK: br %r14<br>
+ %ret = shufflevector <16 x i8> %val1, <16 x i8> %val2,<br>
+ <16 x i32> <i32 8, i32 undef, i32 9, i32 25,<br>
+ i32 undef, i32 26, i32 undef, i32 undef,<br>
+ i32 undef, i32 28, i32 13, i32 29,<br>
+ i32 undef, i32 30, i32 15, i32 undef><br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test a v16i8 merge in which one of the operands is undefined and where<br>
+; indices for that operand are "don't care". Target-independent code<br>
+; converts the indices themselves into "undef"s.<br>
+define <16 x i8> @f7(<16 x i8> %val) {<br>
+; CHECK-LABEL: f7:<br>
+; CHECK: vmrlb %v24, %v24, %v24<br>
+; CHECK: br %r14<br>
+ %ret = shufflevector <16 x i8> undef, <16 x i8> %val,<br>
+ <16 x i32> <i32 11, i32 24, i32 25, i32 5,<br>
+ i32 26, i32 10, i32 27, i32 27,<br>
+ i32 28, i32 28, i32 29, i32 3,<br>
+ i32 2, i32 30, i32 9, i32 31><br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test a canonical v8i16 merge low.<br>
+define <8 x i16> @f8(<8 x i16> %val1, <8 x i16> %val2) {<br>
+; CHECK-LABEL: f8:<br>
+; CHECK: vmrlh %v24, %v24, %v26<br>
+; CHECK: br %r14<br>
+ %ret = shufflevector <8 x i16> %val1, <8 x i16> %val2,<br>
+ <8 x i32> <i32 4, i32 12, i32 5, i32 13,<br>
+ i32 6, i32 14, i32 7, i32 15><br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test a reversed v8i16 merge low.<br>
+define <8 x i16> @f9(<8 x i16> %val1, <8 x i16> %val2) {<br>
+; CHECK-LABEL: f9:<br>
+; CHECK: vmrlh %v24, %v26, %v24<br>
+; CHECK: br %r14<br>
+ %ret = shufflevector <8 x i16> %val1, <8 x i16> %val2,<br>
+ <8 x i32> <i32 12, i32 4, i32 13, i32 5,<br>
+ i32 14, i32 6, i32 15, i32 7><br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test a canonical v4i32 merge low.<br>
+define <4 x i32> @f10(<4 x i32> %val1, <4 x i32> %val2) {<br>
+; CHECK-LABEL: f10:<br>
+; CHECK: vmrlf %v24, %v24, %v26<br>
+; CHECK: br %r14<br>
+ %ret = shufflevector <4 x i32> %val1, <4 x i32> %val2,<br>
+ <4 x i32> <i32 2, i32 6, i32 3, i32 7><br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test a reversed v4i32 merge low.<br>
+define <4 x i32> @f11(<4 x i32> %val1, <4 x i32> %val2) {<br>
+; CHECK-LABEL: f11:<br>
+; CHECK: vmrlf %v24, %v26, %v24<br>
+; CHECK: br %r14<br>
+ %ret = shufflevector <4 x i32> %val1, <4 x i32> %val2,<br>
+ <4 x i32> <i32 6, i32 2, i32 7, i32 3><br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test a canonical v2i64 merge low.<br>
+define <2 x i64> @f12(<2 x i64> %val1, <2 x i64> %val2) {<br>
+; CHECK-LABEL: f12:<br>
+; CHECK: vmrlg %v24, %v24, %v26<br>
+; CHECK: br %r14<br>
+ %ret = shufflevector <2 x i64> %val1, <2 x i64> %val2,<br>
+ <2 x i32> <i32 1, i32 3><br>
+ ret <2 x i64> %ret<br>
+}<br>
+<br>
+; Test a reversed v2i64 merge low.<br>
+define <2 x i64> @f13(<2 x i64> %val1, <2 x i64> %val2) {<br>
+; CHECK-LABEL: f13:<br>
+; CHECK: vmrlg %v24, %v26, %v24<br>
+; CHECK: br %r14<br>
+ %ret = shufflevector <2 x i64> %val1, <2 x i64> %val2,<br>
+ <2 x i32> <i32 3, i32 1><br>
+ ret <2 x i64> %ret<br>
+}<br>
<br>
Added: llvm/trunk/test/CodeGen/SystemZ/vec-perm-06.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-perm-06.ll?rev=236521&view=auto" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-perm-06.ll?rev=236521&view=auto</a><br>
==============================================================================<br>
--- llvm/trunk/test/CodeGen/SystemZ/vec-perm-06.ll (added)<br>
+++ llvm/trunk/test/CodeGen/SystemZ/vec-perm-06.ll Tue May 5 14:25:42 2015<br>
@@ -0,0 +1,140 @@<br>
+; Test vector pack.<br>
+;<br>
+; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=z13 | FileCheck %s<br>
+<br>
+; Test a canonical v16i8 pack.<br>
+define <16 x i8> @f1(<16 x i8> %val1, <16 x i8> %val2) {<br>
+; CHECK-LABEL: f1:<br>
+; CHECK: vpkh %v24, %v24, %v26<br>
+; CHECK: br %r14<br>
+ %ret = shufflevector <16 x i8> %val1, <16 x i8> %val2,<br>
+ <16 x i32> <i32 1, i32 3, i32 5, i32 7,<br>
+ i32 9, i32 11, i32 13, i32 15,<br>
+ i32 17, i32 19, i32 21, i32 23,<br>
+ i32 25, i32 27, i32 29, i32 31><br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test a reversed v16i8 pack.<br>
+define <16 x i8> @f2(<16 x i8> %val1, <16 x i8> %val2) {<br>
+; CHECK-LABEL: f2:<br>
+; CHECK: vpkh %v24, %v26, %v24<br>
+; CHECK: br %r14<br>
+ %ret = shufflevector <16 x i8> %val1, <16 x i8> %val2,<br>
+ <16 x i32> <i32 17, i32 19, i32 21, i32 23,<br>
+ i32 25, i32 27, i32 29, i32 31,<br>
+ i32 1, i32 3, i32 5, i32 7,<br>
+ i32 9, i32 11, i32 13, i32 15><br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test a v16i8 pack with only the first operand being used.<br>
+define <16 x i8> @f3(<16 x i8> %val1, <16 x i8> %val2) {<br>
+; CHECK-LABEL: f3:<br>
+; CHECK: vpkh %v24, %v24, %v24<br>
+; CHECK: br %r14<br>
+ %ret = shufflevector <16 x i8> %val1, <16 x i8> %val2,<br>
+ <16 x i32> <i32 1, i32 3, i32 5, i32 7,<br>
+ i32 9, i32 11, i32 13, i32 15,<br>
+ i32 1, i32 3, i32 5, i32 7,<br>
+ i32 9, i32 11, i32 13, i32 15><br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test a v16i8 pack with only the second operand being used.<br>
+; This is converted into @f3 by target-independent code.<br>
+define <16 x i8> @f4(<16 x i8> %val1, <16 x i8> %val2) {<br>
+; CHECK-LABEL: f4:<br>
+; CHECK: vpkh %v24, %v26, %v26<br>
+; CHECK: br %r14<br>
+ %ret = shufflevector <16 x i8> %val1, <16 x i8> %val2,<br>
+ <16 x i32> <i32 17, i32 19, i32 21, i32 23,<br>
+ i32 25, i32 27, i32 29, i32 31,<br>
+ i32 17, i32 19, i32 21, i32 23,<br>
+ i32 25, i32 27, i32 29, i32 31><br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test a v16i8 pack with both operands being the same. This too is<br>
+; converted into @f3 by target-independent code.<br>
+define <16 x i8> @f5(<16 x i8> %val) {<br>
+; CHECK-LABEL: f5:<br>
+; CHECK: vpkh %v24, %v24, %v24<br>
+; CHECK: br %r14<br>
+ %ret = shufflevector <16 x i8> %val, <16 x i8> %val,<br>
+ <16 x i32> <i32 1, i32 3, i32 5, i32 7,<br>
+ i32 9, i32 11, i32 13, i32 15,<br>
+ i32 17, i32 19, i32 21, i32 23,<br>
+ i32 25, i32 27, i32 29, i32 31><br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test a v16i8 pack in which some of the indices are don't care.<br>
+define <16 x i8> @f6(<16 x i8> %val1, <16 x i8> %val2) {<br>
+; CHECK-LABEL: f6:<br>
+; CHECK: vpkh %v24, %v24, %v26<br>
+; CHECK: br %r14<br>
+ %ret = shufflevector <16 x i8> %val1, <16 x i8> %val2,<br>
+ <16 x i32> <i32 1, i32 undef, i32 5, i32 7,<br>
+ i32 undef, i32 11, i32 undef, i32 undef,<br>
+ i32 undef, i32 19, i32 21, i32 23,<br>
+ i32 undef, i32 27, i32 29, i32 undef><br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test a v16i8 pack in which one of the operands is undefined and where<br>
+; indices for that operand are "don't care". Target-independent code<br>
+; converts the indices themselves into "undef"s.<br>
+define <16 x i8> @f7(<16 x i8> %val) {<br>
+; CHECK-LABEL: f7:<br>
+; CHECK: vpkh %v24, %v24, %v24<br>
+; CHECK: br %r14<br>
+ %ret = shufflevector <16 x i8> undef, <16 x i8> %val,<br>
+ <16 x i32> <i32 7, i32 1, i32 9, i32 15,<br>
+ i32 15, i32 3, i32 5, i32 1,<br>
+ i32 17, i32 19, i32 21, i32 23,<br>
+ i32 25, i32 27, i32 29, i32 31><br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test a canonical v8i16 pack.<br>
+define <8 x i16> @f8(<8 x i16> %val1, <8 x i16> %val2) {<br>
+; CHECK-LABEL: f8:<br>
+; CHECK: vpkf %v24, %v24, %v26<br>
+; CHECK: br %r14<br>
+ %ret = shufflevector <8 x i16> %val1, <8 x i16> %val2,<br>
+ <8 x i32> <i32 1, i32 3, i32 5, i32 7,<br>
+ i32 9, i32 11, i32 13, i32 15><br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test a reversed v8i16 pack.<br>
+define <8 x i16> @f9(<8 x i16> %val1, <8 x i16> %val2) {<br>
+; CHECK-LABEL: f9:<br>
+; CHECK: vpkf %v24, %v26, %v24<br>
+; CHECK: br %r14<br>
+ %ret = shufflevector <8 x i16> %val1, <8 x i16> %val2,<br>
+ <8 x i32> <i32 9, i32 11, i32 13, i32 15,<br>
+ i32 1, i32 3, i32 5, i32 7><br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test a canonical v4i32 pack.<br>
+define <4 x i32> @f10(<4 x i32> %val1, <4 x i32> %val2) {<br>
+; CHECK-LABEL: f10:<br>
+; CHECK: vpkg %v24, %v24, %v26<br>
+; CHECK: br %r14<br>
+ %ret = shufflevector <4 x i32> %val1, <4 x i32> %val2,<br>
+ <4 x i32> <i32 1, i32 3, i32 5, i32 7><br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test a reversed v4i32 pack.<br>
+define <4 x i32> @f11(<4 x i32> %val1, <4 x i32> %val2) {<br>
+; CHECK-LABEL: f11:<br>
+; CHECK: vpkg %v24, %v26, %v24<br>
+; CHECK: br %r14<br>
+ %ret = shufflevector <4 x i32> %val1, <4 x i32> %val2,<br>
+ <4 x i32> <i32 5, i32 7, i32 1, i32 3><br>
+ ret <4 x i32> %ret<br>
+}<br>
<br>
Added: llvm/trunk/test/CodeGen/SystemZ/vec-perm-07.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-perm-07.ll?rev=236521&view=auto" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-perm-07.ll?rev=236521&view=auto</a><br>
==============================================================================<br>
--- llvm/trunk/test/CodeGen/SystemZ/vec-perm-07.ll (added)<br>
+++ llvm/trunk/test/CodeGen/SystemZ/vec-perm-07.ll Tue May 5 14:25:42 2015<br>
@@ -0,0 +1,125 @@<br>
+; Test vector shift left double immediate.<br>
+;<br>
+; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=z13 | FileCheck %s<br>
+<br>
+; Test a v16i8 shift with the lowest useful shift amount.<br>
+define <16 x i8> @f1(<16 x i8> %val1, <16 x i8> %val2) {<br>
+; CHECK-LABEL: f1:<br>
+; CHECK: vsldb %v24, %v24, %v26, 1<br>
+; CHECK: br %r14<br>
+ %ret = shufflevector <16 x i8> %val1, <16 x i8> %val2,<br>
+ <16 x i32> <i32 1, i32 2, i32 3, i32 4,<br>
+ i32 5, i32 6, i32 7, i32 8,<br>
+ i32 9, i32 10, i32 11, i32 12,<br>
+ i32 13, i32 14, i32 15, i32 16><br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test a v16i8 shift with the highest shift amount.<br>
+define <16 x i8> @f2(<16 x i8> %val1, <16 x i8> %val2) {<br>
+; CHECK-LABEL: f2:<br>
+; CHECK: vsldb %v24, %v24, %v26, 15<br>
+; CHECK: br %r14<br>
+ %ret = shufflevector <16 x i8> %val1, <16 x i8> %val2,<br>
+ <16 x i32> <i32 15, i32 16, i32 17, i32 18,<br>
+ i32 19, i32 20, i32 21, i32 22,<br>
+ i32 23, i32 24, i32 25, i32 26,<br>
+ i32 27, i32 28, i32 29, i32 30><br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test a v16i8 shift in which the operands need to be reversed.<br>
+define <16 x i8> @f3(<16 x i8> %val1, <16 x i8> %val2) {<br>
+; CHECK-LABEL: f3:<br>
+; CHECK: vsldb %v24, %v26, %v24, 4<br>
+; CHECK: br %r14<br>
+ %ret = shufflevector <16 x i8> %val1, <16 x i8> %val2,<br>
+ <16 x i32> <i32 20, i32 21, i32 22, i32 23,<br>
+ i32 24, i32 25, i32 26, i32 27,<br>
+ i32 28, i32 29, i32 30, i32 31,<br>
+ i32 0, i32 1, i32 2, i32 3><br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test a v16i8 shift in which the operands need to be duplicated.<br>
+define <16 x i8> @f4(<16 x i8> %val) {<br>
+; CHECK-LABEL: f4:<br>
+; CHECK: vsldb %v24, %v24, %v24, 7<br>
+; CHECK: br %r14<br>
+ %ret = shufflevector <16 x i8> %val, <16 x i8> undef,<br>
+ <16 x i32> <i32 7, i32 8, i32 9, i32 10,<br>
+ i32 11, i32 12, i32 13, i32 14,<br>
+ i32 15, i32 0, i32 1, i32 2,<br>
+ i32 3, i32 4, i32 5, i32 6><br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test a v16i8 shift in which some of the indices are undefs.<br>
+define <16 x i8> @f5(<16 x i8> %val1, <16 x i8> %val2) {<br>
+; CHECK-LABEL: f5:<br>
+; CHECK: vsldb %v24, %v24, %v26, 11<br>
+; CHECK: br %r14<br>
+ %ret = shufflevector <16 x i8> %val1, <16 x i8> %val2,<br>
+ <16 x i32> <i32 undef, i32 undef, i32 undef, i32 undef,<br>
+ i32 15, i32 16, i32 undef, i32 18,<br>
+ i32 19, i32 20, i32 21, i32 22,<br>
+ i32 23, i32 24, i32 25, i32 26><br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; ...and again with reversed operands.<br>
+define <16 x i8> @f6(<16 x i8> %val1, <16 x i8> %val2) {<br>
+; CHECK-LABEL: f6:<br>
+; CHECK: vsldb %v24, %v26, %v24, 13<br>
+; CHECK: br %r14<br>
+ %ret = shufflevector <16 x i8> %val1, <16 x i8> %val2,<br>
+ <16 x i32> <i32 undef, i32 undef, i32 31, i32 0,<br>
+ i32 1, i32 2, i32 3, i32 4,<br>
+ i32 5, i32 6, i32 7, i32 8,<br>
+ i32 9, i32 10, i32 11, i32 12><br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test a v8i16 shift with the lowest useful shift amount.<br>
+define <8 x i16> @f7(<8 x i16> %val1, <8 x i16> %val2) {<br>
+; CHECK-LABEL: f7:<br>
+; CHECK: vsldb %v24, %v24, %v26, 2<br>
+; CHECK: br %r14<br>
+ %ret = shufflevector <8 x i16> %val1, <8 x i16> %val2,<br>
+ <8 x i32> <i32 1, i32 2, i32 3, i32 4,<br>
+ i32 5, i32 6, i32 7, i32 8><br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test a v8i16 shift with the highest useful shift amount.<br>
+define <8 x i16> @f8(<8 x i16> %val1, <8 x i16> %val2) {<br>
+; CHECK-LABEL: f8:<br>
+; CHECK: vsldb %v24, %v24, %v26, 14<br>
+; CHECK: br %r14<br>
+ %ret = shufflevector <8 x i16> %val1, <8 x i16> %val2,<br>
+ <8 x i32> <i32 7, i32 8, i32 9, i32 10,<br>
+ i32 11, i32 12, i32 13, i32 14><br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test a v4i32 shift with the lowest useful shift amount.<br>
+define <4 x i32> @f9(<4 x i32> %val1, <4 x i32> %val2) {<br>
+; CHECK-LABEL: f9:<br>
+; CHECK: vsldb %v24, %v24, %v26, 4<br>
+; CHECK: br %r14<br>
+ %ret = shufflevector <4 x i32> %val1, <4 x i32> %val2,<br>
+ <4 x i32> <i32 1, i32 2, i32 3, i32 4><br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test a v4i32 shift with the highest useful shift amount.<br>
+define <4 x i32> @f10(<4 x i32> %val1, <4 x i32> %val2) {<br>
+; CHECK-LABEL: f10:<br>
+; CHECK: vsldb %v24, %v24, %v26, 12<br>
+; CHECK: br %r14<br>
+ %ret = shufflevector <4 x i32> %val1, <4 x i32> %val2,<br>
+ <4 x i32> <i32 3, i32 4, i32 5, i32 6><br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; We use VPDI for v2i64 shuffles.<br>
<br>
Added: llvm/trunk/test/CodeGen/SystemZ/vec-perm-08.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-perm-08.ll?rev=236521&view=auto" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-perm-08.ll?rev=236521&view=auto</a><br>
==============================================================================<br>
--- llvm/trunk/test/CodeGen/SystemZ/vec-perm-08.ll (added)<br>
+++ llvm/trunk/test/CodeGen/SystemZ/vec-perm-08.ll Tue May 5 14:25:42 2015<br>
@@ -0,0 +1,130 @@<br>
+; Test vector permutes using VPDI.<br>
+;<br>
+; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=z13 | FileCheck %s<br>
+<br>
+; Test a high1/low2 permute for v16i8.<br>
+define <16 x i8> @f1(<16 x i8> %val1, <16 x i8> %val2) {<br>
+; CHECK-LABEL: f1:<br>
+; CHECK: vpdi %v24, %v24, %v26, 1<br>
+; CHECK: br %r14<br>
+ %ret = shufflevector <16 x i8> %val1, <16 x i8> %val2,<br>
+ <16 x i32> <i32 0, i32 1, i32 2, i32 3,<br>
+ i32 4, i32 5, i32 6, i32 7,<br>
+ i32 24, i32 25, i32 26, i32 27,<br>
+ i32 28, i32 29, i32 30, i32 31><br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test a low2/high1 permute for v16i8.<br>
+define <16 x i8> @f2(<16 x i8> %val1, <16 x i8> %val2) {<br>
+; CHECK-LABEL: f2:<br>
+; CHECK: vpdi %v24, %v26, %v24, 4<br>
+; CHECK: br %r14<br>
+ %ret = shufflevector <16 x i8> %val1, <16 x i8> %val2,<br>
+ <16 x i32> <i32 24, i32 25, i32 26, i32 27,<br>
+ i32 28, i32 29, i32 30, i32 31,<br>
+ i32 0, i32 1, i32 2, i32 3,<br>
+ i32 4, i32 5, i32 6, i32 7><br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test a low1/high2 permute for v16i8.<br>
+define <16 x i8> @f3(<16 x i8> %val1, <16 x i8> %val2) {<br>
+; CHECK-LABEL: f3:<br>
+; CHECK: vpdi %v24, %v24, %v26, 4<br>
+; CHECK: br %r14<br>
+ %ret = shufflevector <16 x i8> %val1, <16 x i8> %val2,<br>
+ <16 x i32> <i32 8, i32 9, i32 10, i32 undef,<br>
+ i32 12, i32 undef, i32 14, i32 15,<br>
+ i32 16, i32 17, i32 undef, i32 19,<br>
+ i32 20, i32 21, i32 22, i32 undef><br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test a high2/low1 permute for v16i8.<br>
+define <16 x i8> @f4(<16 x i8> %val1, <16 x i8> %val2) {<br>
+; CHECK-LABEL: f4:<br>
+; CHECK: vpdi %v24, %v26, %v24, 1<br>
+; CHECK: br %r14<br>
+ %ret = shufflevector <16 x i8> %val1, <16 x i8> %val2,<br>
+ <16 x i32> <i32 16, i32 17, i32 18, i32 19,<br>
+ i32 20, i32 21, i32 22, i32 23,<br>
+ i32 8, i32 9, i32 10, i32 11,<br>
+ i32 12, i32 13, i32 14, i32 15><br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test reversing two doublewords in a v16i8.<br>
+define <16 x i8> @f5(<16 x i8> %val) {<br>
+; CHECK-LABEL: f5:<br>
+; CHECK: vpdi %v24, %v24, %v24, 4<br>
+; CHECK: br %r14<br>
+ %ret = shufflevector <16 x i8> %val, <16 x i8> undef,<br>
+ <16 x i32> <i32 8, i32 9, i32 10, i32 11,<br>
+ i32 12, i32 13, i32 14, i32 15,<br>
+ i32 0, i32 1, i32 2, i32 3,<br>
+ i32 4, i32 5, i32 6, i32 7><br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test a high1/low2 permute for v8i16.<br>
+define <8 x i16> @f6(<8 x i16> %val1, <8 x i16> %val2) {<br>
+; CHECK-LABEL: f6:<br>
+; CHECK: vpdi %v24, %v24, %v26, 1<br>
+; CHECK: br %r14<br>
+ %ret = shufflevector <8 x i16> %val1, <8 x i16> %val2,<br>
+ <8 x i32> <i32 0, i32 1, i32 2, i32 3,<br>
+ i32 12, i32 13, i32 14, i32 15><br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test a low2/high1 permute for v8i16.<br>
+define <8 x i16> @f7(<8 x i16> %val1, <8 x i16> %val2) {<br>
+; CHECK-LABEL: f7:<br>
+; CHECK: vpdi %v24, %v26, %v24, 4<br>
+; CHECK: br %r14<br>
+ %ret = shufflevector <8 x i16> %val1, <8 x i16> %val2,<br>
+ <8 x i32> <i32 12, i32 13, i32 14, i32 15,<br>
+ i32 0, i32 1, i32 2, i32 3><br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test a high1/low2 permute for v4i32.<br>
+define <4 x i32> @f8(<4 x i32> %val1, <4 x i32> %val2) {<br>
+; CHECK-LABEL: f8:<br>
+; CHECK: vpdi %v24, %v24, %v26, 1<br>
+; CHECK: br %r14<br>
+ %ret = shufflevector <4 x i32> %val1, <4 x i32> %val2,<br>
+ <4 x i32> <i32 0, i32 1, i32 6, i32 7><br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test a low2/high1 permute for v4i32.<br>
+define <4 x i32> @f9(<4 x i32> %val1, <4 x i32> %val2) {<br>
+; CHECK-LABEL: f9:<br>
+; CHECK: vpdi %v24, %v26, %v24, 4<br>
+; CHECK: br %r14<br>
+ %ret = shufflevector <4 x i32> %val1, <4 x i32> %val2,<br>
+ <4 x i32> <i32 6, i32 7, i32 0, i32 1><br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test a high1/low2 permute for v2i64.<br>
+define <2 x i64> @f10(<2 x i64> %val1, <2 x i64> %val2) {<br>
+; CHECK-LABEL: f10:<br>
+; CHECK: vpdi %v24, %v24, %v26, 1<br>
+; CHECK: br %r14<br>
+ %ret = shufflevector <2 x i64> %val1, <2 x i64> %val2,<br>
+ <2 x i32> <i32 0, i32 3><br>
+ ret <2 x i64> %ret<br>
+}<br>
+<br>
+; Test low2/high1 permute for v2i64.<br>
+define <2 x i64> @f11(<2 x i64> %val1, <2 x i64> %val2) {<br>
+; CHECK-LABEL: f11:<br>
+; CHECK: vpdi %v24, %v26, %v24, 4<br>
+; CHECK: br %r14<br>
+ %ret = shufflevector <2 x i64> %val1, <2 x i64> %val2,<br>
+ <2 x i32> <i32 3, i32 0><br>
+ ret <2 x i64> %ret<br>
+}<br>
<br>
Added: llvm/trunk/test/CodeGen/SystemZ/vec-perm-09.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-perm-09.ll?rev=236521&view=auto" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-perm-09.ll?rev=236521&view=auto</a><br>
==============================================================================<br>
--- llvm/trunk/test/CodeGen/SystemZ/vec-perm-09.ll (added)<br>
+++ llvm/trunk/test/CodeGen/SystemZ/vec-perm-09.ll Tue May 5 14:25:42 2015<br>
@@ -0,0 +1,38 @@<br>
+; Test general vector permute of a v16i8.<br>
+;<br>
+; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=z13 | \<br>
+; RUN: FileCheck -check-prefix=CHECK-CODE %s<br>
+; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=z13 | \<br>
+; RUN: FileCheck -check-prefix=CHECK-VECTOR %s<br>
+<br>
+define <16 x i8> @f1(<16 x i8> %val1, <16 x i8> %val2) {<br>
+; CHECK-CODE-LABEL: f1:<br>
+; CHECK-CODE: larl [[REG:%r[0-5]]],<br>
+; CHECK-CODE: vl [[MASK:%v[0-9]+]], 0([[REG]])<br>
+; CHECK-CODE: vperm %v24, %v24, %v26, [[MASK]]<br>
+; CHECK-CODE: br %r14<br>
+;<br>
+; CHECK-VECTOR: .byte 1<br>
+; CHECK-VECTOR-NEXT: .byte 19<br>
+; CHECK-VECTOR-NEXT: .byte 6<br>
+; CHECK-VECTOR-NEXT: .byte 5<br>
+; CHECK-VECTOR-NEXT: .byte 20<br>
+; CHECK-VECTOR-NEXT: .byte 22<br>
+; CHECK-VECTOR-NEXT: .byte 1<br>
+; CHECK-VECTOR-NEXT: .byte 1<br>
+; CHECK-VECTOR-NEXT: .byte 25<br>
+; CHECK-VECTOR-NEXT: .byte 29<br>
+; CHECK-VECTOR-NEXT: .byte 11<br>
+; Any byte would be OK here<br>
+; CHECK-VECTOR-NEXT: .space 1<br>
+; CHECK-VECTOR-NEXT: .byte 31<br>
+; CHECK-VECTOR-NEXT: .byte 4<br>
+; CHECK-VECTOR-NEXT: .byte 15<br>
+; CHECK-VECTOR-NEXT: .byte 19<br>
+ %ret = shufflevector <16 x i8> %val1, <16 x i8> %val2,<br>
+ <16 x i32> <i32 1, i32 19, i32 6, i32 5,<br>
+ i32 20, i32 22, i32 1, i32 1,<br>
+ i32 25, i32 29, i32 11, i32 undef,<br>
+ i32 31, i32 4, i32 15, i32 19><br>
+ ret <16 x i8> %ret<br>
+}<br>
<br>
Added: llvm/trunk/test/CodeGen/SystemZ/vec-perm-10.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-perm-10.ll?rev=236521&view=auto" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-perm-10.ll?rev=236521&view=auto</a><br>
==============================================================================<br>
--- llvm/trunk/test/CodeGen/SystemZ/vec-perm-10.ll (added)<br>
+++ llvm/trunk/test/CodeGen/SystemZ/vec-perm-10.ll Tue May 5 14:25:42 2015<br>
@@ -0,0 +1,36 @@<br>
+; Test general vector permute of a v8i16.<br>
+;<br>
+; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=z13 | \<br>
+; RUN: FileCheck -check-prefix=CHECK-CODE %s<br>
+; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=z13 | \<br>
+; RUN: FileCheck -check-prefix=CHECK-VECTOR %s<br>
+<br>
+define <8 x i16> @f1(<8 x i16> %val1, <8 x i16> %val2) {<br>
+; CHECK-CODE-LABEL: f1:<br>
+; CHECK-CODE: larl [[REG:%r[0-5]]],<br>
+; CHECK-CODE: vl [[MASK:%v[0-9]+]], 0([[REG]])<br>
+; CHECK-CODE: vperm %v24, %v26, %v24, [[MASK]]<br>
+; CHECK-CODE: br %r14<br>
+;<br>
+; CHECK-VECTOR: .byte 0<br>
+; CHECK-VECTOR-NEXT: .byte 1<br>
+; CHECK-VECTOR-NEXT: .byte 26<br>
+; CHECK-VECTOR-NEXT: .byte 27<br>
+; Any 2 bytes would be OK here<br>
+; CHECK-VECTOR-NEXT: .space 1<br>
+; CHECK-VECTOR-NEXT: .space 1<br>
+; CHECK-VECTOR-NEXT: .byte 28<br>
+; CHECK-VECTOR-NEXT: .byte 29<br>
+; CHECK-VECTOR-NEXT: .byte 6<br>
+; CHECK-VECTOR-NEXT: .byte 7<br>
+; CHECK-VECTOR-NEXT: .byte 14<br>
+; CHECK-VECTOR-NEXT: .byte 15<br>
+; CHECK-VECTOR-NEXT: .byte 8<br>
+; CHECK-VECTOR-NEXT: .byte 9<br>
+; CHECK-VECTOR-NEXT: .byte 16<br>
+; CHECK-VECTOR-NEXT: .byte 17<br>
+ %ret = shufflevector <8 x i16> %val1, <8 x i16> %val2,<br>
+ <8 x i32> <i32 8, i32 5, i32 undef, i32 6,<br>
+ i32 11, i32 15, i32 12, i32 0><br>
+ ret <8 x i16> %ret<br>
+}<br>
<br>
Added: llvm/trunk/test/CodeGen/SystemZ/vec-perm-11.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-perm-11.ll?rev=236521&view=auto" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-perm-11.ll?rev=236521&view=auto</a><br>
==============================================================================<br>
--- llvm/trunk/test/CodeGen/SystemZ/vec-perm-11.ll (added)<br>
+++ llvm/trunk/test/CodeGen/SystemZ/vec-perm-11.ll Tue May 5 14:25:42 2015<br>
@@ -0,0 +1,35 @@<br>
+; Test general vector permute of a v4i32.<br>
+;<br>
+; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=z13 | \<br>
+; RUN: FileCheck -check-prefix=CHECK-CODE %s<br>
+; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=z13 | \<br>
+; RUN: FileCheck -check-prefix=CHECK-VECTOR %s<br>
+<br>
+define <4 x i32> @f1(<4 x i32> %val1, <4 x i32> %val2) {<br>
+; CHECK-CODE-LABEL: f1:<br>
+; CHECK-CODE: larl [[REG:%r[0-5]]],<br>
+; CHECK-CODE: vl [[MASK:%v[0-9]+]], 0([[REG]])<br>
+; CHECK-CODE: vperm %v24, %v26, %v24, [[MASK]]<br>
+; CHECK-CODE: br %r14<br>
+;<br>
+; CHECK-VECTOR: .byte 4<br>
+; CHECK-VECTOR-NEXT: .byte 5<br>
+; CHECK-VECTOR-NEXT: .byte 6<br>
+; CHECK-VECTOR-NEXT: .byte 7<br>
+; CHECK-VECTOR-NEXT: .byte 20<br>
+; CHECK-VECTOR-NEXT: .byte 21<br>
+; CHECK-VECTOR-NEXT: .byte 22<br>
+; CHECK-VECTOR-NEXT: .byte 23<br>
+; Any 4 bytes would be OK here<br>
+; CHECK-VECTOR-NEXT: .space 1<br>
+; CHECK-VECTOR-NEXT: .space 1<br>
+; CHECK-VECTOR-NEXT: .space 1<br>
+; CHECK-VECTOR-NEXT: .space 1<br>
+; CHECK-VECTOR-NEXT: .byte 12<br>
+; CHECK-VECTOR-NEXT: .byte 13<br>
+; CHECK-VECTOR-NEXT: .byte 14<br>
+; CHECK-VECTOR-NEXT: .byte 15<br>
+ %ret = shufflevector <4 x i32> %val1, <4 x i32> %val2,<br>
+ <4 x i32> <i32 5, i32 1, i32 undef, i32 7><br>
+ ret <4 x i32> %ret<br>
+}<br>
<br>
Added: llvm/trunk/test/CodeGen/SystemZ/vec-shift-01.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-shift-01.ll?rev=236521&view=auto" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-shift-01.ll?rev=236521&view=auto</a><br>
==============================================================================<br>
--- llvm/trunk/test/CodeGen/SystemZ/vec-shift-01.ll (added)<br>
+++ llvm/trunk/test/CodeGen/SystemZ/vec-shift-01.ll Tue May 5 14:25:42 2015<br>
@@ -0,0 +1,39 @@<br>
+; Test vector shift left with vector shift amount.<br>
+;<br>
+; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=z13 | FileCheck %s<br>
+<br>
+; Test a v16i8 shift.<br>
+define <16 x i8> @f1(<16 x i8> %dummy, <16 x i8> %val1, <16 x i8> %val2) {<br>
+; CHECK-LABEL: f1:<br>
+; CHECK: veslvb %v24, %v26, %v28<br>
+; CHECK: br %r14<br>
+ %ret = shl <16 x i8> %val1, %val2<br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test a v8i16 shift.<br>
+define <8 x i16> @f2(<8 x i16> %dummy, <8 x i16> %val1, <8 x i16> %val2) {<br>
+; CHECK-LABEL: f2:<br>
+; CHECK: veslvh %v24, %v26, %v28<br>
+; CHECK: br %r14<br>
+ %ret = shl <8 x i16> %val1, %val2<br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test a v4i32 shift.<br>
+define <4 x i32> @f3(<4 x i32> %dummy, <4 x i32> %val1, <4 x i32> %val2) {<br>
+; CHECK-LABEL: f3:<br>
+; CHECK: veslvf %v24, %v26, %v28<br>
+; CHECK: br %r14<br>
+ %ret = shl <4 x i32> %val1, %val2<br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test a v2i64 shift.<br>
+define <2 x i64> @f4(<2 x i64> %dummy, <2 x i64> %val1, <2 x i64> %val2) {<br>
+; CHECK-LABEL: f4:<br>
+; CHECK: veslvg %v24, %v26, %v28<br>
+; CHECK: br %r14<br>
+ %ret = shl <2 x i64> %val1, %val2<br>
+ ret <2 x i64> %ret<br>
+}<br>
<br>
Added: llvm/trunk/test/CodeGen/SystemZ/vec-shift-02.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-shift-02.ll?rev=236521&view=auto" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-shift-02.ll?rev=236521&view=auto</a><br>
==============================================================================<br>
--- llvm/trunk/test/CodeGen/SystemZ/vec-shift-02.ll (added)<br>
+++ llvm/trunk/test/CodeGen/SystemZ/vec-shift-02.ll Tue May 5 14:25:42 2015<br>
@@ -0,0 +1,39 @@<br>
+; Test vector arithmetic shift right with vector shift amount.<br>
+;<br>
+; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=z13 | FileCheck %s<br>
+<br>
+; Test a v16i8 shift.<br>
+define <16 x i8> @f1(<16 x i8> %dummy, <16 x i8> %val1, <16 x i8> %val2) {<br>
+; CHECK-LABEL: f1:<br>
+; CHECK: vesravb %v24, %v26, %v28<br>
+; CHECK: br %r14<br>
+ %ret = ashr <16 x i8> %val1, %val2<br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test a v8i16 shift.<br>
+define <8 x i16> @f2(<8 x i16> %dummy, <8 x i16> %val1, <8 x i16> %val2) {<br>
+; CHECK-LABEL: f2:<br>
+; CHECK: vesravh %v24, %v26, %v28<br>
+; CHECK: br %r14<br>
+ %ret = ashr <8 x i16> %val1, %val2<br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test a v4i32 shift.<br>
+define <4 x i32> @f3(<4 x i32> %dummy, <4 x i32> %val1, <4 x i32> %val2) {<br>
+; CHECK-LABEL: f3:<br>
+; CHECK: vesravf %v24, %v26, %v28<br>
+; CHECK: br %r14<br>
+ %ret = ashr <4 x i32> %val1, %val2<br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test a v2i64 shift.<br>
+define <2 x i64> @f4(<2 x i64> %dummy, <2 x i64> %val1, <2 x i64> %val2) {<br>
+; CHECK-LABEL: f4:<br>
+; CHECK: vesravg %v24, %v26, %v28<br>
+; CHECK: br %r14<br>
+ %ret = ashr <2 x i64> %val1, %val2<br>
+ ret <2 x i64> %ret<br>
+}<br>
<br>
Added: llvm/trunk/test/CodeGen/SystemZ/vec-shift-03.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-shift-03.ll?rev=236521&view=auto" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-shift-03.ll?rev=236521&view=auto</a><br>
==============================================================================<br>
--- llvm/trunk/test/CodeGen/SystemZ/vec-shift-03.ll (added)<br>
+++ llvm/trunk/test/CodeGen/SystemZ/vec-shift-03.ll Tue May 5 14:25:42 2015<br>
@@ -0,0 +1,39 @@<br>
+; Test vector logical shift right with vector shift amount.<br>
+;<br>
+; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=z13 | FileCheck %s<br>
+<br>
+; Test a v16i8 shift.<br>
+define <16 x i8> @f1(<16 x i8> %dummy, <16 x i8> %val1, <16 x i8> %val2) {<br>
+; CHECK-LABEL: f1:<br>
+; CHECK: vesrlvb %v24, %v26, %v28<br>
+; CHECK: br %r14<br>
+ %ret = lshr <16 x i8> %val1, %val2<br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test a v8i16 shift.<br>
+define <8 x i16> @f2(<8 x i16> %dummy, <8 x i16> %val1, <8 x i16> %val2) {<br>
+; CHECK-LABEL: f2:<br>
+; CHECK: vesrlvh %v24, %v26, %v28<br>
+; CHECK: br %r14<br>
+ %ret = lshr <8 x i16> %val1, %val2<br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test a v4i32 shift.<br>
+define <4 x i32> @f3(<4 x i32> %dummy, <4 x i32> %val1, <4 x i32> %val2) {<br>
+; CHECK-LABEL: f3:<br>
+; CHECK: vesrlvf %v24, %v26, %v28<br>
+; CHECK: br %r14<br>
+ %ret = lshr <4 x i32> %val1, %val2<br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test a v2i64 shift.<br>
+define <2 x i64> @f4(<2 x i64> %dummy, <2 x i64> %val1, <2 x i64> %val2) {<br>
+; CHECK-LABEL: f4:<br>
+; CHECK: vesrlvg %v24, %v26, %v28<br>
+; CHECK: br %r14<br>
+ %ret = lshr <2 x i64> %val1, %val2<br>
+ ret <2 x i64> %ret<br>
+}<br>
<br>
Added: llvm/trunk/test/CodeGen/SystemZ/vec-shift-04.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-shift-04.ll?rev=236521&view=auto" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-shift-04.ll?rev=236521&view=auto</a><br>
==============================================================================<br>
--- llvm/trunk/test/CodeGen/SystemZ/vec-shift-04.ll (added)<br>
+++ llvm/trunk/test/CodeGen/SystemZ/vec-shift-04.ll Tue May 5 14:25:42 2015<br>
@@ -0,0 +1,134 @@<br>
+; Test vector shift left with scalar shift amount.<br>
+;<br>
+; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=z13 | FileCheck %s<br>
+<br>
+; Test a v16i8 shift by a variable.<br>
+define <16 x i8> @f1(<16 x i8> %dummy, <16 x i8> %val1, i32 %shift) {<br>
+; CHECK-LABEL: f1:<br>
+; CHECK: veslb %v24, %v26, 0(%r2)<br>
+; CHECK: br %r14<br>
+ %truncshift = trunc i32 %shift to i8<br>
+ %shiftvec = insertelement <16 x i8> undef, i8 %truncshift, i32 0<br>
+ %val2 = shufflevector <16 x i8> %shiftvec, <16 x i8> undef,<br>
+ <16 x i32> zeroinitializer<br>
+ %ret = shl <16 x i8> %val1, %val2<br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test a v16i8 shift by the lowest useful constant.<br>
+define <16 x i8> @f2(<16 x i8> %dummy, <16 x i8> %val) {<br>
+; CHECK-LABEL: f2:<br>
+; CHECK: veslb %v24, %v26, 1<br>
+; CHECK: br %r14<br>
+ %ret = shl <16 x i8> %val, <i8 1, i8 1, i8 1, i8 1,<br>
+ i8 1, i8 1, i8 1, i8 1,<br>
+ i8 1, i8 1, i8 1, i8 1,<br>
+ i8 1, i8 1, i8 1, i8 1><br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test a v16i8 shift by the highest useful constant.<br>
+define <16 x i8> @f3(<16 x i8> %dummy, <16 x i8> %val) {<br>
+; CHECK-LABEL: f3:<br>
+; CHECK: veslb %v24, %v26, 7<br>
+; CHECK: br %r14<br>
+ %ret = shl <16 x i8> %val, <i8 7, i8 7, i8 7, i8 7,<br>
+ i8 7, i8 7, i8 7, i8 7,<br>
+ i8 7, i8 7, i8 7, i8 7,<br>
+ i8 7, i8 7, i8 7, i8 7><br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test a v8i16 shift by a variable.<br>
+define <8 x i16> @f4(<8 x i16> %dummy, <8 x i16> %val1, i32 %shift) {<br>
+; CHECK-LABEL: f4:<br>
+; CHECK: veslh %v24, %v26, 0(%r2)<br>
+; CHECK: br %r14<br>
+ %truncshift = trunc i32 %shift to i16<br>
+ %shiftvec = insertelement <8 x i16> undef, i16 %truncshift, i32 0<br>
+ %val2 = shufflevector <8 x i16> %shiftvec, <8 x i16> undef,<br>
+ <8 x i32> zeroinitializer<br>
+ %ret = shl <8 x i16> %val1, %val2<br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test a v8i16 shift by the lowest useful constant.<br>
+define <8 x i16> @f5(<8 x i16> %dummy, <8 x i16> %val) {<br>
+; CHECK-LABEL: f5:<br>
+; CHECK: veslh %v24, %v26, 1<br>
+; CHECK: br %r14<br>
+ %ret = shl <8 x i16> %val, <i16 1, i16 1, i16 1, i16 1,<br>
+ i16 1, i16 1, i16 1, i16 1><br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test a v8i16 shift by the highest useful constant.<br>
+define <8 x i16> @f6(<8 x i16> %dummy, <8 x i16> %val) {<br>
+; CHECK-LABEL: f6:<br>
+; CHECK: veslh %v24, %v26, 15<br>
+; CHECK: br %r14<br>
+ %ret = shl <8 x i16> %val, <i16 15, i16 15, i16 15, i16 15,<br>
+ i16 15, i16 15, i16 15, i16 15><br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test a v4i32 shift by a variable.<br>
+define <4 x i32> @f7(<4 x i32> %dummy, <4 x i32> %val1, i32 %shift) {<br>
+; CHECK-LABEL: f7:<br>
+; CHECK: veslf %v24, %v26, 0(%r2)<br>
+; CHECK: br %r14<br>
+ %shiftvec = insertelement <4 x i32> undef, i32 %shift, i32 0<br>
+ %val2 = shufflevector <4 x i32> %shiftvec, <4 x i32> undef,<br>
+ <4 x i32> zeroinitializer<br>
+ %ret = shl <4 x i32> %val1, %val2<br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test a v4i32 shift by the lowest useful constant.<br>
+define <4 x i32> @f8(<4 x i32> %dummy, <4 x i32> %val) {<br>
+; CHECK-LABEL: f8:<br>
+; CHECK: veslf %v24, %v26, 1<br>
+; CHECK: br %r14<br>
+ %ret = shl <4 x i32> %val, <i32 1, i32 1, i32 1, i32 1><br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test a v4i32 shift by the highest useful constant.<br>
+define <4 x i32> @f9(<4 x i32> %dummy, <4 x i32> %val) {<br>
+; CHECK-LABEL: f9:<br>
+; CHECK: veslf %v24, %v26, 31<br>
+; CHECK: br %r14<br>
+ %ret = shl <4 x i32> %val, <i32 31, i32 31, i32 31, i32 31><br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test a v2i64 shift by a variable.<br>
+define <2 x i64> @f10(<2 x i64> %dummy, <2 x i64> %val1, i32 %shift) {<br>
+; CHECK-LABEL: f10:<br>
+; CHECK: veslg %v24, %v26, 0(%r2)<br>
+; CHECK: br %r14<br>
+ %extshift = sext i32 %shift to i64<br>
+ %shiftvec = insertelement <2 x i64> undef, i64 %extshift, i32 0<br>
+ %val2 = shufflevector <2 x i64> %shiftvec, <2 x i64> undef,<br>
+ <2 x i32> zeroinitializer<br>
+ %ret = shl <2 x i64> %val1, %val2<br>
+ ret <2 x i64> %ret<br>
+}<br>
+<br>
+; Test a v2i64 shift by the lowest useful constant.<br>
+define <2 x i64> @f11(<2 x i64> %dummy, <2 x i64> %val) {<br>
+; CHECK-LABEL: f11:<br>
+; CHECK: veslg %v24, %v26, 1<br>
+; CHECK: br %r14<br>
+ %ret = shl <2 x i64> %val, <i64 1, i64 1><br>
+ ret <2 x i64> %ret<br>
+}<br>
+<br>
+; Test a v2i64 shift by the highest useful constant.<br>
+define <2 x i64> @f12(<2 x i64> %dummy, <2 x i64> %val) {<br>
+; CHECK-LABEL: f12:<br>
+; CHECK: veslg %v24, %v26, 63<br>
+; CHECK: br %r14<br>
+ %ret = shl <2 x i64> %val, <i64 63, i64 63><br>
+ ret <2 x i64> %ret<br>
+}<br>
<br>
Added: llvm/trunk/test/CodeGen/SystemZ/vec-shift-05.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-shift-05.ll?rev=236521&view=auto" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-shift-05.ll?rev=236521&view=auto</a><br>
==============================================================================<br>
--- llvm/trunk/test/CodeGen/SystemZ/vec-shift-05.ll (added)<br>
+++ llvm/trunk/test/CodeGen/SystemZ/vec-shift-05.ll Tue May 5 14:25:42 2015<br>
@@ -0,0 +1,134 @@<br>
+; Test vector arithmetic shift right with scalar shift amount.<br>
+;<br>
+; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=z13 | FileCheck %s<br>
+<br>
+; Test a v16i8 shift by a variable.<br>
+define <16 x i8> @f1(<16 x i8> %dummy, <16 x i8> %val1, i32 %shift) {<br>
+; CHECK-LABEL: f1:<br>
+; CHECK: vesrab %v24, %v26, 0(%r2)<br>
+; CHECK: br %r14<br>
+ %truncshift = trunc i32 %shift to i8<br>
+ %shiftvec = insertelement <16 x i8> undef, i8 %truncshift, i32 0<br>
+ %val2 = shufflevector <16 x i8> %shiftvec, <16 x i8> undef,<br>
+ <16 x i32> zeroinitializer<br>
+ %ret = ashr <16 x i8> %val1, %val2<br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test a v16i8 shift by the lowest useful constant.<br>
+define <16 x i8> @f2(<16 x i8> %dummy, <16 x i8> %val) {<br>
+; CHECK-LABEL: f2:<br>
+; CHECK: vesrab %v24, %v26, 1<br>
+; CHECK: br %r14<br>
+ %ret = ashr <16 x i8> %val, <i8 1, i8 1, i8 1, i8 1,<br>
+ i8 1, i8 1, i8 1, i8 1,<br>
+ i8 1, i8 1, i8 1, i8 1,<br>
+ i8 1, i8 1, i8 1, i8 1><br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test a v16i8 shift by the highest useful constant.<br>
+define <16 x i8> @f3(<16 x i8> %dummy, <16 x i8> %val) {<br>
+; CHECK-LABEL: f3:<br>
+; CHECK: vesrab %v24, %v26, 7<br>
+; CHECK: br %r14<br>
+ %ret = ashr <16 x i8> %val, <i8 7, i8 7, i8 7, i8 7,<br>
+ i8 7, i8 7, i8 7, i8 7,<br>
+ i8 7, i8 7, i8 7, i8 7,<br>
+ i8 7, i8 7, i8 7, i8 7><br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test a v8i16 shift by a variable.<br>
+define <8 x i16> @f4(<8 x i16> %dummy, <8 x i16> %val1, i32 %shift) {<br>
+; CHECK-LABEL: f4:<br>
+; CHECK: vesrah %v24, %v26, 0(%r2)<br>
+; CHECK: br %r14<br>
+ %truncshift = trunc i32 %shift to i16<br>
+ %shiftvec = insertelement <8 x i16> undef, i16 %truncshift, i32 0<br>
+ %val2 = shufflevector <8 x i16> %shiftvec, <8 x i16> undef,<br>
+ <8 x i32> zeroinitializer<br>
+ %ret = ashr <8 x i16> %val1, %val2<br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test a v8i16 shift by the lowest useful constant.<br>
+define <8 x i16> @f5(<8 x i16> %dummy, <8 x i16> %val) {<br>
+; CHECK-LABEL: f5:<br>
+; CHECK: vesrah %v24, %v26, 1<br>
+; CHECK: br %r14<br>
+ %ret = ashr <8 x i16> %val, <i16 1, i16 1, i16 1, i16 1,<br>
+ i16 1, i16 1, i16 1, i16 1><br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test a v8i16 shift by the highest useful constant.<br>
+define <8 x i16> @f6(<8 x i16> %dummy, <8 x i16> %val) {<br>
+; CHECK-LABEL: f6:<br>
+; CHECK: vesrah %v24, %v26, 15<br>
+; CHECK: br %r14<br>
+ %ret = ashr <8 x i16> %val, <i16 15, i16 15, i16 15, i16 15,<br>
+ i16 15, i16 15, i16 15, i16 15><br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test a v4i32 shift by a variable.<br>
+define <4 x i32> @f7(<4 x i32> %dummy, <4 x i32> %val1, i32 %shift) {<br>
+; CHECK-LABEL: f7:<br>
+; CHECK: vesraf %v24, %v26, 0(%r2)<br>
+; CHECK: br %r14<br>
+ %shiftvec = insertelement <4 x i32> undef, i32 %shift, i32 0<br>
+ %val2 = shufflevector <4 x i32> %shiftvec, <4 x i32> undef,<br>
+ <4 x i32> zeroinitializer<br>
+ %ret = ashr <4 x i32> %val1, %val2<br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test a v4i32 shift by the lowest useful constant.<br>
+define <4 x i32> @f8(<4 x i32> %dummy, <4 x i32> %val) {<br>
+; CHECK-LABEL: f8:<br>
+; CHECK: vesraf %v24, %v26, 1<br>
+; CHECK: br %r14<br>
+ %ret = ashr <4 x i32> %val, <i32 1, i32 1, i32 1, i32 1><br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test a v4i32 shift by the highest useful constant.<br>
+define <4 x i32> @f9(<4 x i32> %dummy, <4 x i32> %val) {<br>
+; CHECK-LABEL: f9:<br>
+; CHECK: vesraf %v24, %v26, 31<br>
+; CHECK: br %r14<br>
+ %ret = ashr <4 x i32> %val, <i32 31, i32 31, i32 31, i32 31><br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test a v2i64 shift by a variable.<br>
+define <2 x i64> @f10(<2 x i64> %dummy, <2 x i64> %val1, i32 %shift) {<br>
+; CHECK-LABEL: f10:<br>
+; CHECK: vesrag %v24, %v26, 0(%r2)<br>
+; CHECK: br %r14<br>
+ %extshift = sext i32 %shift to i64<br>
+ %shiftvec = insertelement <2 x i64> undef, i64 %extshift, i32 0<br>
+ %val2 = shufflevector <2 x i64> %shiftvec, <2 x i64> undef,<br>
+ <2 x i32> zeroinitializer<br>
+ %ret = ashr <2 x i64> %val1, %val2<br>
+ ret <2 x i64> %ret<br>
+}<br>
+<br>
+; Test a v2i64 shift by the lowest useful constant.<br>
+define <2 x i64> @f11(<2 x i64> %dummy, <2 x i64> %val) {<br>
+; CHECK-LABEL: f11:<br>
+; CHECK: vesrag %v24, %v26, 1<br>
+; CHECK: br %r14<br>
+ %ret = ashr <2 x i64> %val, <i64 1, i64 1><br>
+ ret <2 x i64> %ret<br>
+}<br>
+<br>
+; Test a v2i64 shift by the highest useful constant.<br>
+define <2 x i64> @f12(<2 x i64> %dummy, <2 x i64> %val) {<br>
+; CHECK-LABEL: f12:<br>
+; CHECK: vesrag %v24, %v26, 63<br>
+; CHECK: br %r14<br>
+ %ret = ashr <2 x i64> %val, <i64 63, i64 63><br>
+ ret <2 x i64> %ret<br>
+}<br>
<br>
Added: llvm/trunk/test/CodeGen/SystemZ/vec-shift-06.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-shift-06.ll?rev=236521&view=auto" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-shift-06.ll?rev=236521&view=auto</a><br>
==============================================================================<br>
--- llvm/trunk/test/CodeGen/SystemZ/vec-shift-06.ll (added)<br>
+++ llvm/trunk/test/CodeGen/SystemZ/vec-shift-06.ll Tue May 5 14:25:42 2015<br>
@@ -0,0 +1,134 @@<br>
+; Test vector logical shift right with scalar shift amount.<br>
+;<br>
+; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=z13 | FileCheck %s<br>
+<br>
+; Test a v16i8 shift by a variable.<br>
+define <16 x i8> @f1(<16 x i8> %dummy, <16 x i8> %val1, i32 %shift) {<br>
+; CHECK-LABEL: f1:<br>
+; CHECK: vesrlb %v24, %v26, 0(%r2)<br>
+; CHECK: br %r14<br>
+ %truncshift = trunc i32 %shift to i8<br>
+ %shiftvec = insertelement <16 x i8> undef, i8 %truncshift, i32 0<br>
+ %val2 = shufflevector <16 x i8> %shiftvec, <16 x i8> undef,<br>
+ <16 x i32> zeroinitializer<br>
+ %ret = lshr <16 x i8> %val1, %val2<br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test a v16i8 shift by the lowest useful constant.<br>
+define <16 x i8> @f2(<16 x i8> %dummy, <16 x i8> %val) {<br>
+; CHECK-LABEL: f2:<br>
+; CHECK: vesrlb %v24, %v26, 1<br>
+; CHECK: br %r14<br>
+ %ret = lshr <16 x i8> %val, <i8 1, i8 1, i8 1, i8 1,<br>
+ i8 1, i8 1, i8 1, i8 1,<br>
+ i8 1, i8 1, i8 1, i8 1,<br>
+ i8 1, i8 1, i8 1, i8 1><br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test a v16i8 shift by the highest useful constant.<br>
+define <16 x i8> @f3(<16 x i8> %dummy, <16 x i8> %val) {<br>
+; CHECK-LABEL: f3:<br>
+; CHECK: vesrlb %v24, %v26, 7<br>
+; CHECK: br %r14<br>
+ %ret = lshr <16 x i8> %val, <i8 7, i8 7, i8 7, i8 7,<br>
+ i8 7, i8 7, i8 7, i8 7,<br>
+ i8 7, i8 7, i8 7, i8 7,<br>
+ i8 7, i8 7, i8 7, i8 7><br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test a v8i16 shift by a variable.<br>
+define <8 x i16> @f4(<8 x i16> %dummy, <8 x i16> %val1, i32 %shift) {<br>
+; CHECK-LABEL: f4:<br>
+; CHECK: vesrlh %v24, %v26, 0(%r2)<br>
+; CHECK: br %r14<br>
+ %truncshift = trunc i32 %shift to i16<br>
+ %shiftvec = insertelement <8 x i16> undef, i16 %truncshift, i32 0<br>
+ %val2 = shufflevector <8 x i16> %shiftvec, <8 x i16> undef,<br>
+ <8 x i32> zeroinitializer<br>
+ %ret = lshr <8 x i16> %val1, %val2<br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test a v8i16 shift by the lowest useful constant.<br>
+define <8 x i16> @f5(<8 x i16> %dummy, <8 x i16> %val) {<br>
+; CHECK-LABEL: f5:<br>
+; CHECK: vesrlh %v24, %v26, 1<br>
+; CHECK: br %r14<br>
+ %ret = lshr <8 x i16> %val, <i16 1, i16 1, i16 1, i16 1,<br>
+ i16 1, i16 1, i16 1, i16 1><br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test a v8i16 shift by the highest useful constant.<br>
+define <8 x i16> @f6(<8 x i16> %dummy, <8 x i16> %val) {<br>
+; CHECK-LABEL: f6:<br>
+; CHECK: vesrlh %v24, %v26, 15<br>
+; CHECK: br %r14<br>
+ %ret = lshr <8 x i16> %val, <i16 15, i16 15, i16 15, i16 15,<br>
+ i16 15, i16 15, i16 15, i16 15><br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test a v4i32 shift by a variable.<br>
+define <4 x i32> @f7(<4 x i32> %dummy, <4 x i32> %val1, i32 %shift) {<br>
+; CHECK-LABEL: f7:<br>
+; CHECK: vesrlf %v24, %v26, 0(%r2)<br>
+; CHECK: br %r14<br>
+ %shiftvec = insertelement <4 x i32> undef, i32 %shift, i32 0<br>
+ %val2 = shufflevector <4 x i32> %shiftvec, <4 x i32> undef,<br>
+ <4 x i32> zeroinitializer<br>
+ %ret = lshr <4 x i32> %val1, %val2<br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test a v4i32 shift by the lowest useful constant.<br>
+define <4 x i32> @f8(<4 x i32> %dummy, <4 x i32> %val) {<br>
+; CHECK-LABEL: f8:<br>
+; CHECK: vesrlf %v24, %v26, 1<br>
+; CHECK: br %r14<br>
+ %ret = lshr <4 x i32> %val, <i32 1, i32 1, i32 1, i32 1><br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test a v4i32 shift by the highest useful constant.<br>
+define <4 x i32> @f9(<4 x i32> %dummy, <4 x i32> %val) {<br>
+; CHECK-LABEL: f9:<br>
+; CHECK: vesrlf %v24, %v26, 31<br>
+; CHECK: br %r14<br>
+ %ret = lshr <4 x i32> %val, <i32 31, i32 31, i32 31, i32 31><br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test a v2i64 shift by a variable.<br>
+define <2 x i64> @f10(<2 x i64> %dummy, <2 x i64> %val1, i32 %shift) {<br>
+; CHECK-LABEL: f10:<br>
+; CHECK: vesrlg %v24, %v26, 0(%r2)<br>
+; CHECK: br %r14<br>
+ %extshift = sext i32 %shift to i64<br>
+ %shiftvec = insertelement <2 x i64> undef, i64 %extshift, i32 0<br>
+ %val2 = shufflevector <2 x i64> %shiftvec, <2 x i64> undef,<br>
+ <2 x i32> zeroinitializer<br>
+ %ret = lshr <2 x i64> %val1, %val2<br>
+ ret <2 x i64> %ret<br>
+}<br>
+<br>
+; Test a v2i64 shift by the lowest useful constant.<br>
+define <2 x i64> @f11(<2 x i64> %dummy, <2 x i64> %val) {<br>
+; CHECK-LABEL: f11:<br>
+; CHECK: vesrlg %v24, %v26, 1<br>
+; CHECK: br %r14<br>
+ %ret = lshr <2 x i64> %val, <i64 1, i64 1><br>
+ ret <2 x i64> %ret<br>
+}<br>
+<br>
+; Test a v2i64 shift by the highest useful constant.<br>
+define <2 x i64> @f12(<2 x i64> %dummy, <2 x i64> %val) {<br>
+; CHECK-LABEL: f12:<br>
+; CHECK: vesrlg %v24, %v26, 63<br>
+; CHECK: br %r14<br>
+ %ret = lshr <2 x i64> %val, <i64 63, i64 63><br>
+ ret <2 x i64> %ret<br>
+}<br>
<br>
Added: llvm/trunk/test/CodeGen/SystemZ/vec-shift-07.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-shift-07.ll?rev=236521&view=auto" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-shift-07.ll?rev=236521&view=auto</a><br>
==============================================================================<br>
--- llvm/trunk/test/CodeGen/SystemZ/vec-shift-07.ll (added)<br>
+++ llvm/trunk/test/CodeGen/SystemZ/vec-shift-07.ll Tue May 5 14:25:42 2015<br>
@@ -0,0 +1,182 @@<br>
+; Test vector sign extensions.<br>
+;<br>
+; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=z13 | FileCheck %s<br>
+<br>
+; Test a v16i1->v16i8 extension.<br>
+define <16 x i8> @f1(<16 x i8> %val) {<br>
+; CHECK-LABEL: f1:<br>
+; CHECK: veslb [[REG:%v[0-9]+]], %v24, 7<br>
+; CHECK: vesrab %v24, [[REG]], 7<br>
+; CHECK: br %r14<br>
+ %trunc = trunc <16 x i8> %val to <16 x i1><br>
+ %ret = sext <16 x i1> %trunc to <16 x i8><br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test a v8i1->v8i16 extension.<br>
+define <8 x i16> @f2(<8 x i16> %val) {<br>
+; CHECK-LABEL: f2:<br>
+; CHECK: veslh [[REG:%v[0-9]+]], %v24, 15<br>
+; CHECK: vesrah %v24, [[REG]], 15<br>
+; CHECK: br %r14<br>
+ %trunc = trunc <8 x i16> %val to <8 x i1><br>
+ %ret = sext <8 x i1> %trunc to <8 x i16><br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test a v8i8->v8i16 extension.<br>
+define <8 x i16> @f3(<8 x i16> %val) {<br>
+; CHECK-LABEL: f3:<br>
+; CHECK: veslh [[REG:%v[0-9]+]], %v24, 8<br>
+; CHECK: vesrah %v24, [[REG]], 8<br>
+; CHECK: br %r14<br>
+ %trunc = trunc <8 x i16> %val to <8 x i8><br>
+ %ret = sext <8 x i8> %trunc to <8 x i16><br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test a v4i1->v4i32 extension.<br>
+define <4 x i32> @f4(<4 x i32> %val) {<br>
+; CHECK-LABEL: f4:<br>
+; CHECK: veslf [[REG:%v[0-9]+]], %v24, 31<br>
+; CHECK: vesraf %v24, [[REG]], 31<br>
+; CHECK: br %r14<br>
+ %trunc = trunc <4 x i32> %val to <4 x i1><br>
+ %ret = sext <4 x i1> %trunc to <4 x i32><br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test a v4i8->v4i32 extension.<br>
+define <4 x i32> @f5(<4 x i32> %val) {<br>
+; CHECK-LABEL: f5:<br>
+; CHECK: veslf [[REG:%v[0-9]+]], %v24, 24<br>
+; CHECK: vesraf %v24, [[REG]], 24<br>
+; CHECK: br %r14<br>
+ %trunc = trunc <4 x i32> %val to <4 x i8><br>
+ %ret = sext <4 x i8> %trunc to <4 x i32><br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test a v4i16->v4i32 extension.<br>
+define <4 x i32> @f6(<4 x i32> %val) {<br>
+; CHECK-LABEL: f6:<br>
+; CHECK: veslf [[REG:%v[0-9]+]], %v24, 16<br>
+; CHECK: vesraf %v24, [[REG]], 16<br>
+; CHECK: br %r14<br>
+ %trunc = trunc <4 x i32> %val to <4 x i16><br>
+ %ret = sext <4 x i16> %trunc to <4 x i32><br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test a v2i1->v2i64 extension.<br>
+define <2 x i64> @f7(<2 x i64> %val) {<br>
+; CHECK-LABEL: f7:<br>
+; CHECK: veslg [[REG:%v[0-9]+]], %v24, 63<br>
+; CHECK: vesrag %v24, [[REG]], 63<br>
+; CHECK: br %r14<br>
+ %trunc = trunc <2 x i64> %val to <2 x i1><br>
+ %ret = sext <2 x i1> %trunc to <2 x i64><br>
+ ret <2 x i64> %ret<br>
+}<br>
+<br>
+; Test a v2i8->v2i64 extension.<br>
+define <2 x i64> @f8(<2 x i64> %val) {<br>
+; CHECK-LABEL: f8:<br>
+; CHECK: vsegb %v24, %v24<br>
+; CHECK: br %r14<br>
+ %trunc = trunc <2 x i64> %val to <2 x i8><br>
+ %ret = sext <2 x i8> %trunc to <2 x i64><br>
+ ret <2 x i64> %ret<br>
+}<br>
+<br>
+; Test a v2i16->v2i64 extension.<br>
+define <2 x i64> @f9(<2 x i64> %val) {<br>
+; CHECK-LABEL: f9:<br>
+; CHECK: vsegh %v24, %v24<br>
+; CHECK: br %r14<br>
+ %trunc = trunc <2 x i64> %val to <2 x i16><br>
+ %ret = sext <2 x i16> %trunc to <2 x i64><br>
+ ret <2 x i64> %ret<br>
+}<br>
+<br>
+; Test a v2i32->v2i64 extension.<br>
+define <2 x i64> @f10(<2 x i64> %val) {<br>
+; CHECK-LABEL: f10:<br>
+; CHECK: vsegf %v24, %v24<br>
+; CHECK: br %r14<br>
+ %trunc = trunc <2 x i64> %val to <2 x i32><br>
+ %ret = sext <2 x i32> %trunc to <2 x i64><br>
+ ret <2 x i64> %ret<br>
+}<br>
+<br>
+; Test an alternative v2i8->v2i64 extension.<br>
+define <2 x i64> @f11(<2 x i64> %val) {<br>
+; CHECK-LABEL: f11:<br>
+; CHECK: vsegb %v24, %v24<br>
+; CHECK: br %r14<br>
+ %shl = shl <2 x i64> %val, <i64 56, i64 56><br>
+ %ret = ashr <2 x i64> %shl, <i64 56, i64 56><br>
+ ret <2 x i64> %ret<br>
+}<br>
+<br>
+; Test an alternative v2i16->v2i64 extension.<br>
+define <2 x i64> @f12(<2 x i64> %val) {<br>
+; CHECK-LABEL: f12:<br>
+; CHECK: vsegh %v24, %v24<br>
+; CHECK: br %r14<br>
+ %shl = shl <2 x i64> %val, <i64 48, i64 48><br>
+ %ret = ashr <2 x i64> %shl, <i64 48, i64 48><br>
+ ret <2 x i64> %ret<br>
+}<br>
+<br>
+; Test an alternative v2i32->v2i64 extension.<br>
+define <2 x i64> @f13(<2 x i64> %val) {<br>
+; CHECK-LABEL: f13:<br>
+; CHECK: vsegf %v24, %v24<br>
+; CHECK: br %r14<br>
+ %shl = shl <2 x i64> %val, <i64 32, i64 32><br>
+ %ret = ashr <2 x i64> %shl, <i64 32, i64 32><br>
+ ret <2 x i64> %ret<br>
+}<br>
+<br>
+; Test an extraction-based v2i8->v2i64 extension.<br>
+define <2 x i64> @f14(<16 x i8> %val) {<br>
+; CHECK-LABEL: f14:<br>
+; CHECK: vsegb %v24, %v24<br>
+; CHECK: br %r14<br>
+ %elt0 = extractelement <16 x i8> %val, i32 7<br>
+ %elt1 = extractelement <16 x i8> %val, i32 15<br>
+ %ext0 = sext i8 %elt0 to i64<br>
+ %ext1 = sext i8 %elt1 to i64<br>
+ %vec0 = insertelement <2 x i64> undef, i64 %ext0, i32 0<br>
+ %vec1 = insertelement <2 x i64> %vec0, i64 %ext1, i32 1<br>
+ ret <2 x i64> %vec1<br>
+}<br>
+<br>
+; Test an extraction-based v2i16->v2i64 extension.<br>
+define <2 x i64> @f15(<16 x i16> %val) {<br>
+; CHECK-LABEL: f15:<br>
+; CHECK: vsegh %v24, %v24<br>
+; CHECK: br %r14<br>
+ %elt0 = extractelement <16 x i16> %val, i32 3<br>
+ %elt1 = extractelement <16 x i16> %val, i32 7<br>
+ %ext0 = sext i16 %elt0 to i64<br>
+ %ext1 = sext i16 %elt1 to i64<br>
+ %vec0 = insertelement <2 x i64> undef, i64 %ext0, i32 0<br>
+ %vec1 = insertelement <2 x i64> %vec0, i64 %ext1, i32 1<br>
+ ret <2 x i64> %vec1<br>
+}<br>
+<br>
+; Test an extraction-based v2i32->v2i64 extension.<br>
+define <2 x i64> @f16(<16 x i32> %val) {<br>
+; CHECK-LABEL: f16:<br>
+; CHECK: vsegf %v24, %v24<br>
+; CHECK: br %r14<br>
+ %elt0 = extractelement <16 x i32> %val, i32 1<br>
+ %elt1 = extractelement <16 x i32> %val, i32 3<br>
+ %ext0 = sext i32 %elt0 to i64<br>
+ %ext1 = sext i32 %elt1 to i64<br>
+ %vec0 = insertelement <2 x i64> undef, i64 %ext0, i32 0<br>
+ %vec1 = insertelement <2 x i64> %vec0, i64 %ext1, i32 1<br>
+ ret <2 x i64> %vec1<br>
+}<br>
<br>
Added: llvm/trunk/test/CodeGen/SystemZ/vec-sub-01.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-sub-01.ll?rev=236521&view=auto" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-sub-01.ll?rev=236521&view=auto</a><br>
==============================================================================<br>
--- llvm/trunk/test/CodeGen/SystemZ/vec-sub-01.ll (added)<br>
+++ llvm/trunk/test/CodeGen/SystemZ/vec-sub-01.ll Tue May 5 14:25:42 2015<br>
@@ -0,0 +1,39 @@<br>
+; Test vector subtraction.<br>
+;<br>
+; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=z13 | FileCheck %s<br>
+<br>
+; Test a v16i8 subtraction.<br>
+define <16 x i8> @f1(<16 x i8> %dummy, <16 x i8> %val1, <16 x i8> %val2) {<br>
+; CHECK-LABEL: f1:<br>
+; CHECK: vsb %v24, %v26, %v28<br>
+; CHECK: br %r14<br>
+ %ret = sub <16 x i8> %val1, %val2<br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test a v8i16 subtraction.<br>
+define <8 x i16> @f2(<8 x i16> %dummy, <8 x i16> %val1, <8 x i16> %val2) {<br>
+; CHECK-LABEL: f2:<br>
+; CHECK: vsh %v24, %v26, %v28<br>
+; CHECK: br %r14<br>
+ %ret = sub <8 x i16> %val1, %val2<br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test a v4i32 subtraction.<br>
+define <4 x i32> @f3(<4 x i32> %dummy, <4 x i32> %val1, <4 x i32> %val2) {<br>
+; CHECK-LABEL: f3:<br>
+; CHECK: vsf %v24, %v26, %v28<br>
+; CHECK: br %r14<br>
+ %ret = sub <4 x i32> %val1, %val2<br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test a v2i64 subtraction.<br>
+define <2 x i64> @f4(<2 x i64> %dummy, <2 x i64> %val1, <2 x i64> %val2) {<br>
+; CHECK-LABEL: f4:<br>
+; CHECK: vsg %v24, %v26, %v28<br>
+; CHECK: br %r14<br>
+ %ret = sub <2 x i64> %val1, %val2<br>
+ ret <2 x i64> %ret<br>
+}<br>
<br>
Added: llvm/trunk/test/CodeGen/SystemZ/vec-xor-01.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-xor-01.ll?rev=236521&view=auto" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/SystemZ/vec-xor-01.ll?rev=236521&view=auto</a><br>
==============================================================================<br>
--- llvm/trunk/test/CodeGen/SystemZ/vec-xor-01.ll (added)<br>
+++ llvm/trunk/test/CodeGen/SystemZ/vec-xor-01.ll Tue May 5 14:25:42 2015<br>
@@ -0,0 +1,39 @@<br>
+; Test vector XOR.<br>
+;<br>
+; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=z13 | FileCheck %s<br>
+<br>
+; Test a v16i8 XOR.<br>
+define <16 x i8> @f1(<16 x i8> %dummy, <16 x i8> %val1, <16 x i8> %val2) {<br>
+; CHECK-LABEL: f1:<br>
+; CHECK: vx %v24, %v26, %v28<br>
+; CHECK: br %r14<br>
+ %ret = xor <16 x i8> %val1, %val2<br>
+ ret <16 x i8> %ret<br>
+}<br>
+<br>
+; Test a v8i16 XOR.<br>
+define <8 x i16> @f2(<8 x i16> %dummy, <8 x i16> %val1, <8 x i16> %val2) {<br>
+; CHECK-LABEL: f2:<br>
+; CHECK: vx %v24, %v26, %v28<br>
+; CHECK: br %r14<br>
+ %ret = xor <8 x i16> %val1, %val2<br>
+ ret <8 x i16> %ret<br>
+}<br>
+<br>
+; Test a v4i32 XOR.<br>
+define <4 x i32> @f3(<4 x i32> %dummy, <4 x i32> %val1, <4 x i32> %val2) {<br>
+; CHECK-LABEL: f3:<br>
+; CHECK: vx %v24, %v26, %v28<br>
+; CHECK: br %r14<br>
+ %ret = xor <4 x i32> %val1, %val2<br>
+ ret <4 x i32> %ret<br>
+}<br>
+<br>
+; Test a v2i64 XOR.<br>
+define <2 x i64> @f4(<2 x i64> %dummy, <2 x i64> %val1, <2 x i64> %val2) {<br>
+; CHECK-LABEL: f4:<br>
+; CHECK: vx %v24, %v26, %v28<br>
+; CHECK: br %r14<br>
+ %ret = xor <2 x i64> %val1, %val2<br>
+ ret <2 x i64> %ret<br>
+}<br>
<br>
<br>
_______________________________________________<br>
llvm-commits mailing list<br>
<a href="mailto:llvm-commits@cs.uiuc.edu">llvm-commits@cs.uiuc.edu</a><br>
<a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits" target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits</a><br>
</blockquote></div><br></div>