[llvm] r319901 - [Hexagon] Generate HVX code for vector construction and access
Eric Christopher via llvm-commits
llvm-commits at lists.llvm.org
Mon Jan 8 18:40:29 PST 2018
One more :)
clang noticed that HvxSelector::zerous was unused and was warning on me so
I removed it here:
Committing to https://llvm.org/svn/llvm-project/llvm/trunk ...
M lib/Target/Hexagon/HexagonISelDAGToDAGHVX.cpp
Committed r322053
If you still need it you might want to add a use. :)
-eric
On Wed, Dec 6, 2017 at 11:44 AM Krzysztof Parzyszek via llvm-commits <
llvm-commits at lists.llvm.org> wrote:
> Sorry about that, and thanks.
>
> -Krzysztof
>
> On 12/6/2017 12:52 PM, Davide Italiano wrote:
> > I'll go ahead and commit the following to unblock our work, but feel
> > free to follow up accordingly if you don't like it.
> >
> > $ git diff
> > diff --git a/lib/Target/Hexagon/HexagonISelDAGToDAGHVX.cpp
> > b/lib/Target/Hexagon/HexagonISelDAGToDAGHVX.cpp
> > index a636e4e1557..5dc5e764f67 100644
> > --- a/lib/Target/Hexagon/HexagonISelDAGToDAGHVX.cpp
> > +++ b/lib/Target/Hexagon/HexagonISelDAGToDAGHVX.cpp
> > @@ -729,7 +729,9 @@ void NodeTemplate::print(raw_ostream &OS, const
> > SelectionDAG &G) const {
> >
> > void ResultStack::print(raw_ostream &OS, const SelectionDAG &G) const {
> > OS << "Input node:\n";
> > +#ifndef NDEBUG
> > InpNode->dumpr(&G);
> > +#endif
> > OS << "Result templates:\n";
> > for (unsigned I = 0, E = List.size(); I != E; ++I) {
> > OS << '[' << I << "] ";
> >
> >
> > On Wed, Dec 6, 2017 at 10:48 AM, Davide Italiano <davide at freebsd.org>
> wrote:
> >> The build is failing on macOS for me. I think this commit might be
> >> responsible, taking in consideration the range (yesterday night/this
> >> morning).
> >>
> >> Undefined symbols for architecture x86_64:
> >> "llvm::SDNode::dumpr(llvm::SelectionDAG const*) const", referenced
> from:
> >> ResultStack::print(llvm::raw_ostream&, llvm::SelectionDAG
> >> const&) const in libLLVMHexagonCodeGen.a(HexagonISelDAGToDAGHVX.cpp.o)
> >> ld: symbol(s) not found for architecture x86_64
> >>
> >> On Wed, Dec 6, 2017 at 8:40 AM, Krzysztof Parzyszek via llvm-commits
> >> <llvm-commits at lists.llvm.org> wrote:
> >>> Author: kparzysz
> >>> Date: Wed Dec 6 08:40:37 2017
> >>> New Revision: 319901
> >>>
> >>> URL: http://llvm.org/viewvc/llvm-project?rev=319901&view=rev
> >>> Log:
> >>> [Hexagon] Generate HVX code for vector construction and access
> >>>
> >>> Support for:
> >>> - build vector,
> >>> - extract vector element, subvector,
> >>> - insert vector element, subvector,
> >>> - shuffle.
> >>>
> >>> Added:
> >>> llvm/trunk/lib/Target/Hexagon/HexagonISelDAGToDAGHVX.cpp
> >>> llvm/trunk/lib/Target/Hexagon/HexagonISelLoweringHVX.cpp
> >>> llvm/trunk/test/CodeGen/Hexagon/autohvx/
> >>> llvm/trunk/test/CodeGen/Hexagon/autohvx/align-128b.ll
> >>> llvm/trunk/test/CodeGen/Hexagon/autohvx/align-64b.ll
> >>> llvm/trunk/test/CodeGen/Hexagon/autohvx/contract-128b.ll
> >>> llvm/trunk/test/CodeGen/Hexagon/autohvx/contract-64b.ll
> >>> llvm/trunk/test/CodeGen/Hexagon/autohvx/deal-128b.ll
> >>> llvm/trunk/test/CodeGen/Hexagon/autohvx/deal-64b.ll
> >>> llvm/trunk/test/CodeGen/Hexagon/autohvx/delta-128b.ll
> >>> llvm/trunk/test/CodeGen/Hexagon/autohvx/delta-64b.ll
> >>> llvm/trunk/test/CodeGen/Hexagon/autohvx/delta2-64b.ll
> >>> llvm/trunk/test/CodeGen/Hexagon/autohvx/extract-element.ll
> >>> llvm/trunk/test/CodeGen/Hexagon/autohvx/reg-sequence.ll
> >>> llvm/trunk/test/CodeGen/Hexagon/autohvx/shuff-128b.ll
> >>> llvm/trunk/test/CodeGen/Hexagon/autohvx/shuff-64b.ll
> >>> llvm/trunk/test/CodeGen/Hexagon/autohvx/shuff-combos-128b.ll
> >>> llvm/trunk/test/CodeGen/Hexagon/autohvx/shuff-combos-64b.ll
> >>> Modified:
> >>> llvm/trunk/lib/Target/Hexagon/CMakeLists.txt
> >>> llvm/trunk/lib/Target/Hexagon/HexagonISelDAGToDAG.cpp
> >>> llvm/trunk/lib/Target/Hexagon/HexagonISelDAGToDAG.h
> >>> llvm/trunk/lib/Target/Hexagon/HexagonISelLowering.cpp
> >>> llvm/trunk/lib/Target/Hexagon/HexagonISelLowering.h
> >>> llvm/trunk/lib/Target/Hexagon/HexagonPatterns.td
> >>>
> >>> Modified: llvm/trunk/lib/Target/Hexagon/CMakeLists.txt
> >>> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/Hexagon/CMakeLists.txt?rev=319901&r1=319900&r2=319901&view=diff
> >>>
> ==============================================================================
> >>> --- llvm/trunk/lib/Target/Hexagon/CMakeLists.txt (original)
> >>> +++ llvm/trunk/lib/Target/Hexagon/CMakeLists.txt Wed Dec 6 08:40:37
> 2017
> >>> @@ -35,7 +35,9 @@ add_llvm_target(HexagonCodeGen
> >>> HexagonHazardRecognizer.cpp
> >>> HexagonInstrInfo.cpp
> >>> HexagonISelDAGToDAG.cpp
> >>> + HexagonISelDAGToDAGHVX.cpp
> >>> HexagonISelLowering.cpp
> >>> + HexagonISelLoweringHVX.cpp
> >>> HexagonLoopIdiomRecognition.cpp
> >>> HexagonMachineFunctionInfo.cpp
> >>> HexagonMachineScheduler.cpp
> >>>
> >>> Modified: llvm/trunk/lib/Target/Hexagon/HexagonISelDAGToDAG.cpp
> >>> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/Hexagon/HexagonISelDAGToDAG.cpp?rev=319901&r1=319900&r2=319901&view=diff
> >>>
> ==============================================================================
> >>> --- llvm/trunk/lib/Target/Hexagon/HexagonISelDAGToDAG.cpp (original)
> >>> +++ llvm/trunk/lib/Target/Hexagon/HexagonISelDAGToDAG.cpp Wed Dec 6
> 08:40:37 2017
> >>> @@ -754,7 +754,6 @@ void HexagonDAGToDAGISel::SelectBitcast(
> >>> CurDAG->RemoveDeadNode(N);
> >>> }
> >>>
> >>> -
> >>> void HexagonDAGToDAGISel::Select(SDNode *N) {
> >>> if (N->isMachineOpcode())
> >>> return N->setNodeId(-1); // Already selected.
> >>> @@ -772,6 +771,13 @@ void HexagonDAGToDAGISel::Select(SDNode
> >>> case ISD::INTRINSIC_WO_CHAIN: return SelectIntrinsicWOChain(N);
> >>> }
> >>>
> >>> + if (HST->useHVXOps()) {
> >>> + switch (N->getOpcode()) {
> >>> + case ISD::VECTOR_SHUFFLE: return SelectHvxShuffle(N);
> >>> + case HexagonISD::VROR: return SelectHvxRor(N);
> >>> + }
> >>> + }
> >>> +
> >>> SelectCode(N);
> >>> }
> >>>
> >>>
> >>> Modified: llvm/trunk/lib/Target/Hexagon/HexagonISelDAGToDAG.h
> >>> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/Hexagon/HexagonISelDAGToDAG.h?rev=319901&r1=319900&r2=319901&view=diff
> >>>
> ==============================================================================
> >>> --- llvm/trunk/lib/Target/Hexagon/HexagonISelDAGToDAG.h (original)
> >>> +++ llvm/trunk/lib/Target/Hexagon/HexagonISelDAGToDAG.h Wed Dec 6
> 08:40:37 2017
> >>> @@ -26,6 +26,7 @@ namespace llvm {
> >>> class MachineFunction;
> >>> class HexagonInstrInfo;
> >>> class HexagonRegisterInfo;
> >>> +class HexagonTargetLowering;
> >>>
> >>> class HexagonDAGToDAGISel : public SelectionDAGISel {
> >>> const HexagonSubtarget *HST;
> >>> @@ -100,13 +101,25 @@ public:
> >>> void SelectConstant(SDNode *N);
> >>> void SelectConstantFP(SDNode *N);
> >>> void SelectBitcast(SDNode *N);
> >>> - void SelectVectorShuffle(SDNode *N);
> >>>
> >>> - // Include the pieces autogenerated from the target description.
> >>> + // Include the declarations autogenerated from the selection
> patterns.
> >>> #define GET_DAGISEL_DECL
> >>> #include "HexagonGenDAGISel.inc"
> >>>
> >>> private:
> >>> + // This is really only to get access to ReplaceNode (which is a
> protected
> >>> + // member). Any other members used by HvxSelector can be moved
> around to
> >>> + // make them accessible).
> >>> + friend struct HvxSelector;
> >>> +
> >>> + SDValue selectUndef(const SDLoc &dl, MVT ResTy) {
> >>> + SDNode *U = CurDAG->getMachineNode(TargetOpcode::IMPLICIT_DEF,
> dl, ResTy);
> >>> + return SDValue(U, 0);
> >>> + }
> >>> +
> >>> + void SelectHvxShuffle(SDNode *N);
> >>> + void SelectHvxRor(SDNode *N);
> >>> +
> >>> bool keepsLowBits(const SDValue &Val, unsigned NumBits, SDValue
> &Src);
> >>> bool isOrEquivalentToAdd(const SDNode *N) const;
> >>> bool isAlignedMemNode(const MemSDNode *N) const;
> >>>
> >>> Added: llvm/trunk/lib/Target/Hexagon/HexagonISelDAGToDAGHVX.cpp
> >>> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/Hexagon/HexagonISelDAGToDAGHVX.cpp?rev=319901&view=auto
> >>>
> ==============================================================================
> >>> --- llvm/trunk/lib/Target/Hexagon/HexagonISelDAGToDAGHVX.cpp (added)
> >>> +++ llvm/trunk/lib/Target/Hexagon/HexagonISelDAGToDAGHVX.cpp Wed Dec
> 6 08:40:37 2017
> >>> @@ -0,0 +1,1924 @@
> >>> +//===-- HexagonISelDAGToDAGHVX.cpp
> ----------------------------------------===//
> >>> +//
> >>> +// The LLVM Compiler Infrastructure
> >>> +//
> >>> +// This file is distributed under the University of Illinois Open
> Source
> >>> +// License. See LICENSE.TXT for details.
> >>> +//
> >>>
> +//===----------------------------------------------------------------------===//
> >>> +
> >>> +#include "Hexagon.h"
> >>> +#include "HexagonISelDAGToDAG.h"
> >>> +#include "HexagonISelLowering.h"
> >>> +#include "HexagonTargetMachine.h"
> >>> +#include "llvm/CodeGen/MachineInstrBuilder.h"
> >>> +#include "llvm/CodeGen/SelectionDAGISel.h"
> >>> +#include "llvm/IR/Intrinsics.h"
> >>> +#include "llvm/Support/CommandLine.h"
> >>> +#include "llvm/Support/Debug.h"
> >>> +
> >>> +#include <deque>
> >>> +#include <map>
> >>> +#include <set>
> >>> +#include <utility>
> >>> +#include <vector>
> >>> +
> >>> +#define DEBUG_TYPE "hexagon-isel"
> >>> +
> >>> +using namespace llvm;
> >>> +
> >>> +//
> --------------------------------------------------------------------
> >>> +// Implementation of permutation networks.
> >>> +
> >>> +// Implementation of the node routing through butterfly networks:
> >>> +// - Forward delta.
> >>> +// - Reverse delta.
> >>> +// - Benes.
> >>> +//
> >>> +//
> >>> +// Forward delta network consists of log(N) steps, where N is the
> number
> >>> +// of inputs. In each step, an input can stay in place, or it can get
> >>> +// routed to another position[1]. The step after that consists of two
> >>> +// networks, each half in size in terms of the number of nodes. In
> those
> >>> +// terms, in the given step, an input can go to either the upper or
> the
> >>> +// lower network in the next step.
> >>> +//
> >>> +// [1] Hexagon's vdelta/vrdelta allow an element to be routed to both
> >>> +// positions as long as there is no conflict.
> >>> +
> >>> +// Here's a delta network for 8 inputs, only the switching routes are
> >>> +// shown:
> >>> +//
> >>> +// Steps:
> >>> +// |- 1 ---------------|- 2 -----|- 3 -|
> >>> +//
> >>> +// Inp[0] *** *** *** *** Out[0]
> >>> +// \ / \ / \ /
> >>> +// \ / \ / X
> >>> +// \ / \ / / \
> >>> +// Inp[1] *** \ / *** X *** *** Out[1]
> >>> +// \ \ / / \ / \ /
> >>> +// \ \ / / X X
> >>> +// \ \ / / / \ / \
> >>> +// Inp[2] *** \ \ / / *** X *** *** Out[2]
> >>> +// \ \ X / / / \ \ /
> >>> +// \ \ / \ / / / \ X
> >>> +// \ X X / / \ / \
> >>> +// Inp[3] *** \ / \ / \ / *** *** *** Out[3]
> >>> +// \ X X X /
> >>> +// \ / \ / \ / \ /
> >>> +// X X X X
> >>> +// / \ / \ / \ / \
> >>> +// / X X X \
> >>> +// Inp[4] *** / \ / \ / \ *** *** *** Out[4]
> >>> +// / X X \ \ / \ /
> >>> +// / / \ / \ \ \ / X
> >>> +// / / X \ \ \ / / \
> >>> +// Inp[5] *** / / \ \ *** X *** *** Out[5]
> >>> +// / / \ \ \ / \ /
> >>> +// / / \ \ X X
> >>> +// / / \ \ / \ / \
> >>> +// Inp[6] *** / \ *** X *** *** Out[6]
> >>> +// / \ / \ \ /
> >>> +// / \ / \ X
> >>> +// / \ / \ / \
> >>> +// Inp[7] *** *** *** *** Out[7]
> >>> +//
> >>> +//
> >>> +// Reverse delta network is same as delta network, with the steps in
> >>> +// the opposite order.
> >>> +//
> >>> +//
> >>> +// Benes network is a forward delta network immediately followed by
> >>> +// a reverse delta network.
> >>> +
> >>> +
> >>> +// Graph coloring utility used to partition nodes into two groups:
> >>> +// they will correspond to nodes routed to the upper and lower
> networks.
> >>> +struct Coloring {
> >>> + enum : uint8_t {
> >>> + None = 0,
> >>> + Red,
> >>> + Black
> >>> + };
> >>> +
> >>> + using Node = int;
> >>> + using MapType = std::map<Node,uint8_t>;
> >>> + static constexpr Node Ignore = Node(-1);
> >>> +
> >>> + Coloring(ArrayRef<Node> Ord) : Order(Ord) {
> >>> + build();
> >>> + if (!color())
> >>> + Colors.clear();
> >>> + }
> >>> +
> >>> + const MapType &colors() const {
> >>> + return Colors;
> >>> + }
> >>> +
> >>> + uint8_t other(uint8_t Color) {
> >>> + if (Color == None)
> >>> + return Red;
> >>> + return Color == Red ? Black : Red;
> >>> + }
> >>> +
> >>> + void dump() const;
> >>> +
> >>> +private:
> >>> + ArrayRef<Node> Order;
> >>> + MapType Colors;
> >>> + std::set<Node> Needed;
> >>> +
> >>> + using NodeSet = std::set<Node>;
> >>> + std::map<Node,NodeSet> Edges;
> >>> +
> >>> + Node conj(Node Pos) {
> >>> + Node Num = Order.size();
> >>> + return (Pos < Num/2) ? Pos + Num/2 : Pos - Num/2;
> >>> + }
> >>> +
> >>> + uint8_t getColor(Node N) {
> >>> + auto F = Colors.find(N);
> >>> + return F != Colors.end() ? F->second : None;
> >>> + }
> >>> +
> >>> + std::pair<bool,uint8_t> getUniqueColor(const NodeSet &Nodes);
> >>> +
> >>> + void build();
> >>> + bool color();
> >>> +};
> >>> +
> >>> +std::pair<bool,uint8_t> Coloring::getUniqueColor(const NodeSet
> &Nodes) {
> >>> + uint8_t Color = None;
> >>> + for (Node N : Nodes) {
> >>> + uint8_t ColorN = getColor(N);
> >>> + if (ColorN == None)
> >>> + continue;
> >>> + if (Color == None)
> >>> + Color = ColorN;
> >>> + else if (Color != None && Color != ColorN)
> >>> + return { false, None };
> >>> + }
> >>> + return { true, Color };
> >>> +}
> >>> +
> >>> +void Coloring::build() {
> >>> + // Add Order[P] and Order[conj(P)] to Edges.
> >>> + for (unsigned P = 0; P != Order.size(); ++P) {
> >>> + Node I = Order[P];
> >>> + if (I != Ignore) {
> >>> + Needed.insert(I);
> >>> + Node PC = Order[conj(P)];
> >>> + if (PC != Ignore && PC != I)
> >>> + Edges[I].insert(PC);
> >>> + }
> >>> + }
> >>> + // Add I and conj(I) to Edges.
> >>> + for (unsigned I = 0; I != Order.size(); ++I) {
> >>> + if (!Needed.count(I))
> >>> + continue;
> >>> + Node C = conj(I);
> >>> + // This will create an entry in the edge table, even if I is not
> >>> + // connected to any other node. This is necessary, because it
> still
> >>> + // needs to be colored.
> >>> + NodeSet &Is = Edges[I];
> >>> + if (Needed.count(C))
> >>> + Is.insert(C);
> >>> + }
> >>> +}
> >>> +
> >>> +bool Coloring::color() {
> >>> + SetVector<Node> FirstQ;
> >>> + auto Enqueue = [this,&FirstQ] (Node N) {
> >>> + SetVector<Node> Q;
> >>> + Q.insert(N);
> >>> + for (unsigned I = 0; I != Q.size(); ++I) {
> >>> + NodeSet &Ns = Edges[Q[I]];
> >>> + Q.insert(Ns.begin(), Ns.end());
> >>> + }
> >>> + FirstQ.insert(Q.begin(), Q.end());
> >>> + };
> >>> + for (Node N : Needed)
> >>> + Enqueue(N);
> >>> +
> >>> + for (Node N : FirstQ) {
> >>> + if (Colors.count(N))
> >>> + continue;
> >>> + NodeSet &Ns = Edges[N];
> >>> + auto P = getUniqueColor(Ns);
> >>> + if (!P.first)
> >>> + return false;
> >>> + Colors[N] = other(P.second);
> >>> + }
> >>> +
> >>> + // First, color nodes that don't have any dups.
> >>> + for (auto E : Edges) {
> >>> + Node N = E.first;
> >>> + if (!Needed.count(conj(N)) || Colors.count(N))
> >>> + continue;
> >>> + auto P = getUniqueColor(E.second);
> >>> + if (!P.first)
> >>> + return false;
> >>> + Colors[N] = other(P.second);
> >>> + }
> >>> +
> >>> + // Now, nodes that are still uncolored. Since the graph can be
> modified
> >>> + // in this step, create a work queue.
> >>> + std::vector<Node> WorkQ;
> >>> + for (auto E : Edges) {
> >>> + Node N = E.first;
> >>> + if (!Colors.count(N))
> >>> + WorkQ.push_back(N);
> >>> + }
> >>> +
> >>> + for (unsigned I = 0; I < WorkQ.size(); ++I) {
> >>> + Node N = WorkQ[I];
> >>> + NodeSet &Ns = Edges[N];
> >>> + auto P = getUniqueColor(Ns);
> >>> + if (P.first) {
> >>> + Colors[N] = other(P.second);
> >>> + continue;
> >>> + }
> >>> +
> >>> + // Coloring failed. Split this node.
> >>> + Node C = conj(N);
> >>> + uint8_t ColorN = other(None);
> >>> + uint8_t ColorC = other(ColorN);
> >>> + NodeSet &Cs = Edges[C];
> >>> + NodeSet CopyNs = Ns;
> >>> + for (Node M : CopyNs) {
> >>> + uint8_t ColorM = getColor(M);
> >>> + if (ColorM == ColorC) {
> >>> + // Connect M with C, disconnect M from N.
> >>> + Cs.insert(M);
> >>> + Edges[M].insert(C);
> >>> + Ns.erase(M);
> >>> + Edges[M].erase(N);
> >>> + }
> >>> + }
> >>> + Colors[N] = ColorN;
> >>> + Colors[C] = ColorC;
> >>> + }
> >>> +
> >>> + // Explicitly assign "None" all all uncolored nodes.
> >>> + for (unsigned I = 0; I != Order.size(); ++I)
> >>> + if (Colors.count(I) == 0)
> >>> + Colors[I] = None;
> >>> +
> >>> + return true;
> >>> +}
> >>> +
> >>> +LLVM_DUMP_METHOD
> >>> +void Coloring::dump() const {
> >>> + dbgs() << "{ Order: {";
> >>> + for (unsigned I = 0; I != Order.size(); ++I) {
> >>> + Node P = Order[I];
> >>> + if (P != Ignore)
> >>> + dbgs() << ' ' << P;
> >>> + else
> >>> + dbgs() << " -";
> >>> + }
> >>> + dbgs() << " }\n";
> >>> + dbgs() << " Needed: {";
> >>> + for (Node N : Needed)
> >>> + dbgs() << ' ' << N;
> >>> + dbgs() << " }\n";
> >>> +
> >>> + dbgs() << " Edges: {\n";
> >>> + for (auto E : Edges) {
> >>> + dbgs() << " " << E.first << " -> {";
> >>> + for (auto N : E.second)
> >>> + dbgs() << ' ' << N;
> >>> + dbgs() << " }\n";
> >>> + }
> >>> + dbgs() << " }\n";
> >>> +
> >>> + static const char *const Names[] = { "None", "Red", "Black" };
> >>> + dbgs() << " Colors: {\n";
> >>> + for (auto C : Colors)
> >>> + dbgs() << " " << C.first << " -> " << Names[C.second] << "\n";
> >>> + dbgs() << " }\n}\n";
> >>> +}
> >>> +
> >>> +// Base class of for reordering networks. They don't strictly need to
> be
> >>> +// permutations, as outputs with repeated occurrences of an input
> element
> >>> +// are allowed.
> >>> +struct PermNetwork {
> >>> + using Controls = std::vector<uint8_t>;
> >>> + using ElemType = int;
> >>> + static constexpr ElemType Ignore = ElemType(-1);
> >>> +
> >>> + enum : uint8_t {
> >>> + None,
> >>> + Pass,
> >>> + Switch
> >>> + };
> >>> + enum : uint8_t {
> >>> + Forward,
> >>> + Reverse
> >>> + };
> >>> +
> >>> + PermNetwork(ArrayRef<ElemType> Ord, unsigned Mult = 1) {
> >>> + Order.assign(Ord.data(), Ord.data()+Ord.size());
> >>> + Log = 0;
> >>> +
> >>> + unsigned S = Order.size();
> >>> + while (S >>= 1)
> >>> + ++Log;
> >>> +
> >>> + Table.resize(Order.size());
> >>> + for (RowType &Row : Table)
> >>> + Row.resize(Mult*Log, None);
> >>> + }
> >>> +
> >>> + void getControls(Controls &V, unsigned StartAt, uint8_t Dir) const {
> >>> + unsigned Size = Order.size();
> >>> + V.resize(Size);
> >>> + for (unsigned I = 0; I != Size; ++I) {
> >>> + unsigned W = 0;
> >>> + for (unsigned L = 0; L != Log; ++L) {
> >>> + unsigned C = ctl(I, StartAt+L) == Switch;
> >>> + if (Dir == Forward)
> >>> + W |= C << (Log-1-L);
> >>> + else
> >>> + W |= C << L;
> >>> + }
> >>> + assert(isUInt<8>(W));
> >>> + V[I] = uint8_t(W);
> >>> + }
> >>> + }
> >>> +
> >>> + uint8_t ctl(ElemType Pos, unsigned Step) const {
> >>> + return Table[Pos][Step];
> >>> + }
> >>> + unsigned size() const {
> >>> + return Order.size();
> >>> + }
> >>> + unsigned steps() const {
> >>> + return Log;
> >>> + }
> >>> +
> >>> +protected:
> >>> + unsigned Log;
> >>> + std::vector<ElemType> Order;
> >>> + using RowType = std::vector<uint8_t>;
> >>> + std::vector<RowType> Table;
> >>> +};
> >>> +
> >>> +struct ForwardDeltaNetwork : public PermNetwork {
> >>> + ForwardDeltaNetwork(ArrayRef<ElemType> Ord) : PermNetwork(Ord) {}
> >>> +
> >>> + bool run(Controls &V) {
> >>> + if (!route(Order.data(), Table.data(), size(), 0))
> >>> + return false;
> >>> + getControls(V, 0, Forward);
> >>> + return true;
> >>> + }
> >>> +
> >>> +private:
> >>> + bool route(ElemType *P, RowType *T, unsigned Size, unsigned Step);
> >>> +};
> >>> +
> >>> +struct ReverseDeltaNetwork : public PermNetwork {
> >>> + ReverseDeltaNetwork(ArrayRef<ElemType> Ord) : PermNetwork(Ord) {}
> >>> +
> >>> + bool run(Controls &V) {
> >>> + if (!route(Order.data(), Table.data(), size(), 0))
> >>> + return false;
> >>> + getControls(V, 0, Reverse);
> >>> + return true;
> >>> + }
> >>> +
> >>> +private:
> >>> + bool route(ElemType *P, RowType *T, unsigned Size, unsigned Step);
> >>> +};
> >>> +
> >>> +struct BenesNetwork : public PermNetwork {
> >>> + BenesNetwork(ArrayRef<ElemType> Ord) : PermNetwork(Ord, 2) {}
> >>> +
> >>> + bool run(Controls &F, Controls &R) {
> >>> + if (!route(Order.data(), Table.data(), size(), 0))
> >>> + return false;
> >>> +
> >>> + getControls(F, 0, Forward);
> >>> + getControls(R, Log, Reverse);
> >>> + return true;
> >>> + }
> >>> +
> >>> +private:
> >>> + bool route(ElemType *P, RowType *T, unsigned Size, unsigned Step);
> >>> +};
> >>> +
> >>> +
> >>> +bool ForwardDeltaNetwork::route(ElemType *P, RowType *T, unsigned
> Size,
> >>> + unsigned Step) {
> >>> + bool UseUp = false, UseDown = false;
> >>> + ElemType Num = Size;
> >>> +
> >>> + // Cannot use coloring here, because coloring is used to determine
> >>> + // the "big" switch, i.e. the one that changes halves, and in a
> forward
> >>> + // network, a color can be simultaneously routed to both halves in
> the
> >>> + // step we're working on.
> >>> + for (ElemType J = 0; J != Num; ++J) {
> >>> + ElemType I = P[J];
> >>> + // I is the position in the input,
> >>> + // J is the position in the output.
> >>> + if (I == Ignore)
> >>> + continue;
> >>> + uint8_t S;
> >>> + if (I < Num/2)
> >>> + S = (J < Num/2) ? Pass : Switch;
> >>> + else
> >>> + S = (J < Num/2) ? Switch : Pass;
> >>> +
> >>> + // U is the element in the table that needs to be updated.
> >>> + ElemType U = (S == Pass) ? I : (I < Num/2 ? I+Num/2 : I-Num/2);
> >>> + if (U < Num/2)
> >>> + UseUp = true;
> >>> + else
> >>> + UseDown = true;
> >>> + if (T[U][Step] != S && T[U][Step] != None)
> >>> + return false;
> >>> + T[U][Step] = S;
> >>> + }
> >>> +
> >>> + for (ElemType J = 0; J != Num; ++J)
> >>> + if (P[J] != Ignore && P[J] >= Num/2)
> >>> + P[J] -= Num/2;
> >>> +
> >>> + if (Step+1 < Log) {
> >>> + if (UseUp && !route(P, T, Size/2, Step+1))
> >>> + return false;
> >>> + if (UseDown && !route(P+Size/2, T+Size/2, Size/2, Step+1))
> >>> + return false;
> >>> + }
> >>> + return true;
> >>> +}
> >>> +
> >>> +bool ReverseDeltaNetwork::route(ElemType *P, RowType *T, unsigned
> Size,
> >>> + unsigned Step) {
> >>> + unsigned Pets = Log-1 - Step;
> >>> + bool UseUp = false, UseDown = false;
> >>> + ElemType Num = Size;
> >>> +
> >>> + // In this step half-switching occurs, so coloring can be used.
> >>> + Coloring G({P,Size});
> >>> + const Coloring::MapType &M = G.colors();
> >>> + if (M.empty())
> >>> + return false;
> >>> +
> >>> + uint8_t ColorUp = Coloring::None;
> >>> + for (ElemType J = 0; J != Num; ++J) {
> >>> + ElemType I = P[J];
> >>> + // I is the position in the input,
> >>> + // J is the position in the output.
> >>> + if (I == Ignore)
> >>> + continue;
> >>> + uint8_t C = M.at(I);
> >>> + if (C == Coloring::None)
> >>> + continue;
> >>> + // During "Step", inputs cannot switch halves, so if the "up"
> color
> >>> + // is still unknown, make sure that it is selected in such a way
> that
> >>> + // "I" will stay in the same half.
> >>> + bool InpUp = I < Num/2;
> >>> + if (ColorUp == Coloring::None)
> >>> + ColorUp = InpUp ? C : G.other(C);
> >>> + if ((C == ColorUp) != InpUp) {
> >>> + // If I should go to a different half than where is it now,
> give up.
> >>> + return false;
> >>> + }
> >>> +
> >>> + uint8_t S;
> >>> + if (InpUp) {
> >>> + S = (J < Num/2) ? Pass : Switch;
> >>> + UseUp = true;
> >>> + } else {
> >>> + S = (J < Num/2) ? Switch : Pass;
> >>> + UseDown = true;
> >>> + }
> >>> + T[J][Pets] = S;
> >>> + }
> >>> +
> >>> + // Reorder the working permutation according to the computed switch
> table
> >>> + // for the last step (i.e. Pets).
> >>> + for (ElemType J = 0; J != Size/2; ++J) {
> >>> + ElemType PJ = P[J]; // Current values of P[J]
> >>> + ElemType PC = P[J+Size/2]; // and P[conj(J)]
> >>> + ElemType QJ = PJ; // New values of P[J]
> >>> + ElemType QC = PC; // and P[conj(J)]
> >>> + if (T[J][Pets] == Switch)
> >>> + QC = PJ;
> >>> + if (T[J+Size/2][Pets] == Switch)
> >>> + QJ = PC;
> >>> + P[J] = QJ;
> >>> + P[J+Size/2] = QC;
> >>> + }
> >>> +
> >>> + for (ElemType J = 0; J != Num; ++J)
> >>> + if (P[J] != Ignore && P[J] >= Num/2)
> >>> + P[J] -= Num/2;
> >>> +
> >>> + if (Step+1 < Log) {
> >>> + if (UseUp && !route(P, T, Size/2, Step+1))
> >>> + return false;
> >>> + if (UseDown && !route(P+Size/2, T+Size/2, Size/2, Step+1))
> >>> + return false;
> >>> + }
> >>> + return true;
> >>> +}
> >>> +
> >>> +bool BenesNetwork::route(ElemType *P, RowType *T, unsigned Size,
> >>> + unsigned Step) {
> >>> + Coloring G({P,Size});
> >>> + const Coloring::MapType &M = G.colors();
> >>> + if (M.empty())
> >>> + return false;
> >>> + ElemType Num = Size;
> >>> +
> >>> + unsigned Pets = 2*Log-1 - Step;
> >>> + bool UseUp = false, UseDown = false;
> >>> +
> >>> + // Both assignments, i.e. Red->Up and Red->Down are valid, but they
> will
> >>> + // result in different controls. Let's pick the one where the first
> >>> + // control will be "Pass".
> >>> + uint8_t ColorUp = Coloring::None;
> >>> + for (ElemType J = 0; J != Num; ++J) {
> >>> + ElemType I = P[J];
> >>> + if (I == Ignore)
> >>> + continue;
> >>> + uint8_t C = M.at(I);
> >>> + if (C == Coloring::None)
> >>> + continue;
> >>> + if (ColorUp == Coloring::None) {
> >>> + ColorUp = (I < Num/2) ? Coloring::Red : Coloring::Black;
> >>> + }
> >>> + unsigned CI = (I < Num/2) ? I+Num/2 : I-Num/2;
> >>> + if (C == ColorUp) {
> >>> + if (I < Num/2)
> >>> + T[I][Step] = Pass;
> >>> + else
> >>> + T[CI][Step] = Switch;
> >>> + T[J][Pets] = (J < Num/2) ? Pass : Switch;
> >>> + UseUp = true;
> >>> + } else { // Down
> >>> + if (I < Num/2)
> >>> + T[CI][Step] = Switch;
> >>> + else
> >>> + T[I][Step] = Pass;
> >>> + T[J][Pets] = (J < Num/2) ? Switch : Pass;
> >>> + UseDown = true;
> >>> + }
> >>> + }
> >>> +
> >>> + // Reorder the working permutation according to the computed switch
> table
> >>> + // for the last step (i.e. Pets).
> >>> + for (ElemType J = 0; J != Num/2; ++J) {
> >>> + ElemType PJ = P[J]; // Current values of P[J]
> >>> + ElemType PC = P[J+Num/2]; // and P[conj(J)]
> >>> + ElemType QJ = PJ; // New values of P[J]
> >>> + ElemType QC = PC; // and P[conj(J)]
> >>> + if (T[J][Pets] == Switch)
> >>> + QC = PJ;
> >>> + if (T[J+Num/2][Pets] == Switch)
> >>> + QJ = PC;
> >>> + P[J] = QJ;
> >>> + P[J+Num/2] = QC;
> >>> + }
> >>> +
> >>> + for (ElemType J = 0; J != Num; ++J)
> >>> + if (P[J] != Ignore && P[J] >= Num/2)
> >>> + P[J] -= Num/2;
> >>> +
> >>> + if (Step+1 < Log) {
> >>> + if (UseUp && !route(P, T, Size/2, Step+1))
> >>> + return false;
> >>> + if (UseDown && !route(P+Size/2, T+Size/2, Size/2, Step+1))
> >>> + return false;
> >>> + }
> >>> + return true;
> >>> +}
> >>> +
> >>> +//
> --------------------------------------------------------------------
> >>> +// Support for building selection results (output instructions that
> are
> >>> +// parts of the final selection).
> >>> +
> >>> +struct OpRef {
> >>> + OpRef(SDValue V) : OpV(V) {}
> >>> + bool isValue() const { return OpV.getNode() != nullptr; }
> >>> + bool isValid() const { return isValue() || !(OpN & Invalid); }
> >>> + static OpRef res(int N) { return OpRef(Whole | (N & Index)); }
> >>> + static OpRef fail() { return OpRef(Invalid); }
> >>> +
> >>> + static OpRef lo(const OpRef &R) {
> >>> + assert(!R.isValue());
> >>> + return OpRef(R.OpN & (Undef | Index | LoHalf));
> >>> + }
> >>> + static OpRef hi(const OpRef &R) {
> >>> + assert(!R.isValue());
> >>> + return OpRef(R.OpN & (Undef | Index | HiHalf));
> >>> + }
> >>> + static OpRef undef(MVT Ty) { return OpRef(Undef | Ty.SimpleTy); }
> >>> +
> >>> + // Direct value.
> >>> + SDValue OpV = SDValue();
> >>> +
> >>> + // Reference to the operand of the input node:
> >>> + // If the 31st bit is 1, it's undef, otherwise, bits 28..0 are the
> >>> + // operand index:
> >>> + // If bit 30 is set, it's the high half of the operand.
> >>> + // If bit 29 is set, it's the low half of the operand.
> >>> + unsigned OpN = 0;
> >>> +
> >>> + enum : unsigned {
> >>> + Invalid = 0x10000000,
> >>> + LoHalf = 0x20000000,
> >>> + HiHalf = 0x40000000,
> >>> + Whole = LoHalf | HiHalf,
> >>> + Undef = 0x80000000,
> >>> + Index = 0x0FFFFFFF, // Mask of the index value.
> >>> + IndexBits = 28,
> >>> + };
> >>> +
> >>> + void print(raw_ostream &OS, const SelectionDAG &G) const;
> >>> +
> >>> +private:
> >>> + OpRef(unsigned N) : OpN(N) {}
> >>> +};
> >>> +
> >>> +struct NodeTemplate {
> >>> + NodeTemplate() = default;
> >>> + unsigned Opc = 0;
> >>> + MVT Ty = MVT::Other;
> >>> + std::vector<OpRef> Ops;
> >>> +
> >>> + void print(raw_ostream &OS, const SelectionDAG &G) const;
> >>> +};
> >>> +
> >>> +struct ResultStack {
> >>> + ResultStack(SDNode *Inp)
> >>> + : InpNode(Inp), InpTy(Inp->getValueType(0).getSimpleVT()) {}
> >>> + SDNode *InpNode;
> >>> + MVT InpTy;
> >>> + unsigned push(const NodeTemplate &Res) {
> >>> + List.push_back(Res);
> >>> + return List.size()-1;
> >>> + }
> >>> + unsigned push(unsigned Opc, MVT Ty, std::vector<OpRef> &&Ops) {
> >>> + NodeTemplate Res;
> >>> + Res.Opc = Opc;
> >>> + Res.Ty = Ty;
> >>> + Res.Ops = Ops;
> >>> + return push(Res);
> >>> + }
> >>> + bool empty() const { return List.empty(); }
> >>> + unsigned size() const { return List.size(); }
> >>> + unsigned top() const { return size()-1; }
> >>> + const NodeTemplate &operator[](unsigned I) const { return List[I]; }
> >>> + unsigned reset(unsigned NewTop) {
> >>> + List.resize(NewTop+1);
> >>> + return NewTop;
> >>> + }
> >>> +
> >>> + using BaseType = std::vector<NodeTemplate>;
> >>> + BaseType::iterator begin() { return List.begin(); }
> >>> + BaseType::iterator end() { return List.end(); }
> >>> + BaseType::const_iterator begin() const { return List.begin(); }
> >>> + BaseType::const_iterator end() const { return List.end(); }
> >>> +
> >>> + BaseType List;
> >>> +
> >>> + void print(raw_ostream &OS, const SelectionDAG &G) const;
> >>> +};
> >>> +
> >>> +void OpRef::print(raw_ostream &OS, const SelectionDAG &G) const {
> >>> + if (isValue()) {
> >>> + OpV.getNode()->print(OS, &G);
> >>> + return;
> >>> + }
> >>> + if (OpN & Invalid) {
> >>> + OS << "invalid";
> >>> + return;
> >>> + }
> >>> + if (OpN & Undef) {
> >>> + OS << "undef";
> >>> + return;
> >>> + }
> >>> + if ((OpN & Whole) != Whole) {
> >>> + assert((OpN & Whole) == LoHalf || (OpN & Whole) == HiHalf);
> >>> + if (OpN & LoHalf)
> >>> + OS << "lo ";
> >>> + else
> >>> + OS << "hi ";
> >>> + }
> >>> + OS << '#' << SignExtend32(OpN & Index, IndexBits);
> >>> +}
> >>> +
> >>> +void NodeTemplate::print(raw_ostream &OS, const SelectionDAG &G)
> const {
> >>> + const TargetInstrInfo &TII = *G.getSubtarget().getInstrInfo();
> >>> + OS << format("%8s", EVT(Ty).getEVTString().c_str()) << " "
> >>> + << TII.getName(Opc);
> >>> + bool Comma = false;
> >>> + for (const auto &R : Ops) {
> >>> + if (Comma)
> >>> + OS << ',';
> >>> + Comma = true;
> >>> + OS << ' ';
> >>> + R.print(OS, G);
> >>> + }
> >>> +}
> >>> +
> >>> +void ResultStack::print(raw_ostream &OS, const SelectionDAG &G) const
> {
> >>> + OS << "Input node:\n";
> >>> + InpNode->dumpr(&G);
> >>> + OS << "Result templates:\n";
> >>> + for (unsigned I = 0, E = List.size(); I != E; ++I) {
> >>> + OS << '[' << I << "] ";
> >>> + List[I].print(OS, G);
> >>> + OS << '\n';
> >>> + }
> >>> +}
> >>> +
> >>> +struct ShuffleMask {
> >>> + ShuffleMask(ArrayRef<int> M) : Mask(M) {
> >>> + for (unsigned I = 0, E = Mask.size(); I != E; ++I) {
> >>> + int M = Mask[I];
> >>> + if (M == -1)
> >>> + continue;
> >>> + MinSrc = (MinSrc == -1) ? M : std::min(MinSrc, M);
> >>> + MaxSrc = (MaxSrc == -1) ? M : std::max(MaxSrc, M);
> >>> + }
> >>> + }
> >>> +
> >>> + ArrayRef<int> Mask;
> >>> + int MinSrc = -1, MaxSrc = -1;
> >>> +
> >>> + ShuffleMask lo() const {
> >>> + size_t H = Mask.size()/2;
> >>> + return ShuffleMask({Mask.data(), H});
> >>> + }
> >>> + ShuffleMask hi() const {
> >>> + size_t H = Mask.size()/2;
> >>> + return ShuffleMask({Mask.data()+H, H});
> >>> + }
> >>> +};
> >>> +
> >>> +//
> --------------------------------------------------------------------
> >>> +// The HvxSelector class.
> >>> +
> >>> +static const HexagonTargetLowering &getHexagonLowering(SelectionDAG
> &G) {
> >>> + return static_cast<const
> HexagonTargetLowering&>(G.getTargetLoweringInfo());
> >>> +}
> >>> +static const HexagonSubtarget &getHexagonSubtarget(SelectionDAG &G) {
> >>> + return static_cast<const HexagonSubtarget&>(G.getSubtarget());
> >>> +}
> >>> +
> >>> +namespace llvm {
> >>> + struct HvxSelector {
> >>> + const HexagonTargetLowering &Lower;
> >>> + HexagonDAGToDAGISel &ISel;
> >>> + SelectionDAG &DAG;
> >>> + const HexagonSubtarget &HST;
> >>> + const unsigned HwLen;
> >>> +
> >>> + HvxSelector(HexagonDAGToDAGISel &HS, SelectionDAG &G)
> >>> + : Lower(getHexagonLowering(G)), ISel(HS), DAG(G),
> >>> + HST(getHexagonSubtarget(G)), HwLen(HST.getVectorLength()) {}
> >>> +
> >>> + MVT getSingleVT(MVT ElemTy) const {
> >>> + unsigned NumElems = HwLen / (ElemTy.getSizeInBits()/8);
> >>> + return MVT::getVectorVT(ElemTy, NumElems);
> >>> + }
> >>> +
> >>> + MVT getPairVT(MVT ElemTy) const {
> >>> + unsigned NumElems = (2*HwLen) / (ElemTy.getSizeInBits()/8);
> >>> + return MVT::getVectorVT(ElemTy, NumElems);
> >>> + }
> >>> +
> >>> + void selectShuffle(SDNode *N);
> >>> + void selectRor(SDNode *N);
> >>> +
> >>> + private:
> >>> + void materialize(const ResultStack &Results);
> >>> +
> >>> + SDValue getVectorConstant(ArrayRef<uint8_t> Data, const SDLoc
> &dl);
> >>> +
> >>> + enum : unsigned {
> >>> + None,
> >>> + PackMux,
> >>> + };
> >>> + OpRef concat(OpRef Va, OpRef Vb, ResultStack &Results);
> >>> + OpRef packs(ShuffleMask SM, OpRef Va, OpRef Vb, ResultStack
> &Results,
> >>> + MutableArrayRef<int> NewMask, unsigned Options =
> None);
> >>> + OpRef packp(ShuffleMask SM, OpRef Va, OpRef Vb, ResultStack
> &Results,
> >>> + MutableArrayRef<int> NewMask);
> >>> + OpRef zerous(ShuffleMask SM, OpRef Va, ResultStack &Results);
> >>> + OpRef vmuxs(ArrayRef<uint8_t> Bytes, OpRef Va, OpRef Vb,
> >>> + ResultStack &Results);
> >>> + OpRef vmuxp(ArrayRef<uint8_t> Bytes, OpRef Va, OpRef Vb,
> >>> + ResultStack &Results);
> >>> +
> >>> + OpRef shuffs1(ShuffleMask SM, OpRef Va, ResultStack &Results);
> >>> + OpRef shuffs2(ShuffleMask SM, OpRef Va, OpRef Vb, ResultStack
> &Results);
> >>> + OpRef shuffp1(ShuffleMask SM, OpRef Va, ResultStack &Results);
> >>> + OpRef shuffp2(ShuffleMask SM, OpRef Va, OpRef Vb, ResultStack
> &Results);
> >>> +
> >>> + OpRef butterfly(ShuffleMask SM, OpRef Va, ResultStack &Results);
> >>> + OpRef contracting(ShuffleMask SM, OpRef Va, OpRef Vb, ResultStack
> &Results);
> >>> + OpRef expanding(ShuffleMask SM, OpRef Va, ResultStack &Results);
> >>> + OpRef perfect(ShuffleMask SM, OpRef Va, ResultStack &Results);
> >>> +
> >>> + bool selectVectorConstants(SDNode *N);
> >>> + bool scalarizeShuffle(ArrayRef<int> Mask, const SDLoc &dl, MVT
> ResTy,
> >>> + SDValue Va, SDValue Vb, SDNode *N);
> >>> +
> >>> + };
> >>> +}
> >>> +
> >>> +// Return a submask of A that is shorter than A by |C| elements:
> >>> +// - if C > 0, return a submask of A that starts at position C,
> >>> +// - if C <= 0, return a submask of A that starts at 0 (reduce A by
> |C|).
> >>> +static ArrayRef<int> subm(ArrayRef<int> A, int C) {
> >>> + if (C > 0)
> >>> + return { A.data()+C, A.size()-C };
> >>> + return { A.data(), A.size()+C };
> >>> +}
> >>> +
> >>> +static void splitMask(ArrayRef<int> Mask, MutableArrayRef<int> MaskL,
> >>> + MutableArrayRef<int> MaskR) {
> >>> + unsigned VecLen = Mask.size();
> >>> + assert(MaskL.size() == VecLen && MaskR.size() == VecLen);
> >>> + for (unsigned I = 0; I != VecLen; ++I) {
> >>> + int M = Mask[I];
> >>> + if (M < 0) {
> >>> + MaskL[I] = MaskR[I] = -1;
> >>> + } else if (unsigned(M) < VecLen) {
> >>> + MaskL[I] = M;
> >>> + MaskR[I] = -1;
> >>> + } else {
> >>> + MaskL[I] = -1;
> >>> + MaskR[I] = M-VecLen;
> >>> + }
> >>> + }
> >>> +}
> >>> +
> >>> +static std::pair<int,unsigned> findStrip(ArrayRef<int> A, int Inc,
> >>> + unsigned MaxLen) {
> >>> + assert(A.size() > 0 && A.size() >= MaxLen);
> >>> + int F = A[0];
> >>> + int E = F;
> >>> + for (unsigned I = 1; I != MaxLen; ++I) {
> >>> + if (A[I] - E != Inc)
> >>> + return { F, I };
> >>> + E = A[I];
> >>> + }
> >>> + return { F, MaxLen };
> >>> +}
> >>> +
> >>> +static bool isUndef(ArrayRef<int> Mask) {
> >>> + for (int Idx : Mask)
> >>> + if (Idx != -1)
> >>> + return false;
> >>> + return true;
> >>> +}
> >>> +
> >>> +static bool isIdentity(ArrayRef<int> Mask) {
> >>> + unsigned Size = Mask.size();
> >>> + return findStrip(Mask, 1, Size) == std::make_pair(0, Size);
> >>> +}
> >>> +
> >>> +static bool isPermutation(ArrayRef<int> Mask) {
> >>> + // Check by adding all numbers only works if there is no overflow.
> >>> + assert(Mask.size() < 0x00007FFF && "Sanity failure");
> >>> + int Sum = 0;
> >>> + for (int Idx : Mask) {
> >>> + if (Idx == -1)
> >>> + return false;
> >>> + Sum += Idx;
> >>> + }
> >>> + int N = Mask.size();
> >>> + return 2*Sum == N*(N-1);
> >>> +}
> >>> +
> >>> +bool HvxSelector::selectVectorConstants(SDNode *N) {
> >>> + // Constant vectors are generated as loads from constant pools.
> >>> + // Since they are generated during the selection process, the main
> >>> + // selection algorithm is not aware of them. Select them directly
> >>> + // here.
> >>> + if (!N->isMachineOpcode() && N->getOpcode() == ISD::LOAD) {
> >>> + SDValue Addr = cast<LoadSDNode>(N)->getBasePtr();
> >>> + unsigned AddrOpc = Addr.getOpcode();
> >>> + if (AddrOpc == HexagonISD::AT_PCREL || AddrOpc == HexagonISD::CP)
> {
> >>> + if (Addr.getOperand(0).getOpcode() == ISD::TargetConstantPool) {
> >>> + ISel.Select(N);
> >>> + return true;
> >>> + }
> >>> + }
> >>> + }
> >>> +
> >>> + bool Selected = false;
> >>> + for (unsigned I = 0, E = N->getNumOperands(); I != E; ++I)
> >>> + Selected = selectVectorConstants(N->getOperand(I).getNode()) ||
> Selected;
> >>> + return Selected;
> >>> +}
> >>> +
> >>> +void HvxSelector::materialize(const ResultStack &Results) {
> >>> + DEBUG_WITH_TYPE("isel", {
> >>> + dbgs() << "Materializing\n";
> >>> + Results.print(dbgs(), DAG);
> >>> + });
> >>> + if (Results.empty())
> >>> + return;
> >>> + const SDLoc &dl(Results.InpNode);
> >>> + std::vector<SDValue> Output;
> >>> +
> >>> + for (unsigned I = 0, E = Results.size(); I != E; ++I) {
> >>> + const NodeTemplate &Node = Results[I];
> >>> + std::vector<SDValue> Ops;
> >>> + for (const OpRef &R : Node.Ops) {
> >>> + assert(R.isValid());
> >>> + if (R.isValue()) {
> >>> + Ops.push_back(R.OpV);
> >>> + continue;
> >>> + }
> >>> + if (R.OpN & OpRef::Undef) {
> >>> + MVT::SimpleValueType SVT = MVT::SimpleValueType(R.OpN &
> OpRef::Index);
> >>> + Ops.push_back(ISel.selectUndef(dl, MVT(SVT)));
> >>> + continue;
> >>> + }
> >>> + // R is an index of a result.
> >>> + unsigned Part = R.OpN & OpRef::Whole;
> >>> + int Idx = SignExtend32(R.OpN & OpRef::Index, OpRef::IndexBits);
> >>> + if (Idx < 0)
> >>> + Idx += I;
> >>> + assert(Idx >= 0 && unsigned(Idx) < Output.size());
> >>> + SDValue Op = Output[Idx];
> >>> + MVT OpTy = Op.getValueType().getSimpleVT();
> >>> + if (Part != OpRef::Whole) {
> >>> + assert(Part == OpRef::LoHalf || Part == OpRef::HiHalf);
> >>> + if (Op.getOpcode() == HexagonISD::VCOMBINE) {
> >>> + Op = (Part == OpRef::HiHalf) ? Op.getOperand(0) :
> Op.getOperand(1);
> >>> + } else {
> >>> + MVT HalfTy = MVT::getVectorVT(OpTy.getVectorElementType(),
> >>> +
> OpTy.getVectorNumElements()/2);
> >>> + unsigned Sub = (Part == OpRef::LoHalf) ? Hexagon::vsub_lo
> >>> + : Hexagon::vsub_hi;
> >>> + Op = DAG.getTargetExtractSubreg(Sub, dl, HalfTy, Op);
> >>> + }
> >>> + }
> >>> + Ops.push_back(Op);
> >>> + } // for (Node : Results)
> >>> +
> >>> + assert(Node.Ty != MVT::Other);
> >>> + SDNode *ResN = (Node.Opc == TargetOpcode::COPY)
> >>> + ? Ops.front().getNode()
> >>> + : DAG.getMachineNode(Node.Opc, dl, Node.Ty,
> Ops);
> >>> + Output.push_back(SDValue(ResN, 0));
> >>> + }
> >>> +
> >>> + SDNode *OutN = Output.back().getNode();
> >>> + SDNode *InpN = Results.InpNode;
> >>> + DEBUG_WITH_TYPE("isel", {
> >>> + dbgs() << "Generated node:\n";
> >>> + OutN->dumpr(&DAG);
> >>> + });
> >>> +
> >>> + ISel.ReplaceNode(InpN, OutN);
> >>> + selectVectorConstants(OutN);
> >>> + DAG.RemoveDeadNodes();
> >>> +}
> >>> +
> >>> +OpRef HvxSelector::concat(OpRef Lo, OpRef Hi, ResultStack &Results) {
> >>> + DEBUG_WITH_TYPE("isel", {dbgs() << __func__ << '\n';});
> >>> + const SDLoc &dl(Results.InpNode);
> >>> + Results.push(TargetOpcode::REG_SEQUENCE, getPairVT(MVT::i8), {
> >>> + DAG.getTargetConstant(Hexagon::HvxWRRegClassID, dl, MVT::i32),
> >>> + Lo, DAG.getTargetConstant(Hexagon::vsub_lo, dl, MVT::i32),
> >>> + Hi, DAG.getTargetConstant(Hexagon::vsub_hi, dl, MVT::i32),
> >>> + });
> >>> + return OpRef::res(Results.top());
> >>> +}
> >>> +
> >>> +// Va, Vb are single vectors, SM can be arbitrarily long.
> >>> +OpRef HvxSelector::packs(ShuffleMask SM, OpRef Va, OpRef Vb,
> >>> + ResultStack &Results, MutableArrayRef<int>
> NewMask,
> >>> + unsigned Options) {
> >>> + DEBUG_WITH_TYPE("isel", {dbgs() << __func__ << '\n';});
> >>> + if (!Va.isValid() || !Vb.isValid())
> >>> + return OpRef::fail();
> >>> +
> >>> + int VecLen = SM.Mask.size();
> >>> + MVT Ty = getSingleVT(MVT::i8);
> >>> +
> >>> + if (SM.MaxSrc - SM.MinSrc < int(HwLen)) {
> >>> + if (SM.MaxSrc < int(HwLen)) {
> >>> + memcpy(NewMask.data(), SM.Mask.data(), sizeof(int)*VecLen);
> >>> + return Va;
> >>> + }
> >>> + if (SM.MinSrc >= int(HwLen)) {
> >>> + for (int I = 0; I != VecLen; ++I) {
> >>> + int M = SM.Mask[I];
> >>> + if (M != -1)
> >>> + M -= HwLen;
> >>> + NewMask[I] = M;
> >>> + }
> >>> + return Vb;
> >>> + }
> >>> + const SDLoc &dl(Results.InpNode);
> >>> + SDValue S = DAG.getTargetConstant(SM.MinSrc, dl, MVT::i32);
> >>> + if (isUInt<3>(SM.MinSrc)) {
> >>> + Results.push(Hexagon::V6_valignbi, Ty, {Vb, Va, S});
> >>> + } else {
> >>> + Results.push(Hexagon::A2_tfrsi, MVT::i32, {S});
> >>> + unsigned Top = Results.top();
> >>> + Results.push(Hexagon::V6_valignb, Ty, {Vb, Va,
> OpRef::res(Top)});
> >>> + }
> >>> + for (int I = 0; I != VecLen; ++I) {
> >>> + int M = SM.Mask[I];
> >>> + if (M != -1)
> >>> + M -= SM.MinSrc;
> >>> + NewMask[I] = M;
> >>> + }
> >>> + return OpRef::res(Results.top());
> >>> + }
> >>> +
> >>> + if (Options & PackMux) {
> >>> + // If elements picked from Va and Vb have all different (source)
> indexes
> >>> + // (relative to the start of the argument), do a mux, and update
> the mask.
> >>> + BitVector Picked(HwLen);
> >>> + SmallVector<uint8_t,128> MuxBytes(HwLen);
> >>> + bool CanMux = true;
> >>> + for (int I = 0; I != VecLen; ++I) {
> >>> + int M = SM.Mask[I];
> >>> + if (M == -1)
> >>> + continue;
> >>> + if (M >= int(HwLen))
> >>> + M -= HwLen;
> >>> + else
> >>> + MuxBytes[M] = 0xFF;
> >>> + if (Picked[M]) {
> >>> + CanMux = false;
> >>> + break;
> >>> + }
> >>> + NewMask[I] = M;
> >>> + }
> >>> + if (CanMux)
> >>> + return vmuxs(MuxBytes, Va, Vb, Results);
> >>> + }
> >>> +
> >>> + return OpRef::fail();
> >>> +}
> >>> +
> >>> +OpRef HvxSelector::packp(ShuffleMask SM, OpRef Va, OpRef Vb,
> >>> + ResultStack &Results, MutableArrayRef<int>
> NewMask) {
> >>> + DEBUG_WITH_TYPE("isel", {dbgs() << __func__ << '\n';});
> >>> + unsigned HalfMask = 0;
> >>> + unsigned LogHw = Log2_32(HwLen);
> >>> + for (int M : SM.Mask) {
> >>> + if (M == -1)
> >>> + continue;
> >>> + HalfMask |= (1u << (M >> LogHw));
> >>> + }
> >>> +
> >>> + if (HalfMask == 0)
> >>> + return OpRef::undef(getPairVT(MVT::i8));
> >>> +
> >>> + // If more than two halves are used, bail.
> >>> + // TODO: be more aggressive here?
> >>> + if (countPopulation(HalfMask) > 2)
> >>> + return OpRef::fail();
> >>> +
> >>> + MVT HalfTy = getSingleVT(MVT::i8);
> >>> +
> >>> + OpRef Inp[2] = { Va, Vb };
> >>> + OpRef Out[2] = { OpRef::undef(HalfTy), OpRef::undef(HalfTy) };
> >>> +
> >>> + uint8_t HalfIdx[4] = { 0xFF, 0xFF, 0xFF, 0xFF };
> >>> + unsigned Idx = 0;
> >>> + for (unsigned I = 0; I != 4; ++I) {
> >>> + if ((HalfMask & (1u << I)) == 0)
> >>> + continue;
> >>> + assert(Idx < 2);
> >>> + OpRef Op = Inp[I/2];
> >>> + Out[Idx] = (I & 1) ? OpRef::hi(Op) : OpRef::lo(Op);
> >>> + HalfIdx[I] = Idx++;
> >>> + }
> >>> +
> >>> + int VecLen = SM.Mask.size();
> >>> + for (int I = 0; I != VecLen; ++I) {
> >>> + int M = SM.Mask[I];
> >>> + if (M >= 0) {
> >>> + uint8_t Idx = HalfIdx[M >> LogHw];
> >>> + assert(Idx == 0 || Idx == 1);
> >>> + M = (M & (HwLen-1)) + HwLen*Idx;
> >>> + }
> >>> + NewMask[I] = M;
> >>> + }
> >>> +
> >>> + return concat(Out[0], Out[1], Results);
> >>> +}
> >>> +
> >>> +OpRef HvxSelector::zerous(ShuffleMask SM, OpRef Va, ResultStack
> &Results) {
> >>> + DEBUG_WITH_TYPE("isel", {dbgs() << __func__ << '\n';});
> >>> +
> >>> + int VecLen = SM.Mask.size();
> >>> + SmallVector<uint8_t,128> UsedBytes(VecLen);
> >>> + bool HasUnused = false;
> >>> + for (int I = 0; I != VecLen; ++I) {
> >>> + if (SM.Mask[I] != -1)
> >>> + UsedBytes[I] = 0xFF;
> >>> + else
> >>> + HasUnused = true;
> >>> + }
> >>> + if (!HasUnused)
> >>> + return Va;
> >>> + SDValue B = getVectorConstant(UsedBytes, SDLoc(Results.InpNode));
> >>> + Results.push(Hexagon::V6_vand, getSingleVT(MVT::i8), {Va,
> OpRef(B)});
> >>> + return OpRef::res(Results.top());
> >>> +}
> >>> +
> >>> +OpRef HvxSelector::vmuxs(ArrayRef<uint8_t> Bytes, OpRef Va, OpRef Vb,
> >>> + ResultStack &Results) {
> >>> + DEBUG_WITH_TYPE("isel", {dbgs() << __func__ << '\n';});
> >>> + MVT ByteTy = getSingleVT(MVT::i8);
> >>> + MVT BoolTy = MVT::getVectorVT(MVT::i1, 8*HwLen); // XXX
> >>> + const SDLoc &dl(Results.InpNode);
> >>> + SDValue B = getVectorConstant(Bytes, dl);
> >>> + Results.push(Hexagon::V6_vd0, ByteTy, {});
> >>> + Results.push(Hexagon::V6_veqb, BoolTy, {OpRef(B), OpRef::res(-1)});
> >>> + Results.push(Hexagon::V6_vmux, ByteTy, {OpRef::res(-1), Va, Vb});
> >>> + return OpRef::res(Results.top());
> >>> +}
> >>> +
> >>> +OpRef HvxSelector::vmuxp(ArrayRef<uint8_t> Bytes, OpRef Va, OpRef Vb,
> >>> + ResultStack &Results) {
> >>> + DEBUG_WITH_TYPE("isel", {dbgs() << __func__ << '\n';});
> >>> + size_t S = Bytes.size() / 2;
> >>> + OpRef L = vmuxs({Bytes.data(), S}, OpRef::lo(Va), OpRef::lo(Vb),
> Results);
> >>> + OpRef H = vmuxs({Bytes.data()+S, S}, OpRef::hi(Va), OpRef::hi(Vb),
> Results);
> >>> + return concat(L, H, Results);
> >>> +}
> >>> +
> >>> +OpRef HvxSelector::shuffs1(ShuffleMask SM, OpRef Va, ResultStack
> &Results) {
> >>> + DEBUG_WITH_TYPE("isel", {dbgs() << __func__ << '\n';});
> >>> + unsigned VecLen = SM.Mask.size();
> >>> + assert(HwLen == VecLen);
> >>> + assert(all_of(SM.Mask, [this](int M) { return M == -1 || M <
> int(HwLen); }));
> >>> +
> >>> + if (isIdentity(SM.Mask))
> >>> + return Va;
> >>> + if (isUndef(SM.Mask))
> >>> + return OpRef::undef(getSingleVT(MVT::i8));
> >>> +
> >>> + return butterfly(SM, Va, Results);
> >>> +}
> >>> +
> >>> +OpRef HvxSelector::shuffs2(ShuffleMask SM, OpRef Va, OpRef Vb,
> >>> + ResultStack &Results) {
> >>> + DEBUG_WITH_TYPE("isel", {dbgs() << __func__ << '\n';});
> >>> + OpRef C = contracting(SM, Va, Vb, Results);
> >>> + if (C.isValid())
> >>> + return C;
> >>> +
> >>> + int VecLen = SM.Mask.size();
> >>> + SmallVector<int,128> NewMask(VecLen);
> >>> + OpRef P = packs(SM, Va, Vb, Results, NewMask);
> >>> + if (P.isValid())
> >>> + return shuffs1(ShuffleMask(NewMask), P, Results);
> >>> +
> >>> + SmallVector<int,128> MaskL(VecLen), MaskR(VecLen);
> >>> + splitMask(SM.Mask, MaskL, MaskR);
> >>> +
> >>> + OpRef L = shuffs1(ShuffleMask(MaskL), Va, Results);
> >>> + OpRef R = shuffs1(ShuffleMask(MaskR), Vb, Results);
> >>> + if (!L.isValid() || !R.isValid())
> >>> + return OpRef::fail();
> >>> +
> >>> + SmallVector<uint8_t,128> Bytes(VecLen);
> >>> + for (int I = 0; I != VecLen; ++I) {
> >>> + if (MaskL[I] != -1)
> >>> + Bytes[I] = 0xFF;
> >>> + }
> >>> + return vmuxs(Bytes, L, R, Results);
> >>> +}
> >>> +
> >>> +OpRef HvxSelector::shuffp1(ShuffleMask SM, OpRef Va, ResultStack
> &Results) {
> >>> + DEBUG_WITH_TYPE("isel", {dbgs() << __func__ << '\n';});
> >>> + int VecLen = SM.Mask.size();
> >>> +
> >>> + SmallVector<int,128> PackedMask(VecLen);
> >>> + OpRef P = packs(SM, OpRef::lo(Va), OpRef::hi(Va), Results,
> PackedMask);
> >>> + if (P.isValid()) {
> >>> + ShuffleMask PM(PackedMask);
> >>> + OpRef E = expanding(PM, P, Results);
> >>> + if (E.isValid())
> >>> + return E;
> >>> +
> >>> + OpRef L = shuffs1(PM.lo(), P, Results);
> >>> + OpRef H = shuffs1(PM.hi(), P, Results);
> >>> + if (L.isValid() && H.isValid())
> >>> + return concat(L, H, Results);
> >>> + }
> >>> +
> >>> + OpRef R = perfect(SM, Va, Results);
> >>> + if (R.isValid())
> >>> + return R;
> >>> + // TODO commute the mask and try the opposite order of the halves.
> >>> +
> >>> + OpRef L = shuffs2(SM.lo(), OpRef::lo(Va), OpRef::hi(Va), Results);
> >>> + OpRef H = shuffs2(SM.hi(), OpRef::lo(Va), OpRef::hi(Va), Results);
> >>> + if (L.isValid() && H.isValid())
> >>> + return concat(L, H, Results);
> >>> +
> >>> + return OpRef::fail();
> >>> +}
> >>> +
> >>> +OpRef HvxSelector::shuffp2(ShuffleMask SM, OpRef Va, OpRef Vb,
> >>> + ResultStack &Results) {
> >>> + DEBUG_WITH_TYPE("isel", {dbgs() << __func__ << '\n';});
> >>> + int VecLen = SM.Mask.size();
> >>> +
> >>> + SmallVector<int,256> PackedMask(VecLen);
> >>> + OpRef P = packp(SM, Va, Vb, Results, PackedMask);
> >>> + if (P.isValid())
> >>> + return shuffp1(ShuffleMask(PackedMask), P, Results);
> >>> +
> >>> + SmallVector<int,256> MaskL(VecLen), MaskR(VecLen);
> >>> + OpRef L = shuffp1(ShuffleMask(MaskL), Va, Results);
> >>> + OpRef R = shuffp1(ShuffleMask(MaskR), Vb, Results);
> >>> + if (!L.isValid() || !R.isValid())
> >>> + return OpRef::fail();
> >>> +
> >>> + // Mux the results.
> >>> + SmallVector<uint8_t,256> Bytes(VecLen);
> >>> + for (int I = 0; I != VecLen; ++I) {
> >>> + if (MaskL[I] != -1)
> >>> + Bytes[I] = 0xFF;
> >>> + }
> >>> + return vmuxp(Bytes, L, R, Results);
> >>> +}
> >>> +
> >>> +bool HvxSelector::scalarizeShuffle(ArrayRef<int> Mask, const SDLoc
> &dl,
> >>> + MVT ResTy, SDValue Va, SDValue Vb,
> >>> + SDNode *N) {
> >>> + DEBUG_WITH_TYPE("isel", {dbgs() << __func__ << '\n';});
> >>> + MVT ElemTy = ResTy.getVectorElementType();
> >>> + assert(ElemTy == MVT::i8);
> >>> + unsigned VecLen = Mask.size();
> >>> + bool HavePairs = (2*HwLen == VecLen);
> >>> + MVT SingleTy = getSingleVT(MVT::i8);
> >>> +
> >>> + SmallVector<SDValue,128> Ops;
> >>> + for (int I : Mask) {
> >>> + if (I < 0) {
> >>> + Ops.push_back(ISel.selectUndef(dl, ElemTy));
> >>> + continue;
> >>> + }
> >>> + SDValue Vec;
> >>> + unsigned M = I;
> >>> + if (M < VecLen) {
> >>> + Vec = Va;
> >>> + } else {
> >>> + Vec = Vb;
> >>> + M -= VecLen;
> >>> + }
> >>> + if (HavePairs) {
> >>> + if (M < HwLen) {
> >>> + Vec = DAG.getTargetExtractSubreg(Hexagon::vsub_lo, dl,
> SingleTy, Vec);
> >>> + } else {
> >>> + Vec = DAG.getTargetExtractSubreg(Hexagon::vsub_hi, dl,
> SingleTy, Vec);
> >>> + M -= HwLen;
> >>> + }
> >>> + }
> >>> + SDValue Idx = DAG.getConstant(M, dl, MVT::i32);
> >>> + SDValue Ex = DAG.getNode(ISD::EXTRACT_VECTOR_ELT, dl, ElemTy,
> {Vec, Idx});
> >>> + SDValue L = Lower.LowerOperation(Ex, DAG);
> >>> + assert(L.getNode());
> >>> + Ops.push_back(L);
> >>> + }
> >>> +
> >>> + SDValue LV;
> >>> + if (2*HwLen == VecLen) {
> >>> + SDValue B0 = DAG.getBuildVector(SingleTy, dl, {Ops.data(),
> HwLen});
> >>> + SDValue L0 = Lower.LowerOperation(B0, DAG);
> >>> + SDValue B1 = DAG.getBuildVector(SingleTy, dl, {Ops.data()+HwLen,
> HwLen});
> >>> + SDValue L1 = Lower.LowerOperation(B1, DAG);
> >>> + // XXX CONCAT_VECTORS is legal for HVX vectors. Legalizing
> (lowering)
> >>> + // functions may expect to be called only for illegal operations,
> so
> >>> + // make sure that they are not called for legal ones. Develop a
> better
> >>> + // mechanism for dealing with this.
> >>> + LV = DAG.getNode(ISD::CONCAT_VECTORS, dl, ResTy, {L0, L1});
> >>> + } else {
> >>> + SDValue BV = DAG.getBuildVector(ResTy, dl, Ops);
> >>> + LV = Lower.LowerOperation(BV, DAG);
> >>> + }
> >>> +
> >>> + assert(!N->use_empty());
> >>> + ISel.ReplaceNode(N, LV.getNode());
> >>> + DAG.RemoveDeadNodes();
> >>> +
> >>> + std::deque<SDNode*> SubNodes;
> >>> + SubNodes.push_back(LV.getNode());
> >>> + for (unsigned I = 0; I != SubNodes.size(); ++I) {
> >>> + for (SDValue Op : SubNodes[I]->ops())
> >>> + SubNodes.push_back(Op.getNode());
> >>> + }
> >>> + while (!SubNodes.empty()) {
> >>> + SDNode *S = SubNodes.front();
> >>> + SubNodes.pop_front();
> >>> + if (S->use_empty())
> >>> + continue;
> >>> + // This isn't great, but users need to be selected before any
> nodes that
> >>> + // they use. (The reason is to match larger patterns, and avoid
> nodes that
> >>> + // cannot be matched on their own, e.g. ValueType, TokenFactor,
> etc.).
> >>> + bool PendingUser = llvm::any_of(S->uses(), [&SubNodes](const
> SDNode *U) {
> >>> + return llvm::any_of(SubNodes, [U](const
> SDNode *T) {
> >>> + return T == U;
> >>> + });
> >>> + });
> >>> + if (PendingUser)
> >>> + SubNodes.push_back(S);
> >>> + else
> >>> + ISel.Select(S);
> >>> + }
> >>> +
> >>> + DAG.RemoveDeadNodes();
> >>> + return true;
> >>> +}
> >>> +
> >>> +OpRef HvxSelector::contracting(ShuffleMask SM, OpRef Va, OpRef Vb,
> >>> + ResultStack &Results) {
> >>> + DEBUG_WITH_TYPE("isel", {dbgs() << __func__ << '\n';});
> >>> + if (!Va.isValid() || !Vb.isValid())
> >>> + return OpRef::fail();
> >>> +
> >>> + // Contracting shuffles, i.e. instructions that always discard some
> bytes
> >>> + // from the operand vectors.
> >>> + //
> >>> + // V6_vshuff{e,o}b
> >>> + // V6_vdealb4w
> >>> + // V6_vpack{e,o}{b,h}
> >>> +
> >>> + int VecLen = SM.Mask.size();
> >>> + std::pair<int,unsigned> Strip = findStrip(SM.Mask, 1, VecLen);
> >>> + MVT ResTy = getSingleVT(MVT::i8);
> >>> +
> >>> + // The following shuffles only work for bytes and halfwords. This
> requires
> >>> + // the strip length to be 1 or 2.
> >>> + if (Strip.second != 1 && Strip.second != 2)
> >>> + return OpRef::fail();
> >>> +
> >>> + // The patterns for the shuffles, in terms of the starting offsets
> of the
> >>> + // consecutive strips (L = length of the strip, N = VecLen):
> >>> + //
> >>> + // vpacke: 0, 2L, 4L ... N+0, N+2L, N+4L ... L = 1 or 2
> >>> + // vpacko: L, 3L, 5L ... N+L, N+3L, N+5L ... L = 1 or 2
> >>> + //
> >>> + // vshuffe: 0, N+0, 2L, N+2L, 4L ... L = 1 or 2
> >>> + // vshuffo: L, N+L, 3L, N+3L, 5L ... L = 1 or 2
> >>> + //
> >>> + // vdealb4w: 0, 4, 8 ... 2, 6, 10 ... N+0, N+4, N+8 ... N+2, N+6,
> N+10 ...
> >>> +
> >>> + // The value of the element in the mask following the strip will
> decide
> >>> + // what kind of a shuffle this can be.
> >>> + int NextInMask = SM.Mask[Strip.second];
> >>> +
> >>> + // Check if NextInMask could be 2L, 3L or 4, i.e. if it could be a
> mask
> >>> + // for vpack or vdealb4w. VecLen > 4, so NextInMask for vdealb4w
> would
> >>> + // satisfy this.
> >>> + if (NextInMask < VecLen) {
> >>> + // vpack{e,o} or vdealb4w
> >>> + if (Strip.first == 0 && Strip.second == 1 && NextInMask == 4) {
> >>> + int N = VecLen;
> >>> + // Check if this is vdealb4w (L=1).
> >>> + for (int I = 0; I != N/4; ++I)
> >>> + if (SM.Mask[I] != 4*I)
> >>> + return OpRef::fail();
> >>> + for (int I = 0; I != N/4; ++I)
> >>> + if (SM.Mask[I+N/4] != 2 + 4*I)
> >>> + return OpRef::fail();
> >>> + for (int I = 0; I != N/4; ++I)
> >>> + if (SM.Mask[I+N/2] != N + 4*I)
> >>> + return OpRef::fail();
> >>> + for (int I = 0; I != N/4; ++I)
> >>> + if (SM.Mask[I+3*N/4] != N+2 + 4*I)
> >>> + return OpRef::fail();
> >>> + // Matched mask for vdealb4w.
> >>> + Results.push(Hexagon::V6_vdealb4w, ResTy, {Vb, Va});
> >>> + return OpRef::res(Results.top());
> >>> + }
> >>> +
> >>> + // Check if this is vpack{e,o}.
> >>> + int N = VecLen;
> >>> + int L = Strip.second;
> >>> + // Check if the first strip starts at 0 or at L.
> >>> + if (Strip.first != 0 && Strip.first != L)
> >>> + return OpRef::fail();
> >>> + // Examine the rest of the mask.
> >>> + for (int I = L; I < N/2; I += L) {
> >>> + auto S = findStrip(subm(SM.Mask,I), 1, N-I);
> >>> + // Check whether the mask element at the beginning of each strip
> >>> + // increases by 2L each time.
> >>> + if (S.first - Strip.first != 2*I)
> >>> + return OpRef::fail();
> >>> + // Check whether each strip is of the same length.
> >>> + if (S.second != unsigned(L))
> >>> + return OpRef::fail();
> >>> + }
> >>> +
> >>> + // Strip.first == 0 => vpacke
> >>> + // Strip.first == L => vpacko
> >>> + assert(Strip.first == 0 || Strip.first == L);
> >>> + using namespace Hexagon;
> >>> + NodeTemplate Res;
> >>> + Res.Opc = Strip.second == 1 // Number of bytes.
> >>> + ? (Strip.first == 0 ? V6_vpackeb : V6_vpackob)
> >>> + : (Strip.first == 0 ? V6_vpackeh : V6_vpackoh);
> >>> + Res.Ty = ResTy;
> >>> + Res.Ops = { Vb, Va };
> >>> + Results.push(Res);
> >>> + r
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20180109/358fce41/attachment-0001.html>
More information about the llvm-commits
mailing list