<div dir="ltr">One more :)<div><br></div><div>clang noticed that HvxSelector::zerous was unused and was warning on me so I removed it here:</div><div><br></div><div>Committing to <a href="https://llvm.org/svn/llvm-project/llvm/trunk">https://llvm.org/svn/llvm-project/llvm/trunk</a> ...</div><div><span style="white-space:pre"> </span>M<span style="white-space:pre"> </span>lib/Target/Hexagon/HexagonISelDAGToDAGHVX.cpp</div><div>Committed r322053</div><div><br></div><div>If you still need it you might want to add a use. :)</div><div><br></div><div>-eric</div></div><br><div class="gmail_quote"><div dir="ltr">On Wed, Dec 6, 2017 at 11:44 AM Krzysztof Parzyszek via llvm-commits <<a href="mailto:llvm-commits@lists.llvm.org">llvm-commits@lists.llvm.org</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Sorry about that, and thanks.<br>
<br>
-Krzysztof<br>
<br>
On 12/6/2017 12:52 PM, Davide Italiano wrote:<br>
> I'll go ahead and commit the following to unblock our work, but feel<br>
> free to follow up accordingly if you don't like it.<br>
><br>
> $ git diff<br>
> diff --git a/lib/Target/Hexagon/HexagonISelDAGToDAGHVX.cpp<br>
> b/lib/Target/Hexagon/HexagonISelDAGToDAGHVX.cpp<br>
> index a636e4e1557..5dc5e764f67 100644<br>
> --- a/lib/Target/Hexagon/HexagonISelDAGToDAGHVX.cpp<br>
> +++ b/lib/Target/Hexagon/HexagonISelDAGToDAGHVX.cpp<br>
> @@ -729,7 +729,9 @@ void NodeTemplate::print(raw_ostream &OS, const<br>
> SelectionDAG &G) const {<br>
><br>
> void ResultStack::print(raw_ostream &OS, const SelectionDAG &G) const {<br>
> OS << "Input node:\n";<br>
> +#ifndef NDEBUG<br>
> InpNode->dumpr(&G);<br>
> +#endif<br>
> OS << "Result templates:\n";<br>
> for (unsigned I = 0, E = List.size(); I != E; ++I) {<br>
> OS << '[' << I << "] ";<br>
><br>
><br>
> On Wed, Dec 6, 2017 at 10:48 AM, Davide Italiano <<a href="mailto:davide@freebsd.org" target="_blank">davide@freebsd.org</a>> wrote:<br>
>> The build is failing on macOS for me. I think this commit might be<br>
>> responsible, taking in consideration the range (yesterday night/this<br>
>> morning).<br>
>><br>
>> Undefined symbols for architecture x86_64:<br>
>> "llvm::SDNode::dumpr(llvm::SelectionDAG const*) const", referenced from:<br>
>> ResultStack::print(llvm::raw_ostream&, llvm::SelectionDAG<br>
>> const&) const in libLLVMHexagonCodeGen.a(HexagonISelDAGToDAGHVX.cpp.o)<br>
>> ld: symbol(s) not found for architecture x86_64<br>
>><br>
>> On Wed, Dec 6, 2017 at 8:40 AM, Krzysztof Parzyszek via llvm-commits<br>
>> <<a href="mailto:llvm-commits@lists.llvm.org" target="_blank">llvm-commits@lists.llvm.org</a>> wrote:<br>
>>> Author: kparzysz<br>
>>> Date: Wed Dec 6 08:40:37 2017<br>
>>> New Revision: 319901<br>
>>><br>
>>> URL: <a href="http://llvm.org/viewvc/llvm-project?rev=319901&view=rev" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project?rev=319901&view=rev</a><br>
>>> Log:<br>
>>> [Hexagon] Generate HVX code for vector construction and access<br>
>>><br>
>>> Support for:<br>
>>> - build vector,<br>
>>> - extract vector element, subvector,<br>
>>> - insert vector element, subvector,<br>
>>> - shuffle.<br>
>>><br>
>>> Added:<br>
>>> llvm/trunk/lib/Target/Hexagon/HexagonISelDAGToDAGHVX.cpp<br>
>>> llvm/trunk/lib/Target/Hexagon/HexagonISelLoweringHVX.cpp<br>
>>> llvm/trunk/test/CodeGen/Hexagon/autohvx/<br>
>>> llvm/trunk/test/CodeGen/Hexagon/autohvx/align-128b.ll<br>
>>> llvm/trunk/test/CodeGen/Hexagon/autohvx/align-64b.ll<br>
>>> llvm/trunk/test/CodeGen/Hexagon/autohvx/contract-128b.ll<br>
>>> llvm/trunk/test/CodeGen/Hexagon/autohvx/contract-64b.ll<br>
>>> llvm/trunk/test/CodeGen/Hexagon/autohvx/deal-128b.ll<br>
>>> llvm/trunk/test/CodeGen/Hexagon/autohvx/deal-64b.ll<br>
>>> llvm/trunk/test/CodeGen/Hexagon/autohvx/delta-128b.ll<br>
>>> llvm/trunk/test/CodeGen/Hexagon/autohvx/delta-64b.ll<br>
>>> llvm/trunk/test/CodeGen/Hexagon/autohvx/delta2-64b.ll<br>
>>> llvm/trunk/test/CodeGen/Hexagon/autohvx/extract-element.ll<br>
>>> llvm/trunk/test/CodeGen/Hexagon/autohvx/reg-sequence.ll<br>
>>> llvm/trunk/test/CodeGen/Hexagon/autohvx/shuff-128b.ll<br>
>>> llvm/trunk/test/CodeGen/Hexagon/autohvx/shuff-64b.ll<br>
>>> llvm/trunk/test/CodeGen/Hexagon/autohvx/shuff-combos-128b.ll<br>
>>> llvm/trunk/test/CodeGen/Hexagon/autohvx/shuff-combos-64b.ll<br>
>>> Modified:<br>
>>> llvm/trunk/lib/Target/Hexagon/CMakeLists.txt<br>
>>> llvm/trunk/lib/Target/Hexagon/HexagonISelDAGToDAG.cpp<br>
>>> llvm/trunk/lib/Target/Hexagon/HexagonISelDAGToDAG.h<br>
>>> llvm/trunk/lib/Target/Hexagon/HexagonISelLowering.cpp<br>
>>> llvm/trunk/lib/Target/Hexagon/HexagonISelLowering.h<br>
>>> llvm/trunk/lib/Target/Hexagon/HexagonPatterns.td<br>
>>><br>
>>> Modified: llvm/trunk/lib/Target/Hexagon/CMakeLists.txt<br>
>>> URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/Hexagon/CMakeLists.txt?rev=319901&r1=319900&r2=319901&view=diff" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/Hexagon/CMakeLists.txt?rev=319901&r1=319900&r2=319901&view=diff</a><br>
>>> ==============================================================================<br>
>>> --- llvm/trunk/lib/Target/Hexagon/CMakeLists.txt (original)<br>
>>> +++ llvm/trunk/lib/Target/Hexagon/CMakeLists.txt Wed Dec 6 08:40:37 2017<br>
>>> @@ -35,7 +35,9 @@ add_llvm_target(HexagonCodeGen<br>
>>> HexagonHazardRecognizer.cpp<br>
>>> HexagonInstrInfo.cpp<br>
>>> HexagonISelDAGToDAG.cpp<br>
>>> + HexagonISelDAGToDAGHVX.cpp<br>
>>> HexagonISelLowering.cpp<br>
>>> + HexagonISelLoweringHVX.cpp<br>
>>> HexagonLoopIdiomRecognition.cpp<br>
>>> HexagonMachineFunctionInfo.cpp<br>
>>> HexagonMachineScheduler.cpp<br>
>>><br>
>>> Modified: llvm/trunk/lib/Target/Hexagon/HexagonISelDAGToDAG.cpp<br>
>>> URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/Hexagon/HexagonISelDAGToDAG.cpp?rev=319901&r1=319900&r2=319901&view=diff" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/Hexagon/HexagonISelDAGToDAG.cpp?rev=319901&r1=319900&r2=319901&view=diff</a><br>
>>> ==============================================================================<br>
>>> --- llvm/trunk/lib/Target/Hexagon/HexagonISelDAGToDAG.cpp (original)<br>
>>> +++ llvm/trunk/lib/Target/Hexagon/HexagonISelDAGToDAG.cpp Wed Dec 6 08:40:37 2017<br>
>>> @@ -754,7 +754,6 @@ void HexagonDAGToDAGISel::SelectBitcast(<br>
>>> CurDAG->RemoveDeadNode(N);<br>
>>> }<br>
>>><br>
>>> -<br>
>>> void HexagonDAGToDAGISel::Select(SDNode *N) {<br>
>>> if (N->isMachineOpcode())<br>
>>> return N->setNodeId(-1); // Already selected.<br>
>>> @@ -772,6 +771,13 @@ void HexagonDAGToDAGISel::Select(SDNode<br>
>>> case ISD::INTRINSIC_WO_CHAIN: return SelectIntrinsicWOChain(N);<br>
>>> }<br>
>>><br>
>>> + if (HST->useHVXOps()) {<br>
>>> + switch (N->getOpcode()) {<br>
>>> + case ISD::VECTOR_SHUFFLE: return SelectHvxShuffle(N);<br>
>>> + case HexagonISD::VROR: return SelectHvxRor(N);<br>
>>> + }<br>
>>> + }<br>
>>> +<br>
>>> SelectCode(N);<br>
>>> }<br>
>>><br>
>>><br>
>>> Modified: llvm/trunk/lib/Target/Hexagon/HexagonISelDAGToDAG.h<br>
>>> URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/Hexagon/HexagonISelDAGToDAG.h?rev=319901&r1=319900&r2=319901&view=diff" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/Hexagon/HexagonISelDAGToDAG.h?rev=319901&r1=319900&r2=319901&view=diff</a><br>
>>> ==============================================================================<br>
>>> --- llvm/trunk/lib/Target/Hexagon/HexagonISelDAGToDAG.h (original)<br>
>>> +++ llvm/trunk/lib/Target/Hexagon/HexagonISelDAGToDAG.h Wed Dec 6 08:40:37 2017<br>
>>> @@ -26,6 +26,7 @@ namespace llvm {<br>
>>> class MachineFunction;<br>
>>> class HexagonInstrInfo;<br>
>>> class HexagonRegisterInfo;<br>
>>> +class HexagonTargetLowering;<br>
>>><br>
>>> class HexagonDAGToDAGISel : public SelectionDAGISel {<br>
>>> const HexagonSubtarget *HST;<br>
>>> @@ -100,13 +101,25 @@ public:<br>
>>> void SelectConstant(SDNode *N);<br>
>>> void SelectConstantFP(SDNode *N);<br>
>>> void SelectBitcast(SDNode *N);<br>
>>> - void SelectVectorShuffle(SDNode *N);<br>
>>><br>
>>> - // Include the pieces autogenerated from the target description.<br>
>>> + // Include the declarations autogenerated from the selection patterns.<br>
>>> #define GET_DAGISEL_DECL<br>
>>> #include "HexagonGenDAGISel.inc"<br>
>>><br>
>>> private:<br>
>>> + // This is really only to get access to ReplaceNode (which is a protected<br>
>>> + // member). Any other members used by HvxSelector can be moved around to<br>
>>> + // make them accessible).<br>
>>> + friend struct HvxSelector;<br>
>>> +<br>
>>> + SDValue selectUndef(const SDLoc &dl, MVT ResTy) {<br>
>>> + SDNode *U = CurDAG->getMachineNode(TargetOpcode::IMPLICIT_DEF, dl, ResTy);<br>
>>> + return SDValue(U, 0);<br>
>>> + }<br>
>>> +<br>
>>> + void SelectHvxShuffle(SDNode *N);<br>
>>> + void SelectHvxRor(SDNode *N);<br>
>>> +<br>
>>> bool keepsLowBits(const SDValue &Val, unsigned NumBits, SDValue &Src);<br>
>>> bool isOrEquivalentToAdd(const SDNode *N) const;<br>
>>> bool isAlignedMemNode(const MemSDNode *N) const;<br>
>>><br>
>>> Added: llvm/trunk/lib/Target/Hexagon/HexagonISelDAGToDAGHVX.cpp<br>
>>> URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/Hexagon/HexagonISelDAGToDAGHVX.cpp?rev=319901&view=auto" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/Hexagon/HexagonISelDAGToDAGHVX.cpp?rev=319901&view=auto</a><br>
>>> ==============================================================================<br>
>>> --- llvm/trunk/lib/Target/Hexagon/HexagonISelDAGToDAGHVX.cpp (added)<br>
>>> +++ llvm/trunk/lib/Target/Hexagon/HexagonISelDAGToDAGHVX.cpp Wed Dec 6 08:40:37 2017<br>
>>> @@ -0,0 +1,1924 @@<br>
>>> +//===-- HexagonISelDAGToDAGHVX.cpp ----------------------------------------===//<br>
>>> +//<br>
>>> +// The LLVM Compiler Infrastructure<br>
>>> +//<br>
>>> +// This file is distributed under the University of Illinois Open Source<br>
>>> +// License. See LICENSE.TXT for details.<br>
>>> +//<br>
>>> +//===----------------------------------------------------------------------===//<br>
>>> +<br>
>>> +#include "Hexagon.h"<br>
>>> +#include "HexagonISelDAGToDAG.h"<br>
>>> +#include "HexagonISelLowering.h"<br>
>>> +#include "HexagonTargetMachine.h"<br>
>>> +#include "llvm/CodeGen/MachineInstrBuilder.h"<br>
>>> +#include "llvm/CodeGen/SelectionDAGISel.h"<br>
>>> +#include "llvm/IR/Intrinsics.h"<br>
>>> +#include "llvm/Support/CommandLine.h"<br>
>>> +#include "llvm/Support/Debug.h"<br>
>>> +<br>
>>> +#include <deque><br>
>>> +#include <map><br>
>>> +#include <set><br>
>>> +#include <utility><br>
>>> +#include <vector><br>
>>> +<br>
>>> +#define DEBUG_TYPE "hexagon-isel"<br>
>>> +<br>
>>> +using namespace llvm;<br>
>>> +<br>
>>> +// --------------------------------------------------------------------<br>
>>> +// Implementation of permutation networks.<br>
>>> +<br>
>>> +// Implementation of the node routing through butterfly networks:<br>
>>> +// - Forward delta.<br>
>>> +// - Reverse delta.<br>
>>> +// - Benes.<br>
>>> +//<br>
>>> +//<br>
>>> +// Forward delta network consists of log(N) steps, where N is the number<br>
>>> +// of inputs. In each step, an input can stay in place, or it can get<br>
>>> +// routed to another position[1]. The step after that consists of two<br>
>>> +// networks, each half in size in terms of the number of nodes. In those<br>
>>> +// terms, in the given step, an input can go to either the upper or the<br>
>>> +// lower network in the next step.<br>
>>> +//<br>
>>> +// [1] Hexagon's vdelta/vrdelta allow an element to be routed to both<br>
>>> +// positions as long as there is no conflict.<br>
>>> +<br>
>>> +// Here's a delta network for 8 inputs, only the switching routes are<br>
>>> +// shown:<br>
>>> +//<br>
>>> +// Steps:<br>
>>> +// |- 1 ---------------|- 2 -----|- 3 -|<br>
>>> +//<br>
>>> +// Inp[0] *** *** *** *** Out[0]<br>
>>> +// \ / \ / \ /<br>
>>> +// \ / \ / X<br>
>>> +// \ / \ / / \<br>
>>> +// Inp[1] *** \ / *** X *** *** Out[1]<br>
>>> +// \ \ / / \ / \ /<br>
>>> +// \ \ / / X X<br>
>>> +// \ \ / / / \ / \<br>
>>> +// Inp[2] *** \ \ / / *** X *** *** Out[2]<br>
>>> +// \ \ X / / / \ \ /<br>
>>> +// \ \ / \ / / / \ X<br>
>>> +// \ X X / / \ / \<br>
>>> +// Inp[3] *** \ / \ / \ / *** *** *** Out[3]<br>
>>> +// \ X X X /<br>
>>> +// \ / \ / \ / \ /<br>
>>> +// X X X X<br>
>>> +// / \ / \ / \ / \<br>
>>> +// / X X X \<br>
>>> +// Inp[4] *** / \ / \ / \ *** *** *** Out[4]<br>
>>> +// / X X \ \ / \ /<br>
>>> +// / / \ / \ \ \ / X<br>
>>> +// / / X \ \ \ / / \<br>
>>> +// Inp[5] *** / / \ \ *** X *** *** Out[5]<br>
>>> +// / / \ \ \ / \ /<br>
>>> +// / / \ \ X X<br>
>>> +// / / \ \ / \ / \<br>
>>> +// Inp[6] *** / \ *** X *** *** Out[6]<br>
>>> +// / \ / \ \ /<br>
>>> +// / \ / \ X<br>
>>> +// / \ / \ / \<br>
>>> +// Inp[7] *** *** *** *** Out[7]<br>
>>> +//<br>
>>> +//<br>
>>> +// Reverse delta network is same as delta network, with the steps in<br>
>>> +// the opposite order.<br>
>>> +//<br>
>>> +//<br>
>>> +// Benes network is a forward delta network immediately followed by<br>
>>> +// a reverse delta network.<br>
>>> +<br>
>>> +<br>
>>> +// Graph coloring utility used to partition nodes into two groups:<br>
>>> +// they will correspond to nodes routed to the upper and lower networks.<br>
>>> +struct Coloring {<br>
>>> + enum : uint8_t {<br>
>>> + None = 0,<br>
>>> + Red,<br>
>>> + Black<br>
>>> + };<br>
>>> +<br>
>>> + using Node = int;<br>
>>> + using MapType = std::map<Node,uint8_t>;<br>
>>> + static constexpr Node Ignore = Node(-1);<br>
>>> +<br>
>>> + Coloring(ArrayRef<Node> Ord) : Order(Ord) {<br>
>>> + build();<br>
>>> + if (!color())<br>
>>> + Colors.clear();<br>
>>> + }<br>
>>> +<br>
>>> + const MapType &colors() const {<br>
>>> + return Colors;<br>
>>> + }<br>
>>> +<br>
>>> + uint8_t other(uint8_t Color) {<br>
>>> + if (Color == None)<br>
>>> + return Red;<br>
>>> + return Color == Red ? Black : Red;<br>
>>> + }<br>
>>> +<br>
>>> + void dump() const;<br>
>>> +<br>
>>> +private:<br>
>>> + ArrayRef<Node> Order;<br>
>>> + MapType Colors;<br>
>>> + std::set<Node> Needed;<br>
>>> +<br>
>>> + using NodeSet = std::set<Node>;<br>
>>> + std::map<Node,NodeSet> Edges;<br>
>>> +<br>
>>> + Node conj(Node Pos) {<br>
>>> + Node Num = Order.size();<br>
>>> + return (Pos < Num/2) ? Pos + Num/2 : Pos - Num/2;<br>
>>> + }<br>
>>> +<br>
>>> + uint8_t getColor(Node N) {<br>
>>> + auto F = Colors.find(N);<br>
>>> + return F != Colors.end() ? F->second : None;<br>
>>> + }<br>
>>> +<br>
>>> + std::pair<bool,uint8_t> getUniqueColor(const NodeSet &Nodes);<br>
>>> +<br>
>>> + void build();<br>
>>> + bool color();<br>
>>> +};<br>
>>> +<br>
>>> +std::pair<bool,uint8_t> Coloring::getUniqueColor(const NodeSet &Nodes) {<br>
>>> + uint8_t Color = None;<br>
>>> + for (Node N : Nodes) {<br>
>>> + uint8_t ColorN = getColor(N);<br>
>>> + if (ColorN == None)<br>
>>> + continue;<br>
>>> + if (Color == None)<br>
>>> + Color = ColorN;<br>
>>> + else if (Color != None && Color != ColorN)<br>
>>> + return { false, None };<br>
>>> + }<br>
>>> + return { true, Color };<br>
>>> +}<br>
>>> +<br>
>>> +void Coloring::build() {<br>
>>> + // Add Order[P] and Order[conj(P)] to Edges.<br>
>>> + for (unsigned P = 0; P != Order.size(); ++P) {<br>
>>> + Node I = Order[P];<br>
>>> + if (I != Ignore) {<br>
>>> + Needed.insert(I);<br>
>>> + Node PC = Order[conj(P)];<br>
>>> + if (PC != Ignore && PC != I)<br>
>>> + Edges[I].insert(PC);<br>
>>> + }<br>
>>> + }<br>
>>> + // Add I and conj(I) to Edges.<br>
>>> + for (unsigned I = 0; I != Order.size(); ++I) {<br>
>>> + if (!Needed.count(I))<br>
>>> + continue;<br>
>>> + Node C = conj(I);<br>
>>> + // This will create an entry in the edge table, even if I is not<br>
>>> + // connected to any other node. This is necessary, because it still<br>
>>> + // needs to be colored.<br>
>>> + NodeSet &Is = Edges[I];<br>
>>> + if (Needed.count(C))<br>
>>> + Is.insert(C);<br>
>>> + }<br>
>>> +}<br>
>>> +<br>
>>> +bool Coloring::color() {<br>
>>> + SetVector<Node> FirstQ;<br>
>>> + auto Enqueue = [this,&FirstQ] (Node N) {<br>
>>> + SetVector<Node> Q;<br>
>>> + Q.insert(N);<br>
>>> + for (unsigned I = 0; I != Q.size(); ++I) {<br>
>>> + NodeSet &Ns = Edges[Q[I]];<br>
>>> + Q.insert(Ns.begin(), Ns.end());<br>
>>> + }<br>
>>> + FirstQ.insert(Q.begin(), Q.end());<br>
>>> + };<br>
>>> + for (Node N : Needed)<br>
>>> + Enqueue(N);<br>
>>> +<br>
>>> + for (Node N : FirstQ) {<br>
>>> + if (Colors.count(N))<br>
>>> + continue;<br>
>>> + NodeSet &Ns = Edges[N];<br>
>>> + auto P = getUniqueColor(Ns);<br>
>>> + if (!P.first)<br>
>>> + return false;<br>
>>> + Colors[N] = other(P.second);<br>
>>> + }<br>
>>> +<br>
>>> + // First, color nodes that don't have any dups.<br>
>>> + for (auto E : Edges) {<br>
>>> + Node N = E.first;<br>
>>> + if (!Needed.count(conj(N)) || Colors.count(N))<br>
>>> + continue;<br>
>>> + auto P = getUniqueColor(E.second);<br>
>>> + if (!P.first)<br>
>>> + return false;<br>
>>> + Colors[N] = other(P.second);<br>
>>> + }<br>
>>> +<br>
>>> + // Now, nodes that are still uncolored. Since the graph can be modified<br>
>>> + // in this step, create a work queue.<br>
>>> + std::vector<Node> WorkQ;<br>
>>> + for (auto E : Edges) {<br>
>>> + Node N = E.first;<br>
>>> + if (!Colors.count(N))<br>
>>> + WorkQ.push_back(N);<br>
>>> + }<br>
>>> +<br>
>>> + for (unsigned I = 0; I < WorkQ.size(); ++I) {<br>
>>> + Node N = WorkQ[I];<br>
>>> + NodeSet &Ns = Edges[N];<br>
>>> + auto P = getUniqueColor(Ns);<br>
>>> + if (P.first) {<br>
>>> + Colors[N] = other(P.second);<br>
>>> + continue;<br>
>>> + }<br>
>>> +<br>
>>> + // Coloring failed. Split this node.<br>
>>> + Node C = conj(N);<br>
>>> + uint8_t ColorN = other(None);<br>
>>> + uint8_t ColorC = other(ColorN);<br>
>>> + NodeSet &Cs = Edges[C];<br>
>>> + NodeSet CopyNs = Ns;<br>
>>> + for (Node M : CopyNs) {<br>
>>> + uint8_t ColorM = getColor(M);<br>
>>> + if (ColorM == ColorC) {<br>
>>> + // Connect M with C, disconnect M from N.<br>
>>> + Cs.insert(M);<br>
>>> + Edges[M].insert(C);<br>
>>> + Ns.erase(M);<br>
>>> + Edges[M].erase(N);<br>
>>> + }<br>
>>> + }<br>
>>> + Colors[N] = ColorN;<br>
>>> + Colors[C] = ColorC;<br>
>>> + }<br>
>>> +<br>
>>> + // Explicitly assign "None" all all uncolored nodes.<br>
>>> + for (unsigned I = 0; I != Order.size(); ++I)<br>
>>> + if (Colors.count(I) == 0)<br>
>>> + Colors[I] = None;<br>
>>> +<br>
>>> + return true;<br>
>>> +}<br>
>>> +<br>
>>> +LLVM_DUMP_METHOD<br>
>>> +void Coloring::dump() const {<br>
>>> + dbgs() << "{ Order: {";<br>
>>> + for (unsigned I = 0; I != Order.size(); ++I) {<br>
>>> + Node P = Order[I];<br>
>>> + if (P != Ignore)<br>
>>> + dbgs() << ' ' << P;<br>
>>> + else<br>
>>> + dbgs() << " -";<br>
>>> + }<br>
>>> + dbgs() << " }\n";<br>
>>> + dbgs() << " Needed: {";<br>
>>> + for (Node N : Needed)<br>
>>> + dbgs() << ' ' << N;<br>
>>> + dbgs() << " }\n";<br>
>>> +<br>
>>> + dbgs() << " Edges: {\n";<br>
>>> + for (auto E : Edges) {<br>
>>> + dbgs() << " " << E.first << " -> {";<br>
>>> + for (auto N : E.second)<br>
>>> + dbgs() << ' ' << N;<br>
>>> + dbgs() << " }\n";<br>
>>> + }<br>
>>> + dbgs() << " }\n";<br>
>>> +<br>
>>> + static const char *const Names[] = { "None", "Red", "Black" };<br>
>>> + dbgs() << " Colors: {\n";<br>
>>> + for (auto C : Colors)<br>
>>> + dbgs() << " " << C.first << " -> " << Names[C.second] << "\n";<br>
>>> + dbgs() << " }\n}\n";<br>
>>> +}<br>
>>> +<br>
>>> +// Base class of for reordering networks. They don't strictly need to be<br>
>>> +// permutations, as outputs with repeated occurrences of an input element<br>
>>> +// are allowed.<br>
>>> +struct PermNetwork {<br>
>>> + using Controls = std::vector<uint8_t>;<br>
>>> + using ElemType = int;<br>
>>> + static constexpr ElemType Ignore = ElemType(-1);<br>
>>> +<br>
>>> + enum : uint8_t {<br>
>>> + None,<br>
>>> + Pass,<br>
>>> + Switch<br>
>>> + };<br>
>>> + enum : uint8_t {<br>
>>> + Forward,<br>
>>> + Reverse<br>
>>> + };<br>
>>> +<br>
>>> + PermNetwork(ArrayRef<ElemType> Ord, unsigned Mult = 1) {<br>
>>> + Order.assign(Ord.data(), Ord.data()+Ord.size());<br>
>>> + Log = 0;<br>
>>> +<br>
>>> + unsigned S = Order.size();<br>
>>> + while (S >>= 1)<br>
>>> + ++Log;<br>
>>> +<br>
>>> + Table.resize(Order.size());<br>
>>> + for (RowType &Row : Table)<br>
>>> + Row.resize(Mult*Log, None);<br>
>>> + }<br>
>>> +<br>
>>> + void getControls(Controls &V, unsigned StartAt, uint8_t Dir) const {<br>
>>> + unsigned Size = Order.size();<br>
>>> + V.resize(Size);<br>
>>> + for (unsigned I = 0; I != Size; ++I) {<br>
>>> + unsigned W = 0;<br>
>>> + for (unsigned L = 0; L != Log; ++L) {<br>
>>> + unsigned C = ctl(I, StartAt+L) == Switch;<br>
>>> + if (Dir == Forward)<br>
>>> + W |= C << (Log-1-L);<br>
>>> + else<br>
>>> + W |= C << L;<br>
>>> + }<br>
>>> + assert(isUInt<8>(W));<br>
>>> + V[I] = uint8_t(W);<br>
>>> + }<br>
>>> + }<br>
>>> +<br>
>>> + uint8_t ctl(ElemType Pos, unsigned Step) const {<br>
>>> + return Table[Pos][Step];<br>
>>> + }<br>
>>> + unsigned size() const {<br>
>>> + return Order.size();<br>
>>> + }<br>
>>> + unsigned steps() const {<br>
>>> + return Log;<br>
>>> + }<br>
>>> +<br>
>>> +protected:<br>
>>> + unsigned Log;<br>
>>> + std::vector<ElemType> Order;<br>
>>> + using RowType = std::vector<uint8_t>;<br>
>>> + std::vector<RowType> Table;<br>
>>> +};<br>
>>> +<br>
>>> +struct ForwardDeltaNetwork : public PermNetwork {<br>
>>> + ForwardDeltaNetwork(ArrayRef<ElemType> Ord) : PermNetwork(Ord) {}<br>
>>> +<br>
>>> + bool run(Controls &V) {<br>
>>> + if (!route(Order.data(), Table.data(), size(), 0))<br>
>>> + return false;<br>
>>> + getControls(V, 0, Forward);<br>
>>> + return true;<br>
>>> + }<br>
>>> +<br>
>>> +private:<br>
>>> + bool route(ElemType *P, RowType *T, unsigned Size, unsigned Step);<br>
>>> +};<br>
>>> +<br>
>>> +struct ReverseDeltaNetwork : public PermNetwork {<br>
>>> + ReverseDeltaNetwork(ArrayRef<ElemType> Ord) : PermNetwork(Ord) {}<br>
>>> +<br>
>>> + bool run(Controls &V) {<br>
>>> + if (!route(Order.data(), Table.data(), size(), 0))<br>
>>> + return false;<br>
>>> + getControls(V, 0, Reverse);<br>
>>> + return true;<br>
>>> + }<br>
>>> +<br>
>>> +private:<br>
>>> + bool route(ElemType *P, RowType *T, unsigned Size, unsigned Step);<br>
>>> +};<br>
>>> +<br>
>>> +struct BenesNetwork : public PermNetwork {<br>
>>> + BenesNetwork(ArrayRef<ElemType> Ord) : PermNetwork(Ord, 2) {}<br>
>>> +<br>
>>> + bool run(Controls &F, Controls &R) {<br>
>>> + if (!route(Order.data(), Table.data(), size(), 0))<br>
>>> + return false;<br>
>>> +<br>
>>> + getControls(F, 0, Forward);<br>
>>> + getControls(R, Log, Reverse);<br>
>>> + return true;<br>
>>> + }<br>
>>> +<br>
>>> +private:<br>
>>> + bool route(ElemType *P, RowType *T, unsigned Size, unsigned Step);<br>
>>> +};<br>
>>> +<br>
>>> +<br>
>>> +bool ForwardDeltaNetwork::route(ElemType *P, RowType *T, unsigned Size,<br>
>>> + unsigned Step) {<br>
>>> + bool UseUp = false, UseDown = false;<br>
>>> + ElemType Num = Size;<br>
>>> +<br>
>>> + // Cannot use coloring here, because coloring is used to determine<br>
>>> + // the "big" switch, i.e. the one that changes halves, and in a forward<br>
>>> + // network, a color can be simultaneously routed to both halves in the<br>
>>> + // step we're working on.<br>
>>> + for (ElemType J = 0; J != Num; ++J) {<br>
>>> + ElemType I = P[J];<br>
>>> + // I is the position in the input,<br>
>>> + // J is the position in the output.<br>
>>> + if (I == Ignore)<br>
>>> + continue;<br>
>>> + uint8_t S;<br>
>>> + if (I < Num/2)<br>
>>> + S = (J < Num/2) ? Pass : Switch;<br>
>>> + else<br>
>>> + S = (J < Num/2) ? Switch : Pass;<br>
>>> +<br>
>>> + // U is the element in the table that needs to be updated.<br>
>>> + ElemType U = (S == Pass) ? I : (I < Num/2 ? I+Num/2 : I-Num/2);<br>
>>> + if (U < Num/2)<br>
>>> + UseUp = true;<br>
>>> + else<br>
>>> + UseDown = true;<br>
>>> + if (T[U][Step] != S && T[U][Step] != None)<br>
>>> + return false;<br>
>>> + T[U][Step] = S;<br>
>>> + }<br>
>>> +<br>
>>> + for (ElemType J = 0; J != Num; ++J)<br>
>>> + if (P[J] != Ignore && P[J] >= Num/2)<br>
>>> + P[J] -= Num/2;<br>
>>> +<br>
>>> + if (Step+1 < Log) {<br>
>>> + if (UseUp && !route(P, T, Size/2, Step+1))<br>
>>> + return false;<br>
>>> + if (UseDown && !route(P+Size/2, T+Size/2, Size/2, Step+1))<br>
>>> + return false;<br>
>>> + }<br>
>>> + return true;<br>
>>> +}<br>
>>> +<br>
>>> +bool ReverseDeltaNetwork::route(ElemType *P, RowType *T, unsigned Size,<br>
>>> + unsigned Step) {<br>
>>> + unsigned Pets = Log-1 - Step;<br>
>>> + bool UseUp = false, UseDown = false;<br>
>>> + ElemType Num = Size;<br>
>>> +<br>
>>> + // In this step half-switching occurs, so coloring can be used.<br>
>>> + Coloring G({P,Size});<br>
>>> + const Coloring::MapType &M = G.colors();<br>
>>> + if (M.empty())<br>
>>> + return false;<br>
>>> +<br>
>>> + uint8_t ColorUp = Coloring::None;<br>
>>> + for (ElemType J = 0; J != Num; ++J) {<br>
>>> + ElemType I = P[J];<br>
>>> + // I is the position in the input,<br>
>>> + // J is the position in the output.<br>
>>> + if (I == Ignore)<br>
>>> + continue;<br>
>>> + uint8_t C = M.at(I);<br>
>>> + if (C == Coloring::None)<br>
>>> + continue;<br>
>>> + // During "Step", inputs cannot switch halves, so if the "up" color<br>
>>> + // is still unknown, make sure that it is selected in such a way that<br>
>>> + // "I" will stay in the same half.<br>
>>> + bool InpUp = I < Num/2;<br>
>>> + if (ColorUp == Coloring::None)<br>
>>> + ColorUp = InpUp ? C : G.other(C);<br>
>>> + if ((C == ColorUp) != InpUp) {<br>
>>> + // If I should go to a different half than where is it now, give up.<br>
>>> + return false;<br>
>>> + }<br>
>>> +<br>
>>> + uint8_t S;<br>
>>> + if (InpUp) {<br>
>>> + S = (J < Num/2) ? Pass : Switch;<br>
>>> + UseUp = true;<br>
>>> + } else {<br>
>>> + S = (J < Num/2) ? Switch : Pass;<br>
>>> + UseDown = true;<br>
>>> + }<br>
>>> + T[J][Pets] = S;<br>
>>> + }<br>
>>> +<br>
>>> + // Reorder the working permutation according to the computed switch table<br>
>>> + // for the last step (i.e. Pets).<br>
>>> + for (ElemType J = 0; J != Size/2; ++J) {<br>
>>> + ElemType PJ = P[J]; // Current values of P[J]<br>
>>> + ElemType PC = P[J+Size/2]; // and P[conj(J)]<br>
>>> + ElemType QJ = PJ; // New values of P[J]<br>
>>> + ElemType QC = PC; // and P[conj(J)]<br>
>>> + if (T[J][Pets] == Switch)<br>
>>> + QC = PJ;<br>
>>> + if (T[J+Size/2][Pets] == Switch)<br>
>>> + QJ = PC;<br>
>>> + P[J] = QJ;<br>
>>> + P[J+Size/2] = QC;<br>
>>> + }<br>
>>> +<br>
>>> + for (ElemType J = 0; J != Num; ++J)<br>
>>> + if (P[J] != Ignore && P[J] >= Num/2)<br>
>>> + P[J] -= Num/2;<br>
>>> +<br>
>>> + if (Step+1 < Log) {<br>
>>> + if (UseUp && !route(P, T, Size/2, Step+1))<br>
>>> + return false;<br>
>>> + if (UseDown && !route(P+Size/2, T+Size/2, Size/2, Step+1))<br>
>>> + return false;<br>
>>> + }<br>
>>> + return true;<br>
>>> +}<br>
>>> +<br>
>>> +bool BenesNetwork::route(ElemType *P, RowType *T, unsigned Size,<br>
>>> + unsigned Step) {<br>
>>> + Coloring G({P,Size});<br>
>>> + const Coloring::MapType &M = G.colors();<br>
>>> + if (M.empty())<br>
>>> + return false;<br>
>>> + ElemType Num = Size;<br>
>>> +<br>
>>> + unsigned Pets = 2*Log-1 - Step;<br>
>>> + bool UseUp = false, UseDown = false;<br>
>>> +<br>
>>> + // Both assignments, i.e. Red->Up and Red->Down are valid, but they will<br>
>>> + // result in different controls. Let's pick the one where the first<br>
>>> + // control will be "Pass".<br>
>>> + uint8_t ColorUp = Coloring::None;<br>
>>> + for (ElemType J = 0; J != Num; ++J) {<br>
>>> + ElemType I = P[J];<br>
>>> + if (I == Ignore)<br>
>>> + continue;<br>
>>> + uint8_t C = M.at(I);<br>
>>> + if (C == Coloring::None)<br>
>>> + continue;<br>
>>> + if (ColorUp == Coloring::None) {<br>
>>> + ColorUp = (I < Num/2) ? Coloring::Red : Coloring::Black;<br>
>>> + }<br>
>>> + unsigned CI = (I < Num/2) ? I+Num/2 : I-Num/2;<br>
>>> + if (C == ColorUp) {<br>
>>> + if (I < Num/2)<br>
>>> + T[I][Step] = Pass;<br>
>>> + else<br>
>>> + T[CI][Step] = Switch;<br>
>>> + T[J][Pets] = (J < Num/2) ? Pass : Switch;<br>
>>> + UseUp = true;<br>
>>> + } else { // Down<br>
>>> + if (I < Num/2)<br>
>>> + T[CI][Step] = Switch;<br>
>>> + else<br>
>>> + T[I][Step] = Pass;<br>
>>> + T[J][Pets] = (J < Num/2) ? Switch : Pass;<br>
>>> + UseDown = true;<br>
>>> + }<br>
>>> + }<br>
>>> +<br>
>>> + // Reorder the working permutation according to the computed switch table<br>
>>> + // for the last step (i.e. Pets).<br>
>>> + for (ElemType J = 0; J != Num/2; ++J) {<br>
>>> + ElemType PJ = P[J]; // Current values of P[J]<br>
>>> + ElemType PC = P[J+Num/2]; // and P[conj(J)]<br>
>>> + ElemType QJ = PJ; // New values of P[J]<br>
>>> + ElemType QC = PC; // and P[conj(J)]<br>
>>> + if (T[J][Pets] == Switch)<br>
>>> + QC = PJ;<br>
>>> + if (T[J+Num/2][Pets] == Switch)<br>
>>> + QJ = PC;<br>
>>> + P[J] = QJ;<br>
>>> + P[J+Num/2] = QC;<br>
>>> + }<br>
>>> +<br>
>>> + for (ElemType J = 0; J != Num; ++J)<br>
>>> + if (P[J] != Ignore && P[J] >= Num/2)<br>
>>> + P[J] -= Num/2;<br>
>>> +<br>
>>> + if (Step+1 < Log) {<br>
>>> + if (UseUp && !route(P, T, Size/2, Step+1))<br>
>>> + return false;<br>
>>> + if (UseDown && !route(P+Size/2, T+Size/2, Size/2, Step+1))<br>
>>> + return false;<br>
>>> + }<br>
>>> + return true;<br>
>>> +}<br>
>>> +<br>
>>> +// --------------------------------------------------------------------<br>
>>> +// Support for building selection results (output instructions that are<br>
>>> +// parts of the final selection).<br>
>>> +<br>
>>> +struct OpRef {<br>
>>> + OpRef(SDValue V) : OpV(V) {}<br>
>>> + bool isValue() const { return OpV.getNode() != nullptr; }<br>
>>> + bool isValid() const { return isValue() || !(OpN & Invalid); }<br>
>>> + static OpRef res(int N) { return OpRef(Whole | (N & Index)); }<br>
>>> + static OpRef fail() { return OpRef(Invalid); }<br>
>>> +<br>
>>> + static OpRef lo(const OpRef &R) {<br>
>>> + assert(!R.isValue());<br>
>>> + return OpRef(R.OpN & (Undef | Index | LoHalf));<br>
>>> + }<br>
>>> + static OpRef hi(const OpRef &R) {<br>
>>> + assert(!R.isValue());<br>
>>> + return OpRef(R.OpN & (Undef | Index | HiHalf));<br>
>>> + }<br>
>>> + static OpRef undef(MVT Ty) { return OpRef(Undef | Ty.SimpleTy); }<br>
>>> +<br>
>>> + // Direct value.<br>
>>> + SDValue OpV = SDValue();<br>
>>> +<br>
>>> + // Reference to the operand of the input node:<br>
>>> + // If the 31st bit is 1, it's undef, otherwise, bits 28..0 are the<br>
>>> + // operand index:<br>
>>> + // If bit 30 is set, it's the high half of the operand.<br>
>>> + // If bit 29 is set, it's the low half of the operand.<br>
>>> + unsigned OpN = 0;<br>
>>> +<br>
>>> + enum : unsigned {<br>
>>> + Invalid = 0x10000000,<br>
>>> + LoHalf = 0x20000000,<br>
>>> + HiHalf = 0x40000000,<br>
>>> + Whole = LoHalf | HiHalf,<br>
>>> + Undef = 0x80000000,<br>
>>> + Index = 0x0FFFFFFF, // Mask of the index value.<br>
>>> + IndexBits = 28,<br>
>>> + };<br>
>>> +<br>
>>> + void print(raw_ostream &OS, const SelectionDAG &G) const;<br>
>>> +<br>
>>> +private:<br>
>>> + OpRef(unsigned N) : OpN(N) {}<br>
>>> +};<br>
>>> +<br>
>>> +struct NodeTemplate {<br>
>>> + NodeTemplate() = default;<br>
>>> + unsigned Opc = 0;<br>
>>> + MVT Ty = MVT::Other;<br>
>>> + std::vector<OpRef> Ops;<br>
>>> +<br>
>>> + void print(raw_ostream &OS, const SelectionDAG &G) const;<br>
>>> +};<br>
>>> +<br>
>>> +struct ResultStack {<br>
>>> + ResultStack(SDNode *Inp)<br>
>>> + : InpNode(Inp), InpTy(Inp->getValueType(0).getSimpleVT()) {}<br>
>>> + SDNode *InpNode;<br>
>>> + MVT InpTy;<br>
>>> + unsigned push(const NodeTemplate &Res) {<br>
>>> + List.push_back(Res);<br>
>>> + return List.size()-1;<br>
>>> + }<br>
>>> + unsigned push(unsigned Opc, MVT Ty, std::vector<OpRef> &&Ops) {<br>
>>> + NodeTemplate Res;<br>
>>> + Res.Opc = Opc;<br>
>>> + Res.Ty = Ty;<br>
>>> + Res.Ops = Ops;<br>
>>> + return push(Res);<br>
>>> + }<br>
>>> + bool empty() const { return List.empty(); }<br>
>>> + unsigned size() const { return List.size(); }<br>
>>> + unsigned top() const { return size()-1; }<br>
>>> + const NodeTemplate &operator[](unsigned I) const { return List[I]; }<br>
>>> + unsigned reset(unsigned NewTop) {<br>
>>> + List.resize(NewTop+1);<br>
>>> + return NewTop;<br>
>>> + }<br>
>>> +<br>
>>> + using BaseType = std::vector<NodeTemplate>;<br>
>>> + BaseType::iterator begin() { return List.begin(); }<br>
>>> + BaseType::iterator end() { return List.end(); }<br>
>>> + BaseType::const_iterator begin() const { return List.begin(); }<br>
>>> + BaseType::const_iterator end() const { return List.end(); }<br>
>>> +<br>
>>> + BaseType List;<br>
>>> +<br>
>>> + void print(raw_ostream &OS, const SelectionDAG &G) const;<br>
>>> +};<br>
>>> +<br>
>>> +void OpRef::print(raw_ostream &OS, const SelectionDAG &G) const {<br>
>>> + if (isValue()) {<br>
>>> + OpV.getNode()->print(OS, &G);<br>
>>> + return;<br>
>>> + }<br>
>>> + if (OpN & Invalid) {<br>
>>> + OS << "invalid";<br>
>>> + return;<br>
>>> + }<br>
>>> + if (OpN & Undef) {<br>
>>> + OS << "undef";<br>
>>> + return;<br>
>>> + }<br>
>>> + if ((OpN & Whole) != Whole) {<br>
>>> + assert((OpN & Whole) == LoHalf || (OpN & Whole) == HiHalf);<br>
>>> + if (OpN & LoHalf)<br>
>>> + OS << "lo ";<br>
>>> + else<br>
>>> + OS << "hi ";<br>
>>> + }<br>
>>> + OS << '#' << SignExtend32(OpN & Index, IndexBits);<br>
>>> +}<br>
>>> +<br>
>>> +void NodeTemplate::print(raw_ostream &OS, const SelectionDAG &G) const {<br>
>>> + const TargetInstrInfo &TII = *G.getSubtarget().getInstrInfo();<br>
>>> + OS << format("%8s", EVT(Ty).getEVTString().c_str()) << " "<br>
>>> + << TII.getName(Opc);<br>
>>> + bool Comma = false;<br>
>>> + for (const auto &R : Ops) {<br>
>>> + if (Comma)<br>
>>> + OS << ',';<br>
>>> + Comma = true;<br>
>>> + OS << ' ';<br>
>>> + R.print(OS, G);<br>
>>> + }<br>
>>> +}<br>
>>> +<br>
>>> +void ResultStack::print(raw_ostream &OS, const SelectionDAG &G) const {<br>
>>> + OS << "Input node:\n";<br>
>>> + InpNode->dumpr(&G);<br>
>>> + OS << "Result templates:\n";<br>
>>> + for (unsigned I = 0, E = List.size(); I != E; ++I) {<br>
>>> + OS << '[' << I << "] ";<br>
>>> + List[I].print(OS, G);<br>
>>> + OS << '\n';<br>
>>> + }<br>
>>> +}<br>
>>> +<br>
>>> +struct ShuffleMask {<br>
>>> + ShuffleMask(ArrayRef<int> M) : Mask(M) {<br>
>>> + for (unsigned I = 0, E = Mask.size(); I != E; ++I) {<br>
>>> + int M = Mask[I];<br>
>>> + if (M == -1)<br>
>>> + continue;<br>
>>> + MinSrc = (MinSrc == -1) ? M : std::min(MinSrc, M);<br>
>>> + MaxSrc = (MaxSrc == -1) ? M : std::max(MaxSrc, M);<br>
>>> + }<br>
>>> + }<br>
>>> +<br>
>>> + ArrayRef<int> Mask;<br>
>>> + int MinSrc = -1, MaxSrc = -1;<br>
>>> +<br>
>>> + ShuffleMask lo() const {<br>
>>> + size_t H = Mask.size()/2;<br>
>>> + return ShuffleMask({Mask.data(), H});<br>
>>> + }<br>
>>> + ShuffleMask hi() const {<br>
>>> + size_t H = Mask.size()/2;<br>
>>> + return ShuffleMask({Mask.data()+H, H});<br>
>>> + }<br>
>>> +};<br>
>>> +<br>
>>> +// --------------------------------------------------------------------<br>
>>> +// The HvxSelector class.<br>
>>> +<br>
>>> +static const HexagonTargetLowering &getHexagonLowering(SelectionDAG &G) {<br>
>>> + return static_cast<const HexagonTargetLowering&>(G.getTargetLoweringInfo());<br>
>>> +}<br>
>>> +static const HexagonSubtarget &getHexagonSubtarget(SelectionDAG &G) {<br>
>>> + return static_cast<const HexagonSubtarget&>(G.getSubtarget());<br>
>>> +}<br>
>>> +<br>
>>> +namespace llvm {<br>
>>> + struct HvxSelector {<br>
>>> + const HexagonTargetLowering &Lower;<br>
>>> + HexagonDAGToDAGISel &ISel;<br>
>>> + SelectionDAG &DAG;<br>
>>> + const HexagonSubtarget &HST;<br>
>>> + const unsigned HwLen;<br>
>>> +<br>
>>> + HvxSelector(HexagonDAGToDAGISel &HS, SelectionDAG &G)<br>
>>> + : Lower(getHexagonLowering(G)), ISel(HS), DAG(G),<br>
>>> + HST(getHexagonSubtarget(G)), HwLen(HST.getVectorLength()) {}<br>
>>> +<br>
>>> + MVT getSingleVT(MVT ElemTy) const {<br>
>>> + unsigned NumElems = HwLen / (ElemTy.getSizeInBits()/8);<br>
>>> + return MVT::getVectorVT(ElemTy, NumElems);<br>
>>> + }<br>
>>> +<br>
>>> + MVT getPairVT(MVT ElemTy) const {<br>
>>> + unsigned NumElems = (2*HwLen) / (ElemTy.getSizeInBits()/8);<br>
>>> + return MVT::getVectorVT(ElemTy, NumElems);<br>
>>> + }<br>
>>> +<br>
>>> + void selectShuffle(SDNode *N);<br>
>>> + void selectRor(SDNode *N);<br>
>>> +<br>
>>> + private:<br>
>>> + void materialize(const ResultStack &Results);<br>
>>> +<br>
>>> + SDValue getVectorConstant(ArrayRef<uint8_t> Data, const SDLoc &dl);<br>
>>> +<br>
>>> + enum : unsigned {<br>
>>> + None,<br>
>>> + PackMux,<br>
>>> + };<br>
>>> + OpRef concat(OpRef Va, OpRef Vb, ResultStack &Results);<br>
>>> + OpRef packs(ShuffleMask SM, OpRef Va, OpRef Vb, ResultStack &Results,<br>
>>> + MutableArrayRef<int> NewMask, unsigned Options = None);<br>
>>> + OpRef packp(ShuffleMask SM, OpRef Va, OpRef Vb, ResultStack &Results,<br>
>>> + MutableArrayRef<int> NewMask);<br>
>>> + OpRef zerous(ShuffleMask SM, OpRef Va, ResultStack &Results);<br>
>>> + OpRef vmuxs(ArrayRef<uint8_t> Bytes, OpRef Va, OpRef Vb,<br>
>>> + ResultStack &Results);<br>
>>> + OpRef vmuxp(ArrayRef<uint8_t> Bytes, OpRef Va, OpRef Vb,<br>
>>> + ResultStack &Results);<br>
>>> +<br>
>>> + OpRef shuffs1(ShuffleMask SM, OpRef Va, ResultStack &Results);<br>
>>> + OpRef shuffs2(ShuffleMask SM, OpRef Va, OpRef Vb, ResultStack &Results);<br>
>>> + OpRef shuffp1(ShuffleMask SM, OpRef Va, ResultStack &Results);<br>
>>> + OpRef shuffp2(ShuffleMask SM, OpRef Va, OpRef Vb, ResultStack &Results);<br>
>>> +<br>
>>> + OpRef butterfly(ShuffleMask SM, OpRef Va, ResultStack &Results);<br>
>>> + OpRef contracting(ShuffleMask SM, OpRef Va, OpRef Vb, ResultStack &Results);<br>
>>> + OpRef expanding(ShuffleMask SM, OpRef Va, ResultStack &Results);<br>
>>> + OpRef perfect(ShuffleMask SM, OpRef Va, ResultStack &Results);<br>
>>> +<br>
>>> + bool selectVectorConstants(SDNode *N);<br>
>>> + bool scalarizeShuffle(ArrayRef<int> Mask, const SDLoc &dl, MVT ResTy,<br>
>>> + SDValue Va, SDValue Vb, SDNode *N);<br>
>>> +<br>
>>> + };<br>
>>> +}<br>
>>> +<br>
>>> +// Return a submask of A that is shorter than A by |C| elements:<br>
>>> +// - if C > 0, return a submask of A that starts at position C,<br>
>>> +// - if C <= 0, return a submask of A that starts at 0 (reduce A by |C|).<br>
>>> +static ArrayRef<int> subm(ArrayRef<int> A, int C) {<br>
>>> + if (C > 0)<br>
>>> + return { A.data()+C, A.size()-C };<br>
>>> + return { A.data(), A.size()+C };<br>
>>> +}<br>
>>> +<br>
>>> +static void splitMask(ArrayRef<int> Mask, MutableArrayRef<int> MaskL,<br>
>>> + MutableArrayRef<int> MaskR) {<br>
>>> + unsigned VecLen = Mask.size();<br>
>>> + assert(MaskL.size() == VecLen && MaskR.size() == VecLen);<br>
>>> + for (unsigned I = 0; I != VecLen; ++I) {<br>
>>> + int M = Mask[I];<br>
>>> + if (M < 0) {<br>
>>> + MaskL[I] = MaskR[I] = -1;<br>
>>> + } else if (unsigned(M) < VecLen) {<br>
>>> + MaskL[I] = M;<br>
>>> + MaskR[I] = -1;<br>
>>> + } else {<br>
>>> + MaskL[I] = -1;<br>
>>> + MaskR[I] = M-VecLen;<br>
>>> + }<br>
>>> + }<br>
>>> +}<br>
>>> +<br>
>>> +static std::pair<int,unsigned> findStrip(ArrayRef<int> A, int Inc,<br>
>>> + unsigned MaxLen) {<br>
>>> + assert(A.size() > 0 && A.size() >= MaxLen);<br>
>>> + int F = A[0];<br>
>>> + int E = F;<br>
>>> + for (unsigned I = 1; I != MaxLen; ++I) {<br>
>>> + if (A[I] - E != Inc)<br>
>>> + return { F, I };<br>
>>> + E = A[I];<br>
>>> + }<br>
>>> + return { F, MaxLen };<br>
>>> +}<br>
>>> +<br>
>>> +static bool isUndef(ArrayRef<int> Mask) {<br>
>>> + for (int Idx : Mask)<br>
>>> + if (Idx != -1)<br>
>>> + return false;<br>
>>> + return true;<br>
>>> +}<br>
>>> +<br>
>>> +static bool isIdentity(ArrayRef<int> Mask) {<br>
>>> + unsigned Size = Mask.size();<br>
>>> + return findStrip(Mask, 1, Size) == std::make_pair(0, Size);<br>
>>> +}<br>
>>> +<br>
>>> +static bool isPermutation(ArrayRef<int> Mask) {<br>
>>> + // Check by adding all numbers only works if there is no overflow.<br>
>>> + assert(Mask.size() < 0x00007FFF && "Sanity failure");<br>
>>> + int Sum = 0;<br>
>>> + for (int Idx : Mask) {<br>
>>> + if (Idx == -1)<br>
>>> + return false;<br>
>>> + Sum += Idx;<br>
>>> + }<br>
>>> + int N = Mask.size();<br>
>>> + return 2*Sum == N*(N-1);<br>
>>> +}<br>
>>> +<br>
>>> +bool HvxSelector::selectVectorConstants(SDNode *N) {<br>
>>> + // Constant vectors are generated as loads from constant pools.<br>
>>> + // Since they are generated during the selection process, the main<br>
>>> + // selection algorithm is not aware of them. Select them directly<br>
>>> + // here.<br>
>>> + if (!N->isMachineOpcode() && N->getOpcode() == ISD::LOAD) {<br>
>>> + SDValue Addr = cast<LoadSDNode>(N)->getBasePtr();<br>
>>> + unsigned AddrOpc = Addr.getOpcode();<br>
>>> + if (AddrOpc == HexagonISD::AT_PCREL || AddrOpc == HexagonISD::CP) {<br>
>>> + if (Addr.getOperand(0).getOpcode() == ISD::TargetConstantPool) {<br>
>>> + ISel.Select(N);<br>
>>> + return true;<br>
>>> + }<br>
>>> + }<br>
>>> + }<br>
>>> +<br>
>>> + bool Selected = false;<br>
>>> + for (unsigned I = 0, E = N->getNumOperands(); I != E; ++I)<br>
>>> + Selected = selectVectorConstants(N->getOperand(I).getNode()) || Selected;<br>
>>> + return Selected;<br>
>>> +}<br>
>>> +<br>
>>> +void HvxSelector::materialize(const ResultStack &Results) {<br>
>>> + DEBUG_WITH_TYPE("isel", {<br>
>>> + dbgs() << "Materializing\n";<br>
>>> + Results.print(dbgs(), DAG);<br>
>>> + });<br>
>>> + if (Results.empty())<br>
>>> + return;<br>
>>> + const SDLoc &dl(Results.InpNode);<br>
>>> + std::vector<SDValue> Output;<br>
>>> +<br>
>>> + for (unsigned I = 0, E = Results.size(); I != E; ++I) {<br>
>>> + const NodeTemplate &Node = Results[I];<br>
>>> + std::vector<SDValue> Ops;<br>
>>> + for (const OpRef &R : Node.Ops) {<br>
>>> + assert(R.isValid());<br>
>>> + if (R.isValue()) {<br>
>>> + Ops.push_back(R.OpV);<br>
>>> + continue;<br>
>>> + }<br>
>>> + if (R.OpN & OpRef::Undef) {<br>
>>> + MVT::SimpleValueType SVT = MVT::SimpleValueType(R.OpN & OpRef::Index);<br>
>>> + Ops.push_back(ISel.selectUndef(dl, MVT(SVT)));<br>
>>> + continue;<br>
>>> + }<br>
>>> + // R is an index of a result.<br>
>>> + unsigned Part = R.OpN & OpRef::Whole;<br>
>>> + int Idx = SignExtend32(R.OpN & OpRef::Index, OpRef::IndexBits);<br>
>>> + if (Idx < 0)<br>
>>> + Idx += I;<br>
>>> + assert(Idx >= 0 && unsigned(Idx) < Output.size());<br>
>>> + SDValue Op = Output[Idx];<br>
>>> + MVT OpTy = Op.getValueType().getSimpleVT();<br>
>>> + if (Part != OpRef::Whole) {<br>
>>> + assert(Part == OpRef::LoHalf || Part == OpRef::HiHalf);<br>
>>> + if (Op.getOpcode() == HexagonISD::VCOMBINE) {<br>
>>> + Op = (Part == OpRef::HiHalf) ? Op.getOperand(0) : Op.getOperand(1);<br>
>>> + } else {<br>
>>> + MVT HalfTy = MVT::getVectorVT(OpTy.getVectorElementType(),<br>
>>> + OpTy.getVectorNumElements()/2);<br>
>>> + unsigned Sub = (Part == OpRef::LoHalf) ? Hexagon::vsub_lo<br>
>>> + : Hexagon::vsub_hi;<br>
>>> + Op = DAG.getTargetExtractSubreg(Sub, dl, HalfTy, Op);<br>
>>> + }<br>
>>> + }<br>
>>> + Ops.push_back(Op);<br>
>>> + } // for (Node : Results)<br>
>>> +<br>
>>> + assert(Node.Ty != MVT::Other);<br>
>>> + SDNode *ResN = (Node.Opc == TargetOpcode::COPY)<br>
>>> + ? Ops.front().getNode()<br>
>>> + : DAG.getMachineNode(Node.Opc, dl, Node.Ty, Ops);<br>
>>> + Output.push_back(SDValue(ResN, 0));<br>
>>> + }<br>
>>> +<br>
>>> + SDNode *OutN = Output.back().getNode();<br>
>>> + SDNode *InpN = Results.InpNode;<br>
>>> + DEBUG_WITH_TYPE("isel", {<br>
>>> + dbgs() << "Generated node:\n";<br>
>>> + OutN->dumpr(&DAG);<br>
>>> + });<br>
>>> +<br>
>>> + ISel.ReplaceNode(InpN, OutN);<br>
>>> + selectVectorConstants(OutN);<br>
>>> + DAG.RemoveDeadNodes();<br>
>>> +}<br>
>>> +<br>
>>> +OpRef HvxSelector::concat(OpRef Lo, OpRef Hi, ResultStack &Results) {<br>
>>> + DEBUG_WITH_TYPE("isel", {dbgs() << __func__ << '\n';});<br>
>>> + const SDLoc &dl(Results.InpNode);<br>
>>> + Results.push(TargetOpcode::REG_SEQUENCE, getPairVT(MVT::i8), {<br>
>>> + DAG.getTargetConstant(Hexagon::HvxWRRegClassID, dl, MVT::i32),<br>
>>> + Lo, DAG.getTargetConstant(Hexagon::vsub_lo, dl, MVT::i32),<br>
>>> + Hi, DAG.getTargetConstant(Hexagon::vsub_hi, dl, MVT::i32),<br>
>>> + });<br>
>>> + return OpRef::res(Results.top());<br>
>>> +}<br>
>>> +<br>
>>> +// Va, Vb are single vectors, SM can be arbitrarily long.<br>
>>> +OpRef HvxSelector::packs(ShuffleMask SM, OpRef Va, OpRef Vb,<br>
>>> + ResultStack &Results, MutableArrayRef<int> NewMask,<br>
>>> + unsigned Options) {<br>
>>> + DEBUG_WITH_TYPE("isel", {dbgs() << __func__ << '\n';});<br>
>>> + if (!Va.isValid() || !Vb.isValid())<br>
>>> + return OpRef::fail();<br>
>>> +<br>
>>> + int VecLen = SM.Mask.size();<br>
>>> + MVT Ty = getSingleVT(MVT::i8);<br>
>>> +<br>
>>> + if (SM.MaxSrc - SM.MinSrc < int(HwLen)) {<br>
>>> + if (SM.MaxSrc < int(HwLen)) {<br>
>>> + memcpy(NewMask.data(), SM.Mask.data(), sizeof(int)*VecLen);<br>
>>> + return Va;<br>
>>> + }<br>
>>> + if (SM.MinSrc >= int(HwLen)) {<br>
>>> + for (int I = 0; I != VecLen; ++I) {<br>
>>> + int M = SM.Mask[I];<br>
>>> + if (M != -1)<br>
>>> + M -= HwLen;<br>
>>> + NewMask[I] = M;<br>
>>> + }<br>
>>> + return Vb;<br>
>>> + }<br>
>>> + const SDLoc &dl(Results.InpNode);<br>
>>> + SDValue S = DAG.getTargetConstant(SM.MinSrc, dl, MVT::i32);<br>
>>> + if (isUInt<3>(SM.MinSrc)) {<br>
>>> + Results.push(Hexagon::V6_valignbi, Ty, {Vb, Va, S});<br>
>>> + } else {<br>
>>> + Results.push(Hexagon::A2_tfrsi, MVT::i32, {S});<br>
>>> + unsigned Top = Results.top();<br>
>>> + Results.push(Hexagon::V6_valignb, Ty, {Vb, Va, OpRef::res(Top)});<br>
>>> + }<br>
>>> + for (int I = 0; I != VecLen; ++I) {<br>
>>> + int M = SM.Mask[I];<br>
>>> + if (M != -1)<br>
>>> + M -= SM.MinSrc;<br>
>>> + NewMask[I] = M;<br>
>>> + }<br>
>>> + return OpRef::res(Results.top());<br>
>>> + }<br>
>>> +<br>
>>> + if (Options & PackMux) {<br>
>>> + // If elements picked from Va and Vb have all different (source) indexes<br>
>>> + // (relative to the start of the argument), do a mux, and update the mask.<br>
>>> + BitVector Picked(HwLen);<br>
>>> + SmallVector<uint8_t,128> MuxBytes(HwLen);<br>
>>> + bool CanMux = true;<br>
>>> + for (int I = 0; I != VecLen; ++I) {<br>
>>> + int M = SM.Mask[I];<br>
>>> + if (M == -1)<br>
>>> + continue;<br>
>>> + if (M >= int(HwLen))<br>
>>> + M -= HwLen;<br>
>>> + else<br>
>>> + MuxBytes[M] = 0xFF;<br>
>>> + if (Picked[M]) {<br>
>>> + CanMux = false;<br>
>>> + break;<br>
>>> + }<br>
>>> + NewMask[I] = M;<br>
>>> + }<br>
>>> + if (CanMux)<br>
>>> + return vmuxs(MuxBytes, Va, Vb, Results);<br>
>>> + }<br>
>>> +<br>
>>> + return OpRef::fail();<br>
>>> +}<br>
>>> +<br>
>>> +OpRef HvxSelector::packp(ShuffleMask SM, OpRef Va, OpRef Vb,<br>
>>> + ResultStack &Results, MutableArrayRef<int> NewMask) {<br>
>>> + DEBUG_WITH_TYPE("isel", {dbgs() << __func__ << '\n';});<br>
>>> + unsigned HalfMask = 0;<br>
>>> + unsigned LogHw = Log2_32(HwLen);<br>
>>> + for (int M : SM.Mask) {<br>
>>> + if (M == -1)<br>
>>> + continue;<br>
>>> + HalfMask |= (1u << (M >> LogHw));<br>
>>> + }<br>
>>> +<br>
>>> + if (HalfMask == 0)<br>
>>> + return OpRef::undef(getPairVT(MVT::i8));<br>
>>> +<br>
>>> + // If more than two halves are used, bail.<br>
>>> + // TODO: be more aggressive here?<br>
>>> + if (countPopulation(HalfMask) > 2)<br>
>>> + return OpRef::fail();<br>
>>> +<br>
>>> + MVT HalfTy = getSingleVT(MVT::i8);<br>
>>> +<br>
>>> + OpRef Inp[2] = { Va, Vb };<br>
>>> + OpRef Out[2] = { OpRef::undef(HalfTy), OpRef::undef(HalfTy) };<br>
>>> +<br>
>>> + uint8_t HalfIdx[4] = { 0xFF, 0xFF, 0xFF, 0xFF };<br>
>>> + unsigned Idx = 0;<br>
>>> + for (unsigned I = 0; I != 4; ++I) {<br>
>>> + if ((HalfMask & (1u << I)) == 0)<br>
>>> + continue;<br>
>>> + assert(Idx < 2);<br>
>>> + OpRef Op = Inp[I/2];<br>
>>> + Out[Idx] = (I & 1) ? OpRef::hi(Op) : OpRef::lo(Op);<br>
>>> + HalfIdx[I] = Idx++;<br>
>>> + }<br>
>>> +<br>
>>> + int VecLen = SM.Mask.size();<br>
>>> + for (int I = 0; I != VecLen; ++I) {<br>
>>> + int M = SM.Mask[I];<br>
>>> + if (M >= 0) {<br>
>>> + uint8_t Idx = HalfIdx[M >> LogHw];<br>
>>> + assert(Idx == 0 || Idx == 1);<br>
>>> + M = (M & (HwLen-1)) + HwLen*Idx;<br>
>>> + }<br>
>>> + NewMask[I] = M;<br>
>>> + }<br>
>>> +<br>
>>> + return concat(Out[0], Out[1], Results);<br>
>>> +}<br>
>>> +<br>
>>> +OpRef HvxSelector::zerous(ShuffleMask SM, OpRef Va, ResultStack &Results) {<br>
>>> + DEBUG_WITH_TYPE("isel", {dbgs() << __func__ << '\n';});<br>
>>> +<br>
>>> + int VecLen = SM.Mask.size();<br>
>>> + SmallVector<uint8_t,128> UsedBytes(VecLen);<br>
>>> + bool HasUnused = false;<br>
>>> + for (int I = 0; I != VecLen; ++I) {<br>
>>> + if (SM.Mask[I] != -1)<br>
>>> + UsedBytes[I] = 0xFF;<br>
>>> + else<br>
>>> + HasUnused = true;<br>
>>> + }<br>
>>> + if (!HasUnused)<br>
>>> + return Va;<br>
>>> + SDValue B = getVectorConstant(UsedBytes, SDLoc(Results.InpNode));<br>
>>> + Results.push(Hexagon::V6_vand, getSingleVT(MVT::i8), {Va, OpRef(B)});<br>
>>> + return OpRef::res(Results.top());<br>
>>> +}<br>
>>> +<br>
>>> +OpRef HvxSelector::vmuxs(ArrayRef<uint8_t> Bytes, OpRef Va, OpRef Vb,<br>
>>> + ResultStack &Results) {<br>
>>> + DEBUG_WITH_TYPE("isel", {dbgs() << __func__ << '\n';});<br>
>>> + MVT ByteTy = getSingleVT(MVT::i8);<br>
>>> + MVT BoolTy = MVT::getVectorVT(MVT::i1, 8*HwLen); // XXX<br>
>>> + const SDLoc &dl(Results.InpNode);<br>
>>> + SDValue B = getVectorConstant(Bytes, dl);<br>
>>> + Results.push(Hexagon::V6_vd0, ByteTy, {});<br>
>>> + Results.push(Hexagon::V6_veqb, BoolTy, {OpRef(B), OpRef::res(-1)});<br>
>>> + Results.push(Hexagon::V6_vmux, ByteTy, {OpRef::res(-1), Va, Vb});<br>
>>> + return OpRef::res(Results.top());<br>
>>> +}<br>
>>> +<br>
>>> +OpRef HvxSelector::vmuxp(ArrayRef<uint8_t> Bytes, OpRef Va, OpRef Vb,<br>
>>> + ResultStack &Results) {<br>
>>> + DEBUG_WITH_TYPE("isel", {dbgs() << __func__ << '\n';});<br>
>>> + size_t S = Bytes.size() / 2;<br>
>>> + OpRef L = vmuxs({Bytes.data(), S}, OpRef::lo(Va), OpRef::lo(Vb), Results);<br>
>>> + OpRef H = vmuxs({Bytes.data()+S, S}, OpRef::hi(Va), OpRef::hi(Vb), Results);<br>
>>> + return concat(L, H, Results);<br>
>>> +}<br>
>>> +<br>
>>> +OpRef HvxSelector::shuffs1(ShuffleMask SM, OpRef Va, ResultStack &Results) {<br>
>>> + DEBUG_WITH_TYPE("isel", {dbgs() << __func__ << '\n';});<br>
>>> + unsigned VecLen = SM.Mask.size();<br>
>>> + assert(HwLen == VecLen);<br>
>>> + assert(all_of(SM.Mask, [this](int M) { return M == -1 || M < int(HwLen); }));<br>
>>> +<br>
>>> + if (isIdentity(SM.Mask))<br>
>>> + return Va;<br>
>>> + if (isUndef(SM.Mask))<br>
>>> + return OpRef::undef(getSingleVT(MVT::i8));<br>
>>> +<br>
>>> + return butterfly(SM, Va, Results);<br>
>>> +}<br>
>>> +<br>
>>> +OpRef HvxSelector::shuffs2(ShuffleMask SM, OpRef Va, OpRef Vb,<br>
>>> + ResultStack &Results) {<br>
>>> + DEBUG_WITH_TYPE("isel", {dbgs() << __func__ << '\n';});<br>
>>> + OpRef C = contracting(SM, Va, Vb, Results);<br>
>>> + if (C.isValid())<br>
>>> + return C;<br>
>>> +<br>
>>> + int VecLen = SM.Mask.size();<br>
>>> + SmallVector<int,128> NewMask(VecLen);<br>
>>> + OpRef P = packs(SM, Va, Vb, Results, NewMask);<br>
>>> + if (P.isValid())<br>
>>> + return shuffs1(ShuffleMask(NewMask), P, Results);<br>
>>> +<br>
>>> + SmallVector<int,128> MaskL(VecLen), MaskR(VecLen);<br>
>>> + splitMask(SM.Mask, MaskL, MaskR);<br>
>>> +<br>
>>> + OpRef L = shuffs1(ShuffleMask(MaskL), Va, Results);<br>
>>> + OpRef R = shuffs1(ShuffleMask(MaskR), Vb, Results);<br>
>>> + if (!L.isValid() || !R.isValid())<br>
>>> + return OpRef::fail();<br>
>>> +<br>
>>> + SmallVector<uint8_t,128> Bytes(VecLen);<br>
>>> + for (int I = 0; I != VecLen; ++I) {<br>
>>> + if (MaskL[I] != -1)<br>
>>> + Bytes[I] = 0xFF;<br>
>>> + }<br>
>>> + return vmuxs(Bytes, L, R, Results);<br>
>>> +}<br>
>>> +<br>
>>> +OpRef HvxSelector::shuffp1(ShuffleMask SM, OpRef Va, ResultStack &Results) {<br>
>>> + DEBUG_WITH_TYPE("isel", {dbgs() << __func__ << '\n';});<br>
>>> + int VecLen = SM.Mask.size();<br>
>>> +<br>
>>> + SmallVector<int,128> PackedMask(VecLen);<br>
>>> + OpRef P = packs(SM, OpRef::lo(Va), OpRef::hi(Va), Results, PackedMask);<br>
>>> + if (P.isValid()) {<br>
>>> + ShuffleMask PM(PackedMask);<br>
>>> + OpRef E = expanding(PM, P, Results);<br>
>>> + if (E.isValid())<br>
>>> + return E;<br>
>>> +<br>
>>> + OpRef L = shuffs1(PM.lo(), P, Results);<br>
>>> + OpRef H = shuffs1(PM.hi(), P, Results);<br>
>>> + if (L.isValid() && H.isValid())<br>
>>> + return concat(L, H, Results);<br>
>>> + }<br>
>>> +<br>
>>> + OpRef R = perfect(SM, Va, Results);<br>
>>> + if (R.isValid())<br>
>>> + return R;<br>
>>> + // TODO commute the mask and try the opposite order of the halves.<br>
>>> +<br>
>>> + OpRef L = shuffs2(SM.lo(), OpRef::lo(Va), OpRef::hi(Va), Results);<br>
>>> + OpRef H = shuffs2(SM.hi(), OpRef::lo(Va), OpRef::hi(Va), Results);<br>
>>> + if (L.isValid() && H.isValid())<br>
>>> + return concat(L, H, Results);<br>
>>> +<br>
>>> + return OpRef::fail();<br>
>>> +}<br>
>>> +<br>
>>> +OpRef HvxSelector::shuffp2(ShuffleMask SM, OpRef Va, OpRef Vb,<br>
>>> + ResultStack &Results) {<br>
>>> + DEBUG_WITH_TYPE("isel", {dbgs() << __func__ << '\n';});<br>
>>> + int VecLen = SM.Mask.size();<br>
>>> +<br>
>>> + SmallVector<int,256> PackedMask(VecLen);<br>
>>> + OpRef P = packp(SM, Va, Vb, Results, PackedMask);<br>
>>> + if (P.isValid())<br>
>>> + return shuffp1(ShuffleMask(PackedMask), P, Results);<br>
>>> +<br>
>>> + SmallVector<int,256> MaskL(VecLen), MaskR(VecLen);<br>
>>> + OpRef L = shuffp1(ShuffleMask(MaskL), Va, Results);<br>
>>> + OpRef R = shuffp1(ShuffleMask(MaskR), Vb, Results);<br>
>>> + if (!L.isValid() || !R.isValid())<br>
>>> + return OpRef::fail();<br>
>>> +<br>
>>> + // Mux the results.<br>
>>> + SmallVector<uint8_t,256> Bytes(VecLen);<br>
>>> + for (int I = 0; I != VecLen; ++I) {<br>
>>> + if (MaskL[I] != -1)<br>
>>> + Bytes[I] = 0xFF;<br>
>>> + }<br>
>>> + return vmuxp(Bytes, L, R, Results);<br>
>>> +}<br>
>>> +<br>
>>> +bool HvxSelector::scalarizeShuffle(ArrayRef<int> Mask, const SDLoc &dl,<br>
>>> + MVT ResTy, SDValue Va, SDValue Vb,<br>
>>> + SDNode *N) {<br>
>>> + DEBUG_WITH_TYPE("isel", {dbgs() << __func__ << '\n';});<br>
>>> + MVT ElemTy = ResTy.getVectorElementType();<br>
>>> + assert(ElemTy == MVT::i8);<br>
>>> + unsigned VecLen = Mask.size();<br>
>>> + bool HavePairs = (2*HwLen == VecLen);<br>
>>> + MVT SingleTy = getSingleVT(MVT::i8);<br>
>>> +<br>
>>> + SmallVector<SDValue,128> Ops;<br>
>>> + for (int I : Mask) {<br>
>>> + if (I < 0) {<br>
>>> + Ops.push_back(ISel.selectUndef(dl, ElemTy));<br>
>>> + continue;<br>
>>> + }<br>
>>> + SDValue Vec;<br>
>>> + unsigned M = I;<br>
>>> + if (M < VecLen) {<br>
>>> + Vec = Va;<br>
>>> + } else {<br>
>>> + Vec = Vb;<br>
>>> + M -= VecLen;<br>
>>> + }<br>
>>> + if (HavePairs) {<br>
>>> + if (M < HwLen) {<br>
>>> + Vec = DAG.getTargetExtractSubreg(Hexagon::vsub_lo, dl, SingleTy, Vec);<br>
>>> + } else {<br>
>>> + Vec = DAG.getTargetExtractSubreg(Hexagon::vsub_hi, dl, SingleTy, Vec);<br>
>>> + M -= HwLen;<br>
>>> + }<br>
>>> + }<br>
>>> + SDValue Idx = DAG.getConstant(M, dl, MVT::i32);<br>
>>> + SDValue Ex = DAG.getNode(ISD::EXTRACT_VECTOR_ELT, dl, ElemTy, {Vec, Idx});<br>
>>> + SDValue L = Lower.LowerOperation(Ex, DAG);<br>
>>> + assert(L.getNode());<br>
>>> + Ops.push_back(L);<br>
>>> + }<br>
>>> +<br>
>>> + SDValue LV;<br>
>>> + if (2*HwLen == VecLen) {<br>
>>> + SDValue B0 = DAG.getBuildVector(SingleTy, dl, {Ops.data(), HwLen});<br>
>>> + SDValue L0 = Lower.LowerOperation(B0, DAG);<br>
>>> + SDValue B1 = DAG.getBuildVector(SingleTy, dl, {Ops.data()+HwLen, HwLen});<br>
>>> + SDValue L1 = Lower.LowerOperation(B1, DAG);<br>
>>> + // XXX CONCAT_VECTORS is legal for HVX vectors. Legalizing (lowering)<br>
>>> + // functions may expect to be called only for illegal operations, so<br>
>>> + // make sure that they are not called for legal ones. Develop a better<br>
>>> + // mechanism for dealing with this.<br>
>>> + LV = DAG.getNode(ISD::CONCAT_VECTORS, dl, ResTy, {L0, L1});<br>
>>> + } else {<br>
>>> + SDValue BV = DAG.getBuildVector(ResTy, dl, Ops);<br>
>>> + LV = Lower.LowerOperation(BV, DAG);<br>
>>> + }<br>
>>> +<br>
>>> + assert(!N->use_empty());<br>
>>> + ISel.ReplaceNode(N, LV.getNode());<br>
>>> + DAG.RemoveDeadNodes();<br>
>>> +<br>
>>> + std::deque<SDNode*> SubNodes;<br>
>>> + SubNodes.push_back(LV.getNode());<br>
>>> + for (unsigned I = 0; I != SubNodes.size(); ++I) {<br>
>>> + for (SDValue Op : SubNodes[I]->ops())<br>
>>> + SubNodes.push_back(Op.getNode());<br>
>>> + }<br>
>>> + while (!SubNodes.empty()) {<br>
>>> + SDNode *S = SubNodes.front();<br>
>>> + SubNodes.pop_front();<br>
>>> + if (S->use_empty())<br>
>>> + continue;<br>
>>> + // This isn't great, but users need to be selected before any nodes that<br>
>>> + // they use. (The reason is to match larger patterns, and avoid nodes that<br>
>>> + // cannot be matched on their own, e.g. ValueType, TokenFactor, etc.).<br>
>>> + bool PendingUser = llvm::any_of(S->uses(), [&SubNodes](const SDNode *U) {<br>
>>> + return llvm::any_of(SubNodes, [U](const SDNode *T) {<br>
>>> + return T == U;<br>
>>> + });<br>
>>> + });<br>
>>> + if (PendingUser)<br>
>>> + SubNodes.push_back(S);<br>
>>> + else<br>
>>> + ISel.Select(S);<br>
>>> + }<br>
>>> +<br>
>>> + DAG.RemoveDeadNodes();<br>
>>> + return true;<br>
>>> +}<br>
>>> +<br>
>>> +OpRef HvxSelector::contracting(ShuffleMask SM, OpRef Va, OpRef Vb,<br>
>>> + ResultStack &Results) {<br>
>>> + DEBUG_WITH_TYPE("isel", {dbgs() << __func__ << '\n';});<br>
>>> + if (!Va.isValid() || !Vb.isValid())<br>
>>> + return OpRef::fail();<br>
>>> +<br>
>>> + // Contracting shuffles, i.e. instructions that always discard some bytes<br>
>>> + // from the operand vectors.<br>
>>> + //<br>
>>> + // V6_vshuff{e,o}b<br>
>>> + // V6_vdealb4w<br>
>>> + // V6_vpack{e,o}{b,h}<br>
>>> +<br>
>>> + int VecLen = SM.Mask.size();<br>
>>> + std::pair<int,unsigned> Strip = findStrip(SM.Mask, 1, VecLen);<br>
>>> + MVT ResTy = getSingleVT(MVT::i8);<br>
>>> +<br>
>>> + // The following shuffles only work for bytes and halfwords. This requires<br>
>>> + // the strip length to be 1 or 2.<br>
>>> + if (Strip.second != 1 && Strip.second != 2)<br>
>>> + return OpRef::fail();<br>
>>> +<br>
>>> + // The patterns for the shuffles, in terms of the starting offsets of the<br>
>>> + // consecutive strips (L = length of the strip, N = VecLen):<br>
>>> + //<br>
>>> + // vpacke: 0, 2L, 4L ... N+0, N+2L, N+4L ... L = 1 or 2<br>
>>> + // vpacko: L, 3L, 5L ... N+L, N+3L, N+5L ... L = 1 or 2<br>
>>> + //<br>
>>> + // vshuffe: 0, N+0, 2L, N+2L, 4L ... L = 1 or 2<br>
>>> + // vshuffo: L, N+L, 3L, N+3L, 5L ... L = 1 or 2<br>
>>> + //<br>
>>> + // vdealb4w: 0, 4, 8 ... 2, 6, 10 ... N+0, N+4, N+8 ... N+2, N+6, N+10 ...<br>
>>> +<br>
>>> + // The value of the element in the mask following the strip will decide<br>
>>> + // what kind of a shuffle this can be.<br>
>>> + int NextInMask = SM.Mask[Strip.second];<br>
>>> +<br>
>>> + // Check if NextInMask could be 2L, 3L or 4, i.e. if it could be a mask<br>
>>> + // for vpack or vdealb4w. VecLen > 4, so NextInMask for vdealb4w would<br>
>>> + // satisfy this.<br>
>>> + if (NextInMask < VecLen) {<br>
>>> + // vpack{e,o} or vdealb4w<br>
>>> + if (Strip.first == 0 && Strip.second == 1 && NextInMask == 4) {<br>
>>> + int N = VecLen;<br>
>>> + // Check if this is vdealb4w (L=1).<br>
>>> + for (int I = 0; I != N/4; ++I)<br>
>>> + if (SM.Mask[I] != 4*I)<br>
>>> + return OpRef::fail();<br>
>>> + for (int I = 0; I != N/4; ++I)<br>
>>> + if (SM.Mask[I+N/4] != 2 + 4*I)<br>
>>> + return OpRef::fail();<br>
>>> + for (int I = 0; I != N/4; ++I)<br>
>>> + if (SM.Mask[I+N/2] != N + 4*I)<br>
>>> + return OpRef::fail();<br>
>>> + for (int I = 0; I != N/4; ++I)<br>
>>> + if (SM.Mask[I+3*N/4] != N+2 + 4*I)<br>
>>> + return OpRef::fail();<br>
>>> + // Matched mask for vdealb4w.<br>
>>> + Results.push(Hexagon::V6_vdealb4w, ResTy, {Vb, Va});<br>
>>> + return OpRef::res(Results.top());<br>
>>> + }<br>
>>> +<br>
>>> + // Check if this is vpack{e,o}.<br>
>>> + int N = VecLen;<br>
>>> + int L = Strip.second;<br>
>>> + // Check if the first strip starts at 0 or at L.<br>
>>> + if (Strip.first != 0 && Strip.first != L)<br>
>>> + return OpRef::fail();<br>
>>> + // Examine the rest of the mask.<br>
>>> + for (int I = L; I < N/2; I += L) {<br>
>>> + auto S = findStrip(subm(SM.Mask,I), 1, N-I);<br>
>>> + // Check whether the mask element at the beginning of each strip<br>
>>> + // increases by 2L each time.<br>
>>> + if (S.first - Strip.first != 2*I)<br>
>>> + return OpRef::fail();<br>
>>> + // Check whether each strip is of the same length.<br>
>>> + if (S.second != unsigned(L))<br>
>>> + return OpRef::fail();<br>
>>> + }<br>
>>> +<br>
>>> + // Strip.first == 0 => vpacke<br>
>>> + // Strip.first == L => vpacko<br>
>>> + assert(Strip.first == 0 || Strip.first == L);<br>
>>> + using namespace Hexagon;<br>
>>> + NodeTemplate Res;<br>
>>> + Res.Opc = Strip.second == 1 // Number of bytes.<br>
>>> + ? (Strip.first == 0 ? V6_vpackeb : V6_vpackob)<br>
>>> + : (Strip.first == 0 ? V6_vpackeh : V6_vpackoh);<br>
>>> + Res.Ty = ResTy;<br>
>>> + Res.Ops = { Vb, Va };<br>
>>> + Results.push(Res);<br>
>>> + r</blockquote></div>