<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=utf-8">

<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40"><head><meta name=Generator content="Microsoft Word 14 (filtered medium)"><style><!--

/* Font Definitions */

@font-face

        {font-family:SimSun;

        panose-1:2 1 6 0 3 1 1 1 1 1;}

@font-face

        {font-family:SimSun;

        panose-1:2 1 6 0 3 1 1 1 1 1;}

@font-face

        {font-family:Calibri;

        panose-1:2 15 5 2 2 2 4 3 2 4;}

@font-face

        {font-family:Tahoma;

        panose-1:2 11 6 4 3 5 4 4 2 4;}

@font-face

        {font-family:"\@SimSun";

        panose-1:2 1 6 0 3 1 1 1 1 1;}

/* Style Definitions */

p.MsoNormal, li.MsoNormal, div.MsoNormal

        {margin:0in;

        margin-bottom:.0001pt;

        font-size:12.0pt;

        font-family:"Times New Roman","serif";}

a:link, span.MsoHyperlink

        {mso-style-priority:99;

        color:blue;

        text-decoration:underline;}

a:visited, span.MsoHyperlinkFollowed

        {mso-style-priority:99;

        color:purple;

        text-decoration:underline;}

p.MsoPlainText, li.MsoPlainText, div.MsoPlainText

        {mso-style-priority:99;

        mso-style-link:"Plain Text Char";

        margin:0in;

        margin-bottom:.0001pt;

        font-size:11.0pt;

        font-family:"Calibri","sans-serif";}

span.EmailStyle17

        {mso-style-type:personal-reply;

        font-family:"Calibri","sans-serif";

        color:#1F497D;}

span.PlainTextChar

        {mso-style-name:"Plain Text Char";

        mso-style-priority:99;

        mso-style-link:"Plain Text";

        font-family:"Calibri","sans-serif";}

.MsoChpDefault

        {mso-style-type:export-only;

        font-family:"Calibri","sans-serif";}

@page WordSection1

        {size:8.5in 11.0in;

        margin:1.0in 1.0in 1.0in 1.0in;}

div.WordSection1

        {page:WordSection1;}

--></style><!--[if gte mso 9]><xml>

<o:shapedefaults v:ext="edit" spidmax="1026" />

</xml><![endif]--><!--[if gte mso 9]><xml>

<o:shapelayout v:ext="edit">

<o:idmap v:ext="edit" data="1" />

</o:shapelayout></xml><![endif]--></head><body lang=EN-US link=blue vlink=purple><div class=WordSection1><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>Hi Tim,<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'><o:p> </o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>We still don’t have a conclusion about the solution of that comment. But as we are close to the deadline, I create a new patch refactoring all the other comments. The solution about the 64bitVector is still the same as the old version. If it is better to implement in another way, I can keep on refactoring.<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'><o:p> </o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>The patches are attached and also available on http://llvm-reviews.chandlerc.com/D2211.<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>Review please. <o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'><o:p> </o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>Thanks,<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>-Hao<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'><o:p> </o:p></span></p><p class=MsoNormal><b><span style='font-size:10.0pt;font-family:"Tahoma","sans-serif"'>From:</span></b><span style='font-size:10.0pt;font-family:"Tahoma","sans-serif"'> Jiangning Liu [mailto:liujiangning1@gmail.com] <br><b>Sent:</b> Monday, November 18, 2013 10:57 AM<br><b>To:</b> Hao Liu<br><b>Cc:</b> Tim Northover; llvm-commits@cs.uiuc.edu for LLVM; cfe-commits@cs.uiuc.edu<br><b>Subject:</b> Re: [PATCH] Implement aarch64 neon instruction class SIMD lsone and lsone-post - LLVM<o:p></o:p></span></p><p class=MsoNormal><o:p> </o:p></p><div><div><p class=MsoNormal><span style='font-family:"Arial","sans-serif"'>Hi Tim,<o:p></o:p></span></p></div><div><p class=MsoNormal><span style='font-family:"Arial","sans-serif"'><o:p> </o:p></span></p></div><div><p class=MsoNormal><span style='font-family:"Arial","sans-serif"'>I think this scenario is different from table lookup. <o:p></o:p></span></p></div><div><p class=MsoNormal><span style='font-family:"Arial","sans-serif"'><o:p> </o:p></span></p></div><div><p class=MsoNormal><span style='font-family:"Arial","sans-serif"'>For table lookup, the intrinsic functions for 64-bit vector list defines a very special semantic, and it is quite different from 128-bit vecor list, so this is why we'd prefer to lower it in front-end to make it implemented with 128-bit vector list by doing packing action.<o:p></o:p></span></p></div><div><p class=MsoNormal><span style='font-family:"Arial","sans-serif"'><o:p> </o:p></span></p></div><div><p class=MsoNormal><span style='font-family:"Arial","sans-serif"'>For this ldst_one instruction class, the 64-bit vector list can be very naturally mapped to 128-bit vector list. On one hand, using SUBREG_TO_REG and EXTRACT_SUBREG in table gen don't really make differences from directly generating them with CPP code. On the other hand, we still provide very lower level representation and enough flexibility in IR and SelectionDAG, because the intrinsic like int_arm_neon_vld3lane is very close to instructions, and using the value types attached to it's operands we can easily determine the scope requirement of the lane for instruction ld3 {vt.h - vt3.h}[lane], [xn] (0<=lane<=3 or 0<=lane<=7).<o:p></o:p></span></p></div><div><p class=MsoNormal><o:p> </o:p></p></div><div><div><p class=MsoNormal><span style='font-family:"Arial","sans-serif"'>Also, I think we already have precedents using the same solution, for example, vset_lane_u8 and vsetq_lane_u8. If you have any more insights, let's discuss more about it. <o:p></o:p></span></p></div></div><div><p class=MsoNormal><o:p> </o:p></p></div><div><div><p class=MsoNormal><span style='font-family:"Arial","sans-serif"'>Thanks,<o:p></o:p></span></p></div><div><p class=MsoNormal><span style='font-family:"Arial","sans-serif"'>-Jiangning<o:p></o:p></span></p></div><p class=MsoNormal><o:p> </o:p></p><div><p class=MsoNormal>2013/11/17 Hao Liu <<a href="mailto:Hao.Liu@arm.com" target="_blank">Hao.Liu@arm.com</a>><o:p></o:p></p><p class=MsoNormal>Hi Tim,<br><br>I'm a little confused about this comment:<o:p></o:p></p><div><p class=MsoNormal>================<br>Comment at: lib/Target/AArch64/AArch64ISelDAGToDAG.cpp:883<br>@@ +882,3 @@<o:p></o:p></p></div><p class=MsoNormal>+                               N->op_begin() + Vec0Idx + NumVecs);  if<o:p></o:p></p><div><p class=MsoNormal style='margin-bottom:12.0pt'>+ (is64BitVector)<br>+    for (unsigned i = 0; i < Regs.size(); i++)<br>----------------<br>I tend to think this complexity ought to be handled in the front-end (like the legacy TBL/TBX you did recently): a shufflvector extension before calling  bona fide 128-bit intrinsic.<br><br><o:p></o:p></p></div><p class=MsoNormal style='margin-bottom:12.0pt'>I have implemented in the front-end for vld2_lane. But I'm not sure whether the solution is correct. There are many shufflevectors, because we need to transfer the input from 64bit vector to 128bit, and also transfer the output from 128bit to 64bit.  For vld2_lane, there will be 4 more shufflevectors. And 8 more shufflevectors for vld4_lane.<br>I haven't completed all the work yet. I just wonder if I implement the solution in a correct way?<br>If it is correct, I can keep on completing this tomorrow.<br><br>+  case AArch64::BI__builtin_neon_vld2q_lane_v:<br>+    return EmitARMBuiltinExpr(ARM::BI__builtin_neon_vld2q_lane_v, E);<br>+  case AArch64::BI__builtin_neon_vld2_lane_v: {<br>+    Ops[2] = Builder.CreateBitCast(Ops[2], VTy);<br>+    Ops[3] = Builder.CreateBitCast(Ops[3], VTy);<br>+    Value *SV, *SV2;<br>+    // Build a vector containing sequential number like (0, 1, 2, ..., 15)<br>+    SmallVector<Constant*, 16> Indices, Indices2;<br>+    for (unsigned i = 0, e = VTy->getNumElements(); i != e; ++i) {<br>+      Indices.push_back(ConstantInt::get(Int32Ty, 2*i));<br>+      Indices.push_back(ConstantInt::get(Int32Ty, 2*i+1));<br>+      Indices2.push_back(ConstantInt::get(Int32Ty, i));<br>+    }<br>+    SV = llvm::ConstantVector::get(Indices);<br>+    SV2 = llvm::ConstantVector::get(Indices2);<br>+    Value *undef64 = UndefValue::get(VTy);<br>+    Ops[2] = Builder.CreateShuffleVector(Ops[2], undef64, SV, "lane");<br>+    Ops[3] = Builder.CreateShuffleVector(Ops[3], undef64, SV, "lane");<br>+    llvm::VectorType *VTy64 = VTy;<br>+    VTy = llvm::VectorType::get(VTy->getElementType(), VTy->getNumElements() * 2);<br>+    Function *F = CGM.getIntrinsic(Intrinsic::arm_neon_vld2lane, VTy);<br>+    Ops.push_back(Align);<br>+    Ops[1] = Builder.CreateCall(F, makeArrayRef(Ops).slice(1), "vld2_lane");<br>+    Value *Val0 = Builder.CreateExtractValue(Ops[1], 0);<br>+    Value *Val1 = Builder.CreateExtractValue(Ops[1], 1);<br>+    Value *undef128 = UndefValue::get(VTy);<br>+    Val0 = Builder.CreateShuffleVector(Val0, undef128, SV2, "lane");<br>+    Val1 = Builder.CreateShuffleVector(Val1, undef128, SV2, "lane");<br>+    llvm::StructType *STy = llvm::StructType::get(VTy64, VTy64, NULL);<br>+    Value *InsVal = Builder.CreateInsertValue(UndefValue::get(STy), Val0, 0);<br>+    InsVal = Builder.CreateInsertValue(InsVal, Val1, 1);<br>+    Ty = llvm::PointerType::getUnqual(InsVal->getType());<br>+    Ops[0] = Builder.CreateBitCast(Ops[0], Ty);<br>+    return Builder.CreateStore(InsVal, Ops[0]);<br>+  }<span style='color:#1F497D'><o:p></o:p></span></p><p class=MsoNormal>-----Original Message-----<span style='color:#1F497D'><o:p></o:p></span></p><p class=MsoPlainText>From: cfe-commits-bounces@cs.uiuc.edu [mailto:cfe-commits-bounces@cs.uiuc.edu] On Behalf Of Tim Northover<br>Sent: Friday, November 15, 2013 3:56 AM<br>To: t.p.northover@gmail.com; liujiangning1@gmail.com<br>Cc: llvm-commits@cs.uiuc.edu; cfe-commits@cs.uiuc.edu<br>Subject: Re: [PATCH] Implement aarch64 neon instruction class SIMD lsone and lsone-post - LLVM<o:p></o:p></p><p class=MsoPlainText><o:p> </o:p></p><p class=MsoPlainText><o:p> </o:p></p><p class=MsoPlainText><o:p> </o:p></p><p class=MsoPlainText>================<o:p></o:p></p><p class=MsoPlainText>Comment at: lib/Target/AArch64/AArch64ISelDAGToDAG.cpp:763<o:p></o:p></p><p class=MsoPlainText>@@ +762,3 @@<o:p></o:p></p><p class=MsoPlainText>+                        Operand,<o:p></o:p></p><p class=MsoPlainText>+                        CurDAG->getTargetConstant(AArch64::sub_64, <o:p></o:p></p><p class=MsoPlainText>+ MVT::i32));  return SDValue(Reg, 0);<o:p></o:p></p><p class=MsoPlainText>----------------<o:p></o:p></p><p class=MsoPlainText>Should this be SRIdx? If not, that variable is unused. It's a little worrying that this passes its tests actually.<o:p></o:p></p><p class=MsoPlainText><o:p> </o:p></p><p class=MsoPlainText>================<o:p></o:p></p><p class=MsoPlainText>Comment at: lib/Target/AArch64/AArch64ISelDAGToDAG.cpp:777<o:p></o:p></p><p class=MsoPlainText>@@ +776,3 @@<o:p></o:p></p><p class=MsoPlainText>+  default: llvm_unreachable("unhandled vld-dup type");  case MVT::v8i8:  <o:p></o:p></p><p class=MsoPlainText>+ OpcodeIndex = 0; break;  case MVT::v4i16: OpcodeIndex = 1; break;<o:p></o:p></p><p class=MsoPlainText>----------------<o:p></o:p></p><p class=MsoPlainText>This OpcodeIndex mapping is horribly non-obvious and hard-coded. It also seems to omit any f16 vectors, which seems odd.<o:p></o:p></p><p class=MsoPlainText><o:p> </o:p></p><p class=MsoPlainText>I'd suggest separate 64-bit and 128-bit tables, and basing the index off of Log2_32(VT.getScalarType().getSizeInBits()) or some other way to avoid enumerating all types.<o:p></o:p></p><p class=MsoPlainText><o:p> </o:p></p><p class=MsoPlainText>================<o:p></o:p></p><p class=MsoPlainText>Comment at: lib/Target/AArch64/AArch64ISelDAGToDAG.cpp:803<o:p></o:p></p><p class=MsoPlainText>@@ +802,3 @@<o:p></o:p></p><p class=MsoPlainText>+<o:p></o:p></p><p class=MsoPlainText>+  std::vector<EVT> ResTys;<o:p></o:p></p><p class=MsoPlainText>+  ResTys.push_back(MVT::Untyped); // Type of result super register<o:p></o:p></p><p class=MsoPlainText>----------------<o:p></o:p></p><p class=MsoPlainText>SmallVector would be better. Or perhaps a ?: with a call to SelectionDAG::getSDVTList.<o:p></o:p></p><p class=MsoPlainText><o:p> </o:p></p><p class=MsoPlainText>================<o:p></o:p></p><p class=MsoPlainText>Comment at: lib/Target/AArch64/AArch64ISelDAGToDAG.cpp:856<o:p></o:p></p><p class=MsoPlainText>@@ +855,3 @@<o:p></o:p></p><p class=MsoPlainText>+  default: llvm_unreachable("unhandled vld/vst lane type");  case <o:p></o:p></p><p class=MsoPlainText>+ MVT::v16i8: OpcodeIndex = 0; break;  case MVT::v8i16: OpcodeIndex = 1; <o:p></o:p></p><p class=MsoPlainText>+ break;<o:p></o:p></p><p class=MsoPlainText>----------------<o:p></o:p></p><p class=MsoPlainText>Same odd OpcodeIndex.<o:p></o:p></p><p class=MsoPlainText><o:p> </o:p></p><p class=MsoPlainText>================<o:p></o:p></p><p class=MsoPlainText>Comment at: lib/Target/AArch64/AArch64ISelDAGToDAG.cpp:885<o:p></o:p></p><p class=MsoPlainText>@@ +884,3 @@<o:p></o:p></p><p class=MsoPlainText>+    for (unsigned i = 0; i < Regs.size(); i++)<o:p></o:p></p><p class=MsoPlainText>+      Regs[i] = getTargetSubregToReg(AArch64::sub_64, dl, VT, VT64, <o:p></o:p></p><p class=MsoPlainText>+ Regs[i]);  SDValue SuperReg = createQTuple(Regs);<o:p></o:p></p><p class=MsoPlainText>----------------<o:p></o:p></p><p class=MsoPlainText>Ah, I see why that function works now.<o:p></o:p></p><p class=MsoPlainText><o:p> </o:p></p><p class=MsoPlainText>================<o:p></o:p></p><p class=MsoPlainText>Comment at: lib/Target/AArch64/AArch64ISelDAGToDAG.cpp:833<o:p></o:p></p><p class=MsoPlainText>@@ +832,3 @@<o:p></o:p></p><p class=MsoPlainText>+SDNode *AArch64DAGToDAGISel::SelectVLDSTLane(SDNode *N, bool IsLoad,<o:p></o:p></p><p class=MsoPlainText>+                                             unsigned NumVecs, bool isUpdating,<o:p></o:p></p><p class=MsoPlainText>+                                             const uint16_t *Opcodes) {<o:p></o:p></p><p class=MsoPlainText>----------------<o:p></o:p></p><p class=MsoPlainText>"IsUpdating" according to coding standards. Especially as it's right next to "IsLoad" with the capital<o:p></o:p></p><p class=MsoPlainText><o:p> </o:p></p><p class=MsoPlainText>================<o:p></o:p></p><p class=MsoPlainText>Comment at: lib/Target/AArch64/AArch64ISelDAGToDAG.cpp:883<o:p></o:p></p><p class=MsoPlainText>@@ +882,3 @@<o:p></o:p></p><p class=MsoPlainText>+                               N->op_begin() + Vec0Idx + NumVecs);  if <o:p></o:p></p><p class=MsoPlainText>+ (is64BitVector)<o:p></o:p></p><p class=MsoPlainText>+    for (unsigned i = 0; i < Regs.size(); i++)<o:p></o:p></p><p class=MsoPlainText>----------------<o:p></o:p></p><p class=MsoPlainText>I tend to think this complexity ought to be handled in the front-end (like the legacy TBL/TBX you did recently): a shufflvector extension before calling  bona fide 128-bit intrinsic.<o:p></o:p></p><p class=MsoPlainText><o:p> </o:p></p><p class=MsoPlainText>================<o:p></o:p></p><p class=MsoPlainText>Comment at: lib/Target/AArch64/AArch64ISelDAGToDAG.cpp:838<o:p></o:p></p><p class=MsoPlainText>@@ +837,3 @@<o:p></o:p></p><p class=MsoPlainText>+  unsigned AddrOpIdx = isUpdating ? 1 : 2;  unsigned Vec0Idx = 3; // <o:p></o:p></p><p class=MsoPlainText>+ AddrOpIdx + (isUpdating ? 2 : 1)<o:p></o:p></p><p class=MsoPlainText>+<o:p></o:p></p><p class=MsoPlainText>----------------<o:p></o:p></p><p class=MsoPlainText>Commented code.<o:p></o:p></p><p class=MsoPlainText><o:p> </o:p></p><p class=MsoPlainText><o:p> </o:p></p><p class=MsoPlainText><a href="http://llvm-reviews.chandlerc.com/D2146"><span style='color:windowtext;text-decoration:none'>http://llvm-reviews.chandlerc.com/D2146</span></a><o:p></o:p></p><p class=MsoPlainText>_______________________________________________<o:p></o:p></p><p class=MsoPlainText>cfe-commits mailing list<o:p></o:p></p><p class=MsoPlainText><a href="mailto:cfe-commits@cs.uiuc.edu"><span style='color:windowtext;text-decoration:none'>cfe-commits@cs.uiuc.edu</span></a><o:p></o:p></p><p class=MsoNormal><a href="http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits"><span style='color:windowtext;text-decoration:none'>http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits</span></a><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'><o:p></o:p></span></p></div></div></div></div></body></html>