<div dir="ltr"><div>Hi Nadav, I believe the implementation to keep on widening the vector to the next power of two must be in TargetLowering.h because that is where we decide whether to Widen the vector or not, and the size to which we widen it. In this case, we stop at 4xi8 and do not check if it is legal or not. But the comment says ‘try to widen vector elements until a legal type is found’. </div>
<div><br></div><div>Also, there is a minor issue with the comment <3 x float> to <4 x float>, where the float vectors won’t even enter this block of code, since we check the element Vector type (EltVT) to be integer.</div>
<div><br></div><div>The patch breaks a whole lot of test-cases, so it is obviously not the ideal solution. It simply reflects what the comment says.</div><div><br></div><div>diff --git a/include/llvm/Target/TargetLowering.h b/include/llvm/Target/TargetLowering.h</div>
<div>index c3fa3cc..181f951 100644</div><div>--- a/include/llvm/Target/TargetLowering.h</div><div>+++ b/include/llvm/Target/TargetLowering.h</div><div>@@ -1474,10 +1474,14 @@ public:</div><div> // Try to widen vector elements until a legal type is found.</div>
<div> if (EltVT.isInteger()) {</div><div> // Vectors with a number of elements that is not a power of two are always</div><div>- // widened, for example <3 x float> -> <4 x float>.</div><div>
+ // widened, for example <3 x i8> -> <4 x i8>.</div><div> if (!VT.isPow2VectorType()) {</div><div> NumElts = (unsigned)NextPowerOf2(NumElts);</div><div> EVT NVT = EVT::getVectorVT(Context, EltVT, NumElts);</div>
<div>+ while (!isTypeLegal(NVT)) {</div><div>+ NumElts = (unsigned)NextPowerOf2(NumElts);</div><div>+ NVT = EVT::getVectorVT(Context, EltVT, NumElts);</div><div>+ }</div><div> return LegalizeKind(TypeWidenVector, NVT);</div>
<div> }</div><div><br></div><div><br></div><div>From: <a href="mailto:llvmdev-bounces@cs.uiuc.edu">llvmdev-bounces@cs.uiuc.edu</a> [mailto:<a href="mailto:llvmdev-bounces@cs.uiuc.edu">llvmdev-bounces@cs.uiuc.edu</a>] On Behalf Of Nadav Rotem</div>
<div>Sent: Monday, August 12, 2013 1:59 PM</div><div>To: Redmond, Paul</div><div>Cc: LLVM Developers Mailing List</div><div>Subject: Re: [LLVMdev] vector type legalization</div><div><br></div><div>This is a bug in the implementation of WidenVecRes_Binary. On line 1546 it assumes that “Widen” is the last phase of type-legalization and we check if the result is a legal type. But actually we want to continue and promote the elements of the vector. In other cases we may want to widen (to the next power of two) and later split in half because the vector is too big. </div>
<div><br></div><div>On Aug 12, 2013, at 10:46 AM, Redmond, Paul <<a href="mailto:paul.redmond@intel.com">paul.redmond@intel.com</a>> wrote:</div><div><br></div><div>Hi Nadav,</div><div><br></div><div>On 2013-08-12 12:59 PM, "Nadav Rotem" <<a href="mailto:nrotem@apple.com">nrotem@apple.com</a>> wrote:</div>
<div>Hi Paul, </div><div><br></div><div>You can read about it here:</div><div><a href="http://blog.llvm.org/2011/12/llvm-31-vector-changes.html">http://blog.llvm.org/2011/12/llvm-31-vector-changes.html</a></div><div>Hi,</div>
<div><br></div><div>I am trying to understand how vector type legalization works. In</div><div>particular, I'm looking at i8 vector types on x86 (with sse42 features)</div><div><br></div><div>v3i8 gets widened to v4i8 and then operations get unrolled (scalarized)</div>
<div>because v4i8 is not a legal type whereas v4i8 gets</div><div><br></div><div>This does not sound right. v3i8 -> v4i8 is okay. But the next step</div><div>should be v4i8 -> v4i32. The operation nay be scalarized in the vector</div>
<div>legalization phase.</div><div><br></div><div>What I'm looking at is a v3i8 add. In DAGTypeLegalizer::WidenVecRes_Binary</div><div>the operation gets scalarized (DAG.UnrollVector). The input N is</div><div>"0x51c1d60: v3i8 = add 0x51c1860, 0x51c1c60 [ORD=5] [ID=0]" and the</div>
<div>WidenVT is v4i8. The code ends up in the NumElts == 1 path which causes</div><div>scalarization.</div><div><br></div><div>The debug dump shows "Widen node result 0: 0x563dd20: v3i8 = add</div><div>0x563d820, 0x563dc20 [ORD=5] [ID=0]". To me it doesn't look like it's</div>
<div>possible to both widen and promote an operation..</div><div><br></div><div>Paul</div><div><br></div><div>promoted to v4i32. Why doesn't v3i8 (or even v4i8) get widened to</div><div>v16i8? Alternatively, v3i8 could be widened to v4i8 then promoted to</div>
<div>v4i32 but this doesn't happen either.</div><div><br></div><div>Can anyone provide some insight into why vector type legalization works</div><div>the way it does?</div><div><br></div><div>Thanks,</div><div>paul</div>
<div><br></div><div>_______________________________________________</div><div>LLVM Developers mailing list</div><div><a href="mailto:LLVMdev@cs.uiuc.edu">LLVMdev@cs.uiuc.edu</a> <a href="http://llvm.cs.uiuc.edu">http://llvm.cs.uiuc.edu</a></div>
<div><a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev">http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev</a></div><div><br></div><div><br></div>-- <br>==<div>Sriram Murali<br>Intel of Canada, Waterloo</div><div>
(519) 404-0843<a href="http://synergy.ece.ubc.ca/sriram" target="_blank"></a></div><div><br></div>
</div>