<div dir="ltr">On 30 May 2013 20:36, Nadav Rotem <span dir="ltr"><<a href="mailto:nrotem@apple.com" target="_blank">nrotem@apple.com</a>></span> wrote:<br><div class="gmail_extra"><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<div style="word-wrap:break-word">Hi Nick, <div><br></div><div>Thanks for working on this. I have a few comments. </div><div><br></div><div><div><div class="im"><blockquote type="cite"><div style="letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px">
<br>URL:<span> </span><a href="http://llvm.org/viewvc/llvm-project?rev=182976&view=rev" target="_blank">http://llvm.org/viewvc/llvm-project?rev=182976&view=rev</a><br>Log:<br>Reapply with r182909 with a fix to the calculation of the new indices for<br>
insertelement instructions.<br><br></div></blockquote><div dir="auto"><br></div></div><div dir="auto">Can you use the commit message to describe what the patch does ? </div><div class="im"><div dir="auto"><br></div><div dir="auto">
<br></div><blockquote type="cite"><div style="letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px">}<br><br>+/// Return true if we can evaluate the specified expression tree if the vector<br>
+/// elements were shuffled in a different order.<br>+static bool CanEvaluateShuffled(Value *V, ArrayRef<int> Mask,<br>+ unsigned Depth = 100) {<br></div></blockquote><div><br></div></div>
<div>Depth = 100 ?? This is really high. It should be a much lower threshold. This can potentially search 2^100 paths because you are calling this recursion twice on binary operands. Please limit this to something like 5. </div>
</div></div></div></blockquote><div><br></div><div style>So I've updated it to 5 as you asked, but two points. LLVM has made a trade-off in many places (loop unrolling and the inliner's cost analysis for instance) that we spend more compile time optimizing code which uses vectors. Second, in order for this 2^n to ever happen, the function needs to have 2^n instructions in it. Reduced to a mere chain of five instructions, I hope the optimization still fires on real-world code, such as your backwards-iterating example where the loop vectorizer emits redundant shuffles. I don't want to lose the point of this optimization.</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div style="word-wrap:break-word"><div><div><div class="im">
<blockquote type="cite"><div style="letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px">+ case Instruction::InsertElement: {<br>+ ConstantInt *CI = dyn_cast<ConstantInt>(I->getOperand(2));<br>
+ if (!CI) return false;<br>+ int ElementNumber = CI->getLimitedValue();<br>+<br>+ // Verify that 'CI' does not occur twice in Mask. A single 'insertelement'<br>+ // can't put an element into multiple indices.<br>
+ bool SeenOnce = false;<br>+ for (int i = 0, e = Mask.size(); i != e; ++i) {<br>+ if (Mask[i] == ElementNumber) {<br>+ if (SeenOnce)<br>+ return false;<br>+ SeenOnce = true;<br>
+ }<br>+ }<br>+ return CanEvaluateShuffled(I->getOperand(0), Mask, Depth-1);<br>+ }<br>+ }<br>+ return false;<br></div></blockquote><div><br></div></div><div>I would like to see more tests for the code that deals with insertelements.</div>
</div></div></div></blockquote><div><br></div>There was one case I wasn't testing, but as of r183080 I think we have them all?<br><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<div style="word-wrap:break-word"><div><div><div> What happens if only some of the lanes are inserted ? Does it ever fail ?</div></div></div></div></blockquote><div><br></div><div style>Hm? This case is covered in the tests.</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div style="word-wrap:break-word"><div><div><div class="im">
<blockquote type="cite"><div style="letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px"> }<br><br>+ if (isa<UndefValue>(RHS) &&<br>+ // This isn't necessary for correctness, but the comment block below<br>
+ // claims that there are cases where folding two shuffles into one would<br>+ // cause worse codegen on some targets.<br>+ !isa<ShuffleVectorInst>(LHS) &&<br>+ CanEvaluateShuffled(LHS, Mask)) {<br>
</div></blockquote><div><br></div></div><div>What ?</div></div></div></div></blockquote><div><br></div><div style>Good spotting. Now that the patch doesn't do shuffle-shuffle folding, I'll go ahead and remove the extra check and comment block.</div>
<div style><br></div><div style>Nick</div><div style><br></div></div></div></div>