<html><head><meta http-equiv="Content-Type" content="text/html charset=us-ascii"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;">Hi Nick, <div><br></div><div>Thanks for working on this. I have a few comments. </div><div><br></div><div><div><blockquote type="cite"><div style="letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;"><br>URL:<span class="Apple-converted-space"> </span><a href="http://llvm.org/viewvc/llvm-project?rev=182976&view=rev">http://llvm.org/viewvc/llvm-project?rev=182976&view=rev</a><br>Log:<br>Reapply with r182909 with a fix to the calculation of the new indices for<br>insertelement instructions.<br><br></div></blockquote><div dir="auto"><br></div><div dir="auto">Can you use the commit message to describe what the patch does ? </div><div dir="auto"><br></div><div dir="auto"><br></div><blockquote type="cite"><div style="letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;">}<br><br>+/// Return true if we can evaluate the specified expression tree if the vector<br>+/// elements were shuffled in a different order.<br>+static bool CanEvaluateShuffled(Value *V, ArrayRef<int> Mask,<br>+ unsigned Depth = 100) {<br></div></blockquote><div><br></div><div>Depth = 100 ?? This is really high. It should be a much lower threshold. This can potentially search 2^100 paths because you are calling this recursion twice on binary operands. Please limit this to something like 5. </div><div><div><br></div></div><blockquote type="cite"><div style="letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;"><br>+ case Instruction::InsertElement: {<br>+ ConstantInt *CI = dyn_cast<ConstantInt>(I->getOperand(2));<br>+ if (!CI) return false;<br>+ int ElementNumber = CI->getLimitedValue();<br>+<br>+ // Verify that 'CI' does not occur twice in Mask. A single 'insertelement'<br>+ // can't put an element into multiple indices.<br>+ bool SeenOnce = false;<br>+ for (int i = 0, e = Mask.size(); i != e; ++i) {<br>+ if (Mask[i] == ElementNumber) {<br>+ if (SeenOnce)<br>+ return false;<br>+ SeenOnce = true;<br>+ }<br>+ }<br>+ return CanEvaluateShuffled(I->getOperand(0), Mask, Depth-1);<br>+ }<br>+ }<br>+ return false;<br></div></blockquote><div><br></div><div>I would like to see more tests for the code that deals with insertelements. What happens if only some of the lanes are inserted ? Does it ever fail ?</div><div><br></div><blockquote type="cite"><div style="letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;"> }<br><br>+ if (isa<UndefValue>(RHS) &&<br>+ // This isn't necessary for correctness, but the comment block below<br>+ // claims that there are cases where folding two shuffles into one would<br>+ // cause worse codegen on some targets.<br>+ !isa<ShuffleVectorInst>(LHS) &&<br>+ CanEvaluateShuffled(LHS, Mask)) {<br></div></blockquote><div><br></div><div>What ?</div></div><br></div></body></html>