<html><head><meta http-equiv="Content-Type" content="text/html charset=us-ascii"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; "><br><div><div>On Jul 5, 2012, at 1:08 PM, Andrew Trick <<a href="mailto:atrick@apple.com">atrick@apple.com</a>> wrote:</div><br class="Apple-interchange-newline"><blockquote type="cite"><meta http-equiv="Content-Type" content="text/html charset=us-ascii"><div style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; "><br><div><div>On Jul 3, 2012, at 5:09 PM, Jakob Stoklund Olesen <<a href="mailto:stoklund@2pi.dk">stoklund@2pi.dk</a>> wrote:</div><br class="Apple-interchange-newline"><blockquote type="cite">Author: stoklund<br>Date: Tue Jul  3 19:09:54 2012<br>New Revision: 159694<br><br>URL: <a href="http://llvm.org/viewvc/llvm-project?rev=159694&view=rev">http://llvm.org/viewvc/llvm-project?rev=159694&view=rev</a><br>Log:<br>Add an experimental early if-conversion pass, off by default.<br></blockquote></div></div></blockquote><br><blockquote type="cite"><div style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; "><div>You're running this after TailDup, probably because you want MachineDCE to run first. I'm not sure when EarlyTailDup is currently kicking in, but we generally want to try if-converting a pass before tailduping it.</div></div></blockquote><div><br></div><div>IIRC, EarlyTailDup is mostly focused on indirect branches. It definitely won't touch return blocks because the epilog size is unknown.</div><br><blockquote type="cite"><div style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; "><div>Ideal pass order is: DCE, IfCvt, TailDup, ReDCE only where TailDup did something.</div></div></blockquote><div><br></div><div>I didn't put a lot of thought into the pass ordering. I don't think DCE has a lot to do normally, so it is possible we can simply go: IfCvt, TailDup, DCE.</div><br><blockquote type="cite"><div style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; "><div><div>+for (MachineFunction::iterator MFI = MF.begin(), MFE = MF.end(); MFI != MFE;</div><div>+       ++MFI)</div><div>+    if (tryConvertIf(MFI))</div><div><br></div><div>How does this iterator work if you're deleting blocks? Maybe a comment.</div></div></div></blockquote><div><br></div><div>It works because the if-conversion never deletes the head block. I'll add a comment.</div><br><blockquote type="cite"><div style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; "><div><div>I think it would be nice if CFG transforms don't depend on arbitrary block layout order.</div><div>Instead of using a worklist, would you consider a postorder domtree walk? That might solve two problems at once (stable order + nested if-conversion). The parent domtree node won't be invalidated by the if-conversion itself.</div></div></div></blockquote><div><br></div><div>That would be fine, but it is just a performance optimization. The current implementation simply iterates to convergence, so the initial ordering shouldn't matter.</div><div><br></div><div>When I add heuristics, that may change. I can imagine the order affecting heuristics.</div><div><br></div><blockquote type="cite"><div style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; "><div><div>Speaking of which, Are you considering incremental DomTree update? It should be easy for if-conversion.</div></div></div></blockquote><div><br></div><div>Yes, I considered it, as well as LoopInfo update. These analyses are not live in the current pass order, which is why I ignored them. I agree that it should be easy.</div><div><br></div><blockquote type="cite"><div style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; "><div><div>+  if (Tail->pred_size() != 2)</div><div>+    return false;</div><div><br></div><div>That doesn't look right. How can we handle simple nested cases like this?</div><div><br></div><div><font face="Menlo">A</font></div><div><font face="Menlo">|\</font></div><div><font face="Menlo">B \</font></div><div><font face="Menlo">|\ \</font></div><div><font face="Menlo">C E F</font></div><div><font face="Menlo">| | |</font></div><div><font face="Menlo"> \|/</font></div><div><font face="Menlo">  D</font></div></div></div></blockquote><div><br></div><div>I chickened out. It is perfectly possible to handle this case, but it must be done 'inside out':</div><div><br></div><div>- Convert B -> C/E -> D. Rewrite the PHI in D instead of replacing it.</div><div>- Convert A -> B/F -> D. This time eliminate the PHI in D, and possibly join D with A.</div><div><br></div><div>I will look into this.</div><br><blockquote type="cite"><div style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; "><div><div>Taking it a step further, don't we want to handle simple "or" conditions?</div><div><br></div><div><font face="Menlo">A</font></div><div><font face="Menlo">| \</font></div><div><font face="Menlo">B  F</font></div><div><font face="Menlo">|\  \</font></div><div><font face="Menlo">C E  |</font></div><div><font face="Menlo">|  \/</font></div><div><font face="Menlo">|  G</font></div><div><font face="Menlo">| /</font></div><div><font face="Menlo">D</font></div></div></div></blockquote><div><br></div><div>Possibly, it really depends on what turns out to be profitable. I wasn't really expecting that converting complicated structures would be profitable.</div><div><br></div><div>I think we should return to this when heuristics are in place.</div><br><blockquote type="cite"><div style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; "><div>You're looking for simple patterns with single block on a path, so assuming prior passes have all aggressively merged fall-through blocks. Maybe this assumption should be commented.</div></div></blockquote><div><br></div><div>I am not sure what you mean here. The only assumption made about layout is whether the tail block should be merged with the head block, or an unconditional branch is required.</div><br><blockquote type="cite"><div style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; "><div><div><br></div><div><div>+ bool SSAIfConv::canSpeculateInstrs(MachineBasicBlock *MBB) {</div><div>...</div></div><div>+  for (MachineBasicBlock::iterator I = MBB->begin(),</div><div>+       E = MBB->getFirstTerminator(); I != E; ++I) {</div><div><br></div><div>Dumb question: When is it safe to completely ignore MI terminators like this? Are we guaranteed that they don't write physical registers or have any other side effects? This would be a good place for a comment.</div></div></div></blockquote><div><br></div><div>Terminators definitely can't write virtual registers, as Hal learned recently.</div><div><br></div><div>We don't really have hard rules about this stuff, current targets simply don't have side-effecting terminators. I imagine it would create all sorts of problems.</div><div><br></div><div>/jakob</div><div><br></div></div></body></html>