<div class="gmail_quote">On Tue, Oct 18, 2011 at 6:58 PM, Jakob Stoklund Olesen <span dir="ltr"><<a href="mailto:stoklund@2pi.dk">stoklund@2pi.dk</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">
<div style="word-wrap:break-word"><br><div><div class="im"><div>On Oct 18, 2011, at 5:22 PM, Chandler Carruth wrote:</div><br></div><div class="im"><blockquote type="cite"><span style="border-collapse:separate;font-family:Optima;font-style:normal;font-variant:normal;font-weight:normal;letter-spacing:normal;line-height:normal;text-align:-webkit-auto;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;font-size:medium"><div>
<div class="gmail_quote"><blockquote class="gmail_quote" style="margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0.8ex;border-left-width:1px;border-left-color:rgb(204, 204, 204);border-left-style:solid;padding-left:1ex">
<div style="word-wrap:break-word"><div><div><blockquote type="cite"><div class="gmail_quote"><div>As for why it should be an IR pass, mostly because once the selection dag runs through the code, we can never recover all of the freedom we have at the IR level. To start with, splicing MBBs around requires known about the terminators (which we only some of the time do), and it requires re-writing them a touch to account for the different fall-through pattern. To make matters worse, at this point we don't have the nicely analyzable 'switch' terminator (I think), and so the existing MBB placement code just bails on non-branch-exit blocks.</div>
</div></blockquote><div><br></div></div><div>Those are all the wrong reasons for not doing the right thing.</div></div></div></blockquote><div><br></div><div>Sorry, I'm not trying to do the wrong thing because of this... Currently, it feels like a trade-off in terms of cost/benefit. It's not yet clear to me that the benefit of doing this analysis in the CodeGen layer outweighs the cost and I was trying to clarify what the costs I perceive are.</div>
</div></div></span></blockquote></div></div><br><div>I think it's mostly about understanding how MBBs work.</div></div></blockquote><div><br></div><div>Indeed, that seems to be the case. =D Thanks for explaining things below, it helped me a lot.</div>
<div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;"><div style="word-wrap:break-word"><div>Ignoring calls and returns, most machines have three kinds of branches:</div>
<div><br></div><div>1. Unconditional</div><div>2. Conditional</div><div>3. Indirect.</div><div><br></div><div>The AnalyzeBranch() function understands the first two kinds, so if that function returns false (as in it's false that it didn't succeed) you can move the successors around, and you know that placing a successor immediately after the block and calling updateTerminator() will give you a fall-through.</div>
<div><br></div><div>If AnalyzeBranch() fails, you can still check if the last instruction in the block is an unpredicated barrier. If so, it is still safe to move the successors around, but that block will never be a fall-through. The canFallThrough() function implements this check.</div>
<div><br></div><div>If the last instruction in the block is predicated or not a barrier, you must keep it together with its layout successor. This should only happen in rare cases where it is necessary. For example, I am planning to lower invoke instructions into call instructions that are terminators. This is necessary to accurately model control flow to landing pads. Such a call instruction must fall through to its layout successor.</div>
<div><br></div><div>Some experimental targets don't implement AnalyzeBranch, so everything looks like an indirect branch. Those targets get the code placement they deserve.</div><div><br></div><div>I am not claiming the API is awesome, but the information you need is there, and you have the same freedom as for IR.</div>
<div><br></div><div>We explicitly designed the branch weights so switch lowering could annotate all the new branches with exact weights. It would be a shame to ignore that information.</div><div><br></div><div>So the benefits are:</div>
<div><br></div><div>- Profile-driven fall-through layout of lowered switches. That should be a pretty big deal.</div><div>- Proper placement of split critical edges.</div><div>- The ability to implement stuff like: "Don't put too many branches in a fetch group, or you'll freak out the branch predictor".</div>
</div></blockquote><div><br></div><div>These all seem like really good reasons, and your explanation helps me a lot. I'll take a stab at re-implementing this on MBBs. In the mean time, I've attached my patch with the IR-level patch as most of the interesting logic will remain the same. Please be gentle, it's my first proper LLVM pass, and my knowledge of optimization pass research & papers is limited. The part I'm least pleased with is the computation of weight for each chain in an SCC of chains, but so far I've not come up with anything better. =] I've tested this on several ad-hoc test cases, but nothing really thorough. Going to try moving it to the codegen layer first.</div>
<div><br></div><div><br></div><div>One question that remains, mostly to ensure I've understood you correctly: For switches, it is represented as an N-ary conditional terminator, and the targets of the switch can be freely intermingled with other MBBs?</div>
<div><br></div><div>I'll attach an updated patch to work on MBBs when I have one...</div></div>