<html><head><meta http-equiv="Content-Type" content="text/html charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class="">Isn’t it the same problem as what ARMConstantIslandPass is trying to address?<div class=""><br class=""></div><div class="">— </div><div class="">Mehdi</div><div class=""><br class=""><div><blockquote type="cite" class=""><div class="">On Jan 6, 2017, at 2:33 PM, Sean Silva via llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org" class="">llvm-dev@lists.llvm.org</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><div dir="ltr" class="">After looking at this for a while, I do not think that this problem is NP-hard. With a finite "short branch" displacement k, I was not able to come up with a gadget that could create global constraints as would be needed to e.g. model an instance of 3SAT or vertex cover in terms of this problem. <div class=""><br class=""></div><div class="">The problem is hard though. I believe that it is likely to be exponential in the "short branch" displacement k, and k is typically "pretty big".<div class=""><br class=""></div><div class="">-- Sean Silva<br class=""><div class="gmail_extra"><br class=""><div class="gmail_quote">On Fri, Jan 6, 2017 at 1:12 PM, Sean Silva <span dir="ltr" class=""><<a href="mailto:chisophugis@gmail.com" target="_blank" class="">chisophugis@gmail.com</a>></span> wrote:<br class=""><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr" class=""><br class=""><div class="gmail_extra"><br class=""><div class="gmail_quote"><div class=""><div class="h5">On Fri, Jan 6, 2017 at 12:41 AM, Bruce Hoult via llvm-dev <span dir="ltr" class=""><<a href="mailto:llvm-dev@lists.llvm.org" target="_blank" class="">llvm-dev@lists.llvm.org</a>></span> wrote:<br class=""><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr" class=""><br class=""><div class="gmail_extra"><br class=""><div class="gmail_quote"><div class=""><div class="m_3166449507986801982h5">On Fri, Jan 6, 2017 at 6:21 AM, Rui Ueyama via llvm-dev <span dir="ltr" class=""><<a href="mailto:llvm-dev@lists.llvm.org" target="_blank" class="">llvm-dev@lists.llvm.org</a>></span> wrote:<br class=""><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr" class=""><div class="gmail_extra"><div class="gmail_quote"><span class="m_3166449507986801982m_-8474489496312911250gmail-">On Thu, Jan 5, 2017 at 8:15 PM, Peter Smith <span dir="ltr" class=""><<a href="mailto:peter.smith@linaro.org" target="_blank" class="">peter.smith@linaro.org</a>></span> wrote:<br class=""><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">Hello Rui,<br class="">

<br class="">

Thanks for the comments<br class="">

<br class="">

- Synthetic sections and rewriting relocations<br class="">

I think that this would definitely be worth trying. It should remove<br class="">

the need for thunks to be represented in the core data structures, and<br class="">

would allow .<br class=""></blockquote><div class=""><br class=""></div></span><div class="">Creating symbols for thunks would have another benefit: it makes disassembled output easier to read because thunks have names.</div><span class="m_3166449507986801982m_-8474489496312911250gmail-"><div class=""> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">

It would also mean that we wouldn't have to associate symbols with<br class="">

thunks as the relocations would directly target the thunks. ARM<br class="">

interworking makes reusing thunks more difficult as not every thunk is<br class="">

compatible with every caller. For example:<br class="">

ARM B target and Thumb2 B.W target can't reuse the same thunk even if<br class="">

in range as the branch instruction can't change state.<br class="">

<br class="">

I think it is worth an experiment to make the existing implementation<br class="">

of thunks use synthetic sections and rewriting relocations before<br class="">

trying to implement range extension thunks.<br class="">

<br class="">

- Yes the scan is linear it is essentially:<br class="">

do<br class="">

    assign addresses to input sections<br class="">

    for each relocation<br class="">

        if (thunk needed)<br class="">

            create thunk or reuse existing one<br class="">

while (no more thunks added)<br class="">

<br class="">

There's quite a lot of complexity that can be added with respect to<br class="">

the placement of thunks within the output section. For example if<br class="">

there is a caller with a low address and a caller with a high address,<br class="">

both might be able to reuse a thunk placed in the middle. I think it<br class="">

is worth starting simple though.</blockquote><div class=""><br class=""></div></span><div class="">I agree. I believe that computing the best thunk positions is NP-hard, but the best layout and a layout produced by a naive algorithm wouldn't be that different.</div></div></div></div></blockquote><div class=""><br class=""></div></div></div><div style="font-size:12.8px" class="">Correct conclusion, but there's no way the problem is NP.</div><div style="font-size:12.8px" class=""><br class=""></div><div style="font-size:12.8px" class="">Reordering functions (or even individual basic blocks) to minimize the needed thunks is a complex problem.</div><div style="font-size:12.8px" class=""><br class=""></div><div style="font-size:12.8px" class="">But you're not doing that. Once an ordering is selected a simple greedy algorithm is optimal.</div><div style="font-size:12.8px" class=""><br class=""></div><div style="font-size:12.8px" class="">There is no cost difference between a thunk that is right next to the short jump and a thunk that is only juuust within range. So you can find the lowest address jump needing a thunk to a particular target and put the thunk the maximum possible distance after it (after the end of a function, or even after any unconditional branch). Find everything else within range of that thunk and fix it up. Repeat.</div></div></div></div></blockquote><div class=""><br class=""></div></div></div><div class="">I don't think this analysis is correct. Assume a 1M branch displacement for short jumps. Consider:</div><div class=""><br class=""></div><div class="">secA: 512K (contains a jump "jumpA" at offset 0 that needs a thunk (it doesn't matter where it needs to jump to, just that it definitely needs a thunk))</div><div class="">secB: 512K</div><div class="">secC: 512K (contains a jump "jumpC" at offset 0 that jumps to offset 0 in secA (i.e., just barely in range of a short jump))</div><div class=""><br class=""></div><div class="">If the thunk for jumpA is placed between secB and secC (as it would be based on your description) it will push the branch at the beginning of secC out of range, forcing another thunk to be needed. In this small example, the thunk for jumpA must be placed before secA in order to avoid needing a thunk for jumpC. In other words, placing thunks can cause you to need even more thunks.</div><span class="HOEnZb"><font color="#888888" class=""><div class=""><br class=""></div><div class="">-- Sean Silva</div><div class=""><br class=""></div><div class=""> </div></font></span><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class=""><div dir="ltr" class=""><div class="gmail_extra"><div class="gmail_quote"><div style="font-size:12.8px" class=""><br class=""></div><div style="font-size:12.8px" class="">Other algorithms will give smaller average displacements to the thunks, but there is no advantage in that. No other algorithm will generate fewer thunks.</div><div style="font-size:12.8px" class=""><br class=""></div><div style="font-size:12.8px" class="">That's assuming all short branches have the same code size and displacement distance.</div><div style="font-size:12.8px" class=""><br class=""></div><div style="font-size:12.8px" class="">If there are multiple branch distances and code sizes (and you have a choice between them at given call sites) then it's still just a simple dynamic programming problem, solvable in linear [1] time by trying each branch size at the first available call site, with a cache of the minimum cost assuming the first 0, 1, 2 .. N call sites have already been covered.</div><div style="font-size:12.8px" class=""><br class=""></div><div class=""><span style="font-size:12.8px" class="">[1] or at least nCallSites * nBranchSizes</span></div><div class=""> </div></div></div></div>

<br class=""></span><span class="">______________________________<wbr class="">_________________<br class="">

LLVM Developers mailing list<br class="">

<a href="mailto:llvm-dev@lists.llvm.org" target="_blank" class="">llvm-dev@lists.llvm.org</a><br class="">

<a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank" class="">http://lists.llvm.org/cgi-bin/<wbr class="">mailman/listinfo/llvm-dev</a><br class="">

<br class=""></span></blockquote></div><br class=""></div></div>

</blockquote></div><br class=""></div></div></div></div>

_______________________________________________<br class="">LLVM Developers mailing list<br class=""><a href="mailto:llvm-dev@lists.llvm.org" class="">llvm-dev@lists.llvm.org</a><br class="">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev<br class=""></div></blockquote></div><br class=""></div></body></html>