[llvm-dev] RFC: LLD range extension thunks

Fri Jan 6 15:23:29 PST 2017

I think so. It uses essentially the same iterative approach being discussed
in this thread.

On Fri, Jan 6, 2017 at 3:12 PM, Mehdi Amini <mehdi.amini at apple.com> wrote:

> Isn’t it the same problem as what ARMConstantIslandPass is trying to
> address?
>
> —
> Mehdi
>
> On Jan 6, 2017, at 2:33 PM, Sean Silva via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
> After looking at this for a while, I do not think that this problem is
> NP-hard. With a finite "short branch" displacement k, I was not able to
> come up with a gadget that could create global constraints as would be
> needed to e.g. model an instance of 3SAT or vertex cover in terms of this
> problem.
>
> The problem is hard though. I believe that it is likely to be exponential
> in the "short branch" displacement k, and k is typically "pretty big".
>
> -- Sean Silva
>
> On Fri, Jan 6, 2017 at 1:12 PM, Sean Silva <chisophugis at gmail.com> wrote:
>
>>
>>
>> On Fri, Jan 6, 2017 at 12:41 AM, Bruce Hoult via llvm-dev <
>> llvm-dev at lists.llvm.org> wrote:
>>
>>>
>>>
>>> On Fri, Jan 6, 2017 at 6:21 AM, Rui Ueyama via llvm-dev <
>>> llvm-dev at lists.llvm.org> wrote:
>>>
>>>> On Thu, Jan 5, 2017 at 8:15 PM, Peter Smith <peter.smith at linaro.org>
>>>> wrote:
>>>>
>>>>> Hello Rui,
>>>>>
>>>>> Thanks for the comments
>>>>>
>>>>> - Synthetic sections and rewriting relocations
>>>>> I think that this would definitely be worth trying. It should remove
>>>>> the need for thunks to be represented in the core data structures, and
>>>>> would allow .
>>>>>
>>>>
>>>> Creating symbols for thunks would have another benefit: it makes
>>>> disassembled output easier to read because thunks have names.
>>>>
>>>>
>>>>> It would also mean that we wouldn't have to associate symbols with
>>>>> thunks as the relocations would directly target the thunks. ARM
>>>>> interworking makes reusing thunks more difficult as not every thunk is
>>>>> compatible with every caller. For example:
>>>>> ARM B target and Thumb2 B.W target can't reuse the same thunk even if
>>>>> in range as the branch instruction can't change state.
>>>>>
>>>>> I think it is worth an experiment to make the existing implementation
>>>>> of thunks use synthetic sections and rewriting relocations before
>>>>> trying to implement range extension thunks.
>>>>>
>>>>> - Yes the scan is linear it is essentially:
>>>>> do
>>>>>     assign addresses to input sections
>>>>>     for each relocation
>>>>>         if (thunk needed)
>>>>>             create thunk or reuse existing one
>>>>> while (no more thunks added)
>>>>>
>>>>> There's quite a lot of complexity that can be added with respect to
>>>>> the placement of thunks within the output section. For example if
>>>>> there is a caller with a low address and a caller with a high address,
>>>>> both might be able to reuse a thunk placed in the middle. I think it
>>>>> is worth starting simple though.
>>>>
>>>>
>>>> I agree. I believe that computing the best thunk positions is NP-hard,
>>>> but the best layout and a layout produced by a naive algorithm wouldn't be
>>>> that different.
>>>>
>>>
>>> Correct conclusion, but there's no way the problem is NP.
>>>
>>> Reordering functions (or even individual basic blocks) to minimize the
>>> needed thunks is a complex problem.
>>>
>>> But you're not doing that. Once an ordering is selected a simple greedy
>>> algorithm is optimal.
>>>
>>> There is no cost difference between a thunk that is right next to the
>>> short jump and a thunk that is only juuust within range. So you can find
>>> the lowest address jump needing a thunk to a particular target and put the
>>> thunk the maximum possible distance after it (after the end of a function,
>>> or even after any unconditional branch). Find everything else within range
>>> of that thunk and fix it up. Repeat.
>>>
>>
>> I don't think this analysis is correct. Assume a 1M branch displacement
>> for short jumps. Consider:
>>
>> secA: 512K (contains a jump "jumpA" at offset 0 that needs a thunk (it
>> doesn't matter where it needs to jump to, just that it definitely needs a
>> thunk))
>> secB: 512K
>> secC: 512K (contains a jump "jumpC" at offset 0 that jumps to offset 0 in
>> secA (i.e., just barely in range of a short jump))
>>
>> If the thunk for jumpA is placed between secB and secC (as it would be
>> based on your description) it will push the branch at the beginning of secC
>> out of range, forcing another thunk to be needed. In this small example,
>> the thunk for jumpA must be placed before secA in order to avoid needing a
>> thunk for jumpC. In other words, placing thunks can cause you to need even
>> more thunks.
>>
>> -- Sean Silva
>>
>>
>>
>>>
>>> Other algorithms will give smaller average displacements to the thunks,
>>> but there is no advantage in that. No other algorithm will generate fewer
>>> thunks.
>>>
>>> That's assuming all short branches have the same code size and
>>> displacement distance.
>>>
>>> If there are multiple branch distances and code sizes (and you have a
>>> choice between them at given call sites) then it's still just a simple
>>> dynamic programming problem, solvable in linear [1] time by trying each
>>> branch size at the first available call site, with a cache of the minimum
>>> cost assuming the first 0, 1, 2 .. N call sites have already been covered.
>>>
>>> [1] or at least nCallSites * nBranchSizes
>>>
>>>
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> llvm-dev at lists.llvm.org
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>
>>>
>>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170106/800b9c2e/attachment.html>