[llvm-dev] Global variant of regbankselect

Mon Nov 30 17:33:43 PST 2020

For AArch64, we’re currently working on catching up to SelectionDAG in terms of runtime performance, and we still have quite a bit to do on that front. Because of that we haven’t spent much time looking at where we see potential for a global regbankselect algorithm.

I think this is overall a nice idea to explore but unfortunately we don’t have any test cases or much bandwidth to spend on this at this time. I can see us perhaps having some time to look at this in a few months.

Thanks,
Amara

> On Nov 30, 2020, at 3:56 PM, Quentin Colombet <qcolombet at apple.com> wrote:
> 
> Hi Gabriel,
> 
> Thanks for pushing on that front!
> 
> To answer your question (with a few years old context), the reason we never really looked into a global/optimal algorithm is because the current ones are good enough. Back when I was working on AArch64, we didn’t have a test case where such scheme would prove worth it.
> 
> Again this may have changed since last time I looked into it. Amara and Jessica (CC’ed) would know for sure.
> 
> Cheers,
> -Quentin
> 
>> On Nov 29, 2020, at 11:38 PM, Gabriel Hjort Åkerlund via llvm-dev <llvm-dev at lists.llvm.org> wrote:
>> 
>> Hi all,
>> 
>> I have made several improvements and bug fixes to the global variant of regbankselect (available on https://reviews.llvm.org/D90304), and is now in a more mature state compared to its first draft. I ran a few tests to compare run times against the greedy version for AArch64's ARM64 target, and on those 9 testcases the global variant increases runtime of regbankselect with 10% (geometric mean). At most the runtime increased by 54%, and at least by 0%. There is also a patch for removing duplicates mappings, which will reduce these numbers once accepted (https://reviews.llvm.org/D92161). The quality of the code generated was equal to that of greedy.
>> 
>> In running these tests, however, I discovered that RegisterBankInfo's getInstrAlternativeMappings() of AArch64 does not compute all possible mappings. Most often an empty array is returned, and instead the target relies on getInstrMapping() to return the most appropriate mapping depending in the current context. But that assumes that register banks have already been selected for all registers used an instruction, which is the case for fast and greedy mode but not for global mode. Hence, for global mode to work it requires the target to give all possible mappnings, even when a register bank has not yet been selected for all input registers. For example, in order to run the tests above I had to modify the AArch64 backend to return possible mappings for G_STORE in order for global regbankselect to make suitable selections.
>> 
>> So in order to proceed we would need to augment the backends to give all alternatives, but I doubt the backend maintainers would be interested of doing this before getting proof that the global variant actually improves code quality. Hence I would like to have a testcase where it is known that the greedy variant does a sub-optimal selection. Does anyone know of such a testcase?
>> 
>> Cheers,
>> Gabriel Hjort Åkerlund 
>> 
>> 
>> -----Original Message-----
>> From: llvm-dev <llvm-dev-bounces at lists.llvm.org> On Behalf Of Gabriel Hjort Åkerlund via llvm-dev
>> Sent: den 2 november 2020 15:03
>> To: Nicolai Hähnle <nhaehnle at gmail.com>
>> Cc: llvm-dev at lists.llvm.org
>> Subject: Re: [llvm-dev] Optimal variant of regbankselect
>> 
>> Hi Nicolai,
>> 
>> Thanks for your response!
>> 
>> Fair enough, I will rename it to "global" as that is a property that we can all agree on. And if it happens to compute the optimal selection, then that's just a bonus. =)
>> 
>> Cheers,
>> Gabriel
>> 
>> -----Original Message-----
>> From: Nicolai Hähnle <nhaehnle at gmail.com>
>> Sent: den 30 oktober 2020 16:49
>> To: Gabriel Hjort Åkerlund <gabriel.hjort.akerlund at ericsson.com>
>> Cc: llvm-dev at lists.llvm.org
>> Subject: Re: [llvm-dev] Optimal variant of regbankselect
>> 
>> Hi Gabriel,
>> 
>> it's cool that you try to do this, but I think you need some more proof before calling this optimal :)
>> 
>> Put briefly, dynamic programming approaches for this kind of mapping/matching can be optimal if the underlying dependency structure is a tree. If it's a more general DAG, optimality goes out of the window. In your dynamic programming approach, my understanding is that you remember the cost of every possible "realization"/"mapping" of every node individually. So, you end up computing N*M pieces of information, where N is the number of nodes and M is the number of options to choose from at each node.
>> 
>> Unfortunately, the final selection of choices only works in a tree, because you basically have to choose each node's "realization"/"mapping" based on a single successor.
>> 
>> As soon as you're in a general DAG, where a node has to satisfy
>> *multiple* successors, that no longer works. You could theoretically extend the dynamic programming approach, but only by computing information about the cost of *correlated* choices for multiple nodes simultaneously. But then, your performance can go out of the window because you need to compute something like N*M^K pieces of information, where K is a bound on the number of nodes whose choices you need to consider simultaneously. K cannot be bounded in general (other than by some trivial function of the number of nodes in the graph), which means you end up with an exponential worst case if you want to solve this optimally. I would actually expect optimal register bank selection on a general DAG to be NP-complete, but I haven't thought about it too deeply.
>> 
>> Cheers,
>> Nicolai
>> 
>> On Thu, Oct 29, 2020 at 5:41 AM Gabriel Hjort Åkerlund via llvm-dev <llvm-dev at lists.llvm.org> wrote:
>>> 
>>> Hi Dominik,
>>> 
>>> 
>>> 
>>> (Sorry for spamming; the first reply only went to you and not the
>>> list.)
>>> 
>>> 
>>> 
>>> Cool that you want to try it out! There will for sure be some bugs in it, so please let me know if/when you find one and I’ll fix it. And if you could make a testcase out of it, that would be superb (as there’s currently a complete lack of tests).
>>> 
>>> 
>>> 
>>> Although I haven’t measured it, I expect the compilation time to take about 3x more compared to greedy as it makes three passes over the instructions. However, the most amount of work is done in the first pass, which is comparable to the pass made in greedy. So hopefully it’s less than 3x, but this should really be measured over a set of functions to get an accurate figure. Also, there will most likely be improvements to be had to decrease compilation time.
>>> 
>>> 
>>> 
>>> Cheers,
>>> 
>>> Gabriel
>>> 
>>> 
>>> 
>>> From: llvm-dev <llvm-dev-bounces at lists.llvm.org> On Behalf Of Dominik 
>>> Montada via llvm-dev
>>> Sent: den 28 oktober 2020 14:39
>>> To: llvm-dev at lists.llvm.org
>>> Subject: Re: [llvm-dev] Optimal variant of regbankselect
>>> 
>>> 
>>> 
>>> Hi Gabriel,
>>> 
>>> thank you so much for doing this! I'll try out the patch in our downstream implementation right away. Do you know how big of an impact this has on compile time compared to fast and greedy?
>>> 
>>> Cheers,
>>> 
>>> Dominik
>>> 
>>> Am 28.10.20 um 14:32 schrieb Gabriel Hjort Åkerlund via llvm-dev:
>>> 
>>> Hi all,
>>> 
>>> 
>>> 
>>> I have made an attempt of implementing an optimal variant of the register bank selector (regbankselect). The code is available for review at https://protect2.fireeye.com/v1/url?k=a79abb9d-f8018298-a79afb06-86d2114eab2f-6b4f526e957bcbf4&q=1&e=c647977b-0505-4914-b9c4-94e1266e5d86&u=https%3A%2F%2Freviews.llvm.org%2FD90304, and I would greatly appreciate if anyone interested can provide their comments. I have run a few tests the regbankselect-*.mir testcases for AAarch64 and it seems to work, but more tests are surely needed to increase confidence in the implementation. I also tried using AMDGPU, but that backend does not provide the full list of InstructionMappings for a given MachineInstr, which is needed in order to compute the optimal selection of register banks.
>>> 
>>> 
>>> 
>>> Cheers,
>>> 
>>> Gabriel Hjort Åkerlund
>>> 
>>> 
>>> 
>>> _______________________________________________
>>> 
>>> LLVM Developers mailing list
>>> 
>>> llvm-dev at lists.llvm.org
>>> 
>>> https://protect2.fireeye.com/v1/url?k=bec6fed6-e15dc7d3-bec6be4d-86d21
>>> 14eab2f-5a9736227c84c02e&q=1&e=c647977b-0505-4914-b9c4-94e1266e5d86&u=
>>> https%3A%2F%2Flists.llvm.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fllvm-dev
>>> 
>>> --
>>> 
>>> ----------------------------------------------------------------------
>>> 
>>> Dominik Montada                   Email: dominik.montada at hightec-rt.com
>>> 
>>> HighTec EDV-Systeme GmbH          Phone: +49 681 92613 19
>>> 
>>> Europaallee 19                    Fax:   +49-681-92613-26
>>> 
>>> D-66113 Saarbrücken               WWW: https://protect2.fireeye.com/v1/url?k=1a1aa8a6-44ab73c6-1a1ae83d-86e2237f51fb-3c96ae1cad03cc84&q=1&e=083f110d-1869-45bf-91fb-ac31bacb8334&u=http%3A%2F%2Fwww.hightec-rt.com%2F
>>> 
>>> 
>>> 
>>> Managing Director: Vera Strothmann
>>> 
>>> Register Court: Saarbrücken, HRB 10445, VAT ID: DE 138344222
>>> 
>>> 
>>> 
>>> This e-mail may contain confidential and/or privileged information. If
>>> 
>>> you are not the intended recipient please notify the sender 
>>> immediately
>>> 
>>> and destroy this e-mail. Any unauthorised copying, disclosure or
>>> 
>>> distribution of the material in this e-mail is strictly forbidden.
>>> 
>>> ---
>>> 
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> llvm-dev at lists.llvm.org
>>> https://protect2.fireeye.com/v1/url?k=069925c1-59021cc4-0699655a-86d21
>>> 14eab2f-f72692289fd37c37&q=1&e=c647977b-0505-4914-b9c4-94e1266e5d86&u=
>>> https%3A%2F%2Flists.llvm.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fllvm-dev
>> 
>> 
>> 
>> --
>> Lerne, wie die Welt wirklich ist,
>> aber vergiss niemals, wie sie sein sollte.
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>