[llvm-dev] Register allocators and constrained classes at -O0

Quentin Colombet via llvm-dev llvm-dev at lists.llvm.org
Thu Jul 22 09:56:44 PDT 2021


Hi James,

> On Jul 21, 2021, at 10:36 PM, Nagurne, James via llvm-dev <llvm-dev at lists.llvm.org> wrote:
> 
> Thanks for the response, Matthias.
> 
> - Shuffling operands is viable, but I had only considered this somewhat lightheartedly since it seems very fragile to tie the chances that we get an unallocatable situation to something as seemingly unrelated as operand order.
> 
> - The option to effectively separate base/src and index allocation sets is also viable, but becomes more problematic as the ISA has fewer and fewer registers. Say if we remove all but R0-R3, it becomes extremely easy to get into pathological situations if we keep the 'general' set to 2 registers and the constrained set to the other 2, especially due to the relative rarity of base+index memory operations relative to other types of operations. I do admit that should this situation occur, the fault lies somewhat on overzealous architecture designers, but it's a possibility to consider nonetheless.
> 
> - Fixing index to R0 or R1 prior to allocation does have merit, whose spirit is reflected in the AMDGPU "staged" allocation mechanism. The target could tell the allocator to go through and assign the constrained registers first, and then go through to allocate the rest. I think this is a good solution in the long run, and am very happy that it has been shared upstream.
> 
> I did find it interesting that you used 'greedy' in the description for the fast allocator, given that there is a 'Greedy' allocator. I also feel that fast seems far greedier.
> 
> Can you or anyone else see an issue with always running the Greedy (tuned Basic) allocator at all optimization levels, or have history as to why Fast was chosen as the default at -O0?

Always running the greedy allocator would be ideal because it would reduce the maintenance burden of the allocators.
Now, the problem is that greedy is just too slow even when we disable some non essential heuristics. Switching to that allocator would be a pretty big compile time regression for our users. I don’t remember the numbers on top of my head but it was substantial.

That said, if you’re okay with that I believe you could tweak the pipeline for your target to always use the greedy allocator. You may want to make TargetPassConfig::getOptimizeRegAlloc virtual in that case.

Cheers,
-Quentin
> I can imagine it might involve the debugability of values as their live ranges are split.
> 
> JB
> From: Matthias Braun <matthiasb at fb.com>
> Sent: Wednesday, July 21, 2021 6:03:29 PM
> To: Nagurne, James
> Cc: llvm-dev at lists.llvm.org
> Subject: [EXTERNAL] Re: [llvm-dev] Register allocators and constrained classes at -O0
>  
> * This problem is independent of allocation priorities (you can always have R2-R7 already assigned for other values in the surrounding code).
> * The fast register allocator by default assigns registers in a greedy fashion one operand at a time and won't be able to revert the choice within an instructions. The really unusual situation here is that picking certain registers for the earlier "base" and "src" operands can make it impossible to pick one for constrained regclass, I don't remember seeing anything similar for other targets, they either have operands fixed to 1physical register or the register class contains more registers than number of operands...
> 
> Some ideas to work around the problems (though I guess none of them is great):
> * Could shuffle the order of operands so the constraint one comes first?
> * Could use a register class without R0 and R1 for the "base" and "src" inputs
> * Could fix "index" to R0 or R1 upfront.
> 
> 
> > On Jul 21, 2021, at 2:35 PM, Nagurne, James via llvm-dev <llvm-dev at lists.llvm.org> wrote:
> > 
> > Hi all, I’ve run into an issue involving the fast allocator and constrained register classes that I’m looking for pointers on.
> >  
> > Consider a target with a register store instruction using an addressing mode with register base and offset:
> >                 *(base + index) = src;
> >  
> > This instruction may be modeled in tablegen with the operands: (outs) (ins REGCLASS:$src, REGCLASS:$base, REGCLASS_CONSTRAINED:$index)
> >  
> > In my particular case, REGCLASS contains 8 registers, R0-R7, and REGCLASS_CONSTRAINED has only 2, R0 and R1. The allocation order for REGCLASS is specifically set up to avoid R0 and R1 via an order that rotates the list left to (R2-R7, R0, R1).
> >  
> > In a piece of code at -O0, I’m at a point where I’ve allocated R2-R7, and the fast allocator sees a store of the form above. It chooses R0 and R1 for $src and $base. Because R0 and R1 are taken, $index can’t be allocated at all, and the allocator emits an error.
> >  
> > I first thought that the backend should be able to prioritize the constrained register first and found the AllocationPriority field in tablegen, but then noted that it was only used in the greedy allocator. I then looked at the fast allocator and noted that it seems to have no method for the target to affect priority or ordering. This is probably by design, so I pose my conundrum to the community since I can’t imagine that this hasn’t come up at some point in the past.
> >  
> > My current options seem to be:
> >        • Update REGCLASS to not contain R0 or R1 in the allocation order, but this seems like we’ll end up spilling in cases where we really don’t need to
> >        • Run the greedy allocator at -O0. It seems like no other target does something like this currently
> >                • NVPTX and WebAssembly override createTargetRegisterAllocator to return nullptr
> >                • Overriding addRegAssignAndRewriteFast to just return addRegAssignAndRewriteOptimized seems to fix the allocation problem, but I wonder if there’s a fundamental issue with how the target has been defined.
> >        • Perhaps attempt to adapt the recently-committed staged-allocation mechanism fromhttps://reviews.llvm.org/rGeebe841a47cb <https://reviews.llvm.org/rGeebe841a47cb> to allocate constrained registers first
> >  
> > Regards,
> > J.B. Nagurne
> > Code Generation
> > Texas Instruments
> > _______________________________________________
> > LLVM Developers mailing list
> > llvm-dev at lists.llvm.org
> > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev <https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
> 
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210722/d64a3c45/attachment.html>


More information about the llvm-dev mailing list