[PATCH] D14719: Track pristine registers in terms of register units.

Tue Nov 17 15:11:20 PST 2015

Well the default code in  PEI::assignCalleeSavedSpillSlots() only inserts CalleeSaved Regs into the CalleeSavedInfo vector, so I guess for most targets that assumption will hold.
I must admit I still don't understand why you can't just put R16+R17 instead of D8 into the CSI set.

In any way I can live with iterating over subregs in getPristineRegs(). If we ever have to teach the targets with many register tuples about calls that will be the least of our problems I think. The usage in LivePhysRegs looks fine to me without explicit subregister iteration as LivePhysReg::addReg()/removeReg() already deals with that.

While switching PristineRegs to return register units would seems nicer long term, I'd like to not do this right now, as there are still several users around which expect lists of physregs and not regunits. Translating physregs into regunits tends to be easier than the other way around so it's probably best to stay with pristine regs being a list of physregs unless we also somehow manage to switch the typical users to use regunits.

- Matthias

> On Nov 17, 2015, at 2:09 PM, Krzysztof Parzyszek via llvm-commits <llvm-commits at lists.llvm.org> wrote:
> 
> On 11/17/2015 3:55 PM, Quentin Colombet wrote:
>> 
>> Looks like one possible target specific fix could be to return 32-bit register and pair them at the prologue/epilogue time.
>> Side question, what happen is you have an odd number of callee-saved registers to push?
> 
> We have custom version of assignCalleeSavedSpillSlots, where we populate the CSI vector with the actually saved registers, then that vector becomes the return value from getCalleeSavedInfo.  In a different function in HexagonFrameLowering (running during PEI) we go over that vector to generate the actual loads and stores.  We would have to do the assignment twice: once to fill the CSI vector, and the second time to do the real saving/restoring.
> 
> If only a half of a pair would need to be saved, in general we save/restore the whole pair.  In terms of execution time and code size it makes no difference.  We can lose 4 bytes on the stack, but the stack alignment between calls is 8 bytes anyway, so it's not a big deal.
> 
> Our customers have some code that uses fixed registers (e.g. -ffixed-r18), and in such cases, the assign-spill-slot function will be careful to avoid any references to such registers.  In these cases we will only store half of the pair, but we will still reserve a stack slot for the fixed register.  One reason for this is that an 8-byte store must be aligned to a 8-byte boundary: if we wanted to use that space, we would have to break up the remaining pairs into 4-byte stores.  We could coalesce the 4-byte stores at the end of the spill area to avoid this, but I'm not sure if we do that now.
> 
> -Krzysztof
> 
> -- 
> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits