[PATCH] D38143: Dynamic stack alignment for Thumb1

Momchil Velikov via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Sep 25 05:04:52 PDT 2017


chill added inline comments.


================
Comment at: lib/Target/ARM/Thumb1FrameLowering.cpp:357
+    const unsigned NrBitsToZero = countTrailingZeros(MFI.getMaxAlignment());
+    // Emit the following sequence, using R4 as a temporary, since we cannot use
+    // SP as a source or destination register for the shifts:
----------------
asl wrote:
> rengolin wrote:
> > hard-coding r4 here is bound to create problems. You need to make sure it's saved earlier (and popped later).
> > 
> > You also add four instructions to the prologue, which in Thumb1 is not great. It' better than bad codegen, of course, but you need to make sure how often the realignment will hit (from being fatal, I'm guessing not often), or if there's another way to do this (I can't think of anything).
> > 
> > Welcoming comments from people with more Thumb1 experience than myself.
> Do we have register scavenger available here, so we could scavenge a spare register? Or it's too early to to this - before the frame is set up ?
Thanks for pointing that out. I've fixed the patch to add R4 to the set of callee saved register.

Unfortunately, I don't know of a better way to do the alignment in Thumb1. The only other alternative can think of (mov + bic) would need one more scratch register and would have more limiting range of possible alignments.


================
Comment at: lib/Target/ARM/Thumb1FrameLowering.cpp:357
+    const unsigned NrBitsToZero = countTrailingZeros(MFI.getMaxAlignment());
+    // Emit the following sequence, using R4 as a temporary, since we cannot use
+    // SP as a source or destination register for the shifts:
----------------
chill wrote:
> asl wrote:
> > rengolin wrote:
> > > hard-coding r4 here is bound to create problems. You need to make sure it's saved earlier (and popped later).
> > > 
> > > You also add four instructions to the prologue, which in Thumb1 is not great. It' better than bad codegen, of course, but you need to make sure how often the realignment will hit (from being fatal, I'm guessing not often), or if there's another way to do this (I can't think of anything).
> > > 
> > > Welcoming comments from people with more Thumb1 experience than myself.
> > Do we have register scavenger available here, so we could scavenge a spare register? Or it's too early to to this - before the frame is set up ?
> Thanks for pointing that out. I've fixed the patch to add R4 to the set of callee saved register.
> 
> Unfortunately, I don't know of a better way to do the alignment in Thumb1. The only other alternative can think of (mov + bic) would need one more scratch register and would have more limiting range of possible alignments.
I've experimented with using a virtual scratch register and having the register scavenger later find a suitable hard register for it. I works ... kinda. The register R4 is used as a scratch anyway in function epilogue to restore the SP from the frame pointer, so using some other available register does not buy as  anything. There's the option of using a virtual scratch register in epilogue too - but (besides triggering some bugs) the `llvm::emitThumbRegPlusImmediate` does not emit great code if the destination register is not a low register, e.g. we may get:

```
        movs    r1, #-8
        rsbs    r1, r1, #0
        add     r1, r7
        mov     sp, r1
```

Also,  this approach of using R4 as scratch is also employed by ARM/Thumb2 stack align code and I think this should be addressed by a separate patch.


https://reviews.llvm.org/D38143





More information about the llvm-commits mailing list