[PATCH] D17294: Fix for PR 26500

Kit Barton via llvm-commits llvm-commits at lists.llvm.org
Thu Feb 18 15:05:47 PST 2016


kbarton added inline comments.

================
Comment at: lib/Target/PowerPC/PPCFrameLowering.cpp:601
@@ +600,3 @@
+  // If we need the second one, initially set it to R0 as well.
+  if (ScratchRegisterCR) {
+    assert(ScratchRegister &&
----------------
nemanjai wrote:
> Well, there's no guarantee at this point that R12 is available. If I don't do this, we can easily end up in the same situation that prompted the bug. The idea is that I assume we don't have the second scratch register available until one is found. The code that reorders the CR/LR spills depends on the two registers actually being the same.
See comment below. 

================
Comment at: lib/Target/PowerPC/PPCFrameLowering.cpp:616
@@ +615,3 @@
+
+  // Don't consider callee-saved registers
+  for (int i = 0; CSRegs[i]; i++)
----------------
nemanjai wrote:
> How about:
> We shouldn't use callee-saved registers as scratch registers as they may be available when looking for a candidate block for shrink wrapping but not available when the actual prologue/epilogue is being emitted.
... because they have been added as live-in to the prologue block by the PrologueEpilogueInserter. 

================
Comment at: lib/Target/PowerPC/PPCFrameLowering.cpp:623
@@ -596,3 +622,3 @@
   // R0 must be used (as recommended by the ABI)
-  if (Reg == 0)
+  if (BV.empty() || (BV.count() < 2 && Needs2ScratchRegs))
     return false;
----------------
I don't think you want this BV.count() < 2 check here.

As it stands now, if there aren't at least 2 bits available, this will return false (indicating you can't find 2 different registers). 
This ties in with the if condition above on line 601. It's possible that there is one register available, in which case both ScratchRegister and ScratchRegisterCR are set to R0 (line 604) but then this returns false here. The effect of that is shrink wrapping will be disabled, and the same temp register will be used for both the CR and LR (i.e., lost both opportunities). 

I think this logic needs to be reworked here to account for the 3 different scenarios:
1. No scratch registers available: set ScratchRegister to R0, ScratchRegisterCR to R12, return false;
2. One scratch register available: set both ScratchRegisters to available reg; return true;
3. Two scratch registers available: set ScratchRegister and ScratchRegisterCR using the bit vector iterator, return true. 

================
Comment at: lib/Target/PowerPC/PPCFrameLowering.cpp:629
@@ +628,3 @@
+
+  if (ScratchRegisterCR && BV.find_next(*ScratchRegister) != -1)
+    *ScratchRegisterCR = BV.find_next(*ScratchRegister);
----------------
Why would find_next return -1? You've already checked that there are at least 2 bits set.

================
Comment at: lib/Target/PowerPC/PPCFrameLowering.cpp:1106-1107
@@ -1033,3 +1105,4 @@
 
-  findScratchRegister(&MBB, true, &ScratchReg);
+  findScratchRegister(&MBB, true, MustSaveCR, &ScratchReg, &TempReg);
+
   assert(ScratchReg && "No scratch register!");
----------------
nemanjai wrote:
> I think that this would have the potential of causing some functions not to be shrink wrapped unnecessarily. At the very least, it will make blocks that don't have two available scratch registers invalid for epilogue insertion when we technically could use the block as epilogue. This is so because as I mentioned in the comment above, the emitEpilogue function never has a hard requirement for two scratch registers - if we don't have two, we just use the same one for spilling the CR and LR.
> 
> So now the question is, do we want that? Using a single register will cause the CR/LR spills to have increased overall latency. On the other hand, if no blocks are available for epilogue insertion, we don't shrink wrap. I can implement it either way.
I think we should use two scratch registers, if they are available, otherwise just use one. There should be other ways we can overlap the instructions to hide the latencies, if need be. 
However, I would put a comment in both canUseAsEpilogue and emitEpilogue saying that this. 

================
Comment at: lib/Target/PowerPC/PPCFrameLowering.h:46
@@ -44,1 +45,3 @@
+   * to be realigned) the function will check for and set two registers.
+   *
    * \param[in] MBB The machine basic block to find an available register for
----------------
nemanjai wrote:
> I've described that in the return section. Do you want me to add something here as well?
Sorry, I missed that.
But you should update to indicate R0 and R12 are used as the defaults.


Repository:
  rL LLVM

http://reviews.llvm.org/D17294





More information about the llvm-commits mailing list