[llvm-commits] [llvm] r109035 - in /llvm/trunk/lib/Target/X86: X86ISelLowering.cpp X86InstrInfo.cpp

Nate Begeman natebegeman at mac.com
Wed Jul 21 14:54:07 PDT 2010

On Jul 21, 2010, at 2:37 PM, Anton Korobeynikov wrote:

> Hello, Nate
>> Can you explain a bit about how prolog/epilog expansion is broken?  Sorry, I'm not 100% following this yet.
> If you'll look inside prologue / epilogue expansion code, you'll find
> that it finds the place to insert stack frame allocation /
> deallocation stuff just via scanning the first / last MBB and looking
> for push'es / pop's skipping them. It's ok since currently push / pop
> is inserted inserted only by this callee-saved save / restore code.
> Now problem: in order to save / restore high xmm registers one must
> use movaps. And this case is not handled by prologue / epilogue
> expansion code. The problem is that we cannot handle it the same way
> as push / pop: this instruction can normally be generated by the
> codegen. I saw the testcases, when movaps was used for vector reg
> reload from the spill slot just in the end of MBB. The situation was
> even worse with post-RA scheduler, since it can move this movaps quite
> far.
> Thus:
> 1. If we'll skip movaps blindly, we'll insert stack frame allocation /
> deallocation in wrong place
> 2. If we won't skip movaps,  we'll insert stack frame allocation /
> deallocation in wrong place
> The only quick workaround found was to use
> {load,store}RegFromStackSlot for all callee-saved regs, since it uses
> absolute offsets and is not influenced by this erratic behavior.

Ah, ok, so because of a PEI deficiency, we have to store all callee-save registers with move instructions to fixed stack offsets for win64?  If so, how about this, which should at least never generate illegal instructions:

Index: lib/Target/X86/X86InstrInfo.cpp
--- lib/Target/X86/X86InstrInfo.cpp	(revision 109035)
+++ lib/Target/X86/X86InstrInfo.cpp	(working copy)
@@ -2062,6 +2062,7 @@
   DebugLoc DL = MBB.findDebugLoc(MI);
   bool is64Bit = TM.getSubtarget<X86Subtarget>().is64Bit();
+  bool isWin64 = TM.getSubtarget<X86Subtarget>().isTargetWin64();
   unsigned SlotSize = is64Bit ? 8 : 4;
   MachineFunction &MF = *MBB.getParent();
@@ -2077,6 +2078,17 @@
     if (Reg == FPReg)
       // X86RegisterInfo::emitPrologue will handle spilling of frame register.
+    // The Prolog/Epilog Inserter code looks for pushes/pops, but xmm regs can
+    // only be saved with xmm move instructions. As a workaround to prevent
+    // stack frame manipulation code from being inserted in the wrong place,
+    // all registers on win64 should be saved with storeRegToStackSlot();
+    if (isWin64) {
+      const TargetRegisterClass *RC = TRI->getMinimalPhysRegClass(Reg);
+      storeRegToStackSlot(MBB, MI, Reg, true, CSI[i-1].getFrameIdx(), RC, &RI);
+      continue;
+    }

And something similar for the load path. 

>> If I recall correctly, PowerPC has an extremely similar issue with where LR/CR are stored.  I'll revert that part of the patch and work on a "more correct" one, but it seems strange that win64 wouldn't save callee-save regs in this routine, and instead choose to save them some other way.
> It will. But since high xmm regs need to be saved, they (high xmm
> regs) will be saved / restored into a wrong place => not preserved
> across the call.
> -- 
> With best regards, Anton Korobeynikov
> Faculty of Mathematics and Mechanics, Saint Petersburg State University

More information about the llvm-commits mailing list