[PATCH] D20003: X86CallFrameOpt: a first step towards optimizing inalloca calls (PR27076)

Michael Kuperstein via llvm-commits llvm-commits at lists.llvm.org
Fri May 6 00:12:50 PDT 2016


mkuper added a comment.

Hi Hans,
I have a few small comments, none of them very important.

Regarding the general direction - are you sure this can be extended to the more complicated cases in a sane way?
The reason I went in this direction for the regular mov -> push transformation was that the amount of different patterns we actually generate for call sequences is pretty limited. So it seemed a dumb linear instruction search will be able to cover most cases. There were some exceptions - I think one was a pseudo instruction expansion inside the call sequence that created control flow, which meant the FrameSetup and FrameDestroy ended up in different basic blocks. But this was very rare.
I'm not really familiar with the IR we generate for inalloca, but the nested stuff seems scary.


================
Comment at: lib/Target/X86/X86CallFrameOptimization.cpp:363
@@ +362,3 @@
+
+  // Often, there's an instruction here that moves the _chkstk amount into a
+  // virtual register. Ignore it for now; we'll look into that below.
----------------
Is this often or always? It seems like we're only going to match the vreg case, right?

================
Comment at: lib/Target/X86/X86CallFrameOptimization.cpp:384
@@ +383,3 @@
+  if (!I->isCall() || !I->getOperand(0).isSymbol() ||
+      StringRef(I->getOperand(0).getSymbolName()) != "_chkstk")
+    return false;
----------------
We use _alloca() for cygwin/mingw.
Handling only _chkstk sounds reasonable, but I'm not sure I'm a fan of having this string hardcoded here. Perhaps factor the code that decides on the symbol name out of X86FrameLowering::emitStackProbeCall() and call that here as well? 

================
Comment at: lib/Target/X86/X86CallFrameOptimization.cpp:458
@@ -359,3 +457,3 @@
   // Scan the call setup sequence for the pattern we're looking for.
   // We only handle a simple case - a sequence of store instructions that
   // push a sequence of stack-slot-aligned values onto the stack, with
----------------
As long as you're touching this - the comment is no longer correct. Could you please delete it?

================
Comment at: lib/Target/X86/X86CallFrameOptimization.cpp:658
@@ +657,3 @@
+  } else {
+    // Remove the incalloca call setup (in reverse to delete uses before defs).
+    Context.InAllocaSetup.FrameDestroy->eraseFromParent();
----------------
incalloca -> inalloca

================
Comment at: lib/Target/X86/X86CallFrameOptimization.cpp:674
@@ +673,3 @@
+    Context.InAllocaSetup.ChkstkCall->eraseFromParent();
+    Context.InAllocaSetup.AmountMov->eraseFromParent();
+    if (MRI->use_empty(
----------------
Is it possible for ChkstkResultCopy and AmountMov to have more uses?
I don't see why that would happen, but I don't see anything completely preventing it either.

================
Comment at: test/CodeGen/X86/inalloca.ll:1
@@ -1,2 +1,2 @@
-; RUN: llc < %s -mtriple=i686-pc-win32 | FileCheck %s
+; RUN: llc < %s -mtriple=i686-pc-win32 -no-x86-call-frame-opt | FileCheck %s
 
----------------
Are you sure you want to have these tests run only with -no-x86-call-frame-opt?
This way we don't actually cover the "normal" case of x86-call-frame-opt + thiscall, etc.


http://reviews.llvm.org/D20003





More information about the llvm-commits mailing list