[PATCH] D13361: Support for emitting inline stack probes

Reid Kleckner via llvm-commits llvm-commits at lists.llvm.org
Wed Nov 4 09:49:18 PST 2015


rnk added inline comments.

================
Comment at: lib/Target/X86/X86FrameLowering.cpp:457
@@ +456,3 @@
+
+  for (; (MBBI != MBBEnd) && (ChkStkStub == nullptr); MBBI++) {
+    if (MBBI->isCall()) {
----------------
This can be simplified to:
  for (MachineInstr &MI : PrologueMBB) {
    ...

You can recover iterators from MachineInstrs with the getIterator() method.

================
Comment at: lib/Target/X86/X86FrameLowering.cpp:459
@@ +458,3 @@
+    if (MBBI->isCall()) {
+      for (const auto &MO : MBBI->operands()) {
+        if (MO.isSymbol() && (ChkStkStubSymbol == MO.getSymbolName()))
----------------
I don't think you need this loop, all the x86 CALL instructions appear to only take one operand. This should be fine, assuming it passes tests:
  if (MI.isCall() && MI.getOperand(0).isSymbol() &&
     ChkStubStubSymbol == MI.getOperand(0).getSymbolName()) {
    ChkStkStub = &MI;
    break;
  }

================
Comment at: lib/Target/X86/X86FrameLowering.cpp:471
@@ +470,3 @@
+
+    ChkStkStub->removeFromParent();
+
----------------
I think you want `eraseFromParent()` here to avoid a leak.

================
Comment at: lib/Target/X86/X86FrameLowering.cpp:502
@@ +501,3 @@
+  //    LimitReg = gs magic thread env access
+  //    if FinalReg >= LimitReg goto ContinueMBB
+  // RoundBB:
----------------
Is this check correct for a downwards growing stack? LimitReg here is StackLimit in NT_TIB, which is the minimum allowed value for this thread's SP, right? I think we want to enter the loop if FinalReg >= LimitReg.

What are we supposed to do anyway if we attempt to probe beyond StackLimit? I would expect normal code that doesn't probe will crash when accessing beyond StackLimit, so why can't we skip this check and crash somewhere in the loop?

================
Comment at: lib/Target/X86/X86FrameLowering.cpp:519
@@ +518,3 @@
+
+  MachineFunction::iterator MBBIter = MBB;
+  ++MBBIter;
----------------
Duncan has been working to remove the implicit iterator conversions around iplist elements, so the new idiom for this is `MMBIter = MBB->getIterator()`.

================
Comment at: lib/Target/X86/X86FrameLowering.cpp:567
@@ +566,3 @@
+    // Emit the saves.
+    BuildMI(&MBB, DL, TII.get(X86::MOV64mr))
+      .addReg(X86::RSP)
----------------
There's a utility called addRegOffset() for forming basereg+displacement memory operands that don't use a segment register, index register, or scale. In this case it would probably look like:
  addRegOffset(BuildMI(&MBB, DL, TII.get(X86::MOV64mr)),
               X86::RSP, false, RCXShadowSlot)
    .addReg(X86::RCX);
You can also use this idiom below to shorten the other memory accesses and LEAs. If you think it makes things less clear, then I don't feel too strongly about using it everywhere.

================
Comment at: lib/Target/X86/X86FrameLowering.cpp:635-636
@@ +634,4 @@
+      .addReg(0);
+  // Probe by storing a byte onto the stack.
+  BuildMI(LoopMBB, DL, TII.get(X86::MOV8mi))
+      .addReg(ProbeReg)
----------------
MSVC CRT and mingw CRT both use TEST for this purpose. Is there a reason why the CLR wants to do a write here instead of a read?

================
Comment at: lib/Target/X86/X86FrameLowering.cpp:684-691
@@ +683,10 @@
+    }
+    for (MachineBasicBlock::iterator MI = RoundMBB->begin();
+         MI != RoundMBB->end(); ++MI) {
+      MI->setFlag(MachineInstr::FrameSetup);
+    }
+    for (MachineBasicBlock::iterator MI = LoopMBB->begin();
+         MI != LoopMBB->end(); ++MI) {
+      MI->setFlag(MachineInstr::FrameSetup);
+    }
+    for (MachineBasicBlock::iterator CMBBI = ContinueMBB->begin();
----------------
These middle loops can both be range-based.

================
Comment at: test/CodeGen/X86/win_coreclr_chkstk.ll:6
@@ +5,3 @@
+; if more than 4096 bytes are allocated on the stack.
+
+; Prolog stack allocation >= 4096 bytes will require the probe sequence
----------------
The rest of these tests are looking for the presence or absence of the inlined stack probe. We should have one or two tests that pattern match the whole loop. It'll be brittle to future changes to the probe, but we should only do it once or twice.

================
Comment at: test/CodeGen/X86/win_coreclr_chkstk.ll:102
@@ +101,3 @@
+define i32 @test_probe_size() "stack-probe-size"="8192" nounwind {
+; WIN_X64-NOT:  movq    %gs:16, %rcx
+; LINUX-NOT:    movq    %gs:16, %rcx
----------------
It's good to bracket negative checks with something that will match closely to them, so that they don't match something from previous code, something like:
  ; WIN_X64: test_probe_size:
  ; WIN_X64-NOT: mov %gs:16, %rcx
  ; WIN_X64: retq


Repository:
  rL LLVM

http://reviews.llvm.org/D13361





More information about the llvm-commits mailing list