[llvm] r243010 - [X86] Allow load folding into PUSH instructions

Sean Silva chisophugis at gmail.com
Sun Jul 26 23:23:35 PDT 2015


On Sun, Jul 26, 2015 at 10:49 PM, Kuperstein, Michael M <
michael.m.kuperstein at intel.com> wrote:

>  It seems that the feature was originally introduced to avoid the memory
> form of call instructions on Atom CPUs (including Silvermont).
>
> But there’s nothing really special about memory-form calls. The same
> applies to memory-form pushes, for example.
>
> This is based on the following guidance from the x86 optimization manual:
>
>
>
> For Atom (Bonnell):
>
> “For Intel Atom processors,  minimize the presence of complex instructions
> requiring MSROM to take advantage the optimal decode  bandwidth provided by
> the two decode units.”
>
>
>
> For Silvermont:
>
> “MSROM instructions should be avoided if possible. A good example is the
> memory form of PUSH and CALL. It will often be better to perform a load
> into a register and then perform the register version of  PUSH and CALL.”
>
>
>
> I’ve wanted to use MSROM explicitly to make cross-referencing with the
> manual easy, but, as I said, I’m not attached to that.
>

Oops, sorry. I didn't make that connection. Is MSROM just the general Intel
term for "microcoded"?


>
>
> Regarding your other question in the commit – there are, of course, plenty
> of microcoded instructions (there’s a table in the optimization manual),
> but the memory-form PUSHes and CALLs are probably the only ones that are
> both common and have faster alternative sequences. Off the top of my head,
> another one that has a faster alternative sequence is LEAVE, but we never
> generate it to begin with.
>

Which instructions are microcoded varies from processor to processor, so a
flag meaning "avoid microcoded instructions" doesn't make sense in general.
I think it makes more sense for the flag to denote a certain family of
instructions that, for a particular family of CPU's, are microcoded, then
we can apply that flag to the relevant CPU's. I haven't looked in detail,
but would something like AtomAvoidMicrocodedMemForms or something like that
make sense for the name?

-- Sean Silva


>
>
> Thanks,
>
>   Michael
>
>
>
> *From:* Sean Silva [mailto:chisophugis at gmail.com]
> *Sent:* Monday, July 27, 2015 03:51
>
> *To:* Kuperstein, Michael M
> *Cc:* llvm-commits at cs.uiuc.edu
> *Subject:* Re: [llvm] r243010 - [X86] Allow load folding into PUSH
> instructions
>
>
>
>
>
>
>
> On Sun, Jul 26, 2015 at 12:11 AM, Kuperstein, Michael M <
> michael.m.kuperstein at intel.com> wrote:
>
> Yes.
>
> I’m open to suggestions about the name, btw – I’m not at all attached to
> having “msrom” in it.
>
>
>
> I've updated the comment to not use the acronym MSROM in r243256.
>
>
>
> Do you have a better explanation of what subset of instructions this
> feature should really be guarding (now or in the future)?
>
>
>
> -- Sean Silva
>
>
>
>
>
> Michael
>
>
>
> *From:* Sean Silva [mailto:chisophugis at gmail.com]
> *Sent:* Friday, July 24, 2015 07:29
> *To:* Kuperstein, Michael M
> *Cc:* llvm-commits at cs.uiuc.edu
> *Subject:* Re: [llvm] r243010 - [X86] Allow load folding into PUSH
> instructions
>
>
>
>
>
>
>
> On Thu, Jul 23, 2015 at 5:23 AM, Michael Kuperstein <
> michael.m.kuperstein at intel.com> wrote:
>
> Author: mkuper
> Date: Thu Jul 23 07:23:45 2015
> New Revision: 243010
>
> URL: http://llvm.org/viewvc/llvm-project?rev=243010&view=rev
> Log:
> [X86] Allow load folding into PUSH instructions
>
> Adds pushes to the folding tables.
> This also required a fix to the TD definition, since the memory forms of
> the push instructions did not have the right mayLoad/mayStore flags.
>
> Differential Revision: http://reviews.llvm.org/D11340
>
> Added:
>     llvm/trunk/test/CodeGen/X86/fold-push.ll
> Modified:
>     llvm/trunk/lib/Target/X86/X86.td
>     llvm/trunk/lib/Target/X86/X86InstrInfo.cpp
>     llvm/trunk/lib/Target/X86/X86InstrInfo.td
>
> Modified: llvm/trunk/lib/Target/X86/X86.td
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86.td?rev=243010&r1=243009&r2=243010&view=diff
>
> ==============================================================================
> --- llvm/trunk/lib/Target/X86/X86.td (original)
> +++ llvm/trunk/lib/Target/X86/X86.td Thu Jul 23 07:23:45 2015
> @@ -181,6 +181,11 @@ def FeatureSlowDivide64 : SubtargetFeatu
>  def FeaturePadShortFunctions : SubtargetFeature<"pad-short-functions",
>                                       "PadShortFunctions", "true",
>                                       "Pad short functions">;
> +// TODO: This feature ought to be renamed.
> +// What it really refers to are CPUs where instruction that cause MSROM
> +// lookups are expensive, so alternative sequences should be preferred.
>
>
>
> What is "MSROM" in this context? Do you just mean that the instruction is
> microcoded?
>
>
>
> -- Sean Silva
>
>
>
>  +// The best examples of this are the memory forms of CALL and PUSH
> +// instructions, which should be avoided in favor of a MOV + register
> CALL/PUSH.
>  def FeatureCallRegIndirect : SubtargetFeature<"call-reg-indirect",
>                                       "CallRegIndirect", "true",
>                                       "Call register indirect">;
>
> Modified: llvm/trunk/lib/Target/X86/X86InstrInfo.cpp
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrInfo.cpp?rev=243010&r1=243009&r2=243010&view=diff
>
> ==============================================================================
> --- llvm/trunk/lib/Target/X86/X86InstrInfo.cpp (original)
> +++ llvm/trunk/lib/Target/X86/X86InstrInfo.cpp Thu Jul 23 07:23:45 2015
> @@ -332,6 +332,9 @@ X86InstrInfo::X86InstrInfo(X86Subtarget
>      { X86::MUL8r,       X86::MUL8m,         TB_FOLDED_LOAD },
>      { X86::PEXTRDrr,    X86::PEXTRDmr,      TB_FOLDED_STORE },
>      { X86::PEXTRQrr,    X86::PEXTRQmr,      TB_FOLDED_STORE },
> +    { X86::PUSH16r,     X86::PUSH16rmm,     TB_FOLDED_LOAD },
> +    { X86::PUSH32r,     X86::PUSH32rmm,     TB_FOLDED_LOAD },
> +    { X86::PUSH64r,     X86::PUSH64rmm,     TB_FOLDED_LOAD },
>      { X86::SETAEr,      X86::SETAEm,        TB_FOLDED_STORE },
>      { X86::SETAr,       X86::SETAm,         TB_FOLDED_STORE },
>      { X86::SETBEr,      X86::SETBEm,        TB_FOLDED_STORE },
> @@ -4878,10 +4881,14 @@ MachineInstr *X86InstrInfo::foldMemoryOp
>    bool isCallRegIndirect = Subtarget.callRegIndirect();
>    bool isTwoAddrFold = false;
>
> -  // For CPUs that favor the register form of a call,
> -  // do not fold loads into calls.
> -  if (isCallRegIndirect &&
> -    (MI->getOpcode() == X86::CALL32r || MI->getOpcode() == X86::CALL64r))
> +  // For CPUs that favor the register form of a call or push,
> +  // do not fold loads into calls or pushes, unless optimizing for size
> +  // aggressively.
> +  if (isCallRegIndirect &&
> +      !MF.getFunction()->hasFnAttribute(Attribute::MinSize) &&
> +      (MI->getOpcode() == X86::CALL32r || MI->getOpcode() == X86::CALL64r
> ||
> +       MI->getOpcode() == X86::PUSH16r || MI->getOpcode() == X86::PUSH32r
> ||
> +       MI->getOpcode() == X86::PUSH64r))
>      return nullptr;
>
>    unsigned NumOps = MI->getDesc().getNumOperands();
>
> Modified: llvm/trunk/lib/Target/X86/X86InstrInfo.td
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrInfo.td?rev=243010&r1=243009&r2=243010&view=diff
>
> ==============================================================================
> --- llvm/trunk/lib/Target/X86/X86InstrInfo.td (original)
> +++ llvm/trunk/lib/Target/X86/X86InstrInfo.td Thu Jul 23 07:23:45 2015
> @@ -1022,12 +1022,8 @@ def PUSH32r  : I<0x50, AddRegFrm, (outs)
>                   IIC_PUSH_REG>, OpSize32, Requires<[Not64BitMode]>;
>  def PUSH16rmr: I<0xFF, MRM6r, (outs), (ins GR16:$reg), "push{w}\t$reg",[],
>                   IIC_PUSH_REG>, OpSize16;
> -def PUSH16rmm: I<0xFF, MRM6m, (outs), (ins i16mem:$src),
> "push{w}\t$src",[],
> -                 IIC_PUSH_MEM>, OpSize16;
>  def PUSH32rmr: I<0xFF, MRM6r, (outs), (ins GR32:$reg), "push{l}\t$reg",[],
>                   IIC_PUSH_REG>, OpSize32, Requires<[Not64BitMode]>;
> -def PUSH32rmm: I<0xFF, MRM6m, (outs), (ins i32mem:$src),
> "push{l}\t$src",[],
> -                 IIC_PUSH_MEM>, OpSize32, Requires<[Not64BitMode]>;
>
>  def PUSH16i8 : Ii8<0x6a, RawFrm, (outs), (ins i16i8imm:$imm),
>                     "push{w}\t$imm", [], IIC_PUSH_IMM>, OpSize16;
> @@ -1041,6 +1037,14 @@ def PUSHi32  : Ii32<0x68, RawFrm, (outs)
>                     "push{l}\t$imm", [], IIC_PUSH_IMM>, OpSize32,
>                     Requires<[Not64BitMode]>;
>  } // mayStore, SchedRW
> +
> +let mayLoad = 1, mayStore = 1, SchedRW = [WriteRMW] in {
> +def PUSH16rmm: I<0xFF, MRM6m, (outs), (ins i16mem:$src),
> "push{w}\t$src",[],
> +                 IIC_PUSH_MEM>, OpSize16;
> +def PUSH32rmm: I<0xFF, MRM6m, (outs), (ins i32mem:$src),
> "push{l}\t$src",[],
> +                 IIC_PUSH_MEM>, OpSize32, Requires<[Not64BitMode]>;
> +} // mayLoad, mayStore, SchedRW
> +
>  }
>
>  let Defs = [ESP, EFLAGS], Uses = [ESP], mayLoad = 1, hasSideEffects=0,
> @@ -1073,9 +1077,11 @@ def PUSH64r  : I<0x50, AddRegFrm, (outs)
>                   IIC_PUSH_REG>, OpSize32, Requires<[In64BitMode]>;
>  def PUSH64rmr: I<0xFF, MRM6r, (outs), (ins GR64:$reg), "push{q}\t$reg",
> [],
>                   IIC_PUSH_REG>, OpSize32, Requires<[In64BitMode]>;
> +} // mayStore, SchedRW
> +let mayLoad = 1, mayStore = 1, SchedRW = [WriteRMW] in {
>  def PUSH64rmm: I<0xFF, MRM6m, (outs), (ins i64mem:$src), "push{q}\t$src",
> [],
>                   IIC_PUSH_MEM>, OpSize32, Requires<[In64BitMode]>;
> -} // mayStore, SchedRW
> +} // mayLoad, mayStore, SchedRW
>  }
>
>  let Defs = [RSP], Uses = [RSP], hasSideEffects = 0, mayStore = 1,
>
> Added: llvm/trunk/test/CodeGen/X86/fold-push.ll
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/fold-push.ll?rev=243010&view=auto
>
> ==============================================================================
> --- llvm/trunk/test/CodeGen/X86/fold-push.ll (added)
> +++ llvm/trunk/test/CodeGen/X86/fold-push.ll Thu Jul 23 07:23:45 2015
> @@ -0,0 +1,40 @@
> +; RUN: llc < %s -mtriple=i686-windows | FileCheck %s -check-prefix=CHECK
> -check-prefix=NORMAL
> +; RUN: llc < %s -mtriple=i686-windows -mattr=call-reg-indirect |
> FileCheck %s -check-prefix=CHECK -check-prefix=SLM
> +
> +declare void @foo(i32 %r)
> +
> +define void @test(i32 %a, i32 %b) optsize {
> +; CHECK-LABEL: test:
> +; CHECK: movl [[EAX:%e..]], (%esp)
> +; CHECK-NEXT: pushl [[EAX]]
> +; CHECK-NEXT: calll
> +; CHECK-NEXT: addl $4, %esp
> +; CHECK: nop
> +; NORMAL: pushl (%esp)
> +; SLM: movl (%esp), [[RELOAD:%e..]]
> +; SLM-NEXT: pushl [[RELOAD]]
> +; CHECK: calll
> +; CHECK-NEXT: addl $4, %esp
> +  %c = add i32 %a, %b
> +  call void @foo(i32 %c)
> +  call void asm sideeffect "nop",
> "~{ax},~{bx},~{cx},~{dx},~{bp},~{si},~{di}"()
> +  call void @foo(i32 %c)
> +  ret void
> +}
> +
> +define void @test_min(i32 %a, i32 %b) minsize {
> +; CHECK-LABEL: test_min:
> +; CHECK: movl [[EAX:%e..]], (%esp)
> +; CHECK-NEXT: pushl [[EAX]]
> +; CHECK-NEXT: calll
> +; CHECK-NEXT: addl $4, %esp
> +; CHECK: nop
> +; CHECK: pushl (%esp)
> +; CHECK: calll
> +; CHECK-NEXT: addl $4, %esp
> +  %c = add i32 %a, %b
> +  call void @foo(i32 %c)
> +  call void asm sideeffect "nop",
> "~{ax},~{bx},~{cx},~{dx},~{bp},~{si},~{di}"()
> +  call void @foo(i32 %c)
> +  ret void
> +}
>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>
>
>
> ---------------------------------------------------------------------
> Intel Israel (74) Limited
>
> This e-mail and any attachments may contain confidential material for
> the sole use of the intended recipient(s). Any review or distribution
> by others is strictly prohibited. If you are not the intended
> recipient, please contact the sender and delete all copies.
>
>
>
> ---------------------------------------------------------------------
> Intel Israel (74) Limited
>
> This e-mail and any attachments may contain confidential material for
> the sole use of the intended recipient(s). Any review or distribution
> by others is strictly prohibited. If you are not the intended
> recipient, please contact the sender and delete all copies.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20150726/36d12c65/attachment.html>


More information about the llvm-commits mailing list