[llvm] r243010 - [X86] Allow load folding into PUSH instructions

Tue Jul 28 17:44:26 PDT 2015

On Mon, Jul 27, 2015 at 12:37 AM, Kuperstein, Michael M <
michael.m.kuperstein at intel.com> wrote:

>  Yeah, MSROM is short for “microcode sequencer ROM”.
>
> I agree it’s a niche term, it’s just that this is what the optimization
> manual happens to use.
>
>
>
> And, yes, “avoid microcoded instructions” is way too generic, but the
> problem is that this specific behavior is not really tied to a specific
> architecture (especially since Atom in LLVM is a synonym for Bonnell, so
> Atom and Silvermont are distinct), but to an architecture family. So I’d
> rather avoid using “Atom” as part of the name. Plus, I’d prefer try to keep
> it reasonably short. I guess using Atom is better than the alternative,
> though.
>
> How about FeatureAtomAvoidMicrocode and -mattr=-atom-avoid-microcode (or
> FeatureAtomAvoidMSROM and -atom-avoid-msrom?)
>

I like that. We use Atom to mean Bonnell AFAICT just because we mean "Atom"
as in "compatible with all Atom CPU's" and Bonnell happens to be the lowest
common denominator (I think?), so I think using Atom like this makes sense.

-- Sean Silva

>
>
> Michael
>
>
>
> *From:* Sean Silva [mailto:chisophugis at gmail.com]
> *Sent:* Monday, July 27, 2015 09:24
>
> *To:* Kuperstein, Michael M
> *Cc:* llvm-commits at cs.uiuc.edu
> *Subject:* Re: [llvm] r243010 - [X86] Allow load folding into PUSH
> instructions
>
>
>
>
>
>
>
> On Sun, Jul 26, 2015 at 10:49 PM, Kuperstein, Michael M <
> michael.m.kuperstein at intel.com> wrote:
>
> It seems that the feature was originally introduced to avoid the memory
> form of call instructions on Atom CPUs (including Silvermont).
>
> But there’s nothing really special about memory-form calls. The same
> applies to memory-form pushes, for example.
>
> This is based on the following guidance from the x86 optimization manual:
>
>
>
> For Atom (Bonnell):
>
> “For Intel Atom processors,  minimize the presence of complex instructions
> requiring MSROM to take advantage the optimal decode  bandwidth provided by
> the two decode units.”
>
>
>
> For Silvermont:
>
> “MSROM instructions should be avoided if possible. A good example is the
> memory form of PUSH and CALL. It will often be better to perform a load
> into a register and then perform the register version of  PUSH and CALL.”
>
>
>
> I’ve wanted to use MSROM explicitly to make cross-referencing with the
> manual easy, but, as I said, I’m not attached to that.
>
>
>
> Oops, sorry. I didn't make that connection. Is MSROM just the general
> Intel term for "microcoded"?
>
>
>
>
>
> Regarding your other question in the commit – there are, of course, plenty
> of microcoded instructions (there’s a table in the optimization manual),
> but the memory-form PUSHes and CALLs are probably the only ones that are
> both common and have faster alternative sequences. Off the top of my head,
> another one that has a faster alternative sequence is LEAVE, but we never
> generate it to begin with.
>
>
>
> Which instructions are microcoded varies from processor to processor, so a
> flag meaning "avoid microcoded instructions" doesn't make sense in general.
> I think it makes more sense for the flag to denote a certain family of
> instructions that, for a particular family of CPU's, are microcoded, then
> we can apply that flag to the relevant CPU's. I haven't looked in detail,
> but would something like AtomAvoidMicrocodedMemForms or something like that
> make sense for the name?
>
>
>
> -- Sean Silva
>
>
>
>
>
> Thanks,
>
>   Michael
>
>
>
> *From:* Sean Silva [mailto:chisophugis at gmail.com]
> *Sent:* Monday, July 27, 2015 03:51
>
>
> *To:* Kuperstein, Michael M
> *Cc:* llvm-commits at cs.uiuc.edu
> *Subject:* Re: [llvm] r243010 - [X86] Allow load folding into PUSH
> instructions
>
>
>
>
>
>
>
> On Sun, Jul 26, 2015 at 12:11 AM, Kuperstein, Michael M <
> michael.m.kuperstein at intel.com> wrote:
>
> Yes.
>
> I’m open to suggestions about the name, btw – I’m not at all attached to
> having “msrom” in it.
>
>
>
> I've updated the comment to not use the acronym MSROM in r243256.
>
>
>
> Do you have a better explanation of what subset of instructions this
> feature should really be guarding (now or in the future)?
>
>
>
> -- Sean Silva
>
>
>
>
>
> Michael
>
>
>
> *From:* Sean Silva [mailto:chisophugis at gmail.com]
> *Sent:* Friday, July 24, 2015 07:29
> *To:* Kuperstein, Michael M
> *Cc:* llvm-commits at cs.uiuc.edu
> *Subject:* Re: [llvm] r243010 - [X86] Allow load folding into PUSH
> instructions
>
>
>
>
>
>
>
> On Thu, Jul 23, 2015 at 5:23 AM, Michael Kuperstein <
> michael.m.kuperstein at intel.com> wrote:
>
> Author: mkuper
> Date: Thu Jul 23 07:23:45 2015
> New Revision: 243010
>
> URL: http://llvm.org/viewvc/llvm-project?rev=243010&view=rev
> Log:
> [X86] Allow load folding into PUSH instructions
>
> Adds pushes to the folding tables.
> This also required a fix to the TD definition, since the memory forms of
> the push instructions did not have the right mayLoad/mayStore flags.
>
> Differential Revision: http://reviews.llvm.org/D11340
>
> Added:
>     llvm/trunk/test/CodeGen/X86/fold-push.ll
> Modified:
>     llvm/trunk/lib/Target/X86/X86.td
>     llvm/trunk/lib/Target/X86/X86InstrInfo.cpp
>     llvm/trunk/lib/Target/X86/X86InstrInfo.td
>
> Modified: llvm/trunk/lib/Target/X86/X86.td
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86.td?rev=243010&r1=243009&r2=243010&view=diff
>
> ==============================================================================
> --- llvm/trunk/lib/Target/X86/X86.td (original)
> +++ llvm/trunk/lib/Target/X86/X86.td Thu Jul 23 07:23:45 2015
> @@ -181,6 +181,11 @@ def FeatureSlowDivide64 : SubtargetFeatu
>  def FeaturePadShortFunctions : SubtargetFeature<"pad-short-functions",
>                                       "PadShortFunctions", "true",
>                                       "Pad short functions">;
> +// TODO: This feature ought to be renamed.
> +// What it really refers to are CPUs where instruction that cause MSROM
> +// lookups are expensive, so alternative sequences should be preferred.
>
>
>
> What is "MSROM" in this context? Do you just mean that the instruction is
> microcoded?
>
>
>
> -- Sean Silva
>
>
>
>  +// The best examples of this are the memory forms of CALL and PUSH
> +// instructions, which should be avoided in favor of a MOV + register
> CALL/PUSH.
>  def FeatureCallRegIndirect : SubtargetFeature<"call-reg-indirect",
>                                       "CallRegIndirect", "true",
>                                       "Call register indirect">;
>
> Modified: llvm/trunk/lib/Target/X86/X86InstrInfo.cpp
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrInfo.cpp?rev=243010&r1=243009&r2=243010&view=diff
>
> ==============================================================================
> --- llvm/trunk/lib/Target/X86/X86InstrInfo.cpp (original)
> +++ llvm/trunk/lib/Target/X86/X86InstrInfo.cpp Thu Jul 23 07:23:45 2015
> @@ -332,6 +332,9 @@ X86InstrInfo::X86InstrInfo(X86Subtarget
>      { X86::MUL8r,       X86::MUL8m,         TB_FOLDED_LOAD },
>      { X86::PEXTRDrr,    X86::PEXTRDmr,      TB_FOLDED_STORE },
>      { X86::PEXTRQrr,    X86::PEXTRQmr,      TB_FOLDED_STORE },
> +    { X86::PUSH16r,     X86::PUSH16rmm,     TB_FOLDED_LOAD },
> +    { X86::PUSH32r,     X86::PUSH32rmm,     TB_FOLDED_LOAD },
> +    { X86::PUSH64r,     X86::PUSH64rmm,     TB_FOLDED_LOAD },
>      { X86::SETAEr,      X86::SETAEm,        TB_FOLDED_STORE },
>      { X86::SETAr,       X86::SETAm,         TB_FOLDED_STORE },
>      { X86::SETBEr,      X86::SETBEm,        TB_FOLDED_STORE },
> @@ -4878,10 +4881,14 @@ MachineInstr *X86InstrInfo::foldMemoryOp
>    bool isCallRegIndirect = Subtarget.callRegIndirect();
>    bool isTwoAddrFold = false;
>
> -  // For CPUs that favor the register form of a call,
> -  // do not fold loads into calls.
> -  if (isCallRegIndirect &&
> -    (MI->getOpcode() == X86::CALL32r || MI->getOpcode() == X86::CALL64r))
> +  // For CPUs that favor the register form of a call or push,
> +  // do not fold loads into calls or pushes, unless optimizing for size
> +  // aggressively.
> +  if (isCallRegIndirect &&
> +      !MF.getFunction()->hasFnAttribute(Attribute::MinSize) &&
> +      (MI->getOpcode() == X86::CALL32r || MI->getOpcode() == X86::CALL64r
> ||
> +       MI->getOpcode() == X86::PUSH16r || MI->getOpcode() == X86::PUSH32r
> ||
> +       MI->getOpcode() == X86::PUSH64r))
>      return nullptr;
>
>    unsigned NumOps = MI->getDesc().getNumOperands();
>
> Modified: llvm/trunk/lib/Target/X86/X86InstrInfo.td
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrInfo.td?rev=243010&r1=243009&r2=243010&view=diff
>
> ==============================================================================
> --- llvm/trunk/lib/Target/X86/X86InstrInfo.td (original)
> +++ llvm/trunk/lib/Target/X86/X86InstrInfo.td Thu Jul 23 07:23:45 2015
> @@ -1022,12 +1022,8 @@ def PUSH32r  : I<0x50, AddRegFrm, (outs)
>                   IIC_PUSH_REG>, OpSize32, Requires<[Not64BitMode]>;
>  def PUSH16rmr: I<0xFF, MRM6r, (outs), (ins GR16:$reg), "push{w}\t$reg",[],
>                   IIC_PUSH_REG>, OpSize16;
> -def PUSH16rmm: I<0xFF, MRM6m, (outs), (ins i16mem:$src),
> "push{w}\t$src",[],
> -                 IIC_PUSH_MEM>, OpSize16;
>  def PUSH32rmr: I<0xFF, MRM6r, (outs), (ins GR32:$reg), "push{l}\t$reg",[],
>                   IIC_PUSH_REG>, OpSize32, Requires<[Not64BitMode]>;
> -def PUSH32rmm: I<0xFF, MRM6m, (outs), (ins i32mem:$src),
> "push{l}\t$src",[],
> -                 IIC_PUSH_MEM>, OpSize32, Requires<[Not64BitMode]>;
>
>  def PUSH16i8 : Ii8<0x6a, RawFrm, (outs), (ins i16i8imm:$imm),
>                     "push{w}\t$imm", [], IIC_PUSH_IMM>, OpSize16;
> @@ -1041,6 +1037,14 @@ def PUSHi32  : Ii32<0x68, RawFrm, (outs)
>                     "push{l}\t$imm", [], IIC_PUSH_IMM>, OpSize32,
>                     Requires<[Not64BitMode]>;
>  } // mayStore, SchedRW
> +
> +let mayLoad = 1, mayStore = 1, SchedRW = [WriteRMW] in {
> +def PUSH16rmm: I<0xFF, MRM6m, (outs), (ins i16mem:$src),
> "push{w}\t$src",[],
> +                 IIC_PUSH_MEM>, OpSize16;
> +def PUSH32rmm: I<0xFF, MRM6m, (outs), (ins i32mem:$src),
> "push{l}\t$src",[],
> +                 IIC_PUSH_MEM>, OpSize32, Requires<[Not64BitMode]>;
> +} // mayLoad, mayStore, SchedRW
> +
>  }
>
>  let Defs = [ESP, EFLAGS], Uses = [ESP], mayLoad = 1, hasSideEffects=0,
> @@ -1073,9 +1077,11 @@ def PUSH64r  : I<0x50, AddRegFrm, (outs)
>                   IIC_PUSH_REG>, OpSize32, Requires<[In64BitMode]>;
>  def PUSH64rmr: I<0xFF, MRM6r, (outs), (ins GR64:$reg), "push{q}\t$reg",
> [],
>                   IIC_PUSH_REG>, OpSize32, Requires<[In64BitMode]>;
> +} // mayStore, SchedRW
> +let mayLoad = 1, mayStore = 1, SchedRW = [WriteRMW] in {
>  def PUSH64rmm: I<0xFF, MRM6m, (outs), (ins i64mem:$src), "push{q}\t$src",
> [],
>                   IIC_PUSH_MEM>, OpSize32, Requires<[In64BitMode]>;
> -} // mayStore, SchedRW
> +} // mayLoad, mayStore, SchedRW
>  }
>
>  let Defs = [RSP], Uses = [RSP], hasSideEffects = 0, mayStore = 1,
>
> Added: llvm/trunk/test/CodeGen/X86/fold-push.ll
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/fold-push.ll?rev=243010&view=auto
>
> ==============================================================================
> --- llvm/trunk/test/CodeGen/X86/fold-push.ll (added)
> +++ llvm/trunk/test/CodeGen/X86/fold-push.ll Thu Jul 23 07:23:45 2015
> @@ -0,0 +1,40 @@
> +; RUN: llc < %s -mtriple=i686-windows | FileCheck %s -check-prefix=CHECK
> -check-prefix=NORMAL
> +; RUN: llc < %s -mtriple=i686-windows -mattr=call-reg-indirect |
> FileCheck %s -check-prefix=CHECK -check-prefix=SLM
> +
> +declare void @foo(i32 %r)
> +
> +define void @test(i32 %a, i32 %b) optsize {
> +; CHECK-LABEL: test:
> +; CHECK: movl [[EAX:%e..]], (%esp)
> +; CHECK-NEXT: pushl [[EAX]]
> +; CHECK-NEXT: calll
> +; CHECK-NEXT: addl $4, %esp
> +; CHECK: nop
> +; NORMAL: pushl (%esp)
> +; SLM: movl (%esp), [[RELOAD:%e..]]
> +; SLM-NEXT: pushl [[RELOAD]]
> +; CHECK: calll
> +; CHECK-NEXT: addl $4, %esp
> +  %c = add i32 %a, %b
> +  call void @foo(i32 %c)
> +  call void asm sideeffect "nop",
> "~{ax},~{bx},~{cx},~{dx},~{bp},~{si},~{di}"()
> +  call void @foo(i32 %c)
> +  ret void
> +}
> +
> +define void @test_min(i32 %a, i32 %b) minsize {
> +; CHECK-LABEL: test_min:
> +; CHECK: movl [[EAX:%e..]], (%esp)
> +; CHECK-NEXT: pushl [[EAX]]
> +; CHECK-NEXT: calll
> +; CHECK-NEXT: addl $4, %esp
> +; CHECK: nop
> +; CHECK: pushl (%esp)
> +; CHECK: calll
> +; CHECK-NEXT: addl $4, %esp
> +  %c = add i32 %a, %b
> +  call void @foo(i32 %c)
> +  call void asm sideeffect "nop",
> "~{ax},~{bx},~{cx},~{dx},~{bp},~{si},~{di}"()
> +  call void @foo(i32 %c)
> +  ret void
> +}
>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>
>
>
> ---------------------------------------------------------------------
> Intel Israel (74) Limited
>
> This e-mail and any attachments may contain confidential material for
> the sole use of the intended recipient(s). Any review or distribution
> by others is strictly prohibited. If you are not the intended
> recipient, please contact the sender and delete all copies.
>
>
>
> ---------------------------------------------------------------------
> Intel Israel (74) Limited
>
> This e-mail and any attachments may contain confidential material for
> the sole use of the intended recipient(s). Any review or distribution
> by others is strictly prohibited. If you are not the intended
> recipient, please contact the sender and delete all copies.
>
>
>
> ---------------------------------------------------------------------
> Intel Israel (74) Limited
>
> This e-mail and any attachments may contain confidential material for
> the sole use of the intended recipient(s). Any review or distribution
> by others is strictly prohibited. If you are not the intended
> recipient, please contact the sender and delete all copies.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20150728/5011c0e9/attachment.html>