[llvm-commits] [RFC/PATCH] introduce 'UseSSEx' predicates

Michael Liao michael.liao at intel.com
Thu Aug 30 09:58:25 PDT 2012


Committed as r162919 following your feedback. Thanks for review!

BTW, there're other places where AddedComplexity = 1 is used for
AVX/SSE. I will refine them once this commit doesn't break builds. (I
only verified it with regression test in both release/debug builds
locally.)

Yours
- Michael

On Wed, 2012-08-29 at 20:49 -0700, Craig Topper wrote:
> @@ -6913,7 +6913,8 @@ defm PCMPESTRI   :
> SS42AI_pcmpestri<"pcmpestri">;
>  // crc intrinsic instruction
>  // This set of instructions are only rm, the only difference is the
> size
>  // of r and m.
> -let Constraints = "$src1 = $dst" in {
> +let Constraints = "$src1 = $dst",
> +    Predicates = [HasSSE42] in {
>    def CRC32r32m8  : SS42FI<0xF0, MRMSrcMem, (outs GR32:$dst),
>                        (ins GR32:$src1, i8mem:$src2),
>                        "crc32{b} \t{$src2, $src1|$src1, $src2}",
> 
> This isn't necessary. SS42FI is only used by CRC32.
> 
> @@ -6804,7 +6804,7 @@ multiclass pseudo_pcmpistrm<string asm> {
>  let Defs = [EFLAGS], usesCustomInserter = 1 in {
>    let AddedComplexity = 1 in
>      defm VPCMPISTRM128 : pseudo_pcmpistrm<"#VPCMPISTRM128">,
> Requires<[HasAVX]>;
> -  defm PCMPISTRM128 : pseudo_pcmpistrm<"#PCMPISTRM128">,
> Requires<[HasSSE42]>;
> +  defm PCMPISTRM128 : pseudo_pcmpistrm<"#PCMPISTRM128">,
> Requires<[UseSSE42]>;
>  }
>  
>  let Defs = [XMM0, EFLAGS], neverHasSideEffects = 1, Predicates =
> [HasAVX] in {
> @@ -6842,7 +6842,7 @@ multiclass pseudo_pcmpestrm<string asm> {
>  let Defs = [EFLAGS], Uses = [EAX, EDX], usesCustomInserter = 1 in {
>    let AddedComplexity = 1 in
>      defm VPCMPESTRM128 : pseudo_pcmpestrm<"#VPCMPESTRM128">,
> Requires<[HasAVX]>;
> -  defm PCMPESTRM128 : pseudo_pcmpestrm<"#PCMPESTRM128">,
> Requires<[HasSSE42]>;
> +  defm PCMPESTRM128 : pseudo_pcmpestrm<"#PCMPESTRM128">,
> Requires<[UseSSE42]>;
>  }
>  
>  let Predicates = [HasAVX],
> 
> Can you remove the AddedComplexity = 1 here?
> 
> Other than those two things. I think it looks good.
> 
> On Wed, Aug 29, 2012 at 8:32 PM, Craig Topper <craig.topper at gmail.com>
> wrote:
>         I'm in agreement with the plan. Haven't done a detailed review
>         of the patch yet. Hopefully I can get to that later tonight.
>         
>         
>         On Wed, Aug 29, 2012 at 4:21 PM, Eli Friedman
>         <eli.friedman at gmail.com> wrote:
>                 On Tue, Aug 28, 2012 at 11:25 AM, Michael Liao
>                 <michael.liao at intel.com> wrote:
>                 > Ping again for comment on this proposal. - Michael
>                 >
>                 > On Sun, 2012-08-26 at 02:57 -0700, Michael Liao
>                 wrote:
>                 >> On Fri, 2012-08-24 at 19:23 -0700, Eli Friedman
>                 wrote:
>                 >> > On Fri, Aug 24, 2012 at 6:55 PM, Michael Liao
>                 <michael.liao at intel.com> wrote:
>                 >> > > On Fri, 2012-08-24 at 18:14 -0700, Eli Friedman
>                 wrote:
>                 >> > >> On Wed, Aug 22, 2012 at 10:27 AM, Michael Liao
>                 <michael.liao at intel.com> wrote:
>                 >> > >> > Hi
>                 >> > >> >
>                 >> > >> > The attached patch adds several new
>                 predicates, namely UseSSE1, UseSSE2,
>                 >> > >> > UseSSE3, UseSSSE3, UseSSE41, and UseSSE42.
>                 >> > >> >
>                 >> > >> > As the penalty of inter-mixing SSE and AVX
>                 instructions, we need prevent
>                 >> > >> > SSE legacy insn from being generated except
>                 explicitly specified through
>                 >> > >> > some intrinsics. For patterns both supported
>                 by both SSE and AVX, so
>                 >> > >> > far, we force AVX insn will be tried first
>                 relying on AddedComplexity or
>                 >> > >> > td location. It's error-prone and introduces
>                 bugs accidentally.
>                 >> > >> >
>                 >> > >> > 'UseSSEx' is disabled when AVX is turned on.
>                 For insns both supported by
>                 >> > >> > AVX and SSE insns, we need this predicate to
>                 force VEX encoding only.
>                 >> > >> >
>                 >> > >> > For insns not inherited by AVX, we still use
>                 the previous predicates,
>                 >> > >> > i.e. 'HasSSEx'. So far, these insns fall
>                 into the following categories:
>                 >> > >> >   * SSE insns with MMX operands
>                 >> > >> >   * SSE insns with GPR/mem operands only
>                 (xFENCE, PREFETCH, CLFLUSH,
>                 >> > >> >     CRC, and etc.)
>                 >> > >> >   * SSE4A insns.
>                 >> > >> >   * MMX insns.
>                 >> > >> >   * x87 insns added by SSE.
>                 >> > >> >
>                 >> > >> > With this patch, several inter-mixing cases
>                 are found and fixed from
>                 >> > >> > regression tests.
>                 >> > >>
>                 >> > >> ---
>                 a/test/ExecutionEngine/MCJIT/test-common-symbols.ll
>                 >> > >> +++
>                 b/test/ExecutionEngine/MCJIT/test-common-symbols.ll
>                 >> > >> @@ -1,4 +1,4 @@
>                 >> > >> -; RUN: %lli -use-mcjit -O0
>                 -disable-lazy-compilation=false %s
>                 >> > >> +; RUN: %lli -use-mcjit
>                 -disable-lazy-compilation=false %s
>                 >> > >>
>                 >> > >>  ; The intention of this test is to verify
>                 that symbols mapped to COMMON in ELF
>                 >> > >>  ; work as expected.
>                 >> > >>
>                 >> > >> What is the point of this change?
>                 >> > >
>                 >> > > So far, MCJIT cannot support the creation of
>                 '.rodata' section. This's
>                 >> > > work around. I will file a bug for MCJIT.
>                 >> >
>                 >> > Oh, I see...
>                 >> >
>                 >> > How hard would it be to fix fast-isel so it can
>                 select vcvtsi2sdq?
>                 >>
>                 >> Regard to the failure of MCJIT with this patch,
>                 finally I find the root
>                 >> cause. Target constant rematerialization in X86
>                 fast-isel doesn't handle
>                 >> large code model correctly and create a PC-rel
>                 addressing which crashes
>                 >> when the displacement is out of 32-bit immediate
>                 range. I added a simple
>                 >> fix similar to large code model is handled in X86
>                 fast-isel in
>                 >> elsewhere.
>                 >>
>                 >> The patch is revised and attached here. Not sure
>                 it's ok to commit.
>                 
>                 
>                 The fast-isel changes look fine; please commit as a
>                 separate patch
>                 with its own testcase.
>                 
>                 The other changes also look reasonable, but I would
>                 prefer if someone
>                 else familiar with AVX could take a look as well.
>                 
>                 Please file a bug on the fast-isel floating-point
>                 constant issue once
>                 you commit this.
>                 
>                 -Eli
>                 _______________________________________________
>                 llvm-commits mailing list
>                 llvm-commits at cs.uiuc.edu
>                 http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>                 
>         
>         
>         
>         
>         -- 
>         ~Craig
> 
> 
> 
> -- 
> ~Craig





More information about the llvm-commits mailing list