[llvm-commits] [RFC/PATCH] introduce 'UseSSEx' predicates
Michael Liao
michael.liao at intel.com
Thu Aug 30 09:58:25 PDT 2012
Committed as r162919 following your feedback. Thanks for review!
BTW, there're other places where AddedComplexity = 1 is used for
AVX/SSE. I will refine them once this commit doesn't break builds. (I
only verified it with regression test in both release/debug builds
locally.)
Yours
- Michael
On Wed, 2012-08-29 at 20:49 -0700, Craig Topper wrote:
> @@ -6913,7 +6913,8 @@ defm PCMPESTRI :
> SS42AI_pcmpestri<"pcmpestri">;
> // crc intrinsic instruction
> // This set of instructions are only rm, the only difference is the
> size
> // of r and m.
> -let Constraints = "$src1 = $dst" in {
> +let Constraints = "$src1 = $dst",
> + Predicates = [HasSSE42] in {
> def CRC32r32m8 : SS42FI<0xF0, MRMSrcMem, (outs GR32:$dst),
> (ins GR32:$src1, i8mem:$src2),
> "crc32{b} \t{$src2, $src1|$src1, $src2}",
>
> This isn't necessary. SS42FI is only used by CRC32.
>
> @@ -6804,7 +6804,7 @@ multiclass pseudo_pcmpistrm<string asm> {
> let Defs = [EFLAGS], usesCustomInserter = 1 in {
> let AddedComplexity = 1 in
> defm VPCMPISTRM128 : pseudo_pcmpistrm<"#VPCMPISTRM128">,
> Requires<[HasAVX]>;
> - defm PCMPISTRM128 : pseudo_pcmpistrm<"#PCMPISTRM128">,
> Requires<[HasSSE42]>;
> + defm PCMPISTRM128 : pseudo_pcmpistrm<"#PCMPISTRM128">,
> Requires<[UseSSE42]>;
> }
>
> let Defs = [XMM0, EFLAGS], neverHasSideEffects = 1, Predicates =
> [HasAVX] in {
> @@ -6842,7 +6842,7 @@ multiclass pseudo_pcmpestrm<string asm> {
> let Defs = [EFLAGS], Uses = [EAX, EDX], usesCustomInserter = 1 in {
> let AddedComplexity = 1 in
> defm VPCMPESTRM128 : pseudo_pcmpestrm<"#VPCMPESTRM128">,
> Requires<[HasAVX]>;
> - defm PCMPESTRM128 : pseudo_pcmpestrm<"#PCMPESTRM128">,
> Requires<[HasSSE42]>;
> + defm PCMPESTRM128 : pseudo_pcmpestrm<"#PCMPESTRM128">,
> Requires<[UseSSE42]>;
> }
>
> let Predicates = [HasAVX],
>
> Can you remove the AddedComplexity = 1 here?
>
> Other than those two things. I think it looks good.
>
> On Wed, Aug 29, 2012 at 8:32 PM, Craig Topper <craig.topper at gmail.com>
> wrote:
> I'm in agreement with the plan. Haven't done a detailed review
> of the patch yet. Hopefully I can get to that later tonight.
>
>
> On Wed, Aug 29, 2012 at 4:21 PM, Eli Friedman
> <eli.friedman at gmail.com> wrote:
> On Tue, Aug 28, 2012 at 11:25 AM, Michael Liao
> <michael.liao at intel.com> wrote:
> > Ping again for comment on this proposal. - Michael
> >
> > On Sun, 2012-08-26 at 02:57 -0700, Michael Liao
> wrote:
> >> On Fri, 2012-08-24 at 19:23 -0700, Eli Friedman
> wrote:
> >> > On Fri, Aug 24, 2012 at 6:55 PM, Michael Liao
> <michael.liao at intel.com> wrote:
> >> > > On Fri, 2012-08-24 at 18:14 -0700, Eli Friedman
> wrote:
> >> > >> On Wed, Aug 22, 2012 at 10:27 AM, Michael Liao
> <michael.liao at intel.com> wrote:
> >> > >> > Hi
> >> > >> >
> >> > >> > The attached patch adds several new
> predicates, namely UseSSE1, UseSSE2,
> >> > >> > UseSSE3, UseSSSE3, UseSSE41, and UseSSE42.
> >> > >> >
> >> > >> > As the penalty of inter-mixing SSE and AVX
> instructions, we need prevent
> >> > >> > SSE legacy insn from being generated except
> explicitly specified through
> >> > >> > some intrinsics. For patterns both supported
> by both SSE and AVX, so
> >> > >> > far, we force AVX insn will be tried first
> relying on AddedComplexity or
> >> > >> > td location. It's error-prone and introduces
> bugs accidentally.
> >> > >> >
> >> > >> > 'UseSSEx' is disabled when AVX is turned on.
> For insns both supported by
> >> > >> > AVX and SSE insns, we need this predicate to
> force VEX encoding only.
> >> > >> >
> >> > >> > For insns not inherited by AVX, we still use
> the previous predicates,
> >> > >> > i.e. 'HasSSEx'. So far, these insns fall
> into the following categories:
> >> > >> > * SSE insns with MMX operands
> >> > >> > * SSE insns with GPR/mem operands only
> (xFENCE, PREFETCH, CLFLUSH,
> >> > >> > CRC, and etc.)
> >> > >> > * SSE4A insns.
> >> > >> > * MMX insns.
> >> > >> > * x87 insns added by SSE.
> >> > >> >
> >> > >> > With this patch, several inter-mixing cases
> are found and fixed from
> >> > >> > regression tests.
> >> > >>
> >> > >> ---
> a/test/ExecutionEngine/MCJIT/test-common-symbols.ll
> >> > >> +++
> b/test/ExecutionEngine/MCJIT/test-common-symbols.ll
> >> > >> @@ -1,4 +1,4 @@
> >> > >> -; RUN: %lli -use-mcjit -O0
> -disable-lazy-compilation=false %s
> >> > >> +; RUN: %lli -use-mcjit
> -disable-lazy-compilation=false %s
> >> > >>
> >> > >> ; The intention of this test is to verify
> that symbols mapped to COMMON in ELF
> >> > >> ; work as expected.
> >> > >>
> >> > >> What is the point of this change?
> >> > >
> >> > > So far, MCJIT cannot support the creation of
> '.rodata' section. This's
> >> > > work around. I will file a bug for MCJIT.
> >> >
> >> > Oh, I see...
> >> >
> >> > How hard would it be to fix fast-isel so it can
> select vcvtsi2sdq?
> >>
> >> Regard to the failure of MCJIT with this patch,
> finally I find the root
> >> cause. Target constant rematerialization in X86
> fast-isel doesn't handle
> >> large code model correctly and create a PC-rel
> addressing which crashes
> >> when the displacement is out of 32-bit immediate
> range. I added a simple
> >> fix similar to large code model is handled in X86
> fast-isel in
> >> elsewhere.
> >>
> >> The patch is revised and attached here. Not sure
> it's ok to commit.
>
>
> The fast-isel changes look fine; please commit as a
> separate patch
> with its own testcase.
>
> The other changes also look reasonable, but I would
> prefer if someone
> else familiar with AVX could take a look as well.
>
> Please file a bug on the fast-isel floating-point
> constant issue once
> you commit this.
>
> -Eli
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>
>
>
>
>
> --
> ~Craig
>
>
>
> --
> ~Craig
More information about the llvm-commits
mailing list