[llvm-commits] [RFC/PATCH] introduce 'UseSSEx' predicates

Craig Topper craig.topper at gmail.com
Wed Aug 29 20:49:40 PDT 2012


@@ -6913,7 +6913,8 @@ defm PCMPESTRI   : SS42AI_pcmpestri<"pcmpestri">;
 // crc intrinsic instruction
 // This set of instructions are only rm, the only difference is the size
 // of r and m.
-let Constraints = "$src1 = $dst" in {
+let Constraints = "$src1 = $dst",
+    Predicates = [HasSSE42] in {
   def CRC32r32m8  : SS42FI<0xF0, MRMSrcMem, (outs GR32:$dst),
                       (ins GR32:$src1, i8mem:$src2),
                       "crc32{b} \t{$src2, $src1|$src1, $src2}",

This isn't necessary. SS42FI is only used by CRC32.

@@ -6804,7 +6804,7 @@ multiclass pseudo_pcmpistrm<string asm> {
 let Defs = [EFLAGS], usesCustomInserter = 1 in {
   let AddedComplexity = 1 in
     defm VPCMPISTRM128 : pseudo_pcmpistrm<"#VPCMPISTRM128">,
Requires<[HasAVX]>;
-  defm PCMPISTRM128 : pseudo_pcmpistrm<"#PCMPISTRM128">,
Requires<[HasSSE42]>;
+  defm PCMPISTRM128 : pseudo_pcmpistrm<"#PCMPISTRM128">,
Requires<[UseSSE42]>;
 }

 let Defs = [XMM0, EFLAGS], neverHasSideEffects = 1, Predicates = [HasAVX]
in {
@@ -6842,7 +6842,7 @@ multiclass pseudo_pcmpestrm<string asm> {
 let Defs = [EFLAGS], Uses = [EAX, EDX], usesCustomInserter = 1 in {
   let AddedComplexity = 1 in
     defm VPCMPESTRM128 : pseudo_pcmpestrm<"#VPCMPESTRM128">,
Requires<[HasAVX]>;
-  defm PCMPESTRM128 : pseudo_pcmpestrm<"#PCMPESTRM128">,
Requires<[HasSSE42]>;
+  defm PCMPESTRM128 : pseudo_pcmpestrm<"#PCMPESTRM128">,
Requires<[UseSSE42]>;
 }

 let Predicates = [HasAVX],

Can you remove the AddedComplexity = 1 here?

Other than those two things. I think it looks good.

On Wed, Aug 29, 2012 at 8:32 PM, Craig Topper <craig.topper at gmail.com>wrote:

> I'm in agreement with the plan. Haven't done a detailed review of the
> patch yet. Hopefully I can get to that later tonight.
>
>
> On Wed, Aug 29, 2012 at 4:21 PM, Eli Friedman <eli.friedman at gmail.com>wrote:
>
>> On Tue, Aug 28, 2012 at 11:25 AM, Michael Liao <michael.liao at intel.com>
>> wrote:
>> > Ping again for comment on this proposal. - Michael
>> >
>> > On Sun, 2012-08-26 at 02:57 -0700, Michael Liao wrote:
>> >> On Fri, 2012-08-24 at 19:23 -0700, Eli Friedman wrote:
>> >> > On Fri, Aug 24, 2012 at 6:55 PM, Michael Liao <
>> michael.liao at intel.com> wrote:
>> >> > > On Fri, 2012-08-24 at 18:14 -0700, Eli Friedman wrote:
>> >> > >> On Wed, Aug 22, 2012 at 10:27 AM, Michael Liao <
>> michael.liao at intel.com> wrote:
>> >> > >> > Hi
>> >> > >> >
>> >> > >> > The attached patch adds several new predicates, namely UseSSE1,
>> UseSSE2,
>> >> > >> > UseSSE3, UseSSSE3, UseSSE41, and UseSSE42.
>> >> > >> >
>> >> > >> > As the penalty of inter-mixing SSE and AVX instructions, we
>> need prevent
>> >> > >> > SSE legacy insn from being generated except explicitly
>> specified through
>> >> > >> > some intrinsics. For patterns both supported by both SSE and
>> AVX, so
>> >> > >> > far, we force AVX insn will be tried first relying on
>> AddedComplexity or
>> >> > >> > td location. It's error-prone and introduces bugs accidentally.
>> >> > >> >
>> >> > >> > 'UseSSEx' is disabled when AVX is turned on. For insns both
>> supported by
>> >> > >> > AVX and SSE insns, we need this predicate to force VEX encoding
>> only.
>> >> > >> >
>> >> > >> > For insns not inherited by AVX, we still use the previous
>> predicates,
>> >> > >> > i.e. 'HasSSEx'. So far, these insns fall into the following
>> categories:
>> >> > >> >   * SSE insns with MMX operands
>> >> > >> >   * SSE insns with GPR/mem operands only (xFENCE, PREFETCH,
>> CLFLUSH,
>> >> > >> >     CRC, and etc.)
>> >> > >> >   * SSE4A insns.
>> >> > >> >   * MMX insns.
>> >> > >> >   * x87 insns added by SSE.
>> >> > >> >
>> >> > >> > With this patch, several inter-mixing cases are found and fixed
>> from
>> >> > >> > regression tests.
>> >> > >>
>> >> > >> --- a/test/ExecutionEngine/MCJIT/test-common-symbols.ll
>> >> > >> +++ b/test/ExecutionEngine/MCJIT/test-common-symbols.ll
>> >> > >> @@ -1,4 +1,4 @@
>> >> > >> -; RUN: %lli -use-mcjit -O0 -disable-lazy-compilation=false %s
>> >> > >> +; RUN: %lli -use-mcjit -disable-lazy-compilation=false %s
>> >> > >>
>> >> > >>  ; The intention of this test is to verify that symbols mapped to
>> COMMON in ELF
>> >> > >>  ; work as expected.
>> >> > >>
>> >> > >> What is the point of this change?
>> >> > >
>> >> > > So far, MCJIT cannot support the creation of '.rodata' section.
>> This's
>> >> > > work around. I will file a bug for MCJIT.
>> >> >
>> >> > Oh, I see...
>> >> >
>> >> > How hard would it be to fix fast-isel so it can select vcvtsi2sdq?
>> >>
>> >> Regard to the failure of MCJIT with this patch, finally I find the root
>> >> cause. Target constant rematerialization in X86 fast-isel doesn't
>> handle
>> >> large code model correctly and create a PC-rel addressing which crashes
>> >> when the displacement is out of 32-bit immediate range. I added a
>> simple
>> >> fix similar to large code model is handled in X86 fast-isel in
>> >> elsewhere.
>> >>
>> >> The patch is revised and attached here. Not sure it's ok to commit.
>>
>> The fast-isel changes look fine; please commit as a separate patch
>> with its own testcase.
>>
>> The other changes also look reasonable, but I would prefer if someone
>> else familiar with AVX could take a look as well.
>>
>> Please file a bug on the fast-isel floating-point constant issue once
>> you commit this.
>>
>> -Eli
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>
>
>
>
> --
> ~Craig
>



-- 
~Craig
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20120829/b773ce65/attachment.html>


More information about the llvm-commits mailing list