[llvm] r249731 - Move the MMX subtarget feature out of the SSE set of features and into
Chandler Carruth via llvm-commits
llvm-commits at lists.llvm.org
Fri Oct 9 10:40:53 PDT 2015
On Thu, Oct 8, 2015 at 1:11 PM Eric Christopher via llvm-commits <
llvm-commits at lists.llvm.org> wrote:
> Author: echristo
> Date: Thu Oct 8 15:10:06 2015
> New Revision: 249731
>
> URL: http://llvm.org/viewvc/llvm-project?rev=249731&view=rev
> Log:
> Move the MMX subtarget feature out of the SSE set of features and into
> its own variable.
>
> This is needed so that we can explicitly turn off MMX without turning
> off SSE and also so that we can diagnose feature set incompatibilities
> that involve MMX without SSE.
>
> Rationale:
>
> // sse3
> __m128d test_mm_addsub_pd(__m128d A, __m128d B) {
> return _mm_addsub_pd(A, B);
> }
>
> // mmx
> void shift(__m64 a, __m64 b, int c) {
> _mm_slli_pi16(a, c);
> _mm_slli_pi32(a, c);
> _mm_slli_si64(a, c);
> _mm_srli_pi16(a, c);
> _mm_srli_pi32(a, c);
> _mm_srli_si64(a, c);
> _mm_srai_pi16(a, c);
> _mm_srai_pi32(a, c);
> }
>
> clang -msse3 -mno-mmx file.c -c
>
> For this code we should be able to explicitly turn off MMX
> without affecting the compilation of the SSE3 function and then
> diagnose and error on compiling the MMX function.
>
> This matches the existing gcc behavior and follows the spirit of
> the SSE/MMX separation in llvm where we can (and do) turn off
> MMX code generation except in the presence of intrinsics.
>
> Updated a couple of tests, but primarily tested with a couple of tests
> for turning on only mmx and only sse.
>
> This is paired with a patch to clang to take advantage of this behavior.
>
> Added:
> llvm/trunk/test/CodeGen/X86/mmx-only.ll
> llvm/trunk/test/CodeGen/X86/sse-only.ll
> Modified:
> llvm/trunk/lib/Target/X86/X86.td
> llvm/trunk/lib/Target/X86/X86Subtarget.cpp
> llvm/trunk/lib/Target/X86/X86Subtarget.h
> llvm/trunk/test/CodeGen/X86/mmx-intrinsics.ll
> llvm/trunk/test/CodeGen/X86/mult-alt-x86.ll
>
> Modified: llvm/trunk/lib/Target/X86/X86.td
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86.td?rev=249731&r1=249730&r2=249731&view=diff
>
> ==============================================================================
> --- llvm/trunk/lib/Target/X86/X86.td (original)
> +++ llvm/trunk/lib/Target/X86/X86.td Thu Oct 8 15:10:06 2015
> @@ -37,14 +37,17 @@ def FeatureCMOV : SubtargetFeature<"c
> def FeaturePOPCNT : SubtargetFeature<"popcnt", "HasPOPCNT", "true",
> "Support POPCNT instruction">;
>
> -
> -def FeatureMMX : SubtargetFeature<"mmx","X86SSELevel", "MMX",
> +// The MMX subtarget feature is separate from the rest of the SSE features
> +// because it's important (for odd compatibility reasons) to be able to
> +// turn it off explicitly while allowing SSE+ to be on.
> +def FeatureMMX : SubtargetFeature<"mmx","HasMMX", "true",
> "Enable MMX instructions">;
> +
> def FeatureSSE1 : SubtargetFeature<"sse", "X86SSELevel", "SSE1",
> "Enable SSE instructions",
> // SSE codegen depends on cmovs,
> and all
> // SSE1+ processors support them.
> - [FeatureMMX, FeatureCMOV]>;
> + [FeatureCMOV]>;
> def FeatureSSE2 : SubtargetFeature<"sse2", "X86SSELevel", "SSE2",
> "Enable SSE2 instructions",
> [FeatureSSE1]>;
> @@ -219,184 +222,241 @@ def : Proc<"pentium-mmx", [FeatureSl
> def : Proc<"i686", [FeatureSlowUAMem16]>;
> def : Proc<"pentiumpro", [FeatureSlowUAMem16, FeatureCMOV]>;
> def : Proc<"pentium2", [FeatureSlowUAMem16, FeatureMMX,
> FeatureCMOV]>;
> -def : Proc<"pentium3", [FeatureSlowUAMem16, FeatureSSE1]>;
> -def : Proc<"pentium3m", [FeatureSlowUAMem16, FeatureSSE1,
> +def : Proc<"pentium3", [FeatureSlowUAMem16, FeatureMMX,
> FeatureSSE1]>;
> +def : Proc<"pentium3m", [FeatureSlowUAMem16, FeatureMMX,
> FeatureSSE1,
> FeatureSlowBTMem]>;
> -def : Proc<"pentium-m", [FeatureSlowUAMem16, FeatureSSE2,
> +def : Proc<"pentium-m", [FeatureSlowUAMem16, FeatureMMX,
> FeatureSSE2,
> FeatureSlowBTMem]>;
> -def : Proc<"pentium4", [FeatureSlowUAMem16, FeatureSSE2]>;
> -def : Proc<"pentium4m", [FeatureSlowUAMem16, FeatureSSE2,
> +def : Proc<"pentium4", [FeatureSlowUAMem16, FeatureMMX,
> FeatureSSE2]>;
> +def : Proc<"pentium4m", [FeatureSlowUAMem16, FeatureMMX,
> FeatureSSE2,
> FeatureSlowBTMem]>;
>
> // Intel Core Duo.
> -def : ProcessorModel<"yonah", SandyBridgeModel,
> - [FeatureSlowUAMem16, FeatureSSE3, FeatureSlowBTMem]>;
> +def : ProcessorModel<
> + "yonah", SandyBridgeModel,
> + [ FeatureSlowUAMem16, FeatureMMX, FeatureSSE3, FeatureSlowBTMem
> ]>;
>
> // NetBurst.
> -def : Proc<"prescott", [FeatureSlowUAMem16, FeatureSSE3,
> FeatureSlowBTMem]>;
> -def : Proc<"nocona", [FeatureSlowUAMem16, FeatureSSE3,
> FeatureCMPXCHG16B,
> - FeatureSlowBTMem]>;
> +def : Proc<"prescott",
> + [ FeatureSlowUAMem16, FeatureMMX, FeatureSSE3,
> FeatureSlowBTMem ]>;
> +def : Proc<"nocona", [
> + FeatureSlowUAMem16,
> + FeatureMMX,
> + FeatureSSE3,
> + FeatureCMPXCHG16B,
> + FeatureSlowBTMem
> +]>;
>
Ow. You reformatted every list in the same commit. =/ I would have much
preferred leaving them alone, or doing that separately. This diff is
moderately unreadable now.
>
> // Intel Core 2 Solo/Duo.
> -def : ProcessorModel<"core2", SandyBridgeModel,
> - [FeatureSlowUAMem16, FeatureSSSE3, FeatureCMPXCHG16B,
> - FeatureSlowBTMem]>;
> -def : ProcessorModel<"penryn", SandyBridgeModel,
> - [FeatureSlowUAMem16, FeatureSSE41, FeatureCMPXCHG16B,
> - FeatureSlowBTMem]>;
> +def : ProcessorModel<"core2", SandyBridgeModel, [
> + FeatureSlowUAMem16,
> + FeatureMMX,
> + FeatureSSSE3,
> + FeatureCMPXCHG16B,
> + FeatureSlowBTMem
> +]>;
> +def : ProcessorModel<"penryn", SandyBridgeModel, [
> + FeatureSlowUAMem16,
> + FeatureMMX,
> + FeatureSSE41,
> + FeatureCMPXCHG16B,
> + FeatureSlowBTMem
> +]>;
>
> // Atom CPUs.
> class BonnellProc<string Name> : ProcessorModel<Name, AtomModel, [
> - ProcIntelAtom,
> - FeatureSlowUAMem16,
> - FeatureSSSE3,
> - FeatureCMPXCHG16B,
> - FeatureMOVBE,
> - FeatureSlowBTMem,
> - FeatureLeaForSP,
> - FeatureSlowDivide32,
> - FeatureSlowDivide64,
> - FeatureCallRegIndirect,
> - FeatureLEAUsesAG,
> - FeaturePadShortFunctions
> - ]>;
> + ProcIntelAtom,
> + FeatureSlowUAMem16,
> + FeatureMMX,
> + FeatureSSSE3,
> + FeatureCMPXCHG16B,
> + FeatureMOVBE,
> + FeatureSlowBTMem,
> + FeatureLeaForSP,
> + FeatureSlowDivide32,
> + FeatureSlowDivide64,
> + FeatureCallRegIndirect,
> + FeatureLEAUsesAG,
> + FeaturePadShortFunctions
> +]>;
> def : BonnellProc<"bonnell">;
> def : BonnellProc<"atom">; // Pin the generic name to the baseline.
>
> class SilvermontProc<string Name> : ProcessorModel<Name, SLMModel, [
> - ProcIntelSLM,
> - FeatureSSE42,
> - FeatureCMPXCHG16B,
> - FeatureMOVBE,
> - FeaturePOPCNT,
> - FeaturePCLMUL,
> - FeatureAES,
> - FeatureSlowDivide64,
> - FeatureCallRegIndirect,
> - FeaturePRFCHW,
> - FeatureSlowLEA,
> - FeatureSlowIncDec,
> - FeatureSlowBTMem
> - ]>;
> + ProcIntelSLM,
> + FeatureMMX,
> + FeatureSSE42,
> + FeatureCMPXCHG16B,
> + FeatureMOVBE,
> + FeaturePOPCNT,
> + FeaturePCLMUL,
> + FeatureAES,
> + FeatureSlowDivide64,
> + FeatureCallRegIndirect,
> + FeaturePRFCHW,
> + FeatureSlowLEA,
> + FeatureSlowIncDec,
> + FeatureSlowBTMem
> +]>;
> def : SilvermontProc<"silvermont">;
> def : SilvermontProc<"slm">; // Legacy alias.
>
> // "Arrandale" along with corei3 and corei5
> class NehalemProc<string Name> : ProcessorModel<Name, SandyBridgeModel, [
> - FeatureSSE42,
> - FeatureCMPXCHG16B,
> - FeatureSlowBTMem,
> - FeaturePOPCNT
> - ]>;
> + FeatureMMX,
> + FeatureSSE42,
> + FeatureCMPXCHG16B,
> + FeatureSlowBTMem,
> + FeaturePOPCNT
> +]>;
> def : NehalemProc<"nehalem">;
> def : NehalemProc<"corei7">;
>
> // Westmere is a similar machine to nehalem with some additional features.
> // Westmere is the corei3/i5/i7 path from nehalem to sandybridge
> class WestmereProc<string Name> : ProcessorModel<Name, SandyBridgeModel, [
> - FeatureSSE42,
> - FeatureCMPXCHG16B,
> - FeatureSlowBTMem,
> - FeaturePOPCNT,
> - FeatureAES,
> - FeaturePCLMUL
> - ]>;
> + FeatureMMX,
> + FeatureSSE42,
> + FeatureCMPXCHG16B,
> + FeatureSlowBTMem,
> + FeaturePOPCNT,
> + FeatureAES,
> + FeaturePCLMUL
> +]>;
> def : WestmereProc<"westmere">;
>
> // SSE is not listed here since llvm treats AVX as a reimplementation of
> SSE,
> // rather than a superset.
> class SandyBridgeProc<string Name> : ProcessorModel<Name,
> SandyBridgeModel, [
> - FeatureAVX,
> - FeatureCMPXCHG16B,
> - FeatureSlowBTMem,
> - FeatureSlowUAMem32,
> - FeaturePOPCNT,
> - FeatureAES,
> - FeaturePCLMUL
> - ]>;
> + FeatureMMX,
> + FeatureAVX,
> + FeatureCMPXCHG16B,
> + FeatureSlowBTMem,
> + FeatureSlowUAMem32,
> + FeaturePOPCNT,
> + FeatureAES,
> + FeaturePCLMUL
> +]>;
> def : SandyBridgeProc<"sandybridge">;
> def : SandyBridgeProc<"corei7-avx">; // Legacy alias.
>
> class IvyBridgeProc<string Name> : ProcessorModel<Name, SandyBridgeModel,
> [
> - FeatureAVX,
> - FeatureCMPXCHG16B,
> - FeatureSlowBTMem,
> - FeatureSlowUAMem32,
> - FeaturePOPCNT,
> - FeatureAES,
> - FeaturePCLMUL,
> - FeatureRDRAND,
> - FeatureF16C,
> - FeatureFSGSBase
> - ]>;
> + FeatureMMX,
> + FeatureAVX,
> + FeatureCMPXCHG16B,
> + FeatureSlowBTMem,
> + FeatureSlowUAMem32,
> + FeaturePOPCNT,
> + FeatureAES,
> + FeaturePCLMUL,
> + FeatureRDRAND,
> + FeatureF16C,
> + FeatureFSGSBase
> +]>;
> def : IvyBridgeProc<"ivybridge">;
> def : IvyBridgeProc<"core-avx-i">; // Legacy alias.
>
> class HaswellProc<string Name> : ProcessorModel<Name, HaswellModel, [
> - FeatureAVX2,
> - FeatureCMPXCHG16B,
> - FeatureSlowBTMem,
> - FeaturePOPCNT,
> - FeatureAES,
> - FeaturePCLMUL,
> - FeatureRDRAND,
> - FeatureF16C,
> - FeatureFSGSBase,
> - FeatureMOVBE,
> - FeatureLZCNT,
> - FeatureBMI,
> - FeatureBMI2,
> - FeatureFMA,
> - FeatureRTM,
> - FeatureHLE,
> - FeatureSlowIncDec
> - ]>;
> + FeatureMMX,
> + FeatureAVX2,
> + FeatureCMPXCHG16B,
> + FeatureSlowBTMem,
> + FeaturePOPCNT,
> + FeatureAES,
> + FeaturePCLMUL,
> + FeatureRDRAND,
> + FeatureF16C,
> + FeatureFSGSBase,
> + FeatureMOVBE,
> + FeatureLZCNT,
> + FeatureBMI,
> + FeatureBMI2,
> + FeatureFMA,
> + FeatureRTM,
> + FeatureHLE,
> + FeatureSlowIncDec
> +]>;
> def : HaswellProc<"haswell">;
> def : HaswellProc<"core-avx2">; // Legacy alias.
>
> class BroadwellProc<string Name> : ProcessorModel<Name, HaswellModel, [
> - FeatureAVX2,
> - FeatureCMPXCHG16B,
> - FeatureSlowBTMem,
> - FeaturePOPCNT,
> - FeatureAES,
> - FeaturePCLMUL,
> - FeatureRDRAND,
> - FeatureF16C,
> - FeatureFSGSBase,
> - FeatureMOVBE,
> - FeatureLZCNT,
> - FeatureBMI,
> - FeatureBMI2,
> - FeatureFMA,
> - FeatureRTM,
> - FeatureHLE,
> - FeatureADX,
> - FeatureRDSEED,
> - FeatureSlowIncDec
> - ]>;
> + FeatureMMX,
> + FeatureAVX2,
> + FeatureCMPXCHG16B,
> + FeatureSlowBTMem,
> + FeaturePOPCNT,
> + FeatureAES,
> + FeaturePCLMUL,
> + FeatureRDRAND,
> + FeatureF16C,
> + FeatureFSGSBase,
> + FeatureMOVBE,
> + FeatureLZCNT,
> + FeatureBMI,
> + FeatureBMI2,
> + FeatureFMA,
> + FeatureRTM,
> + FeatureHLE,
> + FeatureADX,
> + FeatureRDSEED,
> + FeatureSlowIncDec
> +]>;
> def : BroadwellProc<"broadwell">;
>
> // FIXME: define KNL model
> -class KnightsLandingProc<string Name> : ProcessorModel<Name, HaswellModel,
> - [FeatureAVX512, FeatureERI, FeatureCDI, FeaturePFI,
> - FeatureCMPXCHG16B, FeaturePOPCNT,
> - FeatureAES, FeaturePCLMUL, FeatureRDRAND,
> FeatureF16C,
> - FeatureFSGSBase, FeatureMOVBE, FeatureLZCNT,
> FeatureBMI,
> - FeatureBMI2, FeatureFMA, FeatureRTM, FeatureHLE,
> - FeatureSlowIncDec, FeatureMPX]>;
> +class KnightsLandingProc<string Name> : ProcessorModel<Name,
> HaswellModel, [
> + FeatureMMX,
> + FeatureAVX512,
> + FeatureERI,
> + FeatureCDI,
> + FeaturePFI,
> + FeatureCMPXCHG16B,
> + FeaturePOPCNT,
> + FeatureAES,
> + FeaturePCLMUL,
> + FeatureRDRAND,
> + FeatureF16C,
> + FeatureFSGSBase,
> + FeatureMOVBE,
> + FeatureLZCNT,
> + FeatureBMI,
> + FeatureBMI2,
> + FeatureFMA,
> + FeatureRTM,
> + FeatureHLE,
> + FeatureSlowIncDec,
> + FeatureMPX
> +]>;
> def : KnightsLandingProc<"knl">;
>
> // FIXME: define SKX model
> -class SkylakeProc<string Name> : ProcessorModel<Name, HaswellModel,
> - [FeatureAVX512, FeatureCDI,
> - FeatureDQI, FeatureBWI, FeatureVLX,
> - FeatureCMPXCHG16B, FeatureSlowBTMem,
> - FeaturePOPCNT, FeatureAES, FeaturePCLMUL,
> FeatureRDRAND,
> - FeatureF16C, FeatureFSGSBase, FeatureMOVBE,
> FeatureLZCNT,
> - FeatureBMI, FeatureBMI2, FeatureFMA, FeatureRTM,
> - FeatureHLE, FeatureADX, FeatureRDSEED,
> FeatureSlowIncDec,
> - FeatureMPX]>;
> +class SkylakeProc<string Name> : ProcessorModel<Name, HaswellModel, [
> + FeatureMMX,
> + FeatureAVX512,
> + FeatureCDI,
> + FeatureDQI,
> + FeatureBWI,
> + FeatureVLX,
> + FeatureCMPXCHG16B,
> + FeatureSlowBTMem,
> + FeaturePOPCNT,
> + FeatureAES,
> + FeaturePCLMUL,
> + FeatureRDRAND,
> + FeatureF16C,
> + FeatureFSGSBase,
> + FeatureMOVBE,
> + FeatureLZCNT,
> + FeatureBMI,
> + FeatureBMI2,
> + FeatureFMA,
> + FeatureRTM,
> + FeatureHLE,
> + FeatureADX,
> + FeatureRDSEED,
> + FeatureSlowIncDec,
> + FeatureMPX
> +]>;
> def : SkylakeProc<"skylake">;
> def : SkylakeProc<"skx">; // Legacy alias.
>
> @@ -447,52 +507,117 @@ def : Proc<"barcelona", [FeatureSS
> FeatureSlowSHLD]>;
>
> // Bobcat
> -def : Proc<"btver1", [FeatureSSSE3, FeatureSSE4A,
> FeatureCMPXCHG16B,
> - FeaturePRFCHW, FeatureLZCNT, FeaturePOPCNT,
> - FeatureSlowSHLD]>;
> +def : Proc<"btver1", [
> + FeatureMMX,
> + FeatureSSSE3,
> + FeatureSSE4A,
> + FeatureCMPXCHG16B,
> + FeaturePRFCHW,
> + FeatureLZCNT,
> + FeaturePOPCNT,
> + FeatureSlowSHLD
> +]>;
>
> // Jaguar
> -def : ProcessorModel<"btver2", BtVer2Model,
> - [FeatureAVX, FeatureSSE4A, FeatureCMPXCHG16B,
> - FeaturePRFCHW, FeatureAES, FeaturePCLMUL,
> - FeatureBMI, FeatureF16C, FeatureMOVBE,
> - FeatureLZCNT, FeaturePOPCNT,
> - FeatureSlowSHLD]>;
> +def : ProcessorModel<"btver2", BtVer2Model, [
> + FeatureMMX,
> + FeatureAVX,
> + FeatureSSE4A,
> + FeatureCMPXCHG16B,
> + FeaturePRFCHW,
> + FeatureAES,
> + FeaturePCLMUL,
> + FeatureBMI,
> + FeatureF16C,
> + FeatureMOVBE,
> + FeatureLZCNT,
> + FeaturePOPCNT,
> + FeatureSlowSHLD
> +]>;
>
> // Bulldozer
> -def : Proc<"bdver1", [FeatureXOP, FeatureFMA4, FeatureCMPXCHG16B,
> - FeatureAES, FeaturePRFCHW, FeaturePCLMUL,
> - FeatureAVX, FeatureSSE4A, FeatureLZCNT,
> - FeaturePOPCNT, FeatureSlowSHLD]>;
> +def : Proc<"bdver1", [
> + FeatureXOP,
> + FeatureFMA4,
> + FeatureCMPXCHG16B,
> + FeatureAES,
> + FeaturePRFCHW,
> + FeaturePCLMUL,
> + FeatureMMX,
> + FeatureAVX,
> + FeatureSSE4A,
> + FeatureLZCNT,
> + FeaturePOPCNT,
> + FeatureSlowSHLD
> +]>;
> // Piledriver
> -def : Proc<"bdver2", [FeatureXOP, FeatureFMA4, FeatureCMPXCHG16B,
> - FeatureAES, FeaturePRFCHW, FeaturePCLMUL,
> - FeatureAVX, FeatureSSE4A, FeatureF16C,
> - FeatureLZCNT, FeaturePOPCNT, FeatureBMI,
> - FeatureTBM, FeatureFMA, FeatureSlowSHLD]>;
> +def : Proc<"bdver2", [
> + FeatureXOP,
> + FeatureFMA4,
> + FeatureCMPXCHG16B,
> + FeatureAES,
> + FeaturePRFCHW,
> + FeaturePCLMUL,
> + FeatureMMX,
> + FeatureAVX,
> + FeatureSSE4A,
> + FeatureF16C,
> + FeatureLZCNT,
> + FeaturePOPCNT,
> + FeatureBMI,
> + FeatureTBM,
> + FeatureFMA,
> + FeatureSlowSHLD
> +]>;
>
> // Steamroller
> -def : Proc<"bdver3", [FeatureXOP, FeatureFMA4, FeatureCMPXCHG16B,
> - FeatureAES, FeaturePRFCHW, FeaturePCLMUL,
> - FeatureAVX, FeatureSSE4A, FeatureF16C,
> - FeatureLZCNT, FeaturePOPCNT, FeatureBMI,
> - FeatureTBM, FeatureFMA, FeatureSlowSHLD,
> - FeatureFSGSBase]>;
> +def : Proc<"bdver3", [
> + FeatureXOP,
> + FeatureFMA4,
> + FeatureCMPXCHG16B,
> + FeatureAES,
> + FeaturePRFCHW,
> + FeaturePCLMUL,
> + FeatureMMX,
> + FeatureAVX,
> + FeatureSSE4A,
> + FeatureF16C,
> + FeatureLZCNT,
> + FeaturePOPCNT,
> + FeatureBMI,
> + FeatureTBM,
> + FeatureFMA,
> + FeatureSlowSHLD,
> + FeatureFSGSBase
> +]>;
>
> // Excavator
> -def : Proc<"bdver4", [FeatureAVX2, FeatureXOP, FeatureFMA4,
> - FeatureCMPXCHG16B, FeatureAES,
> FeaturePRFCHW,
> - FeaturePCLMUL, FeatureF16C, FeatureLZCNT,
> - FeaturePOPCNT, FeatureBMI, FeatureBMI2,
> - FeatureTBM, FeatureFMA, FeatureSSE4A,
> - FeatureFSGSBase]>;
> +def : Proc<"bdver4", [
> + FeatureMMX,
> + FeatureAVX2,
> + FeatureXOP,
> + FeatureFMA4,
> + FeatureCMPXCHG16B,
> + FeatureAES,
> + FeaturePRFCHW,
> + FeaturePCLMUL,
> + FeatureF16C,
> + FeatureLZCNT,
> + FeaturePOPCNT,
> + FeatureBMI,
> + FeatureBMI2,
> + FeatureTBM,
> + FeatureFMA,
> + FeatureSSE4A,
> + FeatureFSGSBase
> +]>;
>
> def : Proc<"geode", [FeatureSlowUAMem16, Feature3DNowA]>;
>
> def : Proc<"winchip-c6", [FeatureSlowUAMem16, FeatureMMX]>;
> def : Proc<"winchip2", [FeatureSlowUAMem16, Feature3DNow]>;
> def : Proc<"c3", [FeatureSlowUAMem16, Feature3DNow]>;
> -def : Proc<"c3-2", [FeatureSlowUAMem16, FeatureSSE1]>;
> +def : Proc<"c3-2", [ FeatureSlowUAMem16, FeatureMMX, FeatureSSE1 ]>;
>
> // We also provide a generic 64-bit specific x86 processor model which
> tries to
> // be good for modern chips without enabling instruction set encodings
> past the
> @@ -504,8 +629,9 @@ def : Proc<"c3-2", [FeatureSl
> // covers a huge swath of x86 processors. If there are specific scheduling
> // knobs which need to be tuned differently for AMD chips, we might
> consider
> // forming a common base for them.
> -def : ProcessorModel<"x86-64", SandyBridgeModel,
> - [FeatureSSE2, Feature64Bit, FeatureSlowBTMem]>;
> +def : ProcessorModel<
> + "x86-64", SandyBridgeModel,
> + [ FeatureMMX, FeatureSSE2, Feature64Bit, FeatureSlowBTMem ]>;
>
>
> //===----------------------------------------------------------------------===//
> // Register File Description
>
> Modified: llvm/trunk/lib/Target/X86/X86Subtarget.cpp
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86Subtarget.cpp?rev=249731&r1=249730&r2=249731&view=diff
>
> ==============================================================================
> --- llvm/trunk/lib/Target/X86/X86Subtarget.cpp (original)
> +++ llvm/trunk/lib/Target/X86/X86Subtarget.cpp Thu Oct 8 15:10:06 2015
> @@ -228,9 +228,10 @@ void X86Subtarget::initSubtargetFeatures
> }
>
> void X86Subtarget::initializeEnvironment() {
> - X86SSELevel = NoMMXSSE;
> + X86SSELevel = NoSSE;
> X863DNowLevel = NoThreeDNow;
> HasCMov = false;
> + HasMMX = false;
> HasX86_64 = false;
> HasPOPCNT = false;
> HasSSE4A = false;
>
> Modified: llvm/trunk/lib/Target/X86/X86Subtarget.h
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86Subtarget.h?rev=249731&r1=249730&r2=249731&view=diff
>
> ==============================================================================
> --- llvm/trunk/lib/Target/X86/X86Subtarget.h (original)
> +++ llvm/trunk/lib/Target/X86/X86Subtarget.h Thu Oct 8 15:10:06 2015
> @@ -47,7 +47,7 @@ class X86Subtarget final : public X86Gen
>
> protected:
> enum X86SSEEnum {
> - NoMMXSSE, MMX, SSE1, SSE2, SSE3, SSSE3, SSE41, SSE42, AVX, AVX2,
> AVX512F
> + NoSSE, SSE1, SSE2, SSE3, SSSE3, SSE41, SSE42, AVX, AVX2, AVX512F
> };
>
> enum X863DNowEnum {
> @@ -64,7 +64,7 @@ protected:
> /// Which PIC style to use
> PICStyles::Style PICStyle;
>
> - /// MMX, SSE1, SSE2, SSE3, SSSE3, SSE41, SSE42, or none supported.
> + /// SSE1, SSE2, SSE3, SSSE3, SSE41, SSE42, or none supported.
> X86SSEEnum X86SSELevel;
>
> /// 3DNow, 3DNow Athlon, or none supported.
> @@ -74,6 +74,9 @@ protected:
> /// (generally pentium pro+).
> bool HasCMov;
>
> + /// True if this processor supports MMX instructions.
> + bool HasMMX;
> +
> /// True if the processor supports X86-64 instructions.
> bool HasX86_64;
>
> @@ -319,7 +322,7 @@ public:
> void setPICStyle(PICStyles::Style Style) { PICStyle = Style; }
>
> bool hasCMov() const { return HasCMov; }
> - bool hasMMX() const { return X86SSELevel >= MMX; }
> + bool hasMMX() const { return HasMMX; }
> bool hasSSE1() const { return X86SSELevel >= SSE1; }
> bool hasSSE2() const { return X86SSELevel >= SSE2; }
> bool hasSSE3() const { return X86SSELevel >= SSE3; }
>
> Modified: llvm/trunk/test/CodeGen/X86/mmx-intrinsics.ll
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/mmx-intrinsics.ll?rev=249731&r1=249730&r2=249731&view=diff
>
> ==============================================================================
> --- llvm/trunk/test/CodeGen/X86/mmx-intrinsics.ll (original)
> +++ llvm/trunk/test/CodeGen/X86/mmx-intrinsics.ll Thu Oct 8 15:10:06 2015
> @@ -1,7 +1,7 @@
> ; RUN: llc < %s -march=x86 -mattr=+mmx,+ssse3,-avx | FileCheck %s
> --check-prefix=ALL --check-prefix=X86
> -; RUN: llc < %s -march=x86 -mattr=+avx | FileCheck %s --check-prefix=ALL
> --check-prefix=X86
> +; RUN: llc < %s -march=x86 -mattr=+mmx,+avx | FileCheck %s
> --check-prefix=ALL --check-prefix=X86
> ; RUN: llc < %s -march=x86-64 -mattr=+mmx,+ssse3,-avx | FileCheck %s
> --check-prefix=ALL --check-prefix=X64
> -; RUN: llc < %s -march=x86-64 -mattr=+avx | FileCheck %s
> --check-prefix=ALL --check-prefix=X64
> +; RUN: llc < %s -march=x86-64 -mattr=+mmx,+avx | FileCheck %s
> --check-prefix=ALL --check-prefix=X64
>
> declare x86_mmx @llvm.x86.ssse3.phadd.w(x86_mmx, x86_mmx) nounwind
> readnone
>
>
> Added: llvm/trunk/test/CodeGen/X86/mmx-only.ll
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/mmx-only.ll?rev=249731&view=auto
>
> ==============================================================================
> --- llvm/trunk/test/CodeGen/X86/mmx-only.ll (added)
> +++ llvm/trunk/test/CodeGen/X86/mmx-only.ll Thu Oct 8 15:10:06 2015
> @@ -0,0 +1,21 @@
> +; RUN: llc < %s -march=x86 -mattr=+mmx | FileCheck %s
> +; RUN: llc < %s -march=x86 -mattr=+mmx,-sse | FileCheck %s
> +
> +; Test that turning off sse doesn't turn off mmx.
> +
> +declare x86_mmx @llvm.x86.mmx.pcmpgt.d(x86_mmx, x86_mmx) nounwind readnone
> +
> +define i64 @test88(<1 x i64> %a, <1 x i64> %b) nounwind readnone {
> +; CHECK-LABEL: @test88
> +; CHECK: pcmpgtd
> +entry:
> + %0 = bitcast <1 x i64> %b to <2 x i32>
> + %1 = bitcast <1 x i64> %a to <2 x i32>
> + %mmx_var.i = bitcast <2 x i32> %1 to x86_mmx
> + %mmx_var1.i = bitcast <2 x i32> %0 to x86_mmx
> + %2 = tail call x86_mmx @llvm.x86.mmx.pcmpgt.d(x86_mmx %mmx_var.i,
> x86_mmx %mmx_var1.i) nounwind
> + %3 = bitcast x86_mmx %2 to <2 x i32>
> + %4 = bitcast <2 x i32> %3 to <1 x i64>
> + %5 = extractelement <1 x i64> %4, i32 0
> + ret i64 %5
> +}
>
> Modified: llvm/trunk/test/CodeGen/X86/mult-alt-x86.ll
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/mult-alt-x86.ll?rev=249731&r1=249730&r2=249731&view=diff
>
> ==============================================================================
> --- llvm/trunk/test/CodeGen/X86/mult-alt-x86.ll (original)
> +++ llvm/trunk/test/CodeGen/X86/mult-alt-x86.ll Thu Oct 8 15:10:06 2015
> @@ -1,4 +1,4 @@
> -; RUN: llc < %s -march=x86 -mattr=+sse2 -no-integrated-as
> +; RUN: llc < %s -march=x86 -mattr=+mmx,+sse2 -no-integrated-as
> ; ModuleID = 'mult-alt-x86.c'
> target datalayout =
> "e-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-f80:128:128-v64:64:64-v128:128:128-a0:0:64-f80:32:32-n8:16:32"
> target triple = "i686-pc-win32"
>
> Added: llvm/trunk/test/CodeGen/X86/sse-only.ll
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/sse-only.ll?rev=249731&view=auto
>
> ==============================================================================
> --- llvm/trunk/test/CodeGen/X86/sse-only.ll (added)
> +++ llvm/trunk/test/CodeGen/X86/sse-only.ll Thu Oct 8 15:10:06 2015
> @@ -0,0 +1,19 @@
> +; RUN: llc < %s -march=x86 -mattr=+sse2,-mmx | FileCheck %s
> +
> +; Test that turning off mmx doesn't turn off sse
> +
> +define void @test1(<2 x double>* %r, <2 x double>* %A, double %B)
> nounwind {
> +; CHECK-LABEL: test1:
> +; CHECK: ## BB#0:
> +; CHECK-NEXT: movl {{[0-9]+}}(%esp), %eax
> +; CHECK-NEXT: movl {{[0-9]+}}(%esp), %ecx
> +; CHECK-NEXT: movapd (%ecx), %xmm0
> +; CHECK-NEXT: movlpd {{[0-9]+}}(%esp), %xmm0
> +; CHECK-NEXT: movapd %xmm0, (%eax)
> +; CHECK-NEXT: retl
> + %tmp3 = load <2 x double>, <2 x double>* %A, align 16
> + %tmp7 = insertelement <2 x double> undef, double %B, i32 0
> + %tmp9 = shufflevector <2 x double> %tmp3, <2 x double> %tmp7, <2 x
> i32> < i32 2, i32 1 >
> + store <2 x double> %tmp9, <2 x double>* %r, align 16
> + ret void
> +}
>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20151009/4cd241c3/attachment.html>
More information about the llvm-commits
mailing list