[llvm] 4584685 - [CostModel][X86] Support cost kind specific look up tables

Simon Pilgrim via llvm-commits llvm-commits at lists.llvm.org
Thu Aug 25 04:34:29 PDT 2022


Author: Simon Pilgrim
Date: 2022-08-25T12:23:36+01:00
New Revision: 45846854a2c1414c27bc819033f6de588dea56fe

URL: https://github.com/llvm/llvm-project/commit/45846854a2c1414c27bc819033f6de588dea56fe
DIFF: https://github.com/llvm/llvm-project/commit/45846854a2c1414c27bc819033f6de588dea56fe.diff

LOG: [CostModel][X86] Support cost kind specific look up tables

Most of our cost model tables have been created assuming cost kind == recip-throughput. But we're starting to see passes wanting to get accurate costs for the other kinds as well. Some of these can be determined procedurally (e.g. codesize by default could just be the split count after type legalization), but others are going to need to be handled in cost tables - this is especially true for x86 which has so many ISA combinations.

I've created a 'CostKindCosts' struct which can hold cost values for the 4 cost kinds, defaulting to -1U for unknown cost, this can be used with the existing CostTblEntryT/CostTableLookup template code. I've also added a [TargetCostKind] accessor to make it much easier to look up individual <Optional> costs.

This just changes the ISD::SELECT costs to check the effect (and also to check that the ISD::SETCC are correctly handled for default/None cost kinds) - the plan would be to slowly extend this and move the CostKindTblEntry type somewhere generic to allow other targets to use it once its matured.

I'm also going to resurrect D103695 so that it can help with latency/codesize/sizelatency coverage testing.

For sizelatency - IIRC the definition was vague to let it be target specific - I've tried to use typical uop counts so they're comparable to MicroOpBufferSize etc.

Differential Revision: https://reviews.llvm.org/D132216

Added: 
    

Modified: 
    llvm/lib/Target/X86/X86TargetTransformInfo.cpp
    llvm/test/Analysis/CostModel/X86/intrinsic-cost-kinds.ll
    llvm/test/Analysis/CostModel/X86/select-codesize.ll
    llvm/test/Analysis/CostModel/X86/select-latency.ll
    llvm/test/Analysis/CostModel/X86/select-sizelatency.ll
    llvm/test/Transforms/PhaseOrdering/X86/vector-reductions-logical.ll

Removed: 
    


################################################################################
diff  --git a/llvm/lib/Target/X86/X86TargetTransformInfo.cpp b/llvm/lib/Target/X86/X86TargetTransformInfo.cpp
index b6a1f1d3dbffe..dd4249fd66267 100644
--- a/llvm/lib/Target/X86/X86TargetTransformInfo.cpp
+++ b/llvm/lib/Target/X86/X86TargetTransformInfo.cpp
@@ -13,29 +13,39 @@
 ///
 //===----------------------------------------------------------------------===//
 /// About Cost Model numbers used below it's necessary to say the following:
-/// the numbers correspond to some "generic" X86 CPU instead of usage of
-/// concrete CPU model. Usually the numbers correspond to CPU where the feature
-/// apeared at the first time. For example, if we do Subtarget.hasSSE42() in
+/// the numbers correspond to some "generic" X86 CPU instead of usage of a
+/// specific CPU model. Usually the numbers correspond to the CPU where the
+/// feature first appeared. For example, if we do Subtarget.hasSSE42() in
 /// the lookups below the cost is based on Nehalem as that was the first CPU
-/// to support that feature level and thus has most likely the worst case cost.
+/// to support that feature level and thus has most likely the worst case cost,
+/// although we may discard an outlying worst cost from one CPU (e.g. Atom).
+///
 /// Some examples of other technologies/CPUs:
 ///   SSE 3   - Pentium4 / Athlon64
 ///   SSE 4.1 - Penryn
-///   SSE 4.2 - Nehalem
-///   AVX     - Sandy Bridge
-///   AVX2    - Haswell
+///   SSE 4.2 - Nehalem / Silvermont
+///   AVX     - Sandy Bridge / Jaguar / Bulldozer
+///   AVX2    - Haswell / Ryzen
 ///   AVX-512 - Xeon Phi / Skylake
+///
 /// And some examples of instruction target dependent costs (latency)
 ///                   divss     sqrtss          rsqrtss
-///   AMD K7            11-16     19              3
-///   Piledriver        9-24      13-15           5
-///   Jaguar            14        16              2
-///   Pentium II,III    18        30              2
-///   Nehalem           7-14      7-18            3
-///   Haswell           10-13     11              5
-/// TODO: Develop and implement  the target dependent cost model and
-/// specialize cost numbers for 
diff erent Cost Model Targets such as throughput,
-/// code size, latency and uop count.
+///   AMD K7          11-16     19              3
+///   Piledriver      9-24      13-15           5
+///   Jaguar          14        16              2
+///   Pentium II,III  18        30              2
+///   Nehalem         7-14      7-18            3
+///   Haswell         10-13     11              5
+///
+/// Interpreting the 4 TargetCostKind types:
+/// TCK_RecipThroughput and TCK_Latency should try to match the worst case
+/// values reported by the CPU scheduler models (and llvm-mca).
+/// TCK_CodeSize should match the instruction count (e.g. divss = 1), NOT the
+/// actual encoding size of the instruction.
+/// TCK_SizeAndLatency should match the worst case micro-op counts reported by
+/// by the CPU scheduler models (and llvm-mca), to ensure that they are
+/// compatible with the MicroOpBufferSize and LoopMicroOpBufferSize values which are
+/// often used as the cost thresholds where TCK_SizeAndLatency is requested.
 //===----------------------------------------------------------------------===//
 
 #include "X86TargetTransformInfo.h"
@@ -57,6 +67,43 @@ using namespace llvm;
 //
 //===----------------------------------------------------------------------===//
 
+// Helper struct to store/access costs for each cost kind.
+// TODO: Move this to allow other targets to use it?
+struct CostKindCosts {
+  unsigned RecipThroughputCost;
+  unsigned LatencyCost;
+  unsigned CodeSizeCost;
+  unsigned SizeAndLatencyCost;
+
+  CostKindCosts(unsigned RecipThroughput = ~0U, unsigned Latency = ~0U,
+                unsigned CodeSize = ~0U, unsigned SizeAndLatency = ~0U)
+      : RecipThroughputCost(RecipThroughput), LatencyCost(Latency),
+        CodeSizeCost(CodeSize), SizeAndLatencyCost(SizeAndLatency) {}
+
+  llvm::Optional<unsigned>
+  operator[](TargetTransformInfo::TargetCostKind Kind) const {
+    unsigned Cost = ~0U;
+    switch (Kind) {
+    case TargetTransformInfo::TCK_RecipThroughput:
+      Cost = RecipThroughputCost;
+      break;
+    case TargetTransformInfo::TCK_Latency:
+      Cost = LatencyCost;
+      break;
+    case TargetTransformInfo::TCK_CodeSize:
+      Cost = CodeSizeCost;
+      break;
+    case TargetTransformInfo::TCK_SizeAndLatency:
+      Cost = SizeAndLatencyCost;
+      break;
+    }
+    if (Cost == ~0U)
+      return None;
+    return Cost;
+  }
+};
+using CostKindTblEntry = CostTblEntryT<CostKindCosts>;
+
 TargetTransformInfo::PopcntSupportKind
 X86TTIImpl::getPopcntSupport(unsigned TyWidth) {
   assert(isPowerOf2_32(TyWidth) && "Ty width must be power of 2");
@@ -2645,16 +2692,6 @@ InstructionCost X86TTIImpl::getCmpSelInstrCost(unsigned Opcode, Type *ValTy,
                                                CmpInst::Predicate VecPred,
                                                TTI::TargetCostKind CostKind,
                                                const Instruction *I) {
-  // Assume a 3cy latency for fp select ops.
-  if (CostKind == TTI::TCK_Latency && Opcode == Instruction::Select)
-    if (ValTy->getScalarType()->isFloatingPointTy())
-      return 3;
-
-  // TODO: Handle other cost kinds.
-  if (CostKind != TTI::TCK_RecipThroughput)
-    return BaseT::getCmpSelInstrCost(Opcode, ValTy, CondTy, VecPred, CostKind,
-                                     I);
-
   // Legalize the type.
   std::pair<InstructionCost, MVT> LT = getTypeLegalizationCost(ValTy);
 
@@ -2717,166 +2754,180 @@ InstructionCost X86TTIImpl::getCmpSelInstrCost(unsigned Opcode, Type *ValTy,
     }
   }
 
-  static const CostTblEntry SLMCostTbl[] = {
+  static const CostKindTblEntry SLMCostTbl[] = {
     // slm pcmpeq/pcmpgt throughput is 2
-    { ISD::SETCC,   MVT::v2i64,   2 },
+    { ISD::SETCC,   MVT::v2i64,   { 2 } },
     // slm pblendvb/blendvpd/blendvps throughput is 4
-    { ISD::SELECT,  MVT::v2f64,   4 }, // vblendvpd
-    { ISD::SELECT,  MVT::v4f32,   4 }, // vblendvps
-    { ISD::SELECT,  MVT::v2i64,   4 }, // pblendvb
-    { ISD::SELECT,  MVT::v8i32,   4 }, // pblendvb
-    { ISD::SELECT,  MVT::v8i16,   4 }, // pblendvb
-    { ISD::SELECT,  MVT::v16i8,   4 }, // pblendvb
+    { ISD::SELECT,  MVT::v2f64,   { 4, 4, 1, 3 } }, // vblendvpd
+    { ISD::SELECT,  MVT::v4f32,   { 4, 4, 1, 3 } }, // vblendvps
+    { ISD::SELECT,  MVT::v2i64,   { 4, 4, 1, 3 } }, // pblendvb
+    { ISD::SELECT,  MVT::v8i32,   { 4, 4, 1, 3 } }, // pblendvb
+    { ISD::SELECT,  MVT::v8i16,   { 4, 4, 1, 3 } }, // pblendvb
+    { ISD::SELECT,  MVT::v16i8,   { 4, 4, 1, 3 } }, // pblendvb
   };
 
-  static const CostTblEntry AVX512BWCostTbl[] = {
-    { ISD::SETCC,   MVT::v32i16,  1 },
-    { ISD::SETCC,   MVT::v64i8,   1 },
+  static const CostKindTblEntry AVX512BWCostTbl[] = {
+    { ISD::SETCC,   MVT::v32i16,  { 1 } },
+    { ISD::SETCC,   MVT::v64i8,   { 1 } },
 
-    { ISD::SELECT,  MVT::v32i16,  1 },
-    { ISD::SELECT,  MVT::v64i8,   1 },
+    { ISD::SELECT,  MVT::v32i16,  { 1, 1, 1, 1 } },
+    { ISD::SELECT,  MVT::v64i8,   { 1, 1, 1, 1 } },
   };
 
-  static const CostTblEntry AVX512CostTbl[] = {
-    { ISD::SETCC,   MVT::v8i64,   1 },
-    { ISD::SETCC,   MVT::v16i32,  1 },
-    { ISD::SETCC,   MVT::v8f64,   1 },
-    { ISD::SETCC,   MVT::v16f32,  1 },
-
-    { ISD::SELECT,  MVT::v8i64,   1 },
-    { ISD::SELECT,  MVT::v4i64,   1 },
-    { ISD::SELECT,  MVT::v2i64,   1 },
-    { ISD::SELECT,  MVT::v16i32,  1 },
-    { ISD::SELECT,  MVT::v8i32,   1 },
-    { ISD::SELECT,  MVT::v4i32,   1 },
-    { ISD::SELECT,  MVT::v8f64,   1 },
-    { ISD::SELECT,  MVT::v4f64,   1 },
-    { ISD::SELECT,  MVT::v2f64,   1 },
-    { ISD::SELECT,  MVT::f64,     1 },
-    { ISD::SELECT,  MVT::v16f32,  1 },
-    { ISD::SELECT,  MVT::v8f32 ,  1 },
-    { ISD::SELECT,  MVT::v4f32,   1 },
-    { ISD::SELECT,  MVT::f32  ,   1 },
-
-    { ISD::SETCC,   MVT::v32i16,  2 }, // FIXME: should probably be 4
-    { ISD::SETCC,   MVT::v64i8,   2 }, // FIXME: should probably be 4
-
-    { ISD::SELECT,  MVT::v32i16,  2 },
-    { ISD::SELECT,  MVT::v16i16,  1 },
-    { ISD::SELECT,  MVT::v8i16,   1 },
-    { ISD::SELECT,  MVT::v64i8,   2 },
-    { ISD::SELECT,  MVT::v32i8,   1 },
-    { ISD::SELECT,  MVT::v16i8,   1 },
+  static const CostKindTblEntry AVX512CostTbl[] = {
+    { ISD::SETCC,   MVT::v8i64,   { 1 } },
+    { ISD::SETCC,   MVT::v16i32,  { 1 } },
+    { ISD::SETCC,   MVT::v8f64,   { 1 } },
+    { ISD::SETCC,   MVT::v16f32,  { 1 } },
+
+    { ISD::SELECT,  MVT::v8i64,   { 1, 1, 1, 1 } },
+    { ISD::SELECT,  MVT::v4i64,   { 1, 1, 1, 1 } },
+    { ISD::SELECT,  MVT::v2i64,   { 1, 1, 1, 1 } },
+    { ISD::SELECT,  MVT::v16i32,  { 1, 1, 1, 1 } },
+    { ISD::SELECT,  MVT::v8i32,   { 1, 1, 1, 1 } },
+    { ISD::SELECT,  MVT::v4i32,   { 1, 1, 1, 1 } },
+    { ISD::SELECT,  MVT::v8f64,   { 1, 1, 1, 1 } },
+    { ISD::SELECT,  MVT::v4f64,   { 1, 1, 1, 1 } },
+    { ISD::SELECT,  MVT::v2f64,   { 1, 1, 1, 1 } },
+    { ISD::SELECT,  MVT::f64,     { 1, 1, 1, 1 } },
+    { ISD::SELECT,  MVT::v16f32,  { 1, 1, 1, 1 } },
+    { ISD::SELECT,  MVT::v8f32 ,  { 1, 1, 1, 1 } },
+    { ISD::SELECT,  MVT::v4f32,   { 1, 1, 1, 1 } },
+    { ISD::SELECT,  MVT::f32  ,   { 1, 1, 1, 1 } },
+
+    { ISD::SETCC,   MVT::v32i16,  { 2 } }, // FIXME: should probably be 4
+    { ISD::SETCC,   MVT::v64i8,   { 2 } }, // FIXME: should probably be 4
+
+    { ISD::SELECT,  MVT::v32i16,  { 2, 2, 4, 4 } },
+    { ISD::SELECT,  MVT::v16i16,  { 1, 1, 1, 1 } },
+    { ISD::SELECT,  MVT::v8i16,   { 1, 1, 1, 1 } },
+    { ISD::SELECT,  MVT::v64i8,   { 2, 2, 4, 4 } },
+    { ISD::SELECT,  MVT::v32i8,   { 1, 1, 1, 1 } },
+    { ISD::SELECT,  MVT::v16i8,   { 1, 1, 1, 1 } },
   };
 
-  static const CostTblEntry AVX2CostTbl[] = {
-    { ISD::SETCC,   MVT::v4i64,   1 },
-    { ISD::SETCC,   MVT::v8i32,   1 },
-    { ISD::SETCC,   MVT::v16i16,  1 },
-    { ISD::SETCC,   MVT::v32i8,   1 },
-
-    { ISD::SELECT,  MVT::v4f64,   2 }, // vblendvpd
-    { ISD::SELECT,  MVT::v8f32,   2 }, // vblendvps
-    { ISD::SELECT,  MVT::v4i64,   2 }, // pblendvb
-    { ISD::SELECT,  MVT::v8i32,   2 }, // pblendvb
-    { ISD::SELECT,  MVT::v16i16,  2 }, // pblendvb
-    { ISD::SELECT,  MVT::v32i8,   2 }, // pblendvb
+  static const CostKindTblEntry AVX2CostTbl[] = {
+    { ISD::SETCC,   MVT::v4i64,   { 1 } },
+    { ISD::SETCC,   MVT::v8i32,   { 1 } },
+    { ISD::SETCC,   MVT::v16i16,  { 1 } },
+    { ISD::SETCC,   MVT::v32i8,   { 1 } },
+
+    { ISD::SELECT,  MVT::v4f64,   { 2, 2, 1, 2 } }, // vblendvpd
+    { ISD::SELECT,  MVT::v8f32,   { 2, 2, 1, 2 } }, // vblendvps
+    { ISD::SELECT,  MVT::v4i64,   { 2, 2, 1, 2 } }, // pblendvb
+    { ISD::SELECT,  MVT::v8i32,   { 2, 2, 1, 2 } }, // pblendvb
+    { ISD::SELECT,  MVT::v16i16,  { 2, 2, 1, 2 } }, // pblendvb
+    { ISD::SELECT,  MVT::v32i8,   { 2, 2, 1, 2 } }, // pblendvb
   };
 
-  static const CostTblEntry AVX1CostTbl[] = {
-    { ISD::SETCC,   MVT::v4f64,   1 },
-    { ISD::SETCC,   MVT::v8f32,   1 },
+  static const CostKindTblEntry AVX1CostTbl[] = {
+    { ISD::SETCC,   MVT::v4f64,   { 1 } },
+    { ISD::SETCC,   MVT::v8f32,   { 1 } },
     // AVX1 does not support 8-wide integer compare.
-    { ISD::SETCC,   MVT::v4i64,   4 },
-    { ISD::SETCC,   MVT::v8i32,   4 },
-    { ISD::SETCC,   MVT::v16i16,  4 },
-    { ISD::SETCC,   MVT::v32i8,   4 },
-
-    { ISD::SELECT,  MVT::v4f64,   3 }, // vblendvpd
-    { ISD::SELECT,  MVT::v8f32,   3 }, // vblendvps
-    { ISD::SELECT,  MVT::v4i64,   3 }, // vblendvpd
-    { ISD::SELECT,  MVT::v8i32,   3 }, // vblendvps
-    { ISD::SELECT,  MVT::v16i16,  3 }, // vandps + vandnps + vorps
-    { ISD::SELECT,  MVT::v32i8,   3 }, // vandps + vandnps + vorps
+    { ISD::SETCC,   MVT::v4i64,   { 4 } },
+    { ISD::SETCC,   MVT::v8i32,   { 4 } },
+    { ISD::SETCC,   MVT::v16i16,  { 4 } },
+    { ISD::SETCC,   MVT::v32i8,   { 4 } },
+
+    { ISD::SELECT,  MVT::v4f64,   { 3, 3, 1, 2 } }, // vblendvpd
+    { ISD::SELECT,  MVT::v8f32,   { 3, 3, 1, 2 } }, // vblendvps
+    { ISD::SELECT,  MVT::v4i64,   { 3, 3, 1, 2 } }, // vblendvpd
+    { ISD::SELECT,  MVT::v8i32,   { 3, 3, 1, 2 } }, // vblendvps
+    { ISD::SELECT,  MVT::v16i16,  { 3, 3, 3, 3 } }, // vandps + vandnps + vorps
+    { ISD::SELECT,  MVT::v32i8,   { 3, 3, 3, 3 } }, // vandps + vandnps + vorps
   };
 
-  static const CostTblEntry SSE42CostTbl[] = {
-    { ISD::SETCC,   MVT::v2i64,   1 },
+  static const CostKindTblEntry SSE42CostTbl[] = {
+    { ISD::SETCC,   MVT::v2i64,   { 1 } },
   };
 
-  static const CostTblEntry SSE41CostTbl[] = {
-    { ISD::SETCC,   MVT::v2f64,   1 },
-    { ISD::SETCC,   MVT::v4f32,   1 },
-
-    { ISD::SELECT,  MVT::v2f64,   2 }, // blendvpd
-    { ISD::SELECT,  MVT::f64,     2 }, // blendvpd
-    { ISD::SELECT,  MVT::v4f32,   2 }, // blendvps
-    { ISD::SELECT,  MVT::f32  ,   2 }, // blendvps
-    { ISD::SELECT,  MVT::v2i64,   2 }, // pblendvb
-    { ISD::SELECT,  MVT::v4i32,   2 }, // pblendvb
-    { ISD::SELECT,  MVT::v8i16,   2 }, // pblendvb
-    { ISD::SELECT,  MVT::v16i8,   2 }, // pblendvb
+  static const CostKindTblEntry SSE41CostTbl[] = {
+    { ISD::SETCC,   MVT::v2f64,   { 1 } },
+    { ISD::SETCC,   MVT::v4f32,   { 1 } },
+
+    { ISD::SELECT,  MVT::v2f64,   { 2, 2, 1, 2 } }, // blendvpd
+    { ISD::SELECT,  MVT::f64,     { 2, 2, 1, 2 } }, // blendvpd
+    { ISD::SELECT,  MVT::v4f32,   { 2, 2, 1, 2 } }, // blendvps
+    { ISD::SELECT,  MVT::f32  ,   { 2, 2, 1, 2 } }, // blendvps
+    { ISD::SELECT,  MVT::v2i64,   { 2, 2, 1, 2 } }, // pblendvb
+    { ISD::SELECT,  MVT::v4i32,   { 2, 2, 1, 2 } }, // pblendvb
+    { ISD::SELECT,  MVT::v8i16,   { 2, 2, 1, 2 } }, // pblendvb
+    { ISD::SELECT,  MVT::v16i8,   { 2, 2, 1, 2 } }, // pblendvb
   };
 
-  static const CostTblEntry SSE2CostTbl[] = {
-    { ISD::SETCC,   MVT::v2f64,   2 },
-    { ISD::SETCC,   MVT::f64,     1 },
-    { ISD::SETCC,   MVT::v2i64,   5 }, // pcmpeqd/pcmpgtd expansion
-    { ISD::SETCC,   MVT::v4i32,   1 },
-    { ISD::SETCC,   MVT::v8i16,   1 },
-    { ISD::SETCC,   MVT::v16i8,   1 },
-
-    { ISD::SELECT,  MVT::v2f64,   2 }, // andpd + andnpd + orpd
-    { ISD::SELECT,  MVT::f64,     2 }, // andpd + andnpd + orpd
-    { ISD::SELECT,  MVT::v2i64,   2 }, // pand + pandn + por
-    { ISD::SELECT,  MVT::v4i32,   2 }, // pand + pandn + por
-    { ISD::SELECT,  MVT::v8i16,   2 }, // pand + pandn + por
-    { ISD::SELECT,  MVT::v16i8,   2 }, // pand + pandn + por
+  static const CostKindTblEntry SSE2CostTbl[] = {
+    { ISD::SETCC,   MVT::v2f64,   { 2 } },
+    { ISD::SETCC,   MVT::f64,     { 1 } },
+    { ISD::SETCC,   MVT::v2i64,   { 5 } }, // pcmpeqd/pcmpgtd expansion
+    { ISD::SETCC,   MVT::v4i32,   { 1 } },
+    { ISD::SETCC,   MVT::v8i16,   { 1 } },
+    { ISD::SETCC,   MVT::v16i8,   { 1 } },
+
+    { ISD::SELECT,  MVT::v2f64,   { 2, 2, 3, 3 } }, // andpd + andnpd + orpd
+    { ISD::SELECT,  MVT::f64,     { 2, 2, 3, 3 } }, // andpd + andnpd + orpd
+    { ISD::SELECT,  MVT::v2i64,   { 2, 2, 3, 3 } }, // pand + pandn + por
+    { ISD::SELECT,  MVT::v4i32,   { 2, 2, 3, 3 } }, // pand + pandn + por
+    { ISD::SELECT,  MVT::v8i16,   { 2, 2, 3, 3 } }, // pand + pandn + por
+    { ISD::SELECT,  MVT::v16i8,   { 2, 2, 3, 3 } }, // pand + pandn + por
   };
 
-  static const CostTblEntry SSE1CostTbl[] = {
-    { ISD::SETCC,   MVT::v4f32,   2 },
-    { ISD::SETCC,   MVT::f32,     1 },
+  static const CostKindTblEntry SSE1CostTbl[] = {
+    { ISD::SETCC,   MVT::v4f32,   { 2 } },
+    { ISD::SETCC,   MVT::f32,     { 1 } },
 
-    { ISD::SELECT,  MVT::v4f32,   2 }, // andps + andnps + orps
-    { ISD::SELECT,  MVT::f32,     2 }, // andps + andnps + orps
+    { ISD::SELECT,  MVT::v4f32,   { 2, 2, 3, 3 } }, // andps + andnps + orps
+    { ISD::SELECT,  MVT::f32,     { 2, 2, 3, 3 } }, // andps + andnps + orps
   };
 
   if (ST->useSLMArithCosts())
     if (const auto *Entry = CostTableLookup(SLMCostTbl, ISD, MTy))
-      return LT.first * (ExtraCost + Entry->Cost);
+      if (auto KindCost = Entry->Cost[CostKind])
+        return LT.first * (ExtraCost + KindCost.value());
 
   if (ST->hasBWI())
     if (const auto *Entry = CostTableLookup(AVX512BWCostTbl, ISD, MTy))
-      return LT.first * (ExtraCost + Entry->Cost);
+      if (auto KindCost = Entry->Cost[CostKind])
+        return LT.first * (ExtraCost + KindCost.value());
 
   if (ST->hasAVX512())
     if (const auto *Entry = CostTableLookup(AVX512CostTbl, ISD, MTy))
-      return LT.first * (ExtraCost + Entry->Cost);
+      if (auto KindCost = Entry->Cost[CostKind])
+        return LT.first * (ExtraCost + KindCost.value());
 
   if (ST->hasAVX2())
     if (const auto *Entry = CostTableLookup(AVX2CostTbl, ISD, MTy))
-      return LT.first * (ExtraCost + Entry->Cost);
+      if (auto KindCost = Entry->Cost[CostKind])
+        return LT.first * (ExtraCost + KindCost.value());
 
   if (ST->hasAVX())
     if (const auto *Entry = CostTableLookup(AVX1CostTbl, ISD, MTy))
-      return LT.first * (ExtraCost + Entry->Cost);
+      if (auto KindCost = Entry->Cost[CostKind])
+        return LT.first * (ExtraCost + KindCost.value());
 
   if (ST->hasSSE42())
     if (const auto *Entry = CostTableLookup(SSE42CostTbl, ISD, MTy))
-      return LT.first * (ExtraCost + Entry->Cost);
+      if (auto KindCost = Entry->Cost[CostKind])
+        return LT.first * (ExtraCost + KindCost.value());
 
   if (ST->hasSSE41())
     if (const auto *Entry = CostTableLookup(SSE41CostTbl, ISD, MTy))
-      return LT.first * (ExtraCost + Entry->Cost);
+      if (auto KindCost = Entry->Cost[CostKind])
+        return LT.first * (ExtraCost + KindCost.value());
 
   if (ST->hasSSE2())
     if (const auto *Entry = CostTableLookup(SSE2CostTbl, ISD, MTy))
-      return LT.first * (ExtraCost + Entry->Cost);
+      if (auto KindCost = Entry->Cost[CostKind])
+        return LT.first * (ExtraCost + KindCost.value());
 
   if (ST->hasSSE1())
     if (const auto *Entry = CostTableLookup(SSE1CostTbl, ISD, MTy))
-      return LT.first * (ExtraCost + Entry->Cost);
+      if (auto KindCost = Entry->Cost[CostKind])
+        return LT.first * (ExtraCost + KindCost.value());
+
+  // Assume a 3cy latency for fp select ops.
+  if (CostKind == TTI::TCK_Latency && Opcode == Instruction::Select)
+    if (ValTy->getScalarType()->isFloatingPointTy())
+      return 3;
 
   return BaseT::getCmpSelInstrCost(Opcode, ValTy, CondTy, VecPred, CostKind, I);
 }

diff  --git a/llvm/test/Analysis/CostModel/X86/intrinsic-cost-kinds.ll b/llvm/test/Analysis/CostModel/X86/intrinsic-cost-kinds.ll
index b930fc8de9bee..efb494c6a5449 100644
--- a/llvm/test/Analysis/CostModel/X86/intrinsic-cost-kinds.ll
+++ b/llvm/test/Analysis/CostModel/X86/intrinsic-cost-kinds.ll
@@ -82,17 +82,17 @@ define void @smax(i32 %a, i32 %b, <16 x i32> %va, <16 x i32> %vb) {
 ;
 ; LATE-LABEL: 'smax'
 ; LATE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %s = call i32 @llvm.smax.i32(i32 %a, i32 %b)
-; LATE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v = call <16 x i32> @llvm.smax.v16i32(<16 x i32> %va, <16 x i32> %vb)
+; LATE-NEXT:  Cost Model: Found an estimated cost of 9 for instruction: %v = call <16 x i32> @llvm.smax.v16i32(<16 x i32> %va, <16 x i32> %vb)
 ; LATE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret void
 ;
 ; SIZE-LABEL: 'smax'
 ; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %s = call i32 @llvm.smax.i32(i32 %a, i32 %b)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v = call <16 x i32> @llvm.smax.v16i32(<16 x i32> %va, <16 x i32> %vb)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 13 for instruction: %v = call <16 x i32> @llvm.smax.v16i32(<16 x i32> %va, <16 x i32> %vb)
 ; SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret void
 ;
 ; SIZE_LATE-LABEL: 'smax'
 ; SIZE_LATE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %s = call i32 @llvm.smax.i32(i32 %a, i32 %b)
-; SIZE_LATE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v = call <16 x i32> @llvm.smax.v16i32(<16 x i32> %va, <16 x i32> %vb)
+; SIZE_LATE-NEXT:  Cost Model: Found an estimated cost of 13 for instruction: %v = call <16 x i32> @llvm.smax.v16i32(<16 x i32> %va, <16 x i32> %vb)
 ; SIZE_LATE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret void
 ;
   %s = call i32 @llvm.smax.i32(i32 %a, i32 %b)
@@ -290,17 +290,17 @@ define void @fshl(i32 %a, i32 %b, i32 %c, <16 x i32> %va, <16 x i32> %vb, <16 x
 ;
 ; LATE-LABEL: 'fshl'
 ; LATE-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %s = call i32 @llvm.fshl.i32(i32 %a, i32 %b, i32 %c)
-; LATE-NEXT:  Cost Model: Found an estimated cost of 7 for instruction: %v = call <16 x i32> @llvm.fshl.v16i32(<16 x i32> %va, <16 x i32> %vb, <16 x i32> %vc)
+; LATE-NEXT:  Cost Model: Found an estimated cost of 14 for instruction: %v = call <16 x i32> @llvm.fshl.v16i32(<16 x i32> %va, <16 x i32> %vb, <16 x i32> %vc)
 ; LATE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret void
 ;
 ; SIZE-LABEL: 'fshl'
 ; SIZE-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %s = call i32 @llvm.fshl.i32(i32 %a, i32 %b, i32 %c)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 7 for instruction: %v = call <16 x i32> @llvm.fshl.v16i32(<16 x i32> %va, <16 x i32> %vb, <16 x i32> %vc)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 18 for instruction: %v = call <16 x i32> @llvm.fshl.v16i32(<16 x i32> %va, <16 x i32> %vb, <16 x i32> %vc)
 ; SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret void
 ;
 ; SIZE_LATE-LABEL: 'fshl'
 ; SIZE_LATE-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %s = call i32 @llvm.fshl.i32(i32 %a, i32 %b, i32 %c)
-; SIZE_LATE-NEXT:  Cost Model: Found an estimated cost of 7 for instruction: %v = call <16 x i32> @llvm.fshl.v16i32(<16 x i32> %va, <16 x i32> %vb, <16 x i32> %vc)
+; SIZE_LATE-NEXT:  Cost Model: Found an estimated cost of 18 for instruction: %v = call <16 x i32> @llvm.fshl.v16i32(<16 x i32> %va, <16 x i32> %vb, <16 x i32> %vc)
 ; SIZE_LATE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret void
 ;
   %s = call i32 @llvm.fshl.i32(i32 %a, i32 %b, i32 %c)

diff  --git a/llvm/test/Analysis/CostModel/X86/select-codesize.ll b/llvm/test/Analysis/CostModel/X86/select-codesize.ll
index 84527013b5c72..c801a5fcb257d 100644
--- a/llvm/test/Analysis/CostModel/X86/select-codesize.ll
+++ b/llvm/test/Analysis/CostModel/X86/select-codesize.ll
@@ -1,36 +1,131 @@
 ; NOTE: Assertions have been autogenerated by utils/update_analyze_test_checks.py
-; RUN: opt < %s -passes="print<cost-model>" 2>&1 -disable-output -cost-kind=code-size -mtriple=x86_64-- -mattr=+sse2 | FileCheck %s
-; RUN: opt < %s -passes="print<cost-model>" 2>&1 -disable-output -cost-kind=code-size -mtriple=x86_64-- -mattr=+sse4.1 | FileCheck %s
-; RUN: opt < %s -passes="print<cost-model>" 2>&1 -disable-output -cost-kind=code-size -mtriple=x86_64-- -mattr=+avx | FileCheck %s
-; RUN: opt < %s -passes="print<cost-model>" 2>&1 -disable-output -cost-kind=code-size -mtriple=x86_64-- -mattr=+avx2 | FileCheck %s
-; RUN: opt < %s -passes="print<cost-model>" 2>&1 -disable-output -cost-kind=code-size -mtriple=x86_64-- -mattr=+avx512f,+avx512vl  | FileCheck %s
-; RUN: opt < %s -passes="print<cost-model>" 2>&1 -disable-output -cost-kind=code-size -mtriple=x86_64-- -mattr=+avx512bw,+avx512vl | FileCheck %s
+; RUN: opt < %s -passes="print<cost-model>" 2>&1 -disable-output -cost-kind=code-size -mtriple=x86_64-- -mattr=+sse2 | FileCheck %s -check-prefixes=SSE2
+; RUN: opt < %s -passes="print<cost-model>" 2>&1 -disable-output -cost-kind=code-size -mtriple=x86_64-- -mattr=+sse4.1 | FileCheck %s -check-prefixes=SSE4
+; RUN: opt < %s -passes="print<cost-model>" 2>&1 -disable-output -cost-kind=code-size -mtriple=x86_64-- -mattr=+avx | FileCheck %s -check-prefixes=AVX,AVX1
+; RUN: opt < %s -passes="print<cost-model>" 2>&1 -disable-output -cost-kind=code-size -mtriple=x86_64-- -mattr=+avx2 | FileCheck %s -check-prefixes=AVX,AVX2
+; RUN: opt < %s -passes="print<cost-model>" 2>&1 -disable-output -cost-kind=code-size -mtriple=x86_64-- -mattr=+avx512f,+avx512vl  | FileCheck %s -check-prefixes=AVX512,AVX512F
+; RUN: opt < %s -passes="print<cost-model>" 2>&1 -disable-output -cost-kind=code-size -mtriple=x86_64-- -mattr=+avx512bw,+avx512vl | FileCheck %s -check-prefixes=AVX512,AVX512BW
 ;
-; RUN: opt < %s -passes="print<cost-model>" 2>&1 -disable-output -cost-kind=code-size -mtriple=x86_64-- -mcpu=slm | FileCheck %s
-; RUN: opt < %s -passes="print<cost-model>" 2>&1 -disable-output -cost-kind=code-size -mtriple=x86_64-- -mcpu=goldmont | FileCheck %s
-; RUN: opt < %s -passes="print<cost-model>" 2>&1 -disable-output -cost-kind=code-size -mtriple=x86_64-- -mcpu=btver2 | FileCheck %s
+; RUN: opt < %s -passes="print<cost-model>" 2>&1 -disable-output -cost-kind=code-size -mtriple=x86_64-- -mcpu=slm | FileCheck %s --check-prefixes=SSE4
+; RUN: opt < %s -passes="print<cost-model>" 2>&1 -disable-output -cost-kind=code-size -mtriple=x86_64-- -mcpu=goldmont | FileCheck %s --check-prefixes=SSE4
+; RUN: opt < %s -passes="print<cost-model>" 2>&1 -disable-output -cost-kind=code-size -mtriple=x86_64-- -mcpu=btver2 | FileCheck %s --check-prefixes=AVX,AVX1
 
 ; Verify the cost of vector select instructions.
 
 define i32 @test_select() {
-; CHECK-LABEL: 'test_select'
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I64 = select i1 undef, i64 undef, i64 undef
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V2I64 = select <2 x i1> undef, <2 x i64> undef, <2 x i64> undef
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V4I64 = select <4 x i1> undef, <4 x i64> undef, <4 x i64> undef
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V8I64 = select <8 x i1> undef, <8 x i64> undef, <8 x i64> undef
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I32 = select i1 undef, i32 undef, i32 undef
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V4I32 = select <4 x i1> undef, <4 x i32> undef, <4 x i32> undef
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V8I32 = select <8 x i1> undef, <8 x i32> undef, <8 x i32> undef
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V16I32 = select <16 x i1> undef, <16 x i32> undef, <16 x i32> undef
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I16 = select i1 undef, i16 undef, i16 undef
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V8I16 = select <8 x i1> undef, <8 x i16> undef, <8 x i16> undef
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V16I16 = select <16 x i1> undef, <16 x i16> undef, <16 x i16> undef
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V32I16 = select <32 x i1> undef, <32 x i16> undef, <32 x i16> undef
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I8 = select i1 undef, i8 undef, i8 undef
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V16I8 = select <16 x i1> undef, <16 x i8> undef, <16 x i8> undef
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V32I8 = select <32 x i1> undef, <32 x i8> undef, <32 x i8> undef
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V64I8 = select <64 x i1> undef, <64 x i8> undef, <64 x i8> undef
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 undef
+; SSE2-LABEL: 'test_select'
+; SSE2-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I64 = select i1 undef, i64 undef, i64 undef
+; SSE2-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V2I64 = select <2 x i1> undef, <2 x i64> undef, <2 x i64> undef
+; SSE2-NEXT:  Cost Model: Found an estimated cost of 6 for instruction: %V4I64 = select <4 x i1> undef, <4 x i64> undef, <4 x i64> undef
+; SSE2-NEXT:  Cost Model: Found an estimated cost of 12 for instruction: %V8I64 = select <8 x i1> undef, <8 x i64> undef, <8 x i64> undef
+; SSE2-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I32 = select i1 undef, i32 undef, i32 undef
+; SSE2-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V4I32 = select <4 x i1> undef, <4 x i32> undef, <4 x i32> undef
+; SSE2-NEXT:  Cost Model: Found an estimated cost of 6 for instruction: %V8I32 = select <8 x i1> undef, <8 x i32> undef, <8 x i32> undef
+; SSE2-NEXT:  Cost Model: Found an estimated cost of 12 for instruction: %V16I32 = select <16 x i1> undef, <16 x i32> undef, <16 x i32> undef
+; SSE2-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I16 = select i1 undef, i16 undef, i16 undef
+; SSE2-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V8I16 = select <8 x i1> undef, <8 x i16> undef, <8 x i16> undef
+; SSE2-NEXT:  Cost Model: Found an estimated cost of 6 for instruction: %V16I16 = select <16 x i1> undef, <16 x i16> undef, <16 x i16> undef
+; SSE2-NEXT:  Cost Model: Found an estimated cost of 12 for instruction: %V32I16 = select <32 x i1> undef, <32 x i16> undef, <32 x i16> undef
+; SSE2-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I8 = select i1 undef, i8 undef, i8 undef
+; SSE2-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V16I8 = select <16 x i1> undef, <16 x i8> undef, <16 x i8> undef
+; SSE2-NEXT:  Cost Model: Found an estimated cost of 6 for instruction: %V32I8 = select <32 x i1> undef, <32 x i8> undef, <32 x i8> undef
+; SSE2-NEXT:  Cost Model: Found an estimated cost of 12 for instruction: %V64I8 = select <64 x i1> undef, <64 x i8> undef, <64 x i8> undef
+; SSE2-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 undef
+;
+; SSE4-LABEL: 'test_select'
+; SSE4-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I64 = select i1 undef, i64 undef, i64 undef
+; SSE4-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V2I64 = select <2 x i1> undef, <2 x i64> undef, <2 x i64> undef
+; SSE4-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V4I64 = select <4 x i1> undef, <4 x i64> undef, <4 x i64> undef
+; SSE4-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %V8I64 = select <8 x i1> undef, <8 x i64> undef, <8 x i64> undef
+; SSE4-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I32 = select i1 undef, i32 undef, i32 undef
+; SSE4-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V4I32 = select <4 x i1> undef, <4 x i32> undef, <4 x i32> undef
+; SSE4-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V8I32 = select <8 x i1> undef, <8 x i32> undef, <8 x i32> undef
+; SSE4-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %V16I32 = select <16 x i1> undef, <16 x i32> undef, <16 x i32> undef
+; SSE4-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I16 = select i1 undef, i16 undef, i16 undef
+; SSE4-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V8I16 = select <8 x i1> undef, <8 x i16> undef, <8 x i16> undef
+; SSE4-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V16I16 = select <16 x i1> undef, <16 x i16> undef, <16 x i16> undef
+; SSE4-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %V32I16 = select <32 x i1> undef, <32 x i16> undef, <32 x i16> undef
+; SSE4-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I8 = select i1 undef, i8 undef, i8 undef
+; SSE4-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V16I8 = select <16 x i1> undef, <16 x i8> undef, <16 x i8> undef
+; SSE4-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V32I8 = select <32 x i1> undef, <32 x i8> undef, <32 x i8> undef
+; SSE4-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %V64I8 = select <64 x i1> undef, <64 x i8> undef, <64 x i8> undef
+; SSE4-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 undef
+;
+; AVX1-LABEL: 'test_select'
+; AVX1-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I64 = select i1 undef, i64 undef, i64 undef
+; AVX1-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V2I64 = select <2 x i1> undef, <2 x i64> undef, <2 x i64> undef
+; AVX1-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V4I64 = select <4 x i1> undef, <4 x i64> undef, <4 x i64> undef
+; AVX1-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V8I64 = select <8 x i1> undef, <8 x i64> undef, <8 x i64> undef
+; AVX1-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I32 = select i1 undef, i32 undef, i32 undef
+; AVX1-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V4I32 = select <4 x i1> undef, <4 x i32> undef, <4 x i32> undef
+; AVX1-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V8I32 = select <8 x i1> undef, <8 x i32> undef, <8 x i32> undef
+; AVX1-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V16I32 = select <16 x i1> undef, <16 x i32> undef, <16 x i32> undef
+; AVX1-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I16 = select i1 undef, i16 undef, i16 undef
+; AVX1-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V8I16 = select <8 x i1> undef, <8 x i16> undef, <8 x i16> undef
+; AVX1-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V16I16 = select <16 x i1> undef, <16 x i16> undef, <16 x i16> undef
+; AVX1-NEXT:  Cost Model: Found an estimated cost of 6 for instruction: %V32I16 = select <32 x i1> undef, <32 x i16> undef, <32 x i16> undef
+; AVX1-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I8 = select i1 undef, i8 undef, i8 undef
+; AVX1-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V16I8 = select <16 x i1> undef, <16 x i8> undef, <16 x i8> undef
+; AVX1-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V32I8 = select <32 x i1> undef, <32 x i8> undef, <32 x i8> undef
+; AVX1-NEXT:  Cost Model: Found an estimated cost of 6 for instruction: %V64I8 = select <64 x i1> undef, <64 x i8> undef, <64 x i8> undef
+; AVX1-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 undef
+;
+; AVX2-LABEL: 'test_select'
+; AVX2-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I64 = select i1 undef, i64 undef, i64 undef
+; AVX2-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V2I64 = select <2 x i1> undef, <2 x i64> undef, <2 x i64> undef
+; AVX2-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V4I64 = select <4 x i1> undef, <4 x i64> undef, <4 x i64> undef
+; AVX2-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V8I64 = select <8 x i1> undef, <8 x i64> undef, <8 x i64> undef
+; AVX2-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I32 = select i1 undef, i32 undef, i32 undef
+; AVX2-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V4I32 = select <4 x i1> undef, <4 x i32> undef, <4 x i32> undef
+; AVX2-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V8I32 = select <8 x i1> undef, <8 x i32> undef, <8 x i32> undef
+; AVX2-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V16I32 = select <16 x i1> undef, <16 x i32> undef, <16 x i32> undef
+; AVX2-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I16 = select i1 undef, i16 undef, i16 undef
+; AVX2-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V8I16 = select <8 x i1> undef, <8 x i16> undef, <8 x i16> undef
+; AVX2-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V16I16 = select <16 x i1> undef, <16 x i16> undef, <16 x i16> undef
+; AVX2-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V32I16 = select <32 x i1> undef, <32 x i16> undef, <32 x i16> undef
+; AVX2-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I8 = select i1 undef, i8 undef, i8 undef
+; AVX2-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V16I8 = select <16 x i1> undef, <16 x i8> undef, <16 x i8> undef
+; AVX2-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V32I8 = select <32 x i1> undef, <32 x i8> undef, <32 x i8> undef
+; AVX2-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V64I8 = select <64 x i1> undef, <64 x i8> undef, <64 x i8> undef
+; AVX2-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 undef
+;
+; AVX512F-LABEL: 'test_select'
+; AVX512F-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I64 = select i1 undef, i64 undef, i64 undef
+; AVX512F-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V2I64 = select <2 x i1> undef, <2 x i64> undef, <2 x i64> undef
+; AVX512F-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V4I64 = select <4 x i1> undef, <4 x i64> undef, <4 x i64> undef
+; AVX512F-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V8I64 = select <8 x i1> undef, <8 x i64> undef, <8 x i64> undef
+; AVX512F-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I32 = select i1 undef, i32 undef, i32 undef
+; AVX512F-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V4I32 = select <4 x i1> undef, <4 x i32> undef, <4 x i32> undef
+; AVX512F-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V8I32 = select <8 x i1> undef, <8 x i32> undef, <8 x i32> undef
+; AVX512F-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V16I32 = select <16 x i1> undef, <16 x i32> undef, <16 x i32> undef
+; AVX512F-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I16 = select i1 undef, i16 undef, i16 undef
+; AVX512F-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V8I16 = select <8 x i1> undef, <8 x i16> undef, <8 x i16> undef
+; AVX512F-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V16I16 = select <16 x i1> undef, <16 x i16> undef, <16 x i16> undef
+; AVX512F-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %V32I16 = select <32 x i1> undef, <32 x i16> undef, <32 x i16> undef
+; AVX512F-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I8 = select i1 undef, i8 undef, i8 undef
+; AVX512F-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V16I8 = select <16 x i1> undef, <16 x i8> undef, <16 x i8> undef
+; AVX512F-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V32I8 = select <32 x i1> undef, <32 x i8> undef, <32 x i8> undef
+; AVX512F-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %V64I8 = select <64 x i1> undef, <64 x i8> undef, <64 x i8> undef
+; AVX512F-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 undef
+;
+; AVX512BW-LABEL: 'test_select'
+; AVX512BW-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I64 = select i1 undef, i64 undef, i64 undef
+; AVX512BW-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V2I64 = select <2 x i1> undef, <2 x i64> undef, <2 x i64> undef
+; AVX512BW-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V4I64 = select <4 x i1> undef, <4 x i64> undef, <4 x i64> undef
+; AVX512BW-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V8I64 = select <8 x i1> undef, <8 x i64> undef, <8 x i64> undef
+; AVX512BW-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I32 = select i1 undef, i32 undef, i32 undef
+; AVX512BW-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V4I32 = select <4 x i1> undef, <4 x i32> undef, <4 x i32> undef
+; AVX512BW-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V8I32 = select <8 x i1> undef, <8 x i32> undef, <8 x i32> undef
+; AVX512BW-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V16I32 = select <16 x i1> undef, <16 x i32> undef, <16 x i32> undef
+; AVX512BW-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I16 = select i1 undef, i16 undef, i16 undef
+; AVX512BW-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V8I16 = select <8 x i1> undef, <8 x i16> undef, <8 x i16> undef
+; AVX512BW-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V16I16 = select <16 x i1> undef, <16 x i16> undef, <16 x i16> undef
+; AVX512BW-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V32I16 = select <32 x i1> undef, <32 x i16> undef, <32 x i16> undef
+; AVX512BW-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I8 = select i1 undef, i8 undef, i8 undef
+; AVX512BW-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V16I8 = select <16 x i1> undef, <16 x i8> undef, <16 x i8> undef
+; AVX512BW-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V32I8 = select <32 x i1> undef, <32 x i8> undef, <32 x i8> undef
+; AVX512BW-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V64I8 = select <64 x i1> undef, <64 x i8> undef, <64 x i8> undef
+; AVX512BW-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 undef
 ;
   %I64 = select i1 undef, i64 undef, i64 undef
   %V2I64 = select <2 x i1> undef, <2 x i64> undef, <2 x i64> undef
@@ -56,16 +151,49 @@ define i32 @test_select() {
 }
 
 define i32 @test_select_fp() {
-; CHECK-LABEL: 'test_select_fp'
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %F64 = select i1 undef, double undef, double undef
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V2F64 = select <2 x i1> undef, <2 x double> undef, <2 x double> undef
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V4F64 = select <4 x i1> undef, <4 x double> undef, <4 x double> undef
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V8F64 = select <8 x i1> undef, <8 x double> undef, <8 x double> undef
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %F32 = select i1 undef, float undef, float undef
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V4F32 = select <4 x i1> undef, <4 x float> undef, <4 x float> undef
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V8F32 = select <8 x i1> undef, <8 x float> undef, <8 x float> undef
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V16F32 = select <16 x i1> undef, <16 x float> undef, <16 x float> undef
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 undef
+; SSE2-LABEL: 'test_select_fp'
+; SSE2-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %F64 = select i1 undef, double undef, double undef
+; SSE2-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V2F64 = select <2 x i1> undef, <2 x double> undef, <2 x double> undef
+; SSE2-NEXT:  Cost Model: Found an estimated cost of 6 for instruction: %V4F64 = select <4 x i1> undef, <4 x double> undef, <4 x double> undef
+; SSE2-NEXT:  Cost Model: Found an estimated cost of 12 for instruction: %V8F64 = select <8 x i1> undef, <8 x double> undef, <8 x double> undef
+; SSE2-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %F32 = select i1 undef, float undef, float undef
+; SSE2-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V4F32 = select <4 x i1> undef, <4 x float> undef, <4 x float> undef
+; SSE2-NEXT:  Cost Model: Found an estimated cost of 6 for instruction: %V8F32 = select <8 x i1> undef, <8 x float> undef, <8 x float> undef
+; SSE2-NEXT:  Cost Model: Found an estimated cost of 12 for instruction: %V16F32 = select <16 x i1> undef, <16 x float> undef, <16 x float> undef
+; SSE2-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 undef
+;
+; SSE4-LABEL: 'test_select_fp'
+; SSE4-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %F64 = select i1 undef, double undef, double undef
+; SSE4-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V2F64 = select <2 x i1> undef, <2 x double> undef, <2 x double> undef
+; SSE4-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V4F64 = select <4 x i1> undef, <4 x double> undef, <4 x double> undef
+; SSE4-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %V8F64 = select <8 x i1> undef, <8 x double> undef, <8 x double> undef
+; SSE4-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %F32 = select i1 undef, float undef, float undef
+; SSE4-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V4F32 = select <4 x i1> undef, <4 x float> undef, <4 x float> undef
+; SSE4-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V8F32 = select <8 x i1> undef, <8 x float> undef, <8 x float> undef
+; SSE4-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %V16F32 = select <16 x i1> undef, <16 x float> undef, <16 x float> undef
+; SSE4-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 undef
+;
+; AVX-LABEL: 'test_select_fp'
+; AVX-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %F64 = select i1 undef, double undef, double undef
+; AVX-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V2F64 = select <2 x i1> undef, <2 x double> undef, <2 x double> undef
+; AVX-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V4F64 = select <4 x i1> undef, <4 x double> undef, <4 x double> undef
+; AVX-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V8F64 = select <8 x i1> undef, <8 x double> undef, <8 x double> undef
+; AVX-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %F32 = select i1 undef, float undef, float undef
+; AVX-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V4F32 = select <4 x i1> undef, <4 x float> undef, <4 x float> undef
+; AVX-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V8F32 = select <8 x i1> undef, <8 x float> undef, <8 x float> undef
+; AVX-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V16F32 = select <16 x i1> undef, <16 x float> undef, <16 x float> undef
+; AVX-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 undef
+;
+; AVX512-LABEL: 'test_select_fp'
+; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %F64 = select i1 undef, double undef, double undef
+; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V2F64 = select <2 x i1> undef, <2 x double> undef, <2 x double> undef
+; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V4F64 = select <4 x i1> undef, <4 x double> undef, <4 x double> undef
+; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V8F64 = select <8 x i1> undef, <8 x double> undef, <8 x double> undef
+; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %F32 = select i1 undef, float undef, float undef
+; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V4F32 = select <4 x i1> undef, <4 x float> undef, <4 x float> undef
+; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V8F32 = select <8 x i1> undef, <8 x float> undef, <8 x float> undef
+; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V16F32 = select <16 x i1> undef, <16 x float> undef, <16 x float> undef
+; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 undef
 ;
   %F64 = select i1 undef, double undef, double undef
   %V2F64 = select <2 x i1> undef, <2 x double> undef, <2 x double> undef
@@ -84,45 +212,105 @@ define i32 @test_select_fp() {
 ; Integers of the same size should also use those instructions.
 
 define <2 x i64> @test_2i64(<2 x i64> %a, <2 x i64> %b) {
-; CHECK-LABEL: 'test_2i64'
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %sel = select <2 x i1> <i1 true, i1 false>, <2 x i64> %a, <2 x i64> %b
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <2 x i64> %sel
+; SSE2-LABEL: 'test_2i64'
+; SSE2-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %sel = select <2 x i1> <i1 true, i1 false>, <2 x i64> %a, <2 x i64> %b
+; SSE2-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <2 x i64> %sel
+;
+; SSE4-LABEL: 'test_2i64'
+; SSE4-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %sel = select <2 x i1> <i1 true, i1 false>, <2 x i64> %a, <2 x i64> %b
+; SSE4-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <2 x i64> %sel
+;
+; AVX-LABEL: 'test_2i64'
+; AVX-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %sel = select <2 x i1> <i1 true, i1 false>, <2 x i64> %a, <2 x i64> %b
+; AVX-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <2 x i64> %sel
+;
+; AVX512-LABEL: 'test_2i64'
+; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %sel = select <2 x i1> <i1 true, i1 false>, <2 x i64> %a, <2 x i64> %b
+; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <2 x i64> %sel
 ;
   %sel = select <2 x i1> <i1 true, i1 false>, <2 x i64> %a, <2 x i64> %b
   ret <2 x i64> %sel
 }
 
 define <2 x double> @test_2double(<2 x double> %a, <2 x double> %b) {
-; CHECK-LABEL: 'test_2double'
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %sel = select <2 x i1> <i1 true, i1 false>, <2 x double> %a, <2 x double> %b
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <2 x double> %sel
+; SSE2-LABEL: 'test_2double'
+; SSE2-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %sel = select <2 x i1> <i1 true, i1 false>, <2 x double> %a, <2 x double> %b
+; SSE2-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <2 x double> %sel
+;
+; SSE4-LABEL: 'test_2double'
+; SSE4-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %sel = select <2 x i1> <i1 true, i1 false>, <2 x double> %a, <2 x double> %b
+; SSE4-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <2 x double> %sel
+;
+; AVX-LABEL: 'test_2double'
+; AVX-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %sel = select <2 x i1> <i1 true, i1 false>, <2 x double> %a, <2 x double> %b
+; AVX-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <2 x double> %sel
+;
+; AVX512-LABEL: 'test_2double'
+; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %sel = select <2 x i1> <i1 true, i1 false>, <2 x double> %a, <2 x double> %b
+; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <2 x double> %sel
 ;
   %sel = select <2 x i1> <i1 true, i1 false>, <2 x double> %a, <2 x double> %b
   ret <2 x double> %sel
 }
 
 define <4 x i32> @test_4i32(<4 x i32> %a, <4 x i32> %b) {
-; CHECK-LABEL: 'test_4i32'
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %sel = select <4 x i1> <i1 true, i1 false, i1 true, i1 false>, <4 x i32> %a, <4 x i32> %b
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <4 x i32> %sel
+; SSE2-LABEL: 'test_4i32'
+; SSE2-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %sel = select <4 x i1> <i1 true, i1 false, i1 true, i1 false>, <4 x i32> %a, <4 x i32> %b
+; SSE2-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <4 x i32> %sel
+;
+; SSE4-LABEL: 'test_4i32'
+; SSE4-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %sel = select <4 x i1> <i1 true, i1 false, i1 true, i1 false>, <4 x i32> %a, <4 x i32> %b
+; SSE4-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <4 x i32> %sel
+;
+; AVX-LABEL: 'test_4i32'
+; AVX-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %sel = select <4 x i1> <i1 true, i1 false, i1 true, i1 false>, <4 x i32> %a, <4 x i32> %b
+; AVX-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <4 x i32> %sel
+;
+; AVX512-LABEL: 'test_4i32'
+; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %sel = select <4 x i1> <i1 true, i1 false, i1 true, i1 false>, <4 x i32> %a, <4 x i32> %b
+; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <4 x i32> %sel
 ;
   %sel = select <4 x i1> <i1 true, i1 false, i1 true, i1 false>, <4 x i32> %a, <4 x i32> %b
   ret <4 x i32> %sel
 }
 
 define <4 x float> @test_4float(<4 x float> %a, <4 x float> %b) {
-; CHECK-LABEL: 'test_4float'
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %sel = select <4 x i1> <i1 true, i1 false, i1 true, i1 true>, <4 x float> %a, <4 x float> %b
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <4 x float> %sel
+; SSE2-LABEL: 'test_4float'
+; SSE2-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %sel = select <4 x i1> <i1 true, i1 false, i1 true, i1 true>, <4 x float> %a, <4 x float> %b
+; SSE2-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <4 x float> %sel
+;
+; SSE4-LABEL: 'test_4float'
+; SSE4-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %sel = select <4 x i1> <i1 true, i1 false, i1 true, i1 true>, <4 x float> %a, <4 x float> %b
+; SSE4-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <4 x float> %sel
+;
+; AVX-LABEL: 'test_4float'
+; AVX-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %sel = select <4 x i1> <i1 true, i1 false, i1 true, i1 true>, <4 x float> %a, <4 x float> %b
+; AVX-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <4 x float> %sel
+;
+; AVX512-LABEL: 'test_4float'
+; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %sel = select <4 x i1> <i1 true, i1 false, i1 true, i1 true>, <4 x float> %a, <4 x float> %b
+; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <4 x float> %sel
 ;
   %sel = select <4 x i1> <i1 true, i1 false, i1 true, i1 true>, <4 x float> %a, <4 x float> %b
   ret <4 x float> %sel
 }
 
 define <16 x i8> @test_16i8(<16 x i8> %a, <16 x i8> %b) {
-; CHECK-LABEL: 'test_16i8'
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %sel = select <16 x i1> <i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true>, <16 x i8> %a, <16 x i8> %b
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <16 x i8> %sel
+; SSE2-LABEL: 'test_16i8'
+; SSE2-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %sel = select <16 x i1> <i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true>, <16 x i8> %a, <16 x i8> %b
+; SSE2-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <16 x i8> %sel
+;
+; SSE4-LABEL: 'test_16i8'
+; SSE4-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %sel = select <16 x i1> <i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true>, <16 x i8> %a, <16 x i8> %b
+; SSE4-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <16 x i8> %sel
+;
+; AVX-LABEL: 'test_16i8'
+; AVX-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %sel = select <16 x i1> <i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true>, <16 x i8> %a, <16 x i8> %b
+; AVX-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <16 x i8> %sel
+;
+; AVX512-LABEL: 'test_16i8'
+; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %sel = select <16 x i1> <i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true>, <16 x i8> %a, <16 x i8> %b
+; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <16 x i8> %sel
 ;
   %sel = select <16 x i1> <i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true>, <16 x i8> %a, <16 x i8> %b
   ret <16 x i8> %sel
@@ -132,54 +320,134 @@ define <16 x i8> @test_16i8(<16 x i8> %a, <16 x i8> %b) {
 ; Integers of the same size should also use those instructions.
 
 define <4 x i64> @test_4i64(<4 x i64> %a, <4 x i64> %b) {
-; CHECK-LABEL: 'test_4i64'
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %sel = select <4 x i1> <i1 true, i1 false, i1 false, i1 true>, <4 x i64> %a, <4 x i64> %b
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <4 x i64> %sel
+; SSE2-LABEL: 'test_4i64'
+; SSE2-NEXT:  Cost Model: Found an estimated cost of 6 for instruction: %sel = select <4 x i1> <i1 true, i1 false, i1 false, i1 true>, <4 x i64> %a, <4 x i64> %b
+; SSE2-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <4 x i64> %sel
+;
+; SSE4-LABEL: 'test_4i64'
+; SSE4-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %sel = select <4 x i1> <i1 true, i1 false, i1 false, i1 true>, <4 x i64> %a, <4 x i64> %b
+; SSE4-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <4 x i64> %sel
+;
+; AVX-LABEL: 'test_4i64'
+; AVX-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %sel = select <4 x i1> <i1 true, i1 false, i1 false, i1 true>, <4 x i64> %a, <4 x i64> %b
+; AVX-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <4 x i64> %sel
+;
+; AVX512-LABEL: 'test_4i64'
+; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %sel = select <4 x i1> <i1 true, i1 false, i1 false, i1 true>, <4 x i64> %a, <4 x i64> %b
+; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <4 x i64> %sel
 ;
   %sel = select <4 x i1> <i1 true, i1 false, i1 false, i1 true>, <4 x i64> %a, <4 x i64> %b
   ret <4 x i64> %sel
 }
 
 define <4 x double> @test_4double(<4 x double> %a, <4 x double> %b) {
-; CHECK-LABEL: 'test_4double'
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %sel = select <4 x i1> <i1 true, i1 false, i1 true, i1 false>, <4 x double> %a, <4 x double> %b
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <4 x double> %sel
+; SSE2-LABEL: 'test_4double'
+; SSE2-NEXT:  Cost Model: Found an estimated cost of 6 for instruction: %sel = select <4 x i1> <i1 true, i1 false, i1 true, i1 false>, <4 x double> %a, <4 x double> %b
+; SSE2-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <4 x double> %sel
+;
+; SSE4-LABEL: 'test_4double'
+; SSE4-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %sel = select <4 x i1> <i1 true, i1 false, i1 true, i1 false>, <4 x double> %a, <4 x double> %b
+; SSE4-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <4 x double> %sel
+;
+; AVX-LABEL: 'test_4double'
+; AVX-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %sel = select <4 x i1> <i1 true, i1 false, i1 true, i1 false>, <4 x double> %a, <4 x double> %b
+; AVX-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <4 x double> %sel
+;
+; AVX512-LABEL: 'test_4double'
+; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %sel = select <4 x i1> <i1 true, i1 false, i1 true, i1 false>, <4 x double> %a, <4 x double> %b
+; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <4 x double> %sel
 ;
   %sel = select <4 x i1> <i1 true, i1 false, i1 true, i1 false>, <4 x double> %a, <4 x double> %b
   ret <4 x double> %sel
 }
 
 define <8 x i32> @test_8i32(<8 x i32> %a, <8 x i32> %b) {
-; CHECK-LABEL: 'test_8i32'
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %sel = select <8 x i1> <i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 false>, <8 x i32> %a, <8 x i32> %b
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <8 x i32> %sel
+; SSE2-LABEL: 'test_8i32'
+; SSE2-NEXT:  Cost Model: Found an estimated cost of 6 for instruction: %sel = select <8 x i1> <i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 false>, <8 x i32> %a, <8 x i32> %b
+; SSE2-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <8 x i32> %sel
+;
+; SSE4-LABEL: 'test_8i32'
+; SSE4-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %sel = select <8 x i1> <i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 false>, <8 x i32> %a, <8 x i32> %b
+; SSE4-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <8 x i32> %sel
+;
+; AVX-LABEL: 'test_8i32'
+; AVX-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %sel = select <8 x i1> <i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 false>, <8 x i32> %a, <8 x i32> %b
+; AVX-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <8 x i32> %sel
+;
+; AVX512-LABEL: 'test_8i32'
+; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %sel = select <8 x i1> <i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 false>, <8 x i32> %a, <8 x i32> %b
+; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <8 x i32> %sel
 ;
   %sel = select <8 x i1> <i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 false>, <8 x i32> %a, <8 x i32> %b
   ret <8 x i32> %sel
 }
 
 define <8 x float> @test_8float(<8 x float> %a, <8 x float> %b) {
-; CHECK-LABEL: 'test_8float'
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %sel = select <8 x i1> <i1 true, i1 false, i1 false, i1 false, i1 true, i1 false, i1 false, i1 false>, <8 x float> %a, <8 x float> %b
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <8 x float> %sel
+; SSE2-LABEL: 'test_8float'
+; SSE2-NEXT:  Cost Model: Found an estimated cost of 6 for instruction: %sel = select <8 x i1> <i1 true, i1 false, i1 false, i1 false, i1 true, i1 false, i1 false, i1 false>, <8 x float> %a, <8 x float> %b
+; SSE2-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <8 x float> %sel
+;
+; SSE4-LABEL: 'test_8float'
+; SSE4-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %sel = select <8 x i1> <i1 true, i1 false, i1 false, i1 false, i1 true, i1 false, i1 false, i1 false>, <8 x float> %a, <8 x float> %b
+; SSE4-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <8 x float> %sel
+;
+; AVX-LABEL: 'test_8float'
+; AVX-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %sel = select <8 x i1> <i1 true, i1 false, i1 false, i1 false, i1 true, i1 false, i1 false, i1 false>, <8 x float> %a, <8 x float> %b
+; AVX-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <8 x float> %sel
+;
+; AVX512-LABEL: 'test_8float'
+; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %sel = select <8 x i1> <i1 true, i1 false, i1 false, i1 false, i1 true, i1 false, i1 false, i1 false>, <8 x float> %a, <8 x float> %b
+; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <8 x float> %sel
 ;
   %sel = select <8 x i1> <i1 true, i1 false, i1 false, i1 false, i1 true, i1 false, i1 false, i1 false>, <8 x float> %a, <8 x float> %b
   ret <8 x float> %sel
 }
 
 define <16 x i16> @test_16i16(<16 x i16> %a, <16 x i16> %b) {
-; CHECK-LABEL: 'test_16i16'
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %sel = select <16 x i1> <i1 true, i1 false, i1 false, i1 false, i1 true, i1 false, i1 false, i1 false, i1 true, i1 false, i1 false, i1 false, i1 true, i1 false, i1 false, i1 false>, <16 x i16> %a, <16 x i16> %b
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <16 x i16> %sel
+; SSE2-LABEL: 'test_16i16'
+; SSE2-NEXT:  Cost Model: Found an estimated cost of 6 for instruction: %sel = select <16 x i1> <i1 true, i1 false, i1 false, i1 false, i1 true, i1 false, i1 false, i1 false, i1 true, i1 false, i1 false, i1 false, i1 true, i1 false, i1 false, i1 false>, <16 x i16> %a, <16 x i16> %b
+; SSE2-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <16 x i16> %sel
+;
+; SSE4-LABEL: 'test_16i16'
+; SSE4-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %sel = select <16 x i1> <i1 true, i1 false, i1 false, i1 false, i1 true, i1 false, i1 false, i1 false, i1 true, i1 false, i1 false, i1 false, i1 true, i1 false, i1 false, i1 false>, <16 x i16> %a, <16 x i16> %b
+; SSE4-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <16 x i16> %sel
+;
+; AVX1-LABEL: 'test_16i16'
+; AVX1-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %sel = select <16 x i1> <i1 true, i1 false, i1 false, i1 false, i1 true, i1 false, i1 false, i1 false, i1 true, i1 false, i1 false, i1 false, i1 true, i1 false, i1 false, i1 false>, <16 x i16> %a, <16 x i16> %b
+; AVX1-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <16 x i16> %sel
+;
+; AVX2-LABEL: 'test_16i16'
+; AVX2-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %sel = select <16 x i1> <i1 true, i1 false, i1 false, i1 false, i1 true, i1 false, i1 false, i1 false, i1 true, i1 false, i1 false, i1 false, i1 true, i1 false, i1 false, i1 false>, <16 x i16> %a, <16 x i16> %b
+; AVX2-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <16 x i16> %sel
+;
+; AVX512-LABEL: 'test_16i16'
+; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %sel = select <16 x i1> <i1 true, i1 false, i1 false, i1 false, i1 true, i1 false, i1 false, i1 false, i1 true, i1 false, i1 false, i1 false, i1 true, i1 false, i1 false, i1 false>, <16 x i16> %a, <16 x i16> %b
+; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <16 x i16> %sel
 ;
   %sel = select <16 x i1> <i1 true, i1 false, i1 false, i1 false, i1 true, i1 false, i1 false, i1 false, i1 true, i1 false, i1 false, i1 false, i1 true, i1 false, i1 false, i1 false>, <16 x i16> %a, <16 x i16> %b
   ret <16 x i16> %sel
 }
 
 define <32 x i8> @test_32i8(<32 x i8> %a, <32 x i8> %b) {
-; CHECK-LABEL: 'test_32i8'
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %sel = select <32 x i1> <i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true>, <32 x i8> %a, <32 x i8> %b
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <32 x i8> %sel
+; SSE2-LABEL: 'test_32i8'
+; SSE2-NEXT:  Cost Model: Found an estimated cost of 6 for instruction: %sel = select <32 x i1> <i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true>, <32 x i8> %a, <32 x i8> %b
+; SSE2-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <32 x i8> %sel
+;
+; SSE4-LABEL: 'test_32i8'
+; SSE4-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %sel = select <32 x i1> <i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true>, <32 x i8> %a, <32 x i8> %b
+; SSE4-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <32 x i8> %sel
+;
+; AVX1-LABEL: 'test_32i8'
+; AVX1-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %sel = select <32 x i1> <i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true>, <32 x i8> %a, <32 x i8> %b
+; AVX1-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <32 x i8> %sel
+;
+; AVX2-LABEL: 'test_32i8'
+; AVX2-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %sel = select <32 x i1> <i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true>, <32 x i8> %a, <32 x i8> %b
+; AVX2-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <32 x i8> %sel
+;
+; AVX512-LABEL: 'test_32i8'
+; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %sel = select <32 x i1> <i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true>, <32 x i8> %a, <32 x i8> %b
+; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <32 x i8> %sel
 ;
   %sel = select <32 x i1> <i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true>, <32 x i8> %a, <32 x i8> %b
   ret <32 x i8> %sel

diff  --git a/llvm/test/Analysis/CostModel/X86/select-latency.ll b/llvm/test/Analysis/CostModel/X86/select-latency.ll
index 787fb726710a1..7759b94be5a50 100644
--- a/llvm/test/Analysis/CostModel/X86/select-latency.ll
+++ b/llvm/test/Analysis/CostModel/X86/select-latency.ll
@@ -1,36 +1,131 @@
 ; NOTE: Assertions have been autogenerated by utils/update_analyze_test_checks.py
-; RUN: opt < %s -passes="print<cost-model>" 2>&1 -disable-output -cost-kind=latency -mtriple=x86_64-- -mattr=+sse2 | FileCheck %s
-; RUN: opt < %s -passes="print<cost-model>" 2>&1 -disable-output -cost-kind=latency -mtriple=x86_64-- -mattr=+sse4.1 | FileCheck %s
-; RUN: opt < %s -passes="print<cost-model>" 2>&1 -disable-output -cost-kind=latency -mtriple=x86_64-- -mattr=+avx | FileCheck %s
-; RUN: opt < %s -passes="print<cost-model>" 2>&1 -disable-output -cost-kind=latency -mtriple=x86_64-- -mattr=+avx2 | FileCheck %s
-; RUN: opt < %s -passes="print<cost-model>" 2>&1 -disable-output -cost-kind=latency -mtriple=x86_64-- -mattr=+avx512f,+avx512vl  | FileCheck %s
-; RUN: opt < %s -passes="print<cost-model>" 2>&1 -disable-output -cost-kind=latency -mtriple=x86_64-- -mattr=+avx512bw,+avx512vl | FileCheck %s
+; RUN: opt < %s -passes="print<cost-model>" 2>&1 -disable-output -cost-kind=latency -mtriple=x86_64-- -mattr=+sse2 | FileCheck %s -check-prefixes=SSE
+; RUN: opt < %s -passes="print<cost-model>" 2>&1 -disable-output -cost-kind=latency -mtriple=x86_64-- -mattr=+sse4.1 | FileCheck %s -check-prefixes=SSE
+; RUN: opt < %s -passes="print<cost-model>" 2>&1 -disable-output -cost-kind=latency -mtriple=x86_64-- -mattr=+avx | FileCheck %s -check-prefixes=AVX,AVX1
+; RUN: opt < %s -passes="print<cost-model>" 2>&1 -disable-output -cost-kind=latency -mtriple=x86_64-- -mattr=+avx2 | FileCheck %s -check-prefixes=AVX,AVX2
+; RUN: opt < %s -passes="print<cost-model>" 2>&1 -disable-output -cost-kind=latency -mtriple=x86_64-- -mattr=+avx512f,+avx512vl  | FileCheck %s -check-prefixes=AVX512,AVX512F
+; RUN: opt < %s -passes="print<cost-model>" 2>&1 -disable-output -cost-kind=latency -mtriple=x86_64-- -mattr=+avx512bw,+avx512vl | FileCheck %s -check-prefixes=AVX512,AVX512BW
 ;
-; RUN: opt < %s -passes="print<cost-model>" 2>&1 -disable-output -cost-kind=latency -mtriple=x86_64-- -mcpu=slm | FileCheck %s
-; RUN: opt < %s -passes="print<cost-model>" 2>&1 -disable-output -cost-kind=latency -mtriple=x86_64-- -mcpu=goldmont | FileCheck %s
-; RUN: opt < %s -passes="print<cost-model>" 2>&1 -disable-output -cost-kind=latency -mtriple=x86_64-- -mcpu=btver2 | FileCheck %s
+; RUN: opt < %s -passes="print<cost-model>" 2>&1 -disable-output -cost-kind=latency -mtriple=x86_64-- -mcpu=slm | FileCheck %s --check-prefixes=SLM
+; RUN: opt < %s -passes="print<cost-model>" 2>&1 -disable-output -cost-kind=latency -mtriple=x86_64-- -mcpu=goldmont | FileCheck %s --check-prefixes=SSE
+; RUN: opt < %s -passes="print<cost-model>" 2>&1 -disable-output -cost-kind=latency -mtriple=x86_64-- -mcpu=btver2 | FileCheck %s --check-prefixes=AVX,AVX1
 
 ; Verify the cost of vector select instructions.
 
 define i32 @test_select() {
-; CHECK-LABEL: 'test_select'
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I64 = select i1 undef, i64 undef, i64 undef
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V2I64 = select <2 x i1> undef, <2 x i64> undef, <2 x i64> undef
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V4I64 = select <4 x i1> undef, <4 x i64> undef, <4 x i64> undef
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V8I64 = select <8 x i1> undef, <8 x i64> undef, <8 x i64> undef
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I32 = select i1 undef, i32 undef, i32 undef
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V4I32 = select <4 x i1> undef, <4 x i32> undef, <4 x i32> undef
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V8I32 = select <8 x i1> undef, <8 x i32> undef, <8 x i32> undef
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V16I32 = select <16 x i1> undef, <16 x i32> undef, <16 x i32> undef
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I16 = select i1 undef, i16 undef, i16 undef
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V8I16 = select <8 x i1> undef, <8 x i16> undef, <8 x i16> undef
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V16I16 = select <16 x i1> undef, <16 x i16> undef, <16 x i16> undef
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V32I16 = select <32 x i1> undef, <32 x i16> undef, <32 x i16> undef
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I8 = select i1 undef, i8 undef, i8 undef
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V16I8 = select <16 x i1> undef, <16 x i8> undef, <16 x i8> undef
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V32I8 = select <32 x i1> undef, <32 x i8> undef, <32 x i8> undef
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V64I8 = select <64 x i1> undef, <64 x i8> undef, <64 x i8> undef
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 undef
+; SSE-LABEL: 'test_select'
+; SSE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I64 = select i1 undef, i64 undef, i64 undef
+; SSE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V2I64 = select <2 x i1> undef, <2 x i64> undef, <2 x i64> undef
+; SSE-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %V4I64 = select <4 x i1> undef, <4 x i64> undef, <4 x i64> undef
+; SSE-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %V8I64 = select <8 x i1> undef, <8 x i64> undef, <8 x i64> undef
+; SSE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I32 = select i1 undef, i32 undef, i32 undef
+; SSE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V4I32 = select <4 x i1> undef, <4 x i32> undef, <4 x i32> undef
+; SSE-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %V8I32 = select <8 x i1> undef, <8 x i32> undef, <8 x i32> undef
+; SSE-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %V16I32 = select <16 x i1> undef, <16 x i32> undef, <16 x i32> undef
+; SSE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I16 = select i1 undef, i16 undef, i16 undef
+; SSE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V8I16 = select <8 x i1> undef, <8 x i16> undef, <8 x i16> undef
+; SSE-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %V16I16 = select <16 x i1> undef, <16 x i16> undef, <16 x i16> undef
+; SSE-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %V32I16 = select <32 x i1> undef, <32 x i16> undef, <32 x i16> undef
+; SSE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I8 = select i1 undef, i8 undef, i8 undef
+; SSE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V16I8 = select <16 x i1> undef, <16 x i8> undef, <16 x i8> undef
+; SSE-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %V32I8 = select <32 x i1> undef, <32 x i8> undef, <32 x i8> undef
+; SSE-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %V64I8 = select <64 x i1> undef, <64 x i8> undef, <64 x i8> undef
+; SSE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 undef
+;
+; AVX1-LABEL: 'test_select'
+; AVX1-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I64 = select i1 undef, i64 undef, i64 undef
+; AVX1-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V2I64 = select <2 x i1> undef, <2 x i64> undef, <2 x i64> undef
+; AVX1-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V4I64 = select <4 x i1> undef, <4 x i64> undef, <4 x i64> undef
+; AVX1-NEXT:  Cost Model: Found an estimated cost of 6 for instruction: %V8I64 = select <8 x i1> undef, <8 x i64> undef, <8 x i64> undef
+; AVX1-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I32 = select i1 undef, i32 undef, i32 undef
+; AVX1-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V4I32 = select <4 x i1> undef, <4 x i32> undef, <4 x i32> undef
+; AVX1-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V8I32 = select <8 x i1> undef, <8 x i32> undef, <8 x i32> undef
+; AVX1-NEXT:  Cost Model: Found an estimated cost of 6 for instruction: %V16I32 = select <16 x i1> undef, <16 x i32> undef, <16 x i32> undef
+; AVX1-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I16 = select i1 undef, i16 undef, i16 undef
+; AVX1-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V8I16 = select <8 x i1> undef, <8 x i16> undef, <8 x i16> undef
+; AVX1-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V16I16 = select <16 x i1> undef, <16 x i16> undef, <16 x i16> undef
+; AVX1-NEXT:  Cost Model: Found an estimated cost of 6 for instruction: %V32I16 = select <32 x i1> undef, <32 x i16> undef, <32 x i16> undef
+; AVX1-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I8 = select i1 undef, i8 undef, i8 undef
+; AVX1-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V16I8 = select <16 x i1> undef, <16 x i8> undef, <16 x i8> undef
+; AVX1-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V32I8 = select <32 x i1> undef, <32 x i8> undef, <32 x i8> undef
+; AVX1-NEXT:  Cost Model: Found an estimated cost of 6 for instruction: %V64I8 = select <64 x i1> undef, <64 x i8> undef, <64 x i8> undef
+; AVX1-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 undef
+;
+; AVX2-LABEL: 'test_select'
+; AVX2-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I64 = select i1 undef, i64 undef, i64 undef
+; AVX2-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V2I64 = select <2 x i1> undef, <2 x i64> undef, <2 x i64> undef
+; AVX2-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V4I64 = select <4 x i1> undef, <4 x i64> undef, <4 x i64> undef
+; AVX2-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %V8I64 = select <8 x i1> undef, <8 x i64> undef, <8 x i64> undef
+; AVX2-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I32 = select i1 undef, i32 undef, i32 undef
+; AVX2-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V4I32 = select <4 x i1> undef, <4 x i32> undef, <4 x i32> undef
+; AVX2-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V8I32 = select <8 x i1> undef, <8 x i32> undef, <8 x i32> undef
+; AVX2-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %V16I32 = select <16 x i1> undef, <16 x i32> undef, <16 x i32> undef
+; AVX2-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I16 = select i1 undef, i16 undef, i16 undef
+; AVX2-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V8I16 = select <8 x i1> undef, <8 x i16> undef, <8 x i16> undef
+; AVX2-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V16I16 = select <16 x i1> undef, <16 x i16> undef, <16 x i16> undef
+; AVX2-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %V32I16 = select <32 x i1> undef, <32 x i16> undef, <32 x i16> undef
+; AVX2-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I8 = select i1 undef, i8 undef, i8 undef
+; AVX2-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V16I8 = select <16 x i1> undef, <16 x i8> undef, <16 x i8> undef
+; AVX2-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V32I8 = select <32 x i1> undef, <32 x i8> undef, <32 x i8> undef
+; AVX2-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %V64I8 = select <64 x i1> undef, <64 x i8> undef, <64 x i8> undef
+; AVX2-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 undef
+;
+; AVX512F-LABEL: 'test_select'
+; AVX512F-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I64 = select i1 undef, i64 undef, i64 undef
+; AVX512F-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V2I64 = select <2 x i1> undef, <2 x i64> undef, <2 x i64> undef
+; AVX512F-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V4I64 = select <4 x i1> undef, <4 x i64> undef, <4 x i64> undef
+; AVX512F-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V8I64 = select <8 x i1> undef, <8 x i64> undef, <8 x i64> undef
+; AVX512F-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I32 = select i1 undef, i32 undef, i32 undef
+; AVX512F-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V4I32 = select <4 x i1> undef, <4 x i32> undef, <4 x i32> undef
+; AVX512F-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V8I32 = select <8 x i1> undef, <8 x i32> undef, <8 x i32> undef
+; AVX512F-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V16I32 = select <16 x i1> undef, <16 x i32> undef, <16 x i32> undef
+; AVX512F-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I16 = select i1 undef, i16 undef, i16 undef
+; AVX512F-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V8I16 = select <8 x i1> undef, <8 x i16> undef, <8 x i16> undef
+; AVX512F-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V16I16 = select <16 x i1> undef, <16 x i16> undef, <16 x i16> undef
+; AVX512F-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V32I16 = select <32 x i1> undef, <32 x i16> undef, <32 x i16> undef
+; AVX512F-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I8 = select i1 undef, i8 undef, i8 undef
+; AVX512F-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V16I8 = select <16 x i1> undef, <16 x i8> undef, <16 x i8> undef
+; AVX512F-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V32I8 = select <32 x i1> undef, <32 x i8> undef, <32 x i8> undef
+; AVX512F-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V64I8 = select <64 x i1> undef, <64 x i8> undef, <64 x i8> undef
+; AVX512F-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 undef
+;
+; AVX512BW-LABEL: 'test_select'
+; AVX512BW-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I64 = select i1 undef, i64 undef, i64 undef
+; AVX512BW-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V2I64 = select <2 x i1> undef, <2 x i64> undef, <2 x i64> undef
+; AVX512BW-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V4I64 = select <4 x i1> undef, <4 x i64> undef, <4 x i64> undef
+; AVX512BW-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V8I64 = select <8 x i1> undef, <8 x i64> undef, <8 x i64> undef
+; AVX512BW-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I32 = select i1 undef, i32 undef, i32 undef
+; AVX512BW-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V4I32 = select <4 x i1> undef, <4 x i32> undef, <4 x i32> undef
+; AVX512BW-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V8I32 = select <8 x i1> undef, <8 x i32> undef, <8 x i32> undef
+; AVX512BW-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V16I32 = select <16 x i1> undef, <16 x i32> undef, <16 x i32> undef
+; AVX512BW-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I16 = select i1 undef, i16 undef, i16 undef
+; AVX512BW-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V8I16 = select <8 x i1> undef, <8 x i16> undef, <8 x i16> undef
+; AVX512BW-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V16I16 = select <16 x i1> undef, <16 x i16> undef, <16 x i16> undef
+; AVX512BW-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V32I16 = select <32 x i1> undef, <32 x i16> undef, <32 x i16> undef
+; AVX512BW-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I8 = select i1 undef, i8 undef, i8 undef
+; AVX512BW-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V16I8 = select <16 x i1> undef, <16 x i8> undef, <16 x i8> undef
+; AVX512BW-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V32I8 = select <32 x i1> undef, <32 x i8> undef, <32 x i8> undef
+; AVX512BW-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V64I8 = select <64 x i1> undef, <64 x i8> undef, <64 x i8> undef
+; AVX512BW-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 undef
+;
+; SLM-LABEL: 'test_select'
+; SLM-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I64 = select i1 undef, i64 undef, i64 undef
+; SLM-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %V2I64 = select <2 x i1> undef, <2 x i64> undef, <2 x i64> undef
+; SLM-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %V4I64 = select <4 x i1> undef, <4 x i64> undef, <4 x i64> undef
+; SLM-NEXT:  Cost Model: Found an estimated cost of 16 for instruction: %V8I64 = select <8 x i1> undef, <8 x i64> undef, <8 x i64> undef
+; SLM-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I32 = select i1 undef, i32 undef, i32 undef
+; SLM-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V4I32 = select <4 x i1> undef, <4 x i32> undef, <4 x i32> undef
+; SLM-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %V8I32 = select <8 x i1> undef, <8 x i32> undef, <8 x i32> undef
+; SLM-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %V16I32 = select <16 x i1> undef, <16 x i32> undef, <16 x i32> undef
+; SLM-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I16 = select i1 undef, i16 undef, i16 undef
+; SLM-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %V8I16 = select <8 x i1> undef, <8 x i16> undef, <8 x i16> undef
+; SLM-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %V16I16 = select <16 x i1> undef, <16 x i16> undef, <16 x i16> undef
+; SLM-NEXT:  Cost Model: Found an estimated cost of 16 for instruction: %V32I16 = select <32 x i1> undef, <32 x i16> undef, <32 x i16> undef
+; SLM-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I8 = select i1 undef, i8 undef, i8 undef
+; SLM-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %V16I8 = select <16 x i1> undef, <16 x i8> undef, <16 x i8> undef
+; SLM-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %V32I8 = select <32 x i1> undef, <32 x i8> undef, <32 x i8> undef
+; SLM-NEXT:  Cost Model: Found an estimated cost of 16 for instruction: %V64I8 = select <64 x i1> undef, <64 x i8> undef, <64 x i8> undef
+; SLM-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 undef
 ;
   %I64 = select i1 undef, i64 undef, i64 undef
   %V2I64 = select <2 x i1> undef, <2 x i64> undef, <2 x i64> undef
@@ -56,16 +151,60 @@ define i32 @test_select() {
 }
 
 define i32 @test_select_fp() {
-; CHECK-LABEL: 'test_select_fp'
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %F64 = select i1 undef, double undef, double undef
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V2F64 = select <2 x i1> undef, <2 x double> undef, <2 x double> undef
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V4F64 = select <4 x i1> undef, <4 x double> undef, <4 x double> undef
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V8F64 = select <8 x i1> undef, <8 x double> undef, <8 x double> undef
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %F32 = select i1 undef, float undef, float undef
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V4F32 = select <4 x i1> undef, <4 x float> undef, <4 x float> undef
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V8F32 = select <8 x i1> undef, <8 x float> undef, <8 x float> undef
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V16F32 = select <16 x i1> undef, <16 x float> undef, <16 x float> undef
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 undef
+; SSE-LABEL: 'test_select_fp'
+; SSE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %F64 = select i1 undef, double undef, double undef
+; SSE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V2F64 = select <2 x i1> undef, <2 x double> undef, <2 x double> undef
+; SSE-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %V4F64 = select <4 x i1> undef, <4 x double> undef, <4 x double> undef
+; SSE-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %V8F64 = select <8 x i1> undef, <8 x double> undef, <8 x double> undef
+; SSE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %F32 = select i1 undef, float undef, float undef
+; SSE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V4F32 = select <4 x i1> undef, <4 x float> undef, <4 x float> undef
+; SSE-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %V8F32 = select <8 x i1> undef, <8 x float> undef, <8 x float> undef
+; SSE-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %V16F32 = select <16 x i1> undef, <16 x float> undef, <16 x float> undef
+; SSE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 undef
+;
+; AVX1-LABEL: 'test_select_fp'
+; AVX1-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %F64 = select i1 undef, double undef, double undef
+; AVX1-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V2F64 = select <2 x i1> undef, <2 x double> undef, <2 x double> undef
+; AVX1-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V4F64 = select <4 x i1> undef, <4 x double> undef, <4 x double> undef
+; AVX1-NEXT:  Cost Model: Found an estimated cost of 6 for instruction: %V8F64 = select <8 x i1> undef, <8 x double> undef, <8 x double> undef
+; AVX1-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %F32 = select i1 undef, float undef, float undef
+; AVX1-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V4F32 = select <4 x i1> undef, <4 x float> undef, <4 x float> undef
+; AVX1-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V8F32 = select <8 x i1> undef, <8 x float> undef, <8 x float> undef
+; AVX1-NEXT:  Cost Model: Found an estimated cost of 6 for instruction: %V16F32 = select <16 x i1> undef, <16 x float> undef, <16 x float> undef
+; AVX1-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 undef
+;
+; AVX2-LABEL: 'test_select_fp'
+; AVX2-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %F64 = select i1 undef, double undef, double undef
+; AVX2-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V2F64 = select <2 x i1> undef, <2 x double> undef, <2 x double> undef
+; AVX2-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V4F64 = select <4 x i1> undef, <4 x double> undef, <4 x double> undef
+; AVX2-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %V8F64 = select <8 x i1> undef, <8 x double> undef, <8 x double> undef
+; AVX2-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %F32 = select i1 undef, float undef, float undef
+; AVX2-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V4F32 = select <4 x i1> undef, <4 x float> undef, <4 x float> undef
+; AVX2-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V8F32 = select <8 x i1> undef, <8 x float> undef, <8 x float> undef
+; AVX2-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %V16F32 = select <16 x i1> undef, <16 x float> undef, <16 x float> undef
+; AVX2-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 undef
+;
+; AVX512-LABEL: 'test_select_fp'
+; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %F64 = select i1 undef, double undef, double undef
+; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V2F64 = select <2 x i1> undef, <2 x double> undef, <2 x double> undef
+; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V4F64 = select <4 x i1> undef, <4 x double> undef, <4 x double> undef
+; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V8F64 = select <8 x i1> undef, <8 x double> undef, <8 x double> undef
+; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %F32 = select i1 undef, float undef, float undef
+; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V4F32 = select <4 x i1> undef, <4 x float> undef, <4 x float> undef
+; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V8F32 = select <8 x i1> undef, <8 x float> undef, <8 x float> undef
+; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V16F32 = select <16 x i1> undef, <16 x float> undef, <16 x float> undef
+; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 undef
+;
+; SLM-LABEL: 'test_select_fp'
+; SLM-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %F64 = select i1 undef, double undef, double undef
+; SLM-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %V2F64 = select <2 x i1> undef, <2 x double> undef, <2 x double> undef
+; SLM-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %V4F64 = select <4 x i1> undef, <4 x double> undef, <4 x double> undef
+; SLM-NEXT:  Cost Model: Found an estimated cost of 16 for instruction: %V8F64 = select <8 x i1> undef, <8 x double> undef, <8 x double> undef
+; SLM-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %F32 = select i1 undef, float undef, float undef
+; SLM-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %V4F32 = select <4 x i1> undef, <4 x float> undef, <4 x float> undef
+; SLM-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %V8F32 = select <8 x i1> undef, <8 x float> undef, <8 x float> undef
+; SLM-NEXT:  Cost Model: Found an estimated cost of 16 for instruction: %V16F32 = select <16 x i1> undef, <16 x float> undef, <16 x float> undef
+; SLM-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 undef
 ;
   %F64 = select i1 undef, double undef, double undef
   %V2F64 = select <2 x i1> undef, <2 x double> undef, <2 x double> undef
@@ -84,45 +223,105 @@ define i32 @test_select_fp() {
 ; Integers of the same size should also use those instructions.
 
 define <2 x i64> @test_2i64(<2 x i64> %a, <2 x i64> %b) {
-; CHECK-LABEL: 'test_2i64'
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %sel = select <2 x i1> <i1 true, i1 false>, <2 x i64> %a, <2 x i64> %b
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <2 x i64> %sel
+; SSE-LABEL: 'test_2i64'
+; SSE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %sel = select <2 x i1> <i1 true, i1 false>, <2 x i64> %a, <2 x i64> %b
+; SSE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <2 x i64> %sel
+;
+; AVX-LABEL: 'test_2i64'
+; AVX-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %sel = select <2 x i1> <i1 true, i1 false>, <2 x i64> %a, <2 x i64> %b
+; AVX-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <2 x i64> %sel
+;
+; AVX512-LABEL: 'test_2i64'
+; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %sel = select <2 x i1> <i1 true, i1 false>, <2 x i64> %a, <2 x i64> %b
+; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <2 x i64> %sel
+;
+; SLM-LABEL: 'test_2i64'
+; SLM-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %sel = select <2 x i1> <i1 true, i1 false>, <2 x i64> %a, <2 x i64> %b
+; SLM-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <2 x i64> %sel
 ;
   %sel = select <2 x i1> <i1 true, i1 false>, <2 x i64> %a, <2 x i64> %b
   ret <2 x i64> %sel
 }
 
 define <2 x double> @test_2double(<2 x double> %a, <2 x double> %b) {
-; CHECK-LABEL: 'test_2double'
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %sel = select <2 x i1> <i1 true, i1 false>, <2 x double> %a, <2 x double> %b
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <2 x double> %sel
+; SSE-LABEL: 'test_2double'
+; SSE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %sel = select <2 x i1> <i1 true, i1 false>, <2 x double> %a, <2 x double> %b
+; SSE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <2 x double> %sel
+;
+; AVX-LABEL: 'test_2double'
+; AVX-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %sel = select <2 x i1> <i1 true, i1 false>, <2 x double> %a, <2 x double> %b
+; AVX-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <2 x double> %sel
+;
+; AVX512-LABEL: 'test_2double'
+; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %sel = select <2 x i1> <i1 true, i1 false>, <2 x double> %a, <2 x double> %b
+; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <2 x double> %sel
+;
+; SLM-LABEL: 'test_2double'
+; SLM-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %sel = select <2 x i1> <i1 true, i1 false>, <2 x double> %a, <2 x double> %b
+; SLM-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <2 x double> %sel
 ;
   %sel = select <2 x i1> <i1 true, i1 false>, <2 x double> %a, <2 x double> %b
   ret <2 x double> %sel
 }
 
 define <4 x i32> @test_4i32(<4 x i32> %a, <4 x i32> %b) {
-; CHECK-LABEL: 'test_4i32'
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %sel = select <4 x i1> <i1 true, i1 false, i1 true, i1 false>, <4 x i32> %a, <4 x i32> %b
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <4 x i32> %sel
+; SSE-LABEL: 'test_4i32'
+; SSE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %sel = select <4 x i1> <i1 true, i1 false, i1 true, i1 false>, <4 x i32> %a, <4 x i32> %b
+; SSE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <4 x i32> %sel
+;
+; AVX-LABEL: 'test_4i32'
+; AVX-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %sel = select <4 x i1> <i1 true, i1 false, i1 true, i1 false>, <4 x i32> %a, <4 x i32> %b
+; AVX-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <4 x i32> %sel
+;
+; AVX512-LABEL: 'test_4i32'
+; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %sel = select <4 x i1> <i1 true, i1 false, i1 true, i1 false>, <4 x i32> %a, <4 x i32> %b
+; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <4 x i32> %sel
+;
+; SLM-LABEL: 'test_4i32'
+; SLM-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %sel = select <4 x i1> <i1 true, i1 false, i1 true, i1 false>, <4 x i32> %a, <4 x i32> %b
+; SLM-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <4 x i32> %sel
 ;
   %sel = select <4 x i1> <i1 true, i1 false, i1 true, i1 false>, <4 x i32> %a, <4 x i32> %b
   ret <4 x i32> %sel
 }
 
 define <4 x float> @test_4float(<4 x float> %a, <4 x float> %b) {
-; CHECK-LABEL: 'test_4float'
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %sel = select <4 x i1> <i1 true, i1 false, i1 true, i1 true>, <4 x float> %a, <4 x float> %b
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <4 x float> %sel
+; SSE-LABEL: 'test_4float'
+; SSE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %sel = select <4 x i1> <i1 true, i1 false, i1 true, i1 true>, <4 x float> %a, <4 x float> %b
+; SSE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <4 x float> %sel
+;
+; AVX-LABEL: 'test_4float'
+; AVX-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %sel = select <4 x i1> <i1 true, i1 false, i1 true, i1 true>, <4 x float> %a, <4 x float> %b
+; AVX-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <4 x float> %sel
+;
+; AVX512-LABEL: 'test_4float'
+; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %sel = select <4 x i1> <i1 true, i1 false, i1 true, i1 true>, <4 x float> %a, <4 x float> %b
+; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <4 x float> %sel
+;
+; SLM-LABEL: 'test_4float'
+; SLM-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %sel = select <4 x i1> <i1 true, i1 false, i1 true, i1 true>, <4 x float> %a, <4 x float> %b
+; SLM-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <4 x float> %sel
 ;
   %sel = select <4 x i1> <i1 true, i1 false, i1 true, i1 true>, <4 x float> %a, <4 x float> %b
   ret <4 x float> %sel
 }
 
 define <16 x i8> @test_16i8(<16 x i8> %a, <16 x i8> %b) {
-; CHECK-LABEL: 'test_16i8'
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %sel = select <16 x i1> <i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true>, <16 x i8> %a, <16 x i8> %b
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <16 x i8> %sel
+; SSE-LABEL: 'test_16i8'
+; SSE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %sel = select <16 x i1> <i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true>, <16 x i8> %a, <16 x i8> %b
+; SSE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <16 x i8> %sel
+;
+; AVX-LABEL: 'test_16i8'
+; AVX-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %sel = select <16 x i1> <i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true>, <16 x i8> %a, <16 x i8> %b
+; AVX-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <16 x i8> %sel
+;
+; AVX512-LABEL: 'test_16i8'
+; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %sel = select <16 x i1> <i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true>, <16 x i8> %a, <16 x i8> %b
+; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <16 x i8> %sel
+;
+; SLM-LABEL: 'test_16i8'
+; SLM-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %sel = select <16 x i1> <i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true>, <16 x i8> %a, <16 x i8> %b
+; SLM-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <16 x i8> %sel
 ;
   %sel = select <16 x i1> <i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true>, <16 x i8> %a, <16 x i8> %b
   ret <16 x i8> %sel
@@ -132,54 +331,150 @@ define <16 x i8> @test_16i8(<16 x i8> %a, <16 x i8> %b) {
 ; Integers of the same size should also use those instructions.
 
 define <4 x i64> @test_4i64(<4 x i64> %a, <4 x i64> %b) {
-; CHECK-LABEL: 'test_4i64'
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %sel = select <4 x i1> <i1 true, i1 false, i1 false, i1 true>, <4 x i64> %a, <4 x i64> %b
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <4 x i64> %sel
+; SSE-LABEL: 'test_4i64'
+; SSE-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %sel = select <4 x i1> <i1 true, i1 false, i1 false, i1 true>, <4 x i64> %a, <4 x i64> %b
+; SSE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <4 x i64> %sel
+;
+; AVX1-LABEL: 'test_4i64'
+; AVX1-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %sel = select <4 x i1> <i1 true, i1 false, i1 false, i1 true>, <4 x i64> %a, <4 x i64> %b
+; AVX1-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <4 x i64> %sel
+;
+; AVX2-LABEL: 'test_4i64'
+; AVX2-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %sel = select <4 x i1> <i1 true, i1 false, i1 false, i1 true>, <4 x i64> %a, <4 x i64> %b
+; AVX2-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <4 x i64> %sel
+;
+; AVX512-LABEL: 'test_4i64'
+; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %sel = select <4 x i1> <i1 true, i1 false, i1 false, i1 true>, <4 x i64> %a, <4 x i64> %b
+; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <4 x i64> %sel
+;
+; SLM-LABEL: 'test_4i64'
+; SLM-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %sel = select <4 x i1> <i1 true, i1 false, i1 false, i1 true>, <4 x i64> %a, <4 x i64> %b
+; SLM-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <4 x i64> %sel
 ;
   %sel = select <4 x i1> <i1 true, i1 false, i1 false, i1 true>, <4 x i64> %a, <4 x i64> %b
   ret <4 x i64> %sel
 }
 
 define <4 x double> @test_4double(<4 x double> %a, <4 x double> %b) {
-; CHECK-LABEL: 'test_4double'
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %sel = select <4 x i1> <i1 true, i1 false, i1 true, i1 false>, <4 x double> %a, <4 x double> %b
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <4 x double> %sel
+; SSE-LABEL: 'test_4double'
+; SSE-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %sel = select <4 x i1> <i1 true, i1 false, i1 true, i1 false>, <4 x double> %a, <4 x double> %b
+; SSE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <4 x double> %sel
+;
+; AVX1-LABEL: 'test_4double'
+; AVX1-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %sel = select <4 x i1> <i1 true, i1 false, i1 true, i1 false>, <4 x double> %a, <4 x double> %b
+; AVX1-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <4 x double> %sel
+;
+; AVX2-LABEL: 'test_4double'
+; AVX2-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %sel = select <4 x i1> <i1 true, i1 false, i1 true, i1 false>, <4 x double> %a, <4 x double> %b
+; AVX2-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <4 x double> %sel
+;
+; AVX512-LABEL: 'test_4double'
+; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %sel = select <4 x i1> <i1 true, i1 false, i1 true, i1 false>, <4 x double> %a, <4 x double> %b
+; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <4 x double> %sel
+;
+; SLM-LABEL: 'test_4double'
+; SLM-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %sel = select <4 x i1> <i1 true, i1 false, i1 true, i1 false>, <4 x double> %a, <4 x double> %b
+; SLM-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <4 x double> %sel
 ;
   %sel = select <4 x i1> <i1 true, i1 false, i1 true, i1 false>, <4 x double> %a, <4 x double> %b
   ret <4 x double> %sel
 }
 
 define <8 x i32> @test_8i32(<8 x i32> %a, <8 x i32> %b) {
-; CHECK-LABEL: 'test_8i32'
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %sel = select <8 x i1> <i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 false>, <8 x i32> %a, <8 x i32> %b
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <8 x i32> %sel
+; SSE-LABEL: 'test_8i32'
+; SSE-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %sel = select <8 x i1> <i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 false>, <8 x i32> %a, <8 x i32> %b
+; SSE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <8 x i32> %sel
+;
+; AVX1-LABEL: 'test_8i32'
+; AVX1-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %sel = select <8 x i1> <i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 false>, <8 x i32> %a, <8 x i32> %b
+; AVX1-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <8 x i32> %sel
+;
+; AVX2-LABEL: 'test_8i32'
+; AVX2-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %sel = select <8 x i1> <i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 false>, <8 x i32> %a, <8 x i32> %b
+; AVX2-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <8 x i32> %sel
+;
+; AVX512-LABEL: 'test_8i32'
+; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %sel = select <8 x i1> <i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 false>, <8 x i32> %a, <8 x i32> %b
+; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <8 x i32> %sel
+;
+; SLM-LABEL: 'test_8i32'
+; SLM-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %sel = select <8 x i1> <i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 false>, <8 x i32> %a, <8 x i32> %b
+; SLM-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <8 x i32> %sel
 ;
   %sel = select <8 x i1> <i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 false>, <8 x i32> %a, <8 x i32> %b
   ret <8 x i32> %sel
 }
 
 define <8 x float> @test_8float(<8 x float> %a, <8 x float> %b) {
-; CHECK-LABEL: 'test_8float'
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %sel = select <8 x i1> <i1 true, i1 false, i1 false, i1 false, i1 true, i1 false, i1 false, i1 false>, <8 x float> %a, <8 x float> %b
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <8 x float> %sel
+; SSE-LABEL: 'test_8float'
+; SSE-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %sel = select <8 x i1> <i1 true, i1 false, i1 false, i1 false, i1 true, i1 false, i1 false, i1 false>, <8 x float> %a, <8 x float> %b
+; SSE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <8 x float> %sel
+;
+; AVX1-LABEL: 'test_8float'
+; AVX1-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %sel = select <8 x i1> <i1 true, i1 false, i1 false, i1 false, i1 true, i1 false, i1 false, i1 false>, <8 x float> %a, <8 x float> %b
+; AVX1-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <8 x float> %sel
+;
+; AVX2-LABEL: 'test_8float'
+; AVX2-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %sel = select <8 x i1> <i1 true, i1 false, i1 false, i1 false, i1 true, i1 false, i1 false, i1 false>, <8 x float> %a, <8 x float> %b
+; AVX2-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <8 x float> %sel
+;
+; AVX512-LABEL: 'test_8float'
+; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %sel = select <8 x i1> <i1 true, i1 false, i1 false, i1 false, i1 true, i1 false, i1 false, i1 false>, <8 x float> %a, <8 x float> %b
+; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <8 x float> %sel
+;
+; SLM-LABEL: 'test_8float'
+; SLM-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %sel = select <8 x i1> <i1 true, i1 false, i1 false, i1 false, i1 true, i1 false, i1 false, i1 false>, <8 x float> %a, <8 x float> %b
+; SLM-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <8 x float> %sel
 ;
   %sel = select <8 x i1> <i1 true, i1 false, i1 false, i1 false, i1 true, i1 false, i1 false, i1 false>, <8 x float> %a, <8 x float> %b
   ret <8 x float> %sel
 }
 
 define <16 x i16> @test_16i16(<16 x i16> %a, <16 x i16> %b) {
-; CHECK-LABEL: 'test_16i16'
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %sel = select <16 x i1> <i1 true, i1 false, i1 false, i1 false, i1 true, i1 false, i1 false, i1 false, i1 true, i1 false, i1 false, i1 false, i1 true, i1 false, i1 false, i1 false>, <16 x i16> %a, <16 x i16> %b
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <16 x i16> %sel
+; SSE-LABEL: 'test_16i16'
+; SSE-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %sel = select <16 x i1> <i1 true, i1 false, i1 false, i1 false, i1 true, i1 false, i1 false, i1 false, i1 true, i1 false, i1 false, i1 false, i1 true, i1 false, i1 false, i1 false>, <16 x i16> %a, <16 x i16> %b
+; SSE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <16 x i16> %sel
+;
+; AVX1-LABEL: 'test_16i16'
+; AVX1-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %sel = select <16 x i1> <i1 true, i1 false, i1 false, i1 false, i1 true, i1 false, i1 false, i1 false, i1 true, i1 false, i1 false, i1 false, i1 true, i1 false, i1 false, i1 false>, <16 x i16> %a, <16 x i16> %b
+; AVX1-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <16 x i16> %sel
+;
+; AVX2-LABEL: 'test_16i16'
+; AVX2-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %sel = select <16 x i1> <i1 true, i1 false, i1 false, i1 false, i1 true, i1 false, i1 false, i1 false, i1 true, i1 false, i1 false, i1 false, i1 true, i1 false, i1 false, i1 false>, <16 x i16> %a, <16 x i16> %b
+; AVX2-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <16 x i16> %sel
+;
+; AVX512-LABEL: 'test_16i16'
+; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %sel = select <16 x i1> <i1 true, i1 false, i1 false, i1 false, i1 true, i1 false, i1 false, i1 false, i1 true, i1 false, i1 false, i1 false, i1 true, i1 false, i1 false, i1 false>, <16 x i16> %a, <16 x i16> %b
+; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <16 x i16> %sel
+;
+; SLM-LABEL: 'test_16i16'
+; SLM-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %sel = select <16 x i1> <i1 true, i1 false, i1 false, i1 false, i1 true, i1 false, i1 false, i1 false, i1 true, i1 false, i1 false, i1 false, i1 true, i1 false, i1 false, i1 false>, <16 x i16> %a, <16 x i16> %b
+; SLM-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <16 x i16> %sel
 ;
   %sel = select <16 x i1> <i1 true, i1 false, i1 false, i1 false, i1 true, i1 false, i1 false, i1 false, i1 true, i1 false, i1 false, i1 false, i1 true, i1 false, i1 false, i1 false>, <16 x i16> %a, <16 x i16> %b
   ret <16 x i16> %sel
 }
 
 define <32 x i8> @test_32i8(<32 x i8> %a, <32 x i8> %b) {
-; CHECK-LABEL: 'test_32i8'
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %sel = select <32 x i1> <i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true>, <32 x i8> %a, <32 x i8> %b
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <32 x i8> %sel
+; SSE-LABEL: 'test_32i8'
+; SSE-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %sel = select <32 x i1> <i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true>, <32 x i8> %a, <32 x i8> %b
+; SSE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <32 x i8> %sel
+;
+; AVX1-LABEL: 'test_32i8'
+; AVX1-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %sel = select <32 x i1> <i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true>, <32 x i8> %a, <32 x i8> %b
+; AVX1-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <32 x i8> %sel
+;
+; AVX2-LABEL: 'test_32i8'
+; AVX2-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %sel = select <32 x i1> <i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true>, <32 x i8> %a, <32 x i8> %b
+; AVX2-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <32 x i8> %sel
+;
+; AVX512-LABEL: 'test_32i8'
+; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %sel = select <32 x i1> <i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true>, <32 x i8> %a, <32 x i8> %b
+; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <32 x i8> %sel
+;
+; SLM-LABEL: 'test_32i8'
+; SLM-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %sel = select <32 x i1> <i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true>, <32 x i8> %a, <32 x i8> %b
+; SLM-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <32 x i8> %sel
 ;
   %sel = select <32 x i1> <i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true>, <32 x i8> %a, <32 x i8> %b
   ret <32 x i8> %sel

diff  --git a/llvm/test/Analysis/CostModel/X86/select-sizelatency.ll b/llvm/test/Analysis/CostModel/X86/select-sizelatency.ll
index e2e96746beccd..8b621aea2379d 100644
--- a/llvm/test/Analysis/CostModel/X86/select-sizelatency.ll
+++ b/llvm/test/Analysis/CostModel/X86/select-sizelatency.ll
@@ -1,36 +1,150 @@
 ; NOTE: Assertions have been autogenerated by utils/update_analyze_test_checks.py
-; RUN: opt < %s -passes="print<cost-model>" 2>&1 -disable-output -cost-kind=size-latency -mtriple=x86_64-- -mattr=+sse2 | FileCheck %s
-; RUN: opt < %s -passes="print<cost-model>" 2>&1 -disable-output -cost-kind=size-latency -mtriple=x86_64-- -mattr=+sse4.1 | FileCheck %s
-; RUN: opt < %s -passes="print<cost-model>" 2>&1 -disable-output -cost-kind=size-latency -mtriple=x86_64-- -mattr=+avx | FileCheck %s
-; RUN: opt < %s -passes="print<cost-model>" 2>&1 -disable-output -cost-kind=size-latency -mtriple=x86_64-- -mattr=+avx2 | FileCheck %s
-; RUN: opt < %s -passes="print<cost-model>" 2>&1 -disable-output -cost-kind=size-latency -mtriple=x86_64-- -mattr=+avx512f,+avx512vl  | FileCheck %s
-; RUN: opt < %s -passes="print<cost-model>" 2>&1 -disable-output -cost-kind=size-latency -mtriple=x86_64-- -mattr=+avx512bw,+avx512vl | FileCheck %s
+; RUN: opt < %s -passes="print<cost-model>" 2>&1 -disable-output -cost-kind=size-latency -mtriple=x86_64-- -mattr=+sse2 | FileCheck %s -check-prefixes=SSE2
+; RUN: opt < %s -passes="print<cost-model>" 2>&1 -disable-output -cost-kind=size-latency -mtriple=x86_64-- -mattr=+sse4.1 | FileCheck %s -check-prefixes=SSE4
+; RUN: opt < %s -passes="print<cost-model>" 2>&1 -disable-output -cost-kind=size-latency -mtriple=x86_64-- -mattr=+avx | FileCheck %s -check-prefixes=AVX,AVX1
+; RUN: opt < %s -passes="print<cost-model>" 2>&1 -disable-output -cost-kind=size-latency -mtriple=x86_64-- -mattr=+avx2 | FileCheck %s -check-prefixes=AVX,AVX2
+; RUN: opt < %s -passes="print<cost-model>" 2>&1 -disable-output -cost-kind=size-latency -mtriple=x86_64-- -mattr=+avx512f,+avx512vl  | FileCheck %s -check-prefixes=AVX512,AVX512F
+; RUN: opt < %s -passes="print<cost-model>" 2>&1 -disable-output -cost-kind=size-latency -mtriple=x86_64-- -mattr=+avx512bw,+avx512vl | FileCheck %s -check-prefixes=AVX512,AVX512BW
 ;
-; RUN: opt < %s -passes="print<cost-model>" 2>&1 -disable-output -cost-kind=size-latency -mtriple=x86_64-- -mcpu=slm | FileCheck %s
-; RUN: opt < %s -passes="print<cost-model>" 2>&1 -disable-output -cost-kind=size-latency -mtriple=x86_64-- -mcpu=goldmont | FileCheck %s
-; RUN: opt < %s -passes="print<cost-model>" 2>&1 -disable-output -cost-kind=size-latency -mtriple=x86_64-- -mcpu=btver2 | FileCheck %s
+; RUN: opt < %s -passes="print<cost-model>" 2>&1 -disable-output -cost-kind=size-latency -mtriple=x86_64-- -mcpu=slm | FileCheck %s --check-prefixes=SLM
+; RUN: opt < %s -passes="print<cost-model>" 2>&1 -disable-output -cost-kind=size-latency -mtriple=x86_64-- -mcpu=goldmont | FileCheck %s --check-prefixes=SSE4
+; RUN: opt < %s -passes="print<cost-model>" 2>&1 -disable-output -cost-kind=size-latency -mtriple=x86_64-- -mcpu=btver2 | FileCheck %s --check-prefixes=AVX,AVX1
 
 ; Verify the cost of vector select instructions.
 
 define i32 @test_select() {
-; CHECK-LABEL: 'test_select'
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I64 = select i1 undef, i64 undef, i64 undef
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V2I64 = select <2 x i1> undef, <2 x i64> undef, <2 x i64> undef
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V4I64 = select <4 x i1> undef, <4 x i64> undef, <4 x i64> undef
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V8I64 = select <8 x i1> undef, <8 x i64> undef, <8 x i64> undef
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I32 = select i1 undef, i32 undef, i32 undef
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V4I32 = select <4 x i1> undef, <4 x i32> undef, <4 x i32> undef
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V8I32 = select <8 x i1> undef, <8 x i32> undef, <8 x i32> undef
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V16I32 = select <16 x i1> undef, <16 x i32> undef, <16 x i32> undef
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I16 = select i1 undef, i16 undef, i16 undef
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V8I16 = select <8 x i1> undef, <8 x i16> undef, <8 x i16> undef
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V16I16 = select <16 x i1> undef, <16 x i16> undef, <16 x i16> undef
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V32I16 = select <32 x i1> undef, <32 x i16> undef, <32 x i16> undef
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I8 = select i1 undef, i8 undef, i8 undef
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V16I8 = select <16 x i1> undef, <16 x i8> undef, <16 x i8> undef
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V32I8 = select <32 x i1> undef, <32 x i8> undef, <32 x i8> undef
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V64I8 = select <64 x i1> undef, <64 x i8> undef, <64 x i8> undef
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 undef
+; SSE2-LABEL: 'test_select'
+; SSE2-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I64 = select i1 undef, i64 undef, i64 undef
+; SSE2-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V2I64 = select <2 x i1> undef, <2 x i64> undef, <2 x i64> undef
+; SSE2-NEXT:  Cost Model: Found an estimated cost of 6 for instruction: %V4I64 = select <4 x i1> undef, <4 x i64> undef, <4 x i64> undef
+; SSE2-NEXT:  Cost Model: Found an estimated cost of 12 for instruction: %V8I64 = select <8 x i1> undef, <8 x i64> undef, <8 x i64> undef
+; SSE2-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I32 = select i1 undef, i32 undef, i32 undef
+; SSE2-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V4I32 = select <4 x i1> undef, <4 x i32> undef, <4 x i32> undef
+; SSE2-NEXT:  Cost Model: Found an estimated cost of 6 for instruction: %V8I32 = select <8 x i1> undef, <8 x i32> undef, <8 x i32> undef
+; SSE2-NEXT:  Cost Model: Found an estimated cost of 12 for instruction: %V16I32 = select <16 x i1> undef, <16 x i32> undef, <16 x i32> undef
+; SSE2-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I16 = select i1 undef, i16 undef, i16 undef
+; SSE2-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V8I16 = select <8 x i1> undef, <8 x i16> undef, <8 x i16> undef
+; SSE2-NEXT:  Cost Model: Found an estimated cost of 6 for instruction: %V16I16 = select <16 x i1> undef, <16 x i16> undef, <16 x i16> undef
+; SSE2-NEXT:  Cost Model: Found an estimated cost of 12 for instruction: %V32I16 = select <32 x i1> undef, <32 x i16> undef, <32 x i16> undef
+; SSE2-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I8 = select i1 undef, i8 undef, i8 undef
+; SSE2-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V16I8 = select <16 x i1> undef, <16 x i8> undef, <16 x i8> undef
+; SSE2-NEXT:  Cost Model: Found an estimated cost of 6 for instruction: %V32I8 = select <32 x i1> undef, <32 x i8> undef, <32 x i8> undef
+; SSE2-NEXT:  Cost Model: Found an estimated cost of 12 for instruction: %V64I8 = select <64 x i1> undef, <64 x i8> undef, <64 x i8> undef
+; SSE2-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 undef
+;
+; SSE4-LABEL: 'test_select'
+; SSE4-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I64 = select i1 undef, i64 undef, i64 undef
+; SSE4-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V2I64 = select <2 x i1> undef, <2 x i64> undef, <2 x i64> undef
+; SSE4-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %V4I64 = select <4 x i1> undef, <4 x i64> undef, <4 x i64> undef
+; SSE4-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %V8I64 = select <8 x i1> undef, <8 x i64> undef, <8 x i64> undef
+; SSE4-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I32 = select i1 undef, i32 undef, i32 undef
+; SSE4-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V4I32 = select <4 x i1> undef, <4 x i32> undef, <4 x i32> undef
+; SSE4-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %V8I32 = select <8 x i1> undef, <8 x i32> undef, <8 x i32> undef
+; SSE4-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %V16I32 = select <16 x i1> undef, <16 x i32> undef, <16 x i32> undef
+; SSE4-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I16 = select i1 undef, i16 undef, i16 undef
+; SSE4-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V8I16 = select <8 x i1> undef, <8 x i16> undef, <8 x i16> undef
+; SSE4-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %V16I16 = select <16 x i1> undef, <16 x i16> undef, <16 x i16> undef
+; SSE4-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %V32I16 = select <32 x i1> undef, <32 x i16> undef, <32 x i16> undef
+; SSE4-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I8 = select i1 undef, i8 undef, i8 undef
+; SSE4-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V16I8 = select <16 x i1> undef, <16 x i8> undef, <16 x i8> undef
+; SSE4-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %V32I8 = select <32 x i1> undef, <32 x i8> undef, <32 x i8> undef
+; SSE4-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %V64I8 = select <64 x i1> undef, <64 x i8> undef, <64 x i8> undef
+; SSE4-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 undef
+;
+; AVX1-LABEL: 'test_select'
+; AVX1-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I64 = select i1 undef, i64 undef, i64 undef
+; AVX1-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V2I64 = select <2 x i1> undef, <2 x i64> undef, <2 x i64> undef
+; AVX1-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V4I64 = select <4 x i1> undef, <4 x i64> undef, <4 x i64> undef
+; AVX1-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %V8I64 = select <8 x i1> undef, <8 x i64> undef, <8 x i64> undef
+; AVX1-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I32 = select i1 undef, i32 undef, i32 undef
+; AVX1-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V4I32 = select <4 x i1> undef, <4 x i32> undef, <4 x i32> undef
+; AVX1-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V8I32 = select <8 x i1> undef, <8 x i32> undef, <8 x i32> undef
+; AVX1-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %V16I32 = select <16 x i1> undef, <16 x i32> undef, <16 x i32> undef
+; AVX1-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I16 = select i1 undef, i16 undef, i16 undef
+; AVX1-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V8I16 = select <8 x i1> undef, <8 x i16> undef, <8 x i16> undef
+; AVX1-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V16I16 = select <16 x i1> undef, <16 x i16> undef, <16 x i16> undef
+; AVX1-NEXT:  Cost Model: Found an estimated cost of 6 for instruction: %V32I16 = select <32 x i1> undef, <32 x i16> undef, <32 x i16> undef
+; AVX1-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I8 = select i1 undef, i8 undef, i8 undef
+; AVX1-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V16I8 = select <16 x i1> undef, <16 x i8> undef, <16 x i8> undef
+; AVX1-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V32I8 = select <32 x i1> undef, <32 x i8> undef, <32 x i8> undef
+; AVX1-NEXT:  Cost Model: Found an estimated cost of 6 for instruction: %V64I8 = select <64 x i1> undef, <64 x i8> undef, <64 x i8> undef
+; AVX1-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 undef
+;
+; AVX2-LABEL: 'test_select'
+; AVX2-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I64 = select i1 undef, i64 undef, i64 undef
+; AVX2-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V2I64 = select <2 x i1> undef, <2 x i64> undef, <2 x i64> undef
+; AVX2-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V4I64 = select <4 x i1> undef, <4 x i64> undef, <4 x i64> undef
+; AVX2-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %V8I64 = select <8 x i1> undef, <8 x i64> undef, <8 x i64> undef
+; AVX2-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I32 = select i1 undef, i32 undef, i32 undef
+; AVX2-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V4I32 = select <4 x i1> undef, <4 x i32> undef, <4 x i32> undef
+; AVX2-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V8I32 = select <8 x i1> undef, <8 x i32> undef, <8 x i32> undef
+; AVX2-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %V16I32 = select <16 x i1> undef, <16 x i32> undef, <16 x i32> undef
+; AVX2-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I16 = select i1 undef, i16 undef, i16 undef
+; AVX2-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V8I16 = select <8 x i1> undef, <8 x i16> undef, <8 x i16> undef
+; AVX2-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V16I16 = select <16 x i1> undef, <16 x i16> undef, <16 x i16> undef
+; AVX2-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %V32I16 = select <32 x i1> undef, <32 x i16> undef, <32 x i16> undef
+; AVX2-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I8 = select i1 undef, i8 undef, i8 undef
+; AVX2-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V16I8 = select <16 x i1> undef, <16 x i8> undef, <16 x i8> undef
+; AVX2-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V32I8 = select <32 x i1> undef, <32 x i8> undef, <32 x i8> undef
+; AVX2-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %V64I8 = select <64 x i1> undef, <64 x i8> undef, <64 x i8> undef
+; AVX2-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 undef
+;
+; AVX512F-LABEL: 'test_select'
+; AVX512F-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I64 = select i1 undef, i64 undef, i64 undef
+; AVX512F-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V2I64 = select <2 x i1> undef, <2 x i64> undef, <2 x i64> undef
+; AVX512F-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V4I64 = select <4 x i1> undef, <4 x i64> undef, <4 x i64> undef
+; AVX512F-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V8I64 = select <8 x i1> undef, <8 x i64> undef, <8 x i64> undef
+; AVX512F-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I32 = select i1 undef, i32 undef, i32 undef
+; AVX512F-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V4I32 = select <4 x i1> undef, <4 x i32> undef, <4 x i32> undef
+; AVX512F-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V8I32 = select <8 x i1> undef, <8 x i32> undef, <8 x i32> undef
+; AVX512F-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V16I32 = select <16 x i1> undef, <16 x i32> undef, <16 x i32> undef
+; AVX512F-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I16 = select i1 undef, i16 undef, i16 undef
+; AVX512F-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V8I16 = select <8 x i1> undef, <8 x i16> undef, <8 x i16> undef
+; AVX512F-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V16I16 = select <16 x i1> undef, <16 x i16> undef, <16 x i16> undef
+; AVX512F-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %V32I16 = select <32 x i1> undef, <32 x i16> undef, <32 x i16> undef
+; AVX512F-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I8 = select i1 undef, i8 undef, i8 undef
+; AVX512F-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V16I8 = select <16 x i1> undef, <16 x i8> undef, <16 x i8> undef
+; AVX512F-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V32I8 = select <32 x i1> undef, <32 x i8> undef, <32 x i8> undef
+; AVX512F-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %V64I8 = select <64 x i1> undef, <64 x i8> undef, <64 x i8> undef
+; AVX512F-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 undef
+;
+; AVX512BW-LABEL: 'test_select'
+; AVX512BW-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I64 = select i1 undef, i64 undef, i64 undef
+; AVX512BW-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V2I64 = select <2 x i1> undef, <2 x i64> undef, <2 x i64> undef
+; AVX512BW-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V4I64 = select <4 x i1> undef, <4 x i64> undef, <4 x i64> undef
+; AVX512BW-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V8I64 = select <8 x i1> undef, <8 x i64> undef, <8 x i64> undef
+; AVX512BW-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I32 = select i1 undef, i32 undef, i32 undef
+; AVX512BW-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V4I32 = select <4 x i1> undef, <4 x i32> undef, <4 x i32> undef
+; AVX512BW-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V8I32 = select <8 x i1> undef, <8 x i32> undef, <8 x i32> undef
+; AVX512BW-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V16I32 = select <16 x i1> undef, <16 x i32> undef, <16 x i32> undef
+; AVX512BW-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I16 = select i1 undef, i16 undef, i16 undef
+; AVX512BW-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V8I16 = select <8 x i1> undef, <8 x i16> undef, <8 x i16> undef
+; AVX512BW-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V16I16 = select <16 x i1> undef, <16 x i16> undef, <16 x i16> undef
+; AVX512BW-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V32I16 = select <32 x i1> undef, <32 x i16> undef, <32 x i16> undef
+; AVX512BW-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I8 = select i1 undef, i8 undef, i8 undef
+; AVX512BW-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V16I8 = select <16 x i1> undef, <16 x i8> undef, <16 x i8> undef
+; AVX512BW-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V32I8 = select <32 x i1> undef, <32 x i8> undef, <32 x i8> undef
+; AVX512BW-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V64I8 = select <64 x i1> undef, <64 x i8> undef, <64 x i8> undef
+; AVX512BW-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 undef
+;
+; SLM-LABEL: 'test_select'
+; SLM-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I64 = select i1 undef, i64 undef, i64 undef
+; SLM-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V2I64 = select <2 x i1> undef, <2 x i64> undef, <2 x i64> undef
+; SLM-NEXT:  Cost Model: Found an estimated cost of 6 for instruction: %V4I64 = select <4 x i1> undef, <4 x i64> undef, <4 x i64> undef
+; SLM-NEXT:  Cost Model: Found an estimated cost of 12 for instruction: %V8I64 = select <8 x i1> undef, <8 x i64> undef, <8 x i64> undef
+; SLM-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I32 = select i1 undef, i32 undef, i32 undef
+; SLM-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V4I32 = select <4 x i1> undef, <4 x i32> undef, <4 x i32> undef
+; SLM-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %V8I32 = select <8 x i1> undef, <8 x i32> undef, <8 x i32> undef
+; SLM-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %V16I32 = select <16 x i1> undef, <16 x i32> undef, <16 x i32> undef
+; SLM-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I16 = select i1 undef, i16 undef, i16 undef
+; SLM-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V8I16 = select <8 x i1> undef, <8 x i16> undef, <8 x i16> undef
+; SLM-NEXT:  Cost Model: Found an estimated cost of 6 for instruction: %V16I16 = select <16 x i1> undef, <16 x i16> undef, <16 x i16> undef
+; SLM-NEXT:  Cost Model: Found an estimated cost of 12 for instruction: %V32I16 = select <32 x i1> undef, <32 x i16> undef, <32 x i16> undef
+; SLM-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %I8 = select i1 undef, i8 undef, i8 undef
+; SLM-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V16I8 = select <16 x i1> undef, <16 x i8> undef, <16 x i8> undef
+; SLM-NEXT:  Cost Model: Found an estimated cost of 6 for instruction: %V32I8 = select <32 x i1> undef, <32 x i8> undef, <32 x i8> undef
+; SLM-NEXT:  Cost Model: Found an estimated cost of 12 for instruction: %V64I8 = select <64 x i1> undef, <64 x i8> undef, <64 x i8> undef
+; SLM-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 undef
 ;
   %I64 = select i1 undef, i64 undef, i64 undef
   %V2I64 = select <2 x i1> undef, <2 x i64> undef, <2 x i64> undef
@@ -56,16 +170,60 @@ define i32 @test_select() {
 }
 
 define i32 @test_select_fp() {
-; CHECK-LABEL: 'test_select_fp'
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %F64 = select i1 undef, double undef, double undef
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V2F64 = select <2 x i1> undef, <2 x double> undef, <2 x double> undef
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V4F64 = select <4 x i1> undef, <4 x double> undef, <4 x double> undef
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V8F64 = select <8 x i1> undef, <8 x double> undef, <8 x double> undef
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %F32 = select i1 undef, float undef, float undef
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V4F32 = select <4 x i1> undef, <4 x float> undef, <4 x float> undef
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V8F32 = select <8 x i1> undef, <8 x float> undef, <8 x float> undef
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V16F32 = select <16 x i1> undef, <16 x float> undef, <16 x float> undef
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 undef
+; SSE2-LABEL: 'test_select_fp'
+; SSE2-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %F64 = select i1 undef, double undef, double undef
+; SSE2-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V2F64 = select <2 x i1> undef, <2 x double> undef, <2 x double> undef
+; SSE2-NEXT:  Cost Model: Found an estimated cost of 6 for instruction: %V4F64 = select <4 x i1> undef, <4 x double> undef, <4 x double> undef
+; SSE2-NEXT:  Cost Model: Found an estimated cost of 12 for instruction: %V8F64 = select <8 x i1> undef, <8 x double> undef, <8 x double> undef
+; SSE2-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %F32 = select i1 undef, float undef, float undef
+; SSE2-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V4F32 = select <4 x i1> undef, <4 x float> undef, <4 x float> undef
+; SSE2-NEXT:  Cost Model: Found an estimated cost of 6 for instruction: %V8F32 = select <8 x i1> undef, <8 x float> undef, <8 x float> undef
+; SSE2-NEXT:  Cost Model: Found an estimated cost of 12 for instruction: %V16F32 = select <16 x i1> undef, <16 x float> undef, <16 x float> undef
+; SSE2-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 undef
+;
+; SSE4-LABEL: 'test_select_fp'
+; SSE4-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %F64 = select i1 undef, double undef, double undef
+; SSE4-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V2F64 = select <2 x i1> undef, <2 x double> undef, <2 x double> undef
+; SSE4-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %V4F64 = select <4 x i1> undef, <4 x double> undef, <4 x double> undef
+; SSE4-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %V8F64 = select <8 x i1> undef, <8 x double> undef, <8 x double> undef
+; SSE4-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %F32 = select i1 undef, float undef, float undef
+; SSE4-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V4F32 = select <4 x i1> undef, <4 x float> undef, <4 x float> undef
+; SSE4-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %V8F32 = select <8 x i1> undef, <8 x float> undef, <8 x float> undef
+; SSE4-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %V16F32 = select <16 x i1> undef, <16 x float> undef, <16 x float> undef
+; SSE4-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 undef
+;
+; AVX-LABEL: 'test_select_fp'
+; AVX-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %F64 = select i1 undef, double undef, double undef
+; AVX-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V2F64 = select <2 x i1> undef, <2 x double> undef, <2 x double> undef
+; AVX-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V4F64 = select <4 x i1> undef, <4 x double> undef, <4 x double> undef
+; AVX-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %V8F64 = select <8 x i1> undef, <8 x double> undef, <8 x double> undef
+; AVX-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %F32 = select i1 undef, float undef, float undef
+; AVX-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V4F32 = select <4 x i1> undef, <4 x float> undef, <4 x float> undef
+; AVX-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V8F32 = select <8 x i1> undef, <8 x float> undef, <8 x float> undef
+; AVX-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %V16F32 = select <16 x i1> undef, <16 x float> undef, <16 x float> undef
+; AVX-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 undef
+;
+; AVX512-LABEL: 'test_select_fp'
+; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %F64 = select i1 undef, double undef, double undef
+; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V2F64 = select <2 x i1> undef, <2 x double> undef, <2 x double> undef
+; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V4F64 = select <4 x i1> undef, <4 x double> undef, <4 x double> undef
+; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V8F64 = select <8 x i1> undef, <8 x double> undef, <8 x double> undef
+; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %F32 = select i1 undef, float undef, float undef
+; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V4F32 = select <4 x i1> undef, <4 x float> undef, <4 x float> undef
+; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V8F32 = select <8 x i1> undef, <8 x float> undef, <8 x float> undef
+; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V16F32 = select <16 x i1> undef, <16 x float> undef, <16 x float> undef
+; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 undef
+;
+; SLM-LABEL: 'test_select_fp'
+; SLM-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %F64 = select i1 undef, double undef, double undef
+; SLM-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V2F64 = select <2 x i1> undef, <2 x double> undef, <2 x double> undef
+; SLM-NEXT:  Cost Model: Found an estimated cost of 6 for instruction: %V4F64 = select <4 x i1> undef, <4 x double> undef, <4 x double> undef
+; SLM-NEXT:  Cost Model: Found an estimated cost of 12 for instruction: %V8F64 = select <8 x i1> undef, <8 x double> undef, <8 x double> undef
+; SLM-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %F32 = select i1 undef, float undef, float undef
+; SLM-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V4F32 = select <4 x i1> undef, <4 x float> undef, <4 x float> undef
+; SLM-NEXT:  Cost Model: Found an estimated cost of 6 for instruction: %V8F32 = select <8 x i1> undef, <8 x float> undef, <8 x float> undef
+; SLM-NEXT:  Cost Model: Found an estimated cost of 12 for instruction: %V16F32 = select <16 x i1> undef, <16 x float> undef, <16 x float> undef
+; SLM-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 undef
 ;
   %F64 = select i1 undef, double undef, double undef
   %V2F64 = select <2 x i1> undef, <2 x double> undef, <2 x double> undef
@@ -84,45 +242,125 @@ define i32 @test_select_fp() {
 ; Integers of the same size should also use those instructions.
 
 define <2 x i64> @test_2i64(<2 x i64> %a, <2 x i64> %b) {
-; CHECK-LABEL: 'test_2i64'
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %sel = select <2 x i1> <i1 true, i1 false>, <2 x i64> %a, <2 x i64> %b
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <2 x i64> %sel
+; SSE2-LABEL: 'test_2i64'
+; SSE2-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %sel = select <2 x i1> <i1 true, i1 false>, <2 x i64> %a, <2 x i64> %b
+; SSE2-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <2 x i64> %sel
+;
+; SSE4-LABEL: 'test_2i64'
+; SSE4-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %sel = select <2 x i1> <i1 true, i1 false>, <2 x i64> %a, <2 x i64> %b
+; SSE4-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <2 x i64> %sel
+;
+; AVX-LABEL: 'test_2i64'
+; AVX-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %sel = select <2 x i1> <i1 true, i1 false>, <2 x i64> %a, <2 x i64> %b
+; AVX-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <2 x i64> %sel
+;
+; AVX512-LABEL: 'test_2i64'
+; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %sel = select <2 x i1> <i1 true, i1 false>, <2 x i64> %a, <2 x i64> %b
+; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <2 x i64> %sel
+;
+; SLM-LABEL: 'test_2i64'
+; SLM-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %sel = select <2 x i1> <i1 true, i1 false>, <2 x i64> %a, <2 x i64> %b
+; SLM-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <2 x i64> %sel
 ;
   %sel = select <2 x i1> <i1 true, i1 false>, <2 x i64> %a, <2 x i64> %b
   ret <2 x i64> %sel
 }
 
 define <2 x double> @test_2double(<2 x double> %a, <2 x double> %b) {
-; CHECK-LABEL: 'test_2double'
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %sel = select <2 x i1> <i1 true, i1 false>, <2 x double> %a, <2 x double> %b
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <2 x double> %sel
+; SSE2-LABEL: 'test_2double'
+; SSE2-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %sel = select <2 x i1> <i1 true, i1 false>, <2 x double> %a, <2 x double> %b
+; SSE2-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <2 x double> %sel
+;
+; SSE4-LABEL: 'test_2double'
+; SSE4-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %sel = select <2 x i1> <i1 true, i1 false>, <2 x double> %a, <2 x double> %b
+; SSE4-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <2 x double> %sel
+;
+; AVX-LABEL: 'test_2double'
+; AVX-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %sel = select <2 x i1> <i1 true, i1 false>, <2 x double> %a, <2 x double> %b
+; AVX-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <2 x double> %sel
+;
+; AVX512-LABEL: 'test_2double'
+; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %sel = select <2 x i1> <i1 true, i1 false>, <2 x double> %a, <2 x double> %b
+; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <2 x double> %sel
+;
+; SLM-LABEL: 'test_2double'
+; SLM-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %sel = select <2 x i1> <i1 true, i1 false>, <2 x double> %a, <2 x double> %b
+; SLM-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <2 x double> %sel
 ;
   %sel = select <2 x i1> <i1 true, i1 false>, <2 x double> %a, <2 x double> %b
   ret <2 x double> %sel
 }
 
 define <4 x i32> @test_4i32(<4 x i32> %a, <4 x i32> %b) {
-; CHECK-LABEL: 'test_4i32'
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %sel = select <4 x i1> <i1 true, i1 false, i1 true, i1 false>, <4 x i32> %a, <4 x i32> %b
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <4 x i32> %sel
+; SSE2-LABEL: 'test_4i32'
+; SSE2-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %sel = select <4 x i1> <i1 true, i1 false, i1 true, i1 false>, <4 x i32> %a, <4 x i32> %b
+; SSE2-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <4 x i32> %sel
+;
+; SSE4-LABEL: 'test_4i32'
+; SSE4-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %sel = select <4 x i1> <i1 true, i1 false, i1 true, i1 false>, <4 x i32> %a, <4 x i32> %b
+; SSE4-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <4 x i32> %sel
+;
+; AVX-LABEL: 'test_4i32'
+; AVX-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %sel = select <4 x i1> <i1 true, i1 false, i1 true, i1 false>, <4 x i32> %a, <4 x i32> %b
+; AVX-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <4 x i32> %sel
+;
+; AVX512-LABEL: 'test_4i32'
+; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %sel = select <4 x i1> <i1 true, i1 false, i1 true, i1 false>, <4 x i32> %a, <4 x i32> %b
+; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <4 x i32> %sel
+;
+; SLM-LABEL: 'test_4i32'
+; SLM-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %sel = select <4 x i1> <i1 true, i1 false, i1 true, i1 false>, <4 x i32> %a, <4 x i32> %b
+; SLM-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <4 x i32> %sel
 ;
   %sel = select <4 x i1> <i1 true, i1 false, i1 true, i1 false>, <4 x i32> %a, <4 x i32> %b
   ret <4 x i32> %sel
 }
 
 define <4 x float> @test_4float(<4 x float> %a, <4 x float> %b) {
-; CHECK-LABEL: 'test_4float'
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %sel = select <4 x i1> <i1 true, i1 false, i1 true, i1 true>, <4 x float> %a, <4 x float> %b
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <4 x float> %sel
+; SSE2-LABEL: 'test_4float'
+; SSE2-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %sel = select <4 x i1> <i1 true, i1 false, i1 true, i1 true>, <4 x float> %a, <4 x float> %b
+; SSE2-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <4 x float> %sel
+;
+; SSE4-LABEL: 'test_4float'
+; SSE4-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %sel = select <4 x i1> <i1 true, i1 false, i1 true, i1 true>, <4 x float> %a, <4 x float> %b
+; SSE4-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <4 x float> %sel
+;
+; AVX-LABEL: 'test_4float'
+; AVX-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %sel = select <4 x i1> <i1 true, i1 false, i1 true, i1 true>, <4 x float> %a, <4 x float> %b
+; AVX-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <4 x float> %sel
+;
+; AVX512-LABEL: 'test_4float'
+; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %sel = select <4 x i1> <i1 true, i1 false, i1 true, i1 true>, <4 x float> %a, <4 x float> %b
+; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <4 x float> %sel
+;
+; SLM-LABEL: 'test_4float'
+; SLM-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %sel = select <4 x i1> <i1 true, i1 false, i1 true, i1 true>, <4 x float> %a, <4 x float> %b
+; SLM-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <4 x float> %sel
 ;
   %sel = select <4 x i1> <i1 true, i1 false, i1 true, i1 true>, <4 x float> %a, <4 x float> %b
   ret <4 x float> %sel
 }
 
 define <16 x i8> @test_16i8(<16 x i8> %a, <16 x i8> %b) {
-; CHECK-LABEL: 'test_16i8'
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %sel = select <16 x i1> <i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true>, <16 x i8> %a, <16 x i8> %b
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <16 x i8> %sel
+; SSE2-LABEL: 'test_16i8'
+; SSE2-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %sel = select <16 x i1> <i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true>, <16 x i8> %a, <16 x i8> %b
+; SSE2-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <16 x i8> %sel
+;
+; SSE4-LABEL: 'test_16i8'
+; SSE4-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %sel = select <16 x i1> <i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true>, <16 x i8> %a, <16 x i8> %b
+; SSE4-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <16 x i8> %sel
+;
+; AVX-LABEL: 'test_16i8'
+; AVX-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %sel = select <16 x i1> <i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true>, <16 x i8> %a, <16 x i8> %b
+; AVX-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <16 x i8> %sel
+;
+; AVX512-LABEL: 'test_16i8'
+; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %sel = select <16 x i1> <i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true>, <16 x i8> %a, <16 x i8> %b
+; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <16 x i8> %sel
+;
+; SLM-LABEL: 'test_16i8'
+; SLM-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %sel = select <16 x i1> <i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true>, <16 x i8> %a, <16 x i8> %b
+; SLM-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <16 x i8> %sel
 ;
   %sel = select <16 x i1> <i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true>, <16 x i8> %a, <16 x i8> %b
   ret <16 x i8> %sel
@@ -132,54 +370,158 @@ define <16 x i8> @test_16i8(<16 x i8> %a, <16 x i8> %b) {
 ; Integers of the same size should also use those instructions.
 
 define <4 x i64> @test_4i64(<4 x i64> %a, <4 x i64> %b) {
-; CHECK-LABEL: 'test_4i64'
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %sel = select <4 x i1> <i1 true, i1 false, i1 false, i1 true>, <4 x i64> %a, <4 x i64> %b
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <4 x i64> %sel
+; SSE2-LABEL: 'test_4i64'
+; SSE2-NEXT:  Cost Model: Found an estimated cost of 6 for instruction: %sel = select <4 x i1> <i1 true, i1 false, i1 false, i1 true>, <4 x i64> %a, <4 x i64> %b
+; SSE2-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <4 x i64> %sel
+;
+; SSE4-LABEL: 'test_4i64'
+; SSE4-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %sel = select <4 x i1> <i1 true, i1 false, i1 false, i1 true>, <4 x i64> %a, <4 x i64> %b
+; SSE4-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <4 x i64> %sel
+;
+; AVX-LABEL: 'test_4i64'
+; AVX-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %sel = select <4 x i1> <i1 true, i1 false, i1 false, i1 true>, <4 x i64> %a, <4 x i64> %b
+; AVX-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <4 x i64> %sel
+;
+; AVX512-LABEL: 'test_4i64'
+; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %sel = select <4 x i1> <i1 true, i1 false, i1 false, i1 true>, <4 x i64> %a, <4 x i64> %b
+; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <4 x i64> %sel
+;
+; SLM-LABEL: 'test_4i64'
+; SLM-NEXT:  Cost Model: Found an estimated cost of 6 for instruction: %sel = select <4 x i1> <i1 true, i1 false, i1 false, i1 true>, <4 x i64> %a, <4 x i64> %b
+; SLM-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <4 x i64> %sel
 ;
   %sel = select <4 x i1> <i1 true, i1 false, i1 false, i1 true>, <4 x i64> %a, <4 x i64> %b
   ret <4 x i64> %sel
 }
 
 define <4 x double> @test_4double(<4 x double> %a, <4 x double> %b) {
-; CHECK-LABEL: 'test_4double'
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %sel = select <4 x i1> <i1 true, i1 false, i1 true, i1 false>, <4 x double> %a, <4 x double> %b
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <4 x double> %sel
+; SSE2-LABEL: 'test_4double'
+; SSE2-NEXT:  Cost Model: Found an estimated cost of 6 for instruction: %sel = select <4 x i1> <i1 true, i1 false, i1 true, i1 false>, <4 x double> %a, <4 x double> %b
+; SSE2-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <4 x double> %sel
+;
+; SSE4-LABEL: 'test_4double'
+; SSE4-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %sel = select <4 x i1> <i1 true, i1 false, i1 true, i1 false>, <4 x double> %a, <4 x double> %b
+; SSE4-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <4 x double> %sel
+;
+; AVX-LABEL: 'test_4double'
+; AVX-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %sel = select <4 x i1> <i1 true, i1 false, i1 true, i1 false>, <4 x double> %a, <4 x double> %b
+; AVX-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <4 x double> %sel
+;
+; AVX512-LABEL: 'test_4double'
+; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %sel = select <4 x i1> <i1 true, i1 false, i1 true, i1 false>, <4 x double> %a, <4 x double> %b
+; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <4 x double> %sel
+;
+; SLM-LABEL: 'test_4double'
+; SLM-NEXT:  Cost Model: Found an estimated cost of 6 for instruction: %sel = select <4 x i1> <i1 true, i1 false, i1 true, i1 false>, <4 x double> %a, <4 x double> %b
+; SLM-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <4 x double> %sel
 ;
   %sel = select <4 x i1> <i1 true, i1 false, i1 true, i1 false>, <4 x double> %a, <4 x double> %b
   ret <4 x double> %sel
 }
 
 define <8 x i32> @test_8i32(<8 x i32> %a, <8 x i32> %b) {
-; CHECK-LABEL: 'test_8i32'
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %sel = select <8 x i1> <i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 false>, <8 x i32> %a, <8 x i32> %b
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <8 x i32> %sel
+; SSE2-LABEL: 'test_8i32'
+; SSE2-NEXT:  Cost Model: Found an estimated cost of 6 for instruction: %sel = select <8 x i1> <i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 false>, <8 x i32> %a, <8 x i32> %b
+; SSE2-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <8 x i32> %sel
+;
+; SSE4-LABEL: 'test_8i32'
+; SSE4-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %sel = select <8 x i1> <i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 false>, <8 x i32> %a, <8 x i32> %b
+; SSE4-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <8 x i32> %sel
+;
+; AVX-LABEL: 'test_8i32'
+; AVX-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %sel = select <8 x i1> <i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 false>, <8 x i32> %a, <8 x i32> %b
+; AVX-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <8 x i32> %sel
+;
+; AVX512-LABEL: 'test_8i32'
+; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %sel = select <8 x i1> <i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 false>, <8 x i32> %a, <8 x i32> %b
+; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <8 x i32> %sel
+;
+; SLM-LABEL: 'test_8i32'
+; SLM-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %sel = select <8 x i1> <i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 false>, <8 x i32> %a, <8 x i32> %b
+; SLM-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <8 x i32> %sel
 ;
   %sel = select <8 x i1> <i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 false>, <8 x i32> %a, <8 x i32> %b
   ret <8 x i32> %sel
 }
 
 define <8 x float> @test_8float(<8 x float> %a, <8 x float> %b) {
-; CHECK-LABEL: 'test_8float'
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %sel = select <8 x i1> <i1 true, i1 false, i1 false, i1 false, i1 true, i1 false, i1 false, i1 false>, <8 x float> %a, <8 x float> %b
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <8 x float> %sel
+; SSE2-LABEL: 'test_8float'
+; SSE2-NEXT:  Cost Model: Found an estimated cost of 6 for instruction: %sel = select <8 x i1> <i1 true, i1 false, i1 false, i1 false, i1 true, i1 false, i1 false, i1 false>, <8 x float> %a, <8 x float> %b
+; SSE2-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <8 x float> %sel
+;
+; SSE4-LABEL: 'test_8float'
+; SSE4-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %sel = select <8 x i1> <i1 true, i1 false, i1 false, i1 false, i1 true, i1 false, i1 false, i1 false>, <8 x float> %a, <8 x float> %b
+; SSE4-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <8 x float> %sel
+;
+; AVX-LABEL: 'test_8float'
+; AVX-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %sel = select <8 x i1> <i1 true, i1 false, i1 false, i1 false, i1 true, i1 false, i1 false, i1 false>, <8 x float> %a, <8 x float> %b
+; AVX-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <8 x float> %sel
+;
+; AVX512-LABEL: 'test_8float'
+; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %sel = select <8 x i1> <i1 true, i1 false, i1 false, i1 false, i1 true, i1 false, i1 false, i1 false>, <8 x float> %a, <8 x float> %b
+; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <8 x float> %sel
+;
+; SLM-LABEL: 'test_8float'
+; SLM-NEXT:  Cost Model: Found an estimated cost of 6 for instruction: %sel = select <8 x i1> <i1 true, i1 false, i1 false, i1 false, i1 true, i1 false, i1 false, i1 false>, <8 x float> %a, <8 x float> %b
+; SLM-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <8 x float> %sel
 ;
   %sel = select <8 x i1> <i1 true, i1 false, i1 false, i1 false, i1 true, i1 false, i1 false, i1 false>, <8 x float> %a, <8 x float> %b
   ret <8 x float> %sel
 }
 
 define <16 x i16> @test_16i16(<16 x i16> %a, <16 x i16> %b) {
-; CHECK-LABEL: 'test_16i16'
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %sel = select <16 x i1> <i1 true, i1 false, i1 false, i1 false, i1 true, i1 false, i1 false, i1 false, i1 true, i1 false, i1 false, i1 false, i1 true, i1 false, i1 false, i1 false>, <16 x i16> %a, <16 x i16> %b
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <16 x i16> %sel
+; SSE2-LABEL: 'test_16i16'
+; SSE2-NEXT:  Cost Model: Found an estimated cost of 6 for instruction: %sel = select <16 x i1> <i1 true, i1 false, i1 false, i1 false, i1 true, i1 false, i1 false, i1 false, i1 true, i1 false, i1 false, i1 false, i1 true, i1 false, i1 false, i1 false>, <16 x i16> %a, <16 x i16> %b
+; SSE2-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <16 x i16> %sel
+;
+; SSE4-LABEL: 'test_16i16'
+; SSE4-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %sel = select <16 x i1> <i1 true, i1 false, i1 false, i1 false, i1 true, i1 false, i1 false, i1 false, i1 true, i1 false, i1 false, i1 false, i1 true, i1 false, i1 false, i1 false>, <16 x i16> %a, <16 x i16> %b
+; SSE4-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <16 x i16> %sel
+;
+; AVX1-LABEL: 'test_16i16'
+; AVX1-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %sel = select <16 x i1> <i1 true, i1 false, i1 false, i1 false, i1 true, i1 false, i1 false, i1 false, i1 true, i1 false, i1 false, i1 false, i1 true, i1 false, i1 false, i1 false>, <16 x i16> %a, <16 x i16> %b
+; AVX1-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <16 x i16> %sel
+;
+; AVX2-LABEL: 'test_16i16'
+; AVX2-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %sel = select <16 x i1> <i1 true, i1 false, i1 false, i1 false, i1 true, i1 false, i1 false, i1 false, i1 true, i1 false, i1 false, i1 false, i1 true, i1 false, i1 false, i1 false>, <16 x i16> %a, <16 x i16> %b
+; AVX2-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <16 x i16> %sel
+;
+; AVX512-LABEL: 'test_16i16'
+; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %sel = select <16 x i1> <i1 true, i1 false, i1 false, i1 false, i1 true, i1 false, i1 false, i1 false, i1 true, i1 false, i1 false, i1 false, i1 true, i1 false, i1 false, i1 false>, <16 x i16> %a, <16 x i16> %b
+; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <16 x i16> %sel
+;
+; SLM-LABEL: 'test_16i16'
+; SLM-NEXT:  Cost Model: Found an estimated cost of 6 for instruction: %sel = select <16 x i1> <i1 true, i1 false, i1 false, i1 false, i1 true, i1 false, i1 false, i1 false, i1 true, i1 false, i1 false, i1 false, i1 true, i1 false, i1 false, i1 false>, <16 x i16> %a, <16 x i16> %b
+; SLM-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <16 x i16> %sel
 ;
   %sel = select <16 x i1> <i1 true, i1 false, i1 false, i1 false, i1 true, i1 false, i1 false, i1 false, i1 true, i1 false, i1 false, i1 false, i1 true, i1 false, i1 false, i1 false>, <16 x i16> %a, <16 x i16> %b
   ret <16 x i16> %sel
 }
 
 define <32 x i8> @test_32i8(<32 x i8> %a, <32 x i8> %b) {
-; CHECK-LABEL: 'test_32i8'
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %sel = select <32 x i1> <i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true>, <32 x i8> %a, <32 x i8> %b
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <32 x i8> %sel
+; SSE2-LABEL: 'test_32i8'
+; SSE2-NEXT:  Cost Model: Found an estimated cost of 6 for instruction: %sel = select <32 x i1> <i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true>, <32 x i8> %a, <32 x i8> %b
+; SSE2-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <32 x i8> %sel
+;
+; SSE4-LABEL: 'test_32i8'
+; SSE4-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %sel = select <32 x i1> <i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true>, <32 x i8> %a, <32 x i8> %b
+; SSE4-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <32 x i8> %sel
+;
+; AVX1-LABEL: 'test_32i8'
+; AVX1-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %sel = select <32 x i1> <i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true>, <32 x i8> %a, <32 x i8> %b
+; AVX1-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <32 x i8> %sel
+;
+; AVX2-LABEL: 'test_32i8'
+; AVX2-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %sel = select <32 x i1> <i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true>, <32 x i8> %a, <32 x i8> %b
+; AVX2-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <32 x i8> %sel
+;
+; AVX512-LABEL: 'test_32i8'
+; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %sel = select <32 x i1> <i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true>, <32 x i8> %a, <32 x i8> %b
+; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <32 x i8> %sel
+;
+; SLM-LABEL: 'test_32i8'
+; SLM-NEXT:  Cost Model: Found an estimated cost of 6 for instruction: %sel = select <32 x i1> <i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true>, <32 x i8> %a, <32 x i8> %b
+; SLM-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret <32 x i8> %sel
 ;
   %sel = select <32 x i1> <i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true>, <32 x i8> %a, <32 x i8> %b
   ret <32 x i8> %sel

diff  --git a/llvm/test/Transforms/PhaseOrdering/X86/vector-reductions-logical.ll b/llvm/test/Transforms/PhaseOrdering/X86/vector-reductions-logical.ll
index 2a01774b3ad5d..aeff193be49eb 100644
--- a/llvm/test/Transforms/PhaseOrdering/X86/vector-reductions-logical.ll
+++ b/llvm/test/Transforms/PhaseOrdering/X86/vector-reductions-logical.ll
@@ -11,18 +11,14 @@ define float @test_merge_allof_v4sf(<4 x float> %t) {
 ; CHECK-NEXT:    [[TMP0:%.*]] = fcmp uge <4 x float> [[T_FR]], zeroinitializer
 ; CHECK-NEXT:    [[TMP1:%.*]] = bitcast <4 x i1> [[TMP0]] to i4
 ; CHECK-NEXT:    [[TMP2:%.*]] = icmp eq i4 [[TMP1]], 0
-; CHECK-NEXT:    br i1 [[TMP2]], label [[RETURN:%.*]], label [[LOR_LHS_FALSE:%.*]]
-; CHECK:       lor.lhs.false:
 ; CHECK-NEXT:    [[TMP3:%.*]] = fcmp ule <4 x float> [[T_FR]], <float 1.000000e+00, float 1.000000e+00, float 1.000000e+00, float 1.000000e+00>
 ; CHECK-NEXT:    [[TMP4:%.*]] = bitcast <4 x i1> [[TMP3]] to i4
 ; CHECK-NEXT:    [[TMP5:%.*]] = icmp eq i4 [[TMP4]], 0
+; CHECK-NEXT:    [[OR_COND:%.*]] = or i1 [[TMP2]], [[TMP5]]
 ; CHECK-NEXT:    [[SHIFT:%.*]] = shufflevector <4 x float> [[T_FR]], <4 x float> poison, <4 x i32> <i32 1, i32 undef, i32 undef, i32 undef>
 ; CHECK-NEXT:    [[TMP6:%.*]] = fadd <4 x float> [[T_FR]], [[SHIFT]]
 ; CHECK-NEXT:    [[ADD:%.*]] = extractelement <4 x float> [[TMP6]], i64 0
-; CHECK-NEXT:    [[SPEC_SELECT:%.*]] = select i1 [[TMP5]], float 0.000000e+00, float [[ADD]]
-; CHECK-NEXT:    br label [[RETURN]]
-; CHECK:       return:
-; CHECK-NEXT:    [[RETVAL_0:%.*]] = phi float [ 0.000000e+00, [[ENTRY:%.*]] ], [ [[SPEC_SELECT]], [[LOR_LHS_FALSE]] ]
+; CHECK-NEXT:    [[RETVAL_0:%.*]] = select i1 [[OR_COND]], float 0.000000e+00, float [[ADD]]
 ; CHECK-NEXT:    ret float [[RETVAL_0]]
 ;
 entry:
@@ -171,18 +167,14 @@ define float @test_separate_allof_v4sf(<4 x float> %t) {
 ; CHECK-NEXT:    [[TMP0:%.*]] = fcmp uge <4 x float> [[T_FR]], zeroinitializer
 ; CHECK-NEXT:    [[TMP1:%.*]] = bitcast <4 x i1> [[TMP0]] to i4
 ; CHECK-NEXT:    [[TMP2:%.*]] = icmp eq i4 [[TMP1]], 0
-; CHECK-NEXT:    br i1 [[TMP2]], label [[RETURN:%.*]], label [[IF_END:%.*]]
-; CHECK:       if.end:
 ; CHECK-NEXT:    [[TMP3:%.*]] = fcmp ule <4 x float> [[T_FR]], <float 1.000000e+00, float 1.000000e+00, float 1.000000e+00, float 1.000000e+00>
 ; CHECK-NEXT:    [[TMP4:%.*]] = bitcast <4 x i1> [[TMP3]] to i4
 ; CHECK-NEXT:    [[TMP5:%.*]] = icmp eq i4 [[TMP4]], 0
+; CHECK-NEXT:    [[OR_COND:%.*]] = or i1 [[TMP2]], [[TMP5]]
 ; CHECK-NEXT:    [[SHIFT:%.*]] = shufflevector <4 x float> [[T_FR]], <4 x float> poison, <4 x i32> <i32 1, i32 undef, i32 undef, i32 undef>
 ; CHECK-NEXT:    [[TMP6:%.*]] = fadd <4 x float> [[T_FR]], [[SHIFT]]
 ; CHECK-NEXT:    [[ADD:%.*]] = extractelement <4 x float> [[TMP6]], i64 0
-; CHECK-NEXT:    [[SPEC_SELECT:%.*]] = select i1 [[TMP5]], float 0.000000e+00, float [[ADD]]
-; CHECK-NEXT:    br label [[RETURN]]
-; CHECK:       return:
-; CHECK-NEXT:    [[RETVAL_0:%.*]] = phi float [ 0.000000e+00, [[ENTRY:%.*]] ], [ [[SPEC_SELECT]], [[IF_END]] ]
+; CHECK-NEXT:    [[RETVAL_0:%.*]] = select i1 [[OR_COND]], float 0.000000e+00, float [[ADD]]
 ; CHECK-NEXT:    ret float [[RETVAL_0]]
 ;
 entry:
@@ -257,18 +249,14 @@ define float @test_separate_anyof_v4sf(<4 x float> %t) {
 ; CHECK-NEXT:    [[TMP0:%.*]] = fcmp olt <4 x float> [[T_FR]], zeroinitializer
 ; CHECK-NEXT:    [[TMP1:%.*]] = bitcast <4 x i1> [[TMP0]] to i4
 ; CHECK-NEXT:    [[DOTNOT:%.*]] = icmp eq i4 [[TMP1]], 0
-; CHECK-NEXT:    br i1 [[DOTNOT]], label [[IF_END:%.*]], label [[RETURN:%.*]]
-; CHECK:       if.end:
 ; CHECK-NEXT:    [[TMP2:%.*]] = fcmp ogt <4 x float> [[T_FR]], <float 1.000000e+00, float 1.000000e+00, float 1.000000e+00, float 1.000000e+00>
 ; CHECK-NEXT:    [[TMP3:%.*]] = bitcast <4 x i1> [[TMP2]] to i4
 ; CHECK-NEXT:    [[DOTNOT6:%.*]] = icmp eq i4 [[TMP3]], 0
+; CHECK-NEXT:    [[OR_COND:%.*]] = and i1 [[DOTNOT]], [[DOTNOT6]]
 ; CHECK-NEXT:    [[SHIFT:%.*]] = shufflevector <4 x float> [[T_FR]], <4 x float> poison, <4 x i32> <i32 1, i32 undef, i32 undef, i32 undef>
 ; CHECK-NEXT:    [[TMP4:%.*]] = fadd <4 x float> [[T_FR]], [[SHIFT]]
 ; CHECK-NEXT:    [[ADD:%.*]] = extractelement <4 x float> [[TMP4]], i64 0
-; CHECK-NEXT:    [[SPEC_SELECT:%.*]] = select i1 [[DOTNOT6]], float [[ADD]], float 0.000000e+00
-; CHECK-NEXT:    br label [[RETURN]]
-; CHECK:       return:
-; CHECK-NEXT:    [[RETVAL_0:%.*]] = phi float [ 0.000000e+00, [[ENTRY:%.*]] ], [ [[SPEC_SELECT]], [[IF_END]] ]
+; CHECK-NEXT:    [[RETVAL_0:%.*]] = select i1 [[OR_COND]], float [[ADD]], float 0.000000e+00
 ; CHECK-NEXT:    ret float [[RETVAL_0]]
 ;
 entry:


        


More information about the llvm-commits mailing list