[PATCH] D20057: [X86][SSE] Improve cost model for i64 vector comparisons on pre-SSE42 targets

Simon Pilgrim via llvm-commits llvm-commits at lists.llvm.org
Sun May 8 09:01:26 PDT 2016


RKSimon created this revision.
RKSimon added reviewers: silvas, ab, andreadb, spatel.
RKSimon added a subscriber: llvm-commits.
RKSimon set the repository for this revision to rL LLVM.

As discussed on PR24888, until SSE42 we don't have access to PCMPGTQ for v2i64 comparisons, but the cost models don't reflect this, resulting in over-optimistic vectorizaton.

This patch adds SSE2 'base level' costs that match what a typical target is capable of and only reduces the v2i64 costs at SSE42.

Technically SSE41 provides a PCMPEQQ v2i64 equality test, but as getCmpSelInstrCost doesn't give us a way to discriminate between comparision test types we can't easily make use of this, otherwise we could split the cost of integer equality and greater-than tests to give better costings of each.

Repository:
  rL LLVM

http://reviews.llvm.org/D20057

Files:
  lib/Target/X86/X86TargetTransformInfo.cpp
  test/Analysis/CostModel/X86/cmp.ll

Index: test/Analysis/CostModel/X86/cmp.ll
===================================================================
--- test/Analysis/CostModel/X86/cmp.ll
+++ test/Analysis/CostModel/X86/cmp.ll
@@ -87,18 +87,18 @@
   ;AVX:   cost of 1 {{.*}} icmp
   %H = icmp eq <4 x i32> undef, undef
 
-  ;SSE2:  cost of 1 {{.*}} icmp
-  ;SSE3:  cost of 1 {{.*}} icmp
-  ;SSSE3: cost of 1 {{.*}} icmp
-  ;SSE41: cost of 1 {{.*}} icmp
+  ;SSE2:  cost of 8 {{.*}} icmp
+  ;SSE3:  cost of 8 {{.*}} icmp
+  ;SSSE3: cost of 8 {{.*}} icmp
+  ;SSE41: cost of 8 {{.*}} icmp
   ;SSE42: cost of 1 {{.*}} icmp
   ;AVX:   cost of 1 {{.*}} icmp
   %I = icmp eq <2 x i64> undef, undef
 
-  ;SSE2:  cost of 2 {{.*}} icmp
-  ;SSE3:  cost of 2 {{.*}} icmp
-  ;SSSE3: cost of 2 {{.*}} icmp
-  ;SSE41: cost of 2 {{.*}} icmp
+  ;SSE2:  cost of 16 {{.*}} icmp
+  ;SSE3:  cost of 16 {{.*}} icmp
+  ;SSSE3: cost of 16 {{.*}} icmp
+  ;SSE41: cost of 16 {{.*}} icmp
   ;SSE42: cost of 2 {{.*}} icmp
   ;AVX1:  cost of 4 {{.*}} icmp
   ;AVX2:  cost of 1 {{.*}} icmp
Index: lib/Target/X86/X86TargetTransformInfo.cpp
===================================================================
--- lib/Target/X86/X86TargetTransformInfo.cpp
+++ lib/Target/X86/X86TargetTransformInfo.cpp
@@ -857,13 +857,17 @@
   int ISD = TLI->InstructionOpcodeToISD(Opcode);
   assert(ISD && "Invalid opcode");
 
+  static const CostTblEntry SSE2CostTbl[] = {
+    { ISD::SETCC,   MVT::v2i64,   8 },
+    { ISD::SETCC,   MVT::v4i32,   1 },
+    { ISD::SETCC,   MVT::v8i16,   1 },
+    { ISD::SETCC,   MVT::v16i8,   1 },
+  };
+
   static const CostTblEntry SSE42CostTbl[] = {
     { ISD::SETCC,   MVT::v2f64,   1 },
     { ISD::SETCC,   MVT::v4f32,   1 },
     { ISD::SETCC,   MVT::v2i64,   1 },
-    { ISD::SETCC,   MVT::v4i32,   1 },
-    { ISD::SETCC,   MVT::v8i16,   1 },
-    { ISD::SETCC,   MVT::v16i8,   1 },
   };
 
   static const CostTblEntry AVX1CostTbl[] = {
@@ -906,6 +910,10 @@
     if (const auto *Entry = CostTableLookup(SSE42CostTbl, ISD, MTy))
       return LT.first * Entry->Cost;
 
+  if (ST->hasSSE2())
+    if (const auto *Entry = CostTableLookup(SSE2CostTbl, ISD, MTy))
+      return LT.first * Entry->Cost;
+
   return BaseT::getCmpSelInstrCost(Opcode, ValTy, CondTy);
 }
 


-------------- next part --------------
A non-text attachment was scrubbed...
Name: D20057.56519.patch
Type: text/x-patch
Size: 2280 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20160508/2f7c81d0/attachment.bin>


More information about the llvm-commits mailing list