[PATCH] D40008: [X86][TTI] update costs of interleaved load\store of i64\double

Dorit Nuzman via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Nov 15 10:28:21 PST 2017


dorit added a comment.

I think it would be nice to make the testcases smaller; Right now you have something like this:
for (…) {
 Dst[2*i] = Dst[2*i] +  Src[2*i] * k
 Dst[2*i+1] = Dst[2*i+1] + Src[2*i+1] * k
}
...which actually tests both strided loads and strided stores.
So you could either use one test to check both store and load costs (and even then you probably don't need both a mul and an add just to check memops costs). 
Or if you want to separate the load and store cases, the Load test could be something like:
for (…) {
 s += Src[2*i] 
 s += Src[2*i+1] 
}
The Store test could be something like:
For(…){

  Dst[2*i] = k1;
  Dst[2*i+1] = k2;

}



================
Comment at: test/Analysis/CostModel/X86/interleaved-store-i64.ll:2
+; REQUIRES: asserts
+; RUN: opt -S -loop-vectorize -debug-only=loop-vectorize -mcpu=skylake %s 2>&1 | FileCheck %s
+target datalayout = "e-m:e-p:32:32-f64:32:64-f80:32-n8:16:32-S128"
----------------
I see some of the interleave tests in this directory use -mcpu=core_avx2 and some use -mcpu=skylake.  I wonder which one we want to use?


https://reviews.llvm.org/D40008





More information about the llvm-commits mailing list