[PATCH] Improve the cost evaluation of LSR

Mon May 4 13:10:07 PDT 2015

Something to consider when evaluating changes in this area: I see no perf difference between the existing and proposed codegen for the loops seen in https://llvm.org/bugs/show_bug.cgi?id=23384#c0 when running on an Intel Haswell system, but there is a >10% win running on an AMD Jaguar system when using the complex addressing. The Jaguar core is narrower (2-wide issue) and has much less micro-architectural shock absorption than Haswell. Wins and losses may be more apparent on smaller cores such as this or Atom.

FWIW, I applied the patch, ran the benchmarking subset of test-suite on the Jaguar system, and saw a 0.7% geomean improvement after filtering out lots of noisy tests. Benchmarks/McGill/chomp is the largest improvement: +31%; Benchmarks/Stanford/FloatMM is the worst regression: -10%. I haven't analyzed the results any more than that yet.

REPOSITORY
  rL LLVM

http://reviews.llvm.org/D9429

EMAIL PREFERENCES
  http://reviews.llvm.org/settings/panel/emailpreferences/