[PATCH] D42981: [COST] Fix cost model of load instructions on X86

Thu Feb 25 07:49:27 PST 2021

ABataev added a comment.

In D42981#2587680 <https://reviews.llvm.org/D42981#2587680>, @RKSimon wrote:

> I have a few concerns
>
> - we're increasing register pressure (x86 fp scalars share regs with the vector types)
> - I'm not certain we are accounting for the impact of increased AGU usage - even though the load has been folded "for free", are we correctly handling the gep costs?

Yes, probably. That's why I did internal thorough performance testing of this patch. And also one of my colleagues did this too. And we got almost the same results. We have about +9% gain for one of the tests, +2% and +3% gains for 2 other tests (for AVX2), no significant changes for AVX512, so mostly the old targets are affected. No significant perf losses were found during testing.

================
Comment at: llvm/test/Transforms/LoopVectorize/X86/pr34438.ll:35
+; CHECK-NEXT:    store <4 x float> [[TMP7]], <4 x float>* [[TMP8]], align 4, !llvm.access.group !0
+; CHECK-NEXT:    [[INDEX_NEXT]] = add i64 [[INDEX]], 4
 ; CHECK-NEXT:    [[TMP9:%.*]] = icmp eq i64 [[INDEX_NEXT]], 8
----------------
RKSimon wrote:
> This looks like a VECTOR regression on AVX targets?
Looks like it prefers 256 bit vectors rather than 2 x 256 + SPLIT vector instructions (8 x i64). I think this is fine for older targets

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D42981/new/

https://reviews.llvm.org/D42981