[PATCH] D110971: [X86][Costmodel] Load/store i8 Stride=4 VF=32 interleaving costs

Roman Lebedev via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Sat Oct 2 15:06:49 PDT 2021


lebedev.ri added a comment.

In D110971#3038358 <https://reviews.llvm.org/D110971#3038358>, @RKSimon wrote:

> I've no objections,

Oh, good! Then i basically intend to completely fill out the {i8,i16,i32,64} x {2,3,4,6} permutation matrix, we are quite close actually.

> although I think we'd gain more by ensuring we have cost numbers of the right magnitude for SSE2 first and adjust for later targets.

I agree that the baseline costs are bogus, but as discussed previously to improve them we need better generic shuffle cost modelling.

> As you said, its all rather tedious though - I don't know if we've passed the point where automating more of this with scripting (either the cost extraction or the CHECKs) would be useful?

It would be good to have the update-check-lines script for this, yes.
Automatic cost extraction is somewhat more tedious than i have originally anticipated,
because most of the time i have to manually adjust the assembly to weed out the loads/stores.

After i'm done with these costs, there are two general things i'll want to look into:

- we have a gigantic elephant in the room: if some lanes aren't demanded, we should just scale the cost by the number of demanded lanes, not fallback to the baseline cost. this will have monumental impact.
- i suspect our legality-driven cost model for constant-masked loads/stores/scatters/gathers is misguided.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D110971/new/

https://reviews.llvm.org/D110971



More information about the llvm-commits mailing list