[PATCH] D100099: [X86][CostModel] Try to fix cost computation load/stores of non-power-of-two vectors

Thu Apr 15 09:22:38 PDT 2021

lebedev.ri added a comment.

This is probably still impresice for small remainder sub-vectors.
E.g. load cost for `<3 x float>` w/ 8 byte alignment should be 1: https://godbolt.org/z/r3ncvMvaf

================
Comment at: llvm/lib/Target/X86/X86TargetTransformInfo.cpp:3236
+        std::pair<int, MVT> LST = TLI->getTypeLegalizationCost(DL, SubTy);
+        if (!LST.second.isVector()) {
+          APInt DemandedElts =
----------------
Hm, i wonder if we also need to add `getShuffleCost(SK_ExtractSubvector` cost.
(with wide vector ty widened to next power of two)

================
Comment at: llvm/lib/Target/X86/X86TargetTransformInfo.cpp:3222-3230
+      SmallVector<unsigned, CHAR_BIT * sizeof(NumElem)> Factors;
+      for (unsigned Bit = 0; Bit != CHAR_BIT * sizeof(NumElem); ++Bit) {
+        unsigned Factor = unsigned(1) << Bit;
+        if (NumElem & Factor)
+          Factors.emplace_back(Factor);
+      }
+      assert(std::accumulate(Factors.begin(), Factors.end(), unsigned(0)) ==
----------------
lebedev.ri wrote:
> RKSimon wrote:
> > ABataev wrote:
> > > Why not just something like this:
> > > ```
> > > unsigned Factor = 0;
> > > for (; NumElem > 0; NumElem -= Factor) {
> > >   Factor = PowerOf2Floor(NumElem);
> > >   .....
> > > }
> > > ```
> > +1 Having Factor updated in the condition as well as being used in increment block is difficult to grok
> Note that i have addressed @ABataev's comment, it was about earlier patch version:
> https://reviews.llvm.org/D100099?id=336063#change-OQEVJvBxQbDZ
... or are you telling to move `Factor` computation from the condition?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D100099/new/

https://reviews.llvm.org/D100099