[PATCH] D89767: [CostModel][X86] teach TTI calculate cost of chain of vector inserts/extracts more precisely and correctly:In each 128-lane, if there is at least one index is demanded and not all indices are demanded...
    Pengfei Wang via Phabricator via llvm-commits 
    llvm-commits at lists.llvm.org
       
    Tue Oct 20 07:42:19 PDT 2020
    
    
  
pengfei added inline comments.
================
Comment at: llvm/lib/Target/X86/X86TargetTransformInfo.cpp:3088
+        // inserti128.
+        // Case#3: inserting into 4,5,6,7 index needs 4*vpinsrd + inserti128.
+        unsigned Num128Lanes = LT.second.getSizeInBits() / 128 * LT.first;
----------------
Why inserting 5th in case 2 needs extracti128 but case 3 doesn't?
================
Comment at: llvm/lib/Target/X86/X86TargetTransformInfo.cpp:3089
+        // Case#3: inserting into 4,5,6,7 index needs 4*vpinsrd + inserti128.
+        unsigned Num128Lanes = LT.second.getSizeInBits() / 128 * LT.first;
+        unsigned NumElts = LT.second.getVectorNumElements() * LT.first;
----------------
Do we need to skip if LT.second.getSizeInBits() == 128?
Repository:
  rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D89767/new/
https://reviews.llvm.org/D89767
    
    
More information about the llvm-commits
mailing list