[llvm] 2eaef53 - [TTI] `BasicTTIImplBase::getInterleavedMemoryOpCost()`: fix load discounting

Roman Lebedev via llvm-commits llvm-commits at lists.llvm.org
Fri Oct 22 04:09:18 PDT 2021


Author: Roman Lebedev
Date: 2021-10-22T14:08:58+03:00
New Revision: 2eaef530232e1fbf12dec087487346dcaaf97b1c

URL: https://github.com/llvm/llvm-project/commit/2eaef530232e1fbf12dec087487346dcaaf97b1c
DIFF: https://github.com/llvm/llvm-project/commit/2eaef530232e1fbf12dec087487346dcaaf97b1c.diff

LOG: [TTI] `BasicTTIImplBase::getInterleavedMemoryOpCost()`: fix load discounting

The math here is:
Cost of 1 load = cost of n loads / n
Cost of live loads = num live loads * Cost of 1 load
Cost of live loads = num live loads * (cost of n loads / n)
Cost of live loads = cost of n loads * (num live loads / n)

But, all the variables here are integers,
and integer division rounds down,
but this calculation clearly expects float semantics.

Instead multiply upfront, and then perform round-up-division.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D112302

Added: 
    

Modified: 
    llvm/include/llvm/CodeGen/BasicTTIImpl.h
    llvm/test/Transforms/LoopVectorize/AArch64/interleaved_cost.ll

Removed: 
    


################################################################################
diff  --git a/llvm/include/llvm/CodeGen/BasicTTIImpl.h b/llvm/include/llvm/CodeGen/BasicTTIImpl.h
index fcc8202f3bd87..9b116a8c65544 100644
--- a/llvm/include/llvm/CodeGen/BasicTTIImpl.h
+++ b/llvm/include/llvm/CodeGen/BasicTTIImpl.h
@@ -1214,7 +1214,7 @@ class BasicTTIImplBase : public TargetTransformInfoImplCRTPBase<T> {
     //
     // TODO: Note that legalization can turn masked loads/stores into unmasked
     // (legalized) loads/stores. This can be reflected in the cost.
-    if (VecTySize > VecTyLTSize) {
+    if (Cost.isValid() && VecTySize > VecTyLTSize) {
       // The number of loads of a legal type it will take to represent a load
       // of the unlegalized vector type.
       unsigned NumLegalInsts = divideCeil(VecTySize, VecTyLTSize);
@@ -1231,7 +1231,8 @@ class BasicTTIImplBase : public TargetTransformInfoImplCRTPBase<T> {
 
       // Scale the cost of the load by the fraction of legal instructions that
       // will be used.
-      Cost *= UsedInsts.count() / NumLegalInsts;
+      Cost = divideCeil(UsedInsts.count() * Cost.getValue().getValue(),
+                        NumLegalInsts);
     }
 
     // Then plus the cost of interleave operation.

diff  --git a/llvm/test/Transforms/LoopVectorize/AArch64/interleaved_cost.ll b/llvm/test/Transforms/LoopVectorize/AArch64/interleaved_cost.ll
index 54ee8fc6e73fd..2b28a8ecf2129 100644
--- a/llvm/test/Transforms/LoopVectorize/AArch64/interleaved_cost.ll
+++ b/llvm/test/Transforms/LoopVectorize/AArch64/interleaved_cost.ll
@@ -168,7 +168,7 @@ entry:
 ; gaps.
 ;
 ; VF_2-LABEL: Checking a loop in "i64_factor_8"
-; VF_2:         Found an estimated cost of 6 for VF 2 For instruction: %tmp2 = load i64, i64* %tmp0, align 8
+; VF_2:         Found an estimated cost of 10 for VF 2 For instruction: %tmp2 = load i64, i64* %tmp0, align 8
 ; VF_2-NEXT:    Found an estimated cost of 0 for VF 2 For instruction: %tmp3 = load i64, i64* %tmp1, align 8
 ; VF_2-NEXT:    Found an estimated cost of 7 for VF 2 For instruction: store i64 0, i64* %tmp0, align 8
 ; VF_2-NEXT:    Found an estimated cost of 7 for VF 2 For instruction: store i64 0, i64* %tmp1, align 8


        


More information about the llvm-commits mailing list