[PATCH] D59252: [TTI] getMemcpyCost

Sjoerd Meijer via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Mar 13 09:39:54 PDT 2019


SjoerdMeijer updated this revision to Diff 190431.
SjoerdMeijer edited the summary of this revision.
SjoerdMeijer added a comment.
Herald added a subscriber: javed.absar.

In my previous comment, I commented on a possible default cost calculation for a memcpy, which approximated the cost by calculating the number of load/stores pairs. I probably could see this working as a default estimation, because I think when this would be used, the result will be that memcpy's are considered expensive relative to other instructions, which is probably correct. However,  I appreciate that this is an approximation, and that there many things are missing here, like source/destination alignment, available load/store instruction, etc., which really shows this is target dependent decision. So, I've changed getMemcpyCost to just simply return TCC_Expensive, so that target can override this method and implement their decision making there, like is done for many/most functions here in TTI.

I've added regression test.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D59252/new/

https://reviews.llvm.org/D59252

Files:
  include/llvm/Analysis/TargetTransformInfo.h
  include/llvm/Analysis/TargetTransformInfoImpl.h
  test/Analysis/CostModel/ARM/memcpy.ll


Index: test/Analysis/CostModel/ARM/memcpy.ll
===================================================================
--- /dev/null
+++ test/Analysis/CostModel/ARM/memcpy.ll
@@ -0,0 +1,13 @@
+; RUN: opt < %s  -cost-model -analyze -cost-kind=code-size | FileCheck %s
+
+target datalayout = "e-m:e-p:32:32-Fi8-i64:64-v128:64:128-a:0:32-n32-S64"
+target triple = "thumbv7m-arm-unknown-eabi"
+
+define void @memcpy(i8* %d, i8* %s, i32 %N) {
+entry:
+; CHECK: cost of 4 for instruction: call void @llvm.memcpy.p0i8.p0i8.i32
+  call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 1 %d, i8* align 1 %s, i32 36, i1 false)
+  ret void
+}
+
+declare void @llvm.memcpy.p0i8.p0i8.i32(i8* nocapture writeonly, i8* nocapture readonly, i32, i1) #1
Index: include/llvm/Analysis/TargetTransformInfoImpl.h
===================================================================
--- include/llvm/Analysis/TargetTransformInfoImpl.h
+++ include/llvm/Analysis/TargetTransformInfoImpl.h
@@ -140,15 +140,22 @@
 
   unsigned getInliningThresholdMultiplier() { return 1; }
 
+  unsigned getMemcpyCost(const Instruction *I) {
+    return TTI::TCC_Expensive;
+  }
+
   unsigned getIntrinsicCost(Intrinsic::ID IID, Type *RetTy,
                             ArrayRef<Type *> ParamTys, const User *U) {
     switch (IID) {
     default:
       // Intrinsics rarely (if ever) have normal argument setup constraints.
       // Model them as having a basic instruction cost.
-      // FIXME: This is wrong for libc intrinsics.
       return TTI::TCC_Basic;
 
+    // TODO: other libc intrinsics.
+    case Intrinsic::memcpy:
+      return getMemcpyCost(dyn_cast<Instruction>(U));
+
     case Intrinsic::annotation:
     case Intrinsic::assume:
     case Intrinsic::sideeffect:
Index: include/llvm/Analysis/TargetTransformInfo.h
===================================================================
--- include/llvm/Analysis/TargetTransformInfo.h
+++ include/llvm/Analysis/TargetTransformInfo.h
@@ -246,6 +246,10 @@
                        ArrayRef<const Value *> Arguments,
                        const User *U = nullptr) const;
 
+  /// \Return the expected cost of a memcpy, which could e.g. depend on the
+  /// source/destination type and alignment and the number of bytes copied.
+  int getMemcpyCost(const Instruction *I) const;
+
   /// \return The estimated number of case clusters when lowering \p 'SI'.
   /// \p JTSize Set a jump table size only when \p SI is suitable for a jump
   /// table.
@@ -1053,6 +1057,7 @@
   virtual int getIntrinsicCost(Intrinsic::ID IID, Type *RetTy,
                                ArrayRef<const Value *> Arguments,
                                const User *U) = 0;
+  virtual int getMemcpyCost(const Instruction *I) = 0;
   virtual unsigned getEstimatedNumberOfCaseClusters(const SwitchInst &SI,
                                                     unsigned &JTSize) = 0;
   virtual int
@@ -1267,6 +1272,9 @@
                        const User *U = nullptr) override {
     return Impl.getIntrinsicCost(IID, RetTy, Arguments, U);
   }
+  int getMemcpyCost(const Instruction *I) {
+    return Impl.getMemcpyCost(I);
+  }
   int getUserCost(const User *U, ArrayRef<const Value *> Operands) override {
     return Impl.getUserCost(U, Operands);
   }


-------------- next part --------------
A non-text attachment was scrubbed...
Name: D59252.190431.patch
Type: text/x-patch
Size: 3243 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20190313/a6a2ce30/attachment.bin>


More information about the llvm-commits mailing list