[all-commits] [llvm/llvm-project] 9f6048: [CostModel] remove cost-kind predicate for ctlz/ct...

RotateRight via All-commits all-commits at lists.llvm.org
Thu Oct 15 10:20:10 PDT 2020


  Branch: refs/heads/master
  Home:   https://github.com/llvm/llvm-project
  Commit: 9f6048f83dc2be76ab97172bf1113d9fa9db7c60
      https://github.com/llvm/llvm-project/commit/9f6048f83dc2be76ab97172bf1113d9fa9db7c60
  Author: Sanjay Patel <spatel at rotateright.com>
  Date:   2020-10-15 (Thu, 15 Oct 2020)

  Changed paths:
    M llvm/include/llvm/CodeGen/BasicTTIImpl.h
    M llvm/test/Analysis/CostModel/X86/intrinsic-cost-kinds.ll

  Log Message:
  -----------
  [CostModel] remove cost-kind predicate for ctlz/cttz intrinsics in basic TTI implementation

The cost modeling for intrinsics is a patchwork based on different
expectations from the callers, so it's a mess. I'm hoping to untangle
this to allow canonicalization to the new min/max intrinsics in IR.
The general goal is to remove the cost-kind restriction here in the
basic implementation class. Ie, if some intrinsic has throughput cost
of 104, assume that it has the same size, latency, and blended costs.
Effectively, an intrinsic with cost N is composed of N simple
instructions. If that's not correct, the target should provide a more
accurate override.

The x86-64 SSE2 subtarget cost diffs require explanation:

1. The scalar ctlz/cttz are assuming "BSR+XOR+CMOV" or
   "TEST+BSF+CMOV/BRANCH", so not cheap.
2. The 128-bit SSE vector width versions assume cost of 18 or 26
   (no explanation provided in the tables, but this corresponds to a
   bunch of shift/logic/compare).
3. The 512-bit vectors in the test file are scaled up by a factor of
   4 from the legal vector width costs.
4. The plain latency cost-kind is not affected in this patch because
   that calc is diverted before we get to getIntrinsicInstrCost().

Differential Revision: https://reviews.llvm.org/D89461




More information about the All-commits mailing list