[all-commits] [llvm/llvm-project] 37646a: [RISCV] Account for LMUL in memory op costs

Philip Reames via All-commits all-commits at lists.llvm.org
Wed Apr 5 07:59:43 PDT 2023


  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: 37646a2c28fd08f7d7715fd8efc132357ffe0c34
      https://github.com/llvm/llvm-project/commit/37646a2c28fd08f7d7715fd8efc132357ffe0c34
  Author: Philip Reames <preames at rivosinc.com>
  Date:   2023-04-05 (Wed, 05 Apr 2023)

  Changed paths:
    M llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
    M llvm/test/Analysis/CostModel/RISCV/masked_ldst.ll
    M llvm/test/Analysis/CostModel/RISCV/rvv-load-store.ll
    M llvm/test/Transforms/LoopVectorize/RISCV/interleaved-cost.ll
    M llvm/test/Transforms/LoopVectorize/RISCV/riscv-vector-reverse.ll
    M llvm/test/Transforms/LoopVectorize/RISCV/strided-accesses.ll
    M llvm/test/Transforms/LoopVectorize/RISCV/zvl32b.ll

  Log Message:
  -----------
  [RISCV] Account for LMUL in memory op costs

Generally, the cost of a memory op will scale with the number of vector registers accessed. Machines might exist which have a narrow memory access than vector register width, but machines with a wider memory access width than vector register width seem unlikely.

I noticed this because we were preferring wide loads + deinterleaves on examples where the cost of a short gather (actually a strided load) would be better. Touching 8 vector registers instead of doing a 4 element gather is not a good tradeoff.

Differential Revision: https://reviews.llvm.org/D147470




More information about the All-commits mailing list