[all-commits] [llvm/llvm-project] 243e58: [CostModel][X86] Improve accuracy of vXi64 MUL cos...

Mon May 24 01:48:56 PDT 2021

  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: 243e58868176102484c3ff1a338342633ede7361
      https://github.com/llvm/llvm-project/commit/243e58868176102484c3ff1a338342633ede7361
  Author: Simon Pilgrim <llvm-dev at redking.me.uk>
  Date:   2021-05-24 (Mon, 24 May 2021)

  Changed paths:
    M llvm/lib/Target/X86/X86TargetTransformInfo.cpp
    M llvm/test/Analysis/CostModel/X86/arith-fix.ll
    M llvm/test/Analysis/CostModel/X86/arith-overflow.ll
    M llvm/test/Analysis/CostModel/X86/arith.ll
    M llvm/test/Analysis/CostModel/X86/reduce-mul.ll
    M llvm/test/Analysis/CostModel/X86/rem.ll

  Log Message:
  -----------
  [CostModel][X86] Improve accuracy of vXi64 MUL costs on AVX2/AVX512 targets

By llvm-mca analysis, Haswell/Broadwell has the worst v4i64 recip-throughput cost of the AVX2 targets at 6 (vs the currently used cost of 8). Similarly SkylakeServer (our only AVX512 target model) implements PMULLQ with an average cost of 1.5 (rounded up to 2.0), and the PMULUDQ-sequence (without AVX512DQ) as a cost of 6.