[all-commits] [llvm/llvm-project] 090cae: [TTI] Add DemandedElts to getScalarizationOverhead

Simon Pilgrim via All-commits all-commits at lists.llvm.org
Wed Apr 29 04:16:17 PDT 2020


  Branch: refs/heads/master
  Home:   https://github.com/llvm/llvm-project
  Commit: 090cae8491279126690b9530ce72b7f8fdb1dc9e
      https://github.com/llvm/llvm-project/commit/090cae8491279126690b9530ce72b7f8fdb1dc9e
  Author: Simon Pilgrim <llvm-dev at redking.me.uk>
  Date:   2020-04-29 (Wed, 29 Apr 2020)

  Changed paths:
    M llvm/include/llvm/Analysis/TargetTransformInfo.h
    M llvm/include/llvm/Analysis/TargetTransformInfoImpl.h
    M llvm/include/llvm/CodeGen/BasicTTIImpl.h
    M llvm/lib/Analysis/TargetTransformInfo.cpp
    M llvm/lib/Target/Hexagon/HexagonTargetTransformInfo.cpp
    M llvm/lib/Target/Hexagon/HexagonTargetTransformInfo.h
    M llvm/lib/Target/X86/X86TargetTransformInfo.cpp
    M llvm/lib/Target/X86/X86TargetTransformInfo.h
    M llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
    M llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
    M llvm/test/Analysis/CostModel/X86/arith-fp.ll
    M llvm/test/Analysis/CostModel/X86/fptosi.ll
    M llvm/test/Analysis/CostModel/X86/fptoui.ll
    M llvm/test/Analysis/CostModel/X86/fround.ll
    M llvm/test/Analysis/CostModel/X86/intrinsic-cost.ll
    M llvm/test/Analysis/CostModel/X86/load_store.ll
    M llvm/test/Analysis/CostModel/X86/masked-intrinsic-cost.ll
    M llvm/test/Analysis/CostModel/X86/sitofp.ll
    M llvm/test/Transforms/LoopVectorize/X86/strided_load_cost.ll
    M llvm/test/Transforms/SLPVectorizer/X86/minimum-sizes.ll
    M llvm/test/Transforms/SLPVectorizer/X86/resched.ll
    M llvm/test/Transforms/SLPVectorizer/X86/vectorize-reorder-reuse.ll

  Log Message:
  -----------
  [TTI] Add DemandedElts to getScalarizationOverhead

The improvements to the x86 vector insert/extract element costs in D74976 resulted in the estimated costs for vector initialization and scalarization increasing higher than should be expected. This is particularly noticeable on pre-SSE4 targets where the available of legal INSERT_VECTOR_ELT ops is more limited.

This patch does 2 things:
1 - it implements X86TTIImpl::getScalarizationOverhead to more accurately represent the typical costs of a ISD::BUILD_VECTOR pattern.
2 - it adds a DemandedElts mask to getScalarizationOverhead to permit the SLP's BoUpSLP::getGatherCost to be rewritten to use it directly instead of accumulating raw vector insertion costs.

This fixes PR45418 where a v4i8 (zext'd to v4i32) was no longer vectorizing.

A future patch should extend X86TTIImpl::getScalarizationOverhead to tweak the EXTRACT_VECTOR_ELT scalarization costs as well.

Reviewed By: @craig.topper

Differential Revision: https://reviews.llvm.org/D78216




More information about the All-commits mailing list