[all-commits] [llvm/llvm-project] 0aff17: [Analysis] Add simple cost model for strict (in-or...
david-arm via All-commits
all-commits at lists.llvm.org
Mon Jul 26 02:26:25 PDT 2021
Branch: refs/heads/main
Home: https://github.com/llvm/llvm-project
Commit: 0aff1798b5721d5f95d16f465b99d357012bb8d1
https://github.com/llvm/llvm-project/commit/0aff1798b5721d5f95d16f465b99d357012bb8d1
Author: David Sherwood <david.sherwood at arm.com>
Date: 2021-07-26 (Mon, 26 Jul 2021)
Changed paths:
M llvm/include/llvm/Analysis/TargetTransformInfo.h
M llvm/include/llvm/Analysis/TargetTransformInfoImpl.h
M llvm/include/llvm/CodeGen/BasicTTIImpl.h
M llvm/lib/Analysis/TargetTransformInfo.cpp
M llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
M llvm/lib/Target/AArch64/AArch64TargetTransformInfo.h
M llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp
M llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.h
M llvm/lib/Target/ARM/ARMTargetTransformInfo.cpp
M llvm/lib/Target/ARM/ARMTargetTransformInfo.h
M llvm/lib/Target/X86/X86TargetTransformInfo.cpp
M llvm/lib/Target/X86/X86TargetTransformInfo.h
M llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
M llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
A llvm/test/Analysis/CostModel/AArch64/reduce-fadd.ll
M llvm/test/Analysis/CostModel/AArch64/sve-intrinsics.ll
M llvm/test/Analysis/CostModel/X86/intrinsic-cost-kinds.ll
M llvm/test/Analysis/CostModel/X86/reduce-fadd.ll
M llvm/test/Analysis/CostModel/X86/reduce-fmul.ll
A llvm/test/Transforms/LoopVectorize/AArch64/strict-fadd-cost.ll
A llvm/test/Transforms/LoopVectorize/AArch64/sve-strict-fadd-cost.ll
Log Message:
-----------
[Analysis] Add simple cost model for strict (in-order) reductions
I have added a new FastMathFlags parameter to getArithmeticReductionCost
to indicate what type of reduction we are performing:
1. Tree-wise. This is the typical fast-math reduction that involves
continually splitting a vector up into halves and adding each
half together until we get a scalar result. This is the default
behaviour for integers, whereas for floating point we only do this
if reassociation is allowed.
2. Ordered. This now allows us to estimate the cost of performing
a strict vector reduction by treating it as a series of scalar
operations in lane order. This is the case when FP reassociation
is not permitted. For scalable vectors this is more difficult
because at compile time we do not know how many lanes there are,
and so we use the worst case maximum vscale value.
I have also fixed getTypeBasedIntrinsicInstrCost to pass in the
FastMathFlags, which meant fixing up some X86 tests where we always
assumed the vector.reduce.fadd/mul intrinsics were 'fast'.
New tests have been added here:
Analysis/CostModel/AArch64/reduce-fadd.ll
Analysis/CostModel/AArch64/sve-intrinsics.ll
Transforms/LoopVectorize/AArch64/strict-fadd-cost.ll
Transforms/LoopVectorize/AArch64/sve-strict-fadd-cost.ll
Differential Revision: https://reviews.llvm.org/D105432
More information about the All-commits
mailing list