[all-commits] [llvm/llvm-project] 9991ea: [CostModel][AArch64] Make extractelement, with fmu...
Sushant Gokhale via All-commits
all-commits at lists.llvm.org
Tue Nov 12 21:41:10 PST 2024
Branch: refs/heads/main
Home: https://github.com/llvm/llvm-project
Commit: 9991ea28fcd308d5bd357358710e5344e26b46e1
https://github.com/llvm/llvm-project/commit/9991ea28fcd308d5bd357358710e5344e26b46e1
Author: Sushant Gokhale <sgokhale at nvidia.com>
Date: 2024-11-13 (Wed, 13 Nov 2024)
Changed paths:
M llvm/include/llvm/Analysis/TargetTransformInfo.h
M llvm/include/llvm/Analysis/TargetTransformInfoImpl.h
M llvm/include/llvm/CodeGen/BasicTTIImpl.h
M llvm/lib/Analysis/TargetTransformInfo.cpp
M llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
M llvm/lib/Target/AArch64/AArch64TargetTransformInfo.h
M llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
M llvm/test/Analysis/CostModel/AArch64/extract_float.ll
Log Message:
-----------
[CostModel][AArch64] Make extractelement, with fmul user, free whenev… (#111479)
…er possible
In case of Neon, if there exists extractelement from lane != 0 such that
1. extractelement does not necessitate a move from vector_reg -> GPR
2. extractelement result feeds into fmul
3. Other operand of fmul is a scalar or extractelement from lane 0 or
lane equivalent to 0
then the extractelement can be merged with fmul in the backend and it
incurs no cost.
e.g.
```
define double @foo(<2 x double> %a) {
%1 = extractelement <2 x double> %a, i32 0
%2 = extractelement <2 x double> %a, i32 1
%res = fmul double %1, %2
ret double %res
}
```
`%2` and `%res` can be merged in the backend to generate:
`fmul d0, d0, v0.d[1]`
The change was tested with SPEC FP(C/C++) on Neoverse-v2.
**Compile time impact**: None
**Performance impact**: Observing 1.3-1.7% uplift on lbm benchmark with -flto depending upon the config.
To unsubscribe from these emails, change your notification settings at https://github.com/llvm/llvm-project/settings/notifications
More information about the All-commits
mailing list