[all-commits] [llvm/llvm-project] 487686: [SDAG][RISCV] Don't promote VP_REDUCE_{FADD, FMUL} ...
Luke Lau via All-commits
all-commits at lists.llvm.org
Thu Oct 3 09:18:06 PDT 2024
Branch: refs/heads/main
Home: https://github.com/llvm/llvm-project
Commit: 487686b82e9a20bd090704e04c8cbf7207dd6527
https://github.com/llvm/llvm-project/commit/487686b82e9a20bd090704e04c8cbf7207dd6527
Author: Luke Lau <luke at igalia.com>
Date: 2024-10-04 (Fri, 04 Oct 2024)
Changed paths:
M llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
M llvm/lib/Target/RISCV/RISCVISelLowering.cpp
M llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
M llvm/test/Analysis/CostModel/RISCV/reduce-fadd.ll
M llvm/test/CodeGen/RISCV/rvv/fixed-vectors-reduction-fp-vp.ll
M llvm/test/CodeGen/RISCV/rvv/vreductions-fp-vp.ll
Log Message:
-----------
[SDAG][RISCV] Don't promote VP_REDUCE_{FADD,FMUL} (#111000)
In https://reviews.llvm.org/D153848, promotion was added for a variety
of f16 ops with zvfhmin, including VP reductions.
However I don't believe it's correct to promote f16 fadd or fmul
reductions to f32 since we need to round the intermediate results.
Today if we lower @llvm.vp.reduce.fadd.nxv1f16 on RISC-V, we'll get two
different results depending on whether we compiled with +zvfh or
+zvfhmin, for example with a 3 element reduction:
; v9 = [0.1563, 5.97e-8, 0.00006104]
; zvfh
vsetivli x0, 3, e16, m1, ta, ma
vmv.v.i v8, 0
vfredosum.vs v8, v9, v8
vfmv.f.s fa0, v8
; fa0 = 0.1563
; zvfhmin
vsetivli x0, 3, e16, m1, ta, ma
vfwcvt.f.f.v v10, v9
vsetivli x0, 3, e32, m1, ta, ma
vmv.v.i v8, 0
vfredosum.vs v8, v10, v8
vfmv.f.s fa0, v8
fcvt.h.s fa0, fa0
; fa0 = 0.1564
This same thing happens with reassociative reductions e.g. vfredusum.vs,
and this also applies for bf16.
I couldn't find anything in the LangRef for reductions that suggest the
excess precision is allowed. There may be something we can do in Clang
with -fexcess-precision=fast, but I haven't looked into this yet.
I presume the same precision issue occurs with fmul, but not with
fmin/fmax/fminimum/fmaximum.
I can't think of another way of lowering these other than scalarizing,
and we can't scalarize scalable vectors, so this just removes the
promotion and adjusts the cost model to return an invalid cost. (It
looks like we also don't currently cost fmul reductions, so presumably
they also have an invalid cost?)
I think this should be enough to stop the loop vectorizer or SLP from
emitting these intrinsics.
To unsubscribe from these emails, change your notification settings at https://github.com/llvm/llvm-project/settings/notifications
More information about the All-commits
mailing list