[PATCH] D156171: [AArch64][GlobalISel] G_FMINNUM and G_FMAXNUM vector lowering
Dave Green via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Jul 25 03:45:17 PDT 2023
dmgreen added inline comments.
================
Comment at: llvm/test/CodeGen/AArch64/fminmax.ll:271
entry:
%c = call <3 x float> @llvm.minnum.v3f32(<3 x float> %a, <3 x float> %b)
ret <3 x float> %c
----------------
tschuett wrote:
> The 3 x f32 seems to fail again, but it should be in reach.
It's just uglyness from buildvectors with undef elements. You can see it moving out-of and into the same register:
```
mov s2, v0.s[1]
..
mov v0.s[1], v2.s[0]
```
And it shouldn't be setting v0.s[3], as those value are just undef. All that can be cleaned up with other combines, they just haven't been implemented yet (and they resolve around what happens to propagated undef elements, so have some dragons).
================
Comment at: llvm/test/CodeGen/AArch64/fminmax.ll:402
entry:
%c = call <7 x float> @llvm.minnum.v7f32(<7 x float> %a, <7 x float> %b)
ret <7 x float> %c
----------------
tschuett wrote:
> The 7 is ugly for both.
The calling convention here is very odd. They seem to match between the two, at least.
================
Comment at: llvm/test/CodeGen/AArch64/fminmax.ll:1409
entry:
%c = call <16 x half> @llvm.maxnum.v16f16(<16 x half> %a, <16 x half> %b)
ret <16 x half> %c
----------------
tschuett wrote:
> For 16 Gisel seems to be better?!?
The SDAG for fp16 without fullfp16 has always scalarized many instructions, and I don't believe anyone has looked into fixing it, as it's fairly low priority. It's worth getting it right with GISel if we can though.
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D156171/new/
https://reviews.llvm.org/D156171
More information about the llvm-commits
mailing list