[PATCH] D156171: [AArch64][GlobalISel] G_FMINNUM and G_FMAXNUM vector lowering

Dave Green via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Jul 25 03:45:17 PDT 2023


dmgreen added inline comments.


================
Comment at: llvm/test/CodeGen/AArch64/fminmax.ll:271
 entry:
   %c = call <3 x float> @llvm.minnum.v3f32(<3 x float> %a, <3 x float> %b)
   ret <3 x float> %c
----------------
tschuett wrote:
> The 3 x  f32 seems to fail again, but it should be in reach.
It's just uglyness from buildvectors with undef elements. You can see it moving out-of and into the same register:
```
mov s2, v0.s[1]
..
mov v0.s[1], v2.s[0]
```
And it shouldn't be setting v0.s[3], as those value are just undef.  All that can be cleaned up with other combines, they just haven't been implemented yet (and they resolve around what happens to propagated undef elements, so have some dragons).


================
Comment at: llvm/test/CodeGen/AArch64/fminmax.ll:402
 entry:
   %c = call <7 x float> @llvm.minnum.v7f32(<7 x float> %a, <7 x float> %b)
   ret <7 x float> %c
----------------
tschuett wrote:
> The 7 is ugly for both.
The calling convention here is very odd. They seem to match between the two, at least.


================
Comment at: llvm/test/CodeGen/AArch64/fminmax.ll:1409
 entry:
   %c = call <16 x half> @llvm.maxnum.v16f16(<16 x half> %a, <16 x half> %b)
   ret <16 x half> %c
----------------
tschuett wrote:
> For 16 Gisel seems to be better?!?
The SDAG for fp16 without fullfp16 has always scalarized many instructions, and I don't believe anyone has looked into fixing it, as it's fairly low priority. It's worth getting it right with GISel if we can though.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D156171/new/

https://reviews.llvm.org/D156171



More information about the llvm-commits mailing list