[PATCH] D155601: [AArch64][GISel] Additional FPExt vector lowering
Dave Green via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Sun Jul 23 08:54:05 PDT 2023
dmgreen added a comment.
In D155601#4510788 <https://reviews.llvm.org/D155601#4510788>, @arsenm wrote:
> If you also test bfloat you'll find it miscompiles
Yeah. My understanding is that either the s16 type needs to distinguish between different float representations, or the G_FPEXT operation would need to, either with a different G_ opcode or some sort of flag.
================
Comment at: llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp:545
+ .clampNumElements(0, v4s32, v4s32)
+ .clampNumElements(0, v2s64, v2s64)
+ .scalarize(0);
----------------
tschuett wrote:
> .clampMinNumElements(0, s32, 2)
> .clampMaxNumElements(0, s32, 4)
> .clampMinNumElements(0, s64, 1)
> .clampMaxNumElements(0, s64, 2)
>
> ?
The result type of an fpext needs to be a v4s32 or v2s64, to match a fcvtl. There are no v2f16->v2f32 instructions (unless you count the lower 2 lanes of a v4f16->v4f32 fcvtl, and it would seem better to legalize in the legalize step so we only have to deal with legal operation later, and reuse all the tablegen patterns).
================
Comment at: llvm/test/CodeGen/AArch64/fpext.ll:204
+entry:
+ %c = fpext <3 x half> %a to <3 x float>
+ ret <3 x float> %c
----------------
tschuett wrote:
> arsenm wrote:
> > tschuett wrote:
> > > What is the trick behind the difference?
> > the dag widened the vector and gisel scalarized it
> I missed the *3*.
Yeah the 3 is awkward. This comes from the expansion, adding `undef` lanes that are not yet cleaned up properly. It should be possible to improve it, and we can get back to the same codegen as SDAG
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D155601/new/
https://reviews.llvm.org/D155601
More information about the llvm-commits
mailing list