[PATCH] D99324: [AArch64][SVE] Codegen dup_lane for dup(vector_extract)
JunMa via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Mar 29 04:55:01 PDT 2021
junparser added inline comments.
================
Comment at: llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td:624
+ def : Pat<(nxv4f16 (AArch64dup (f16 (vector_extract (nxv4f16 ZPR:$vec), sve_elm_idx_extdup_h:$index)))),
+ (DUP_ZZI_H ZPR:$vec, sve_elm_idx_extdup_h:$index)>;
+ def : Pat<(nxv2f16 (AArch64dup (f16 (vector_extract (nxv2f16 ZPR:$vec), sve_elm_idx_extdup_h:$index)))),
----------------
paulwalker-arm wrote:
> sdesmalen wrote:
> > This isn't entirely correct, because a nxv4f16 has gaps between the elements. A full nxv8f16 has vscale x 8 elements, so that means a nxv4f16 has vscale x 4 elements, with 4 gaps in between, e.g. `<elt0, _, elt1, _, .. >`. That means the element must be multiplied by 2 in this case (and the one for nxv2f32), and 4 for the nxv2f16 case.
> While logically true I think in practice you'd rewrite the patten so the instruction's element type matched that of the "packed" vector associated with the dag result's element count (i.e. D for nxv2, S for nxv4).
>
> So in this instance something like:
> ```
> def : Pat<(nxv4f16 (AArch64dup (f16 (vector_extract (nxv4f16 ZPR:$vec), sve_elm_idx_extdup_s:$index)))),
> (DUP_ZZI_S ZPR:$vec, sve_elm_idx_extdup_s:$index)>;
> ```
>
> So in essense all `nxv4` results are considered to be duplicating floats, with all `nxv2` results the result of duplicating doubles.
>
> Is it possible to move the patterns into the multiclass for sve_int_perm_dup_i?
> This isn't entirely correct, because a nxv4f16 has gaps between the elements. A full nxv8f16 has vscale x 8 elements, so that means a nxv4f16 has vscale x 4 elements, with 4 gaps in between, e.g. `<elt0, _, elt1, _, .. >`. That means the element must be multiplied by 2 in this case (and the one for nxv2f32), and 4 for the nxv2f16 case.
This is quiet different than what I thought, for nxv4f16, I thought the upper 64bit should be empty. Where can i find these rules? I haven't see such in anywhere
================
Comment at: llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td:624
+ def : Pat<(nxv4f16 (AArch64dup (f16 (vector_extract (nxv4f16 ZPR:$vec), sve_elm_idx_extdup_h:$index)))),
+ (DUP_ZZI_H ZPR:$vec, sve_elm_idx_extdup_h:$index)>;
+ def : Pat<(nxv2f16 (AArch64dup (f16 (vector_extract (nxv2f16 ZPR:$vec), sve_elm_idx_extdup_h:$index)))),
----------------
junparser wrote:
> paulwalker-arm wrote:
> > sdesmalen wrote:
> > > This isn't entirely correct, because a nxv4f16 has gaps between the elements. A full nxv8f16 has vscale x 8 elements, so that means a nxv4f16 has vscale x 4 elements, with 4 gaps in between, e.g. `<elt0, _, elt1, _, .. >`. That means the element must be multiplied by 2 in this case (and the one for nxv2f32), and 4 for the nxv2f16 case.
> > While logically true I think in practice you'd rewrite the patten so the instruction's element type matched that of the "packed" vector associated with the dag result's element count (i.e. D for nxv2, S for nxv4).
> >
> > So in this instance something like:
> > ```
> > def : Pat<(nxv4f16 (AArch64dup (f16 (vector_extract (nxv4f16 ZPR:$vec), sve_elm_idx_extdup_s:$index)))),
> > (DUP_ZZI_S ZPR:$vec, sve_elm_idx_extdup_s:$index)>;
> > ```
> >
> > So in essense all `nxv4` results are considered to be duplicating floats, with all `nxv2` results the result of duplicating doubles.
> >
> > Is it possible to move the patterns into the multiclass for sve_int_perm_dup_i?
> > This isn't entirely correct, because a nxv4f16 has gaps between the elements. A full nxv8f16 has vscale x 8 elements, so that means a nxv4f16 has vscale x 4 elements, with 4 gaps in between, e.g. `<elt0, _, elt1, _, .. >`. That means the element must be multiplied by 2 in this case (and the one for nxv2f32), and 4 for the nxv2f16 case.
>
> This is quiet different than what I thought, for nxv4f16, I thought the upper 64bit should be empty. Where can i find these rules? I haven't see such in anywhere
OK, I'll move them to sve_int_perm_dup_i
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D99324/new/
https://reviews.llvm.org/D99324
More information about the llvm-commits
mailing list