[PATCH] D80410: [WIP][SVE] Pass through dup(0) to zero-merging pseudos

Sander de Smalen via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu May 21 15:11:59 PDT 2020


sdesmalen created this revision.
sdesmalen added a reviewer: cameron.mcinally.
Herald added subscribers: llvm-commits, psnobl, rkruppe, hiraditya, tschuett.
Herald added a reviewer: efriedma.
Herald added a project: LLVM.

Hi @cameron.mcinally, I'm just sharing what I tried out today based
off your patch D80260 <https://reviews.llvm.org/D80260>. I'm not really planning to land it, but feel
free to use for reference or discard entirely if you've already been
working on something similar.

It passes the dup(0) to the zero-merging pseudos, similar to what D80260 <https://reviews.llvm.org/D80260>
does for any other mask value.

This patch also highlights a bug that currently exists with the expansion
of the pseudo instructions that merge zero's into the false lanes.

The zero-merging pseudos don't have any tied operand constraints to give
the register allocator more freedom to use the reverse instructions
(like FSUBR).

A bug currently exists when the register allocation of one of the pseudos
ends up as:

  Dst = FSUB_ZERO_S P0, Z0, Z0

The expand pass cannot zero the false lanes of Z0 using MOVPRFX, because
the MOVPRFX instruction specifies that the destination register must not
be used in any other operand position than the destination register. This
would not be valid:

  Z0 = MOVPRFX P0/z, Z0
  Z0 = FSUB_S Z0, P0/m, Z0
                        ^^

At point of expanding the pseudo, there may not be a spare register
available to expand this into a legal sequence. In D71712 <https://reviews.llvm.org/D71712> we've solved
this by using a 'Conditional Early Clobber' pass that runs during register
allocation and makes sure the destination register is different from
any of the input registers, if the two input registers will otherwise end
up the same. This is a bit fiddly, and it's probably better to build
on the design set out in D80260 <https://reviews.llvm.org/D80260> where the merge-value value is passed
to the pseudo, so the compiler can decide at point of pseudo expansion
whether to use the DUP(0) value, or to use the zeroing MOVPRFX.

Given that the DUP IMM instructions have `isReMaterializable` set,
the register allocator hopefully won't try too hard to keep it in a
register.


https://reviews.llvm.org/D80410

Files:
  llvm/lib/Target/AArch64/AArch64ExpandPseudoInsts.cpp
  llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp
  llvm/lib/Target/AArch64/SVEInstrFormats.td
  llvm/test/CodeGen/AArch64/sve-intrinsics-fp-arith-merging.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D80410.265607.patch
Type: text/x-patch
Size: 7043 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20200521/cd523098/attachment.bin>


More information about the llvm-commits mailing list