[PATCH] D69497: [PowerPC] Fix MI peephole optimization for splats

Sun Oct 27 18:18:49 PDT 2019

vddvss created this revision.
vddvss added reviewers: hfinkel, nemanjai, echristo, jsji, stefanp.
vddvss added a project: LLVM.
Herald added subscribers: llvm-commits, shchenz, kbarton, hiraditya.

This patch fixes an issue where the PPC MI peephole optimization pass incorrectly remove a vector swap.

Specifically, the pass can combine a splat/swap to a splat/copy. It uses `TargetRegisterInfo::lookThruCopyLike` to determine that the operands to the splat are the same. However, the current logic only compares the operands based on register numbers. In the case where the splat operands are ultimately feed from the same physical register, the pass can incorrectly remove a swap if the feed register for one of the operands has been clobbered.

This patch adds a check to ensure that the registers feeding both operands are defined by the same instruction.

Here is an example in pseudo-MIR of what happens in the test cased added in this patch:

Before PPC MI peephole optimization:

  %arg = XVADDDP %0, %1

  $f1 = COPY %arg.sub_64
  call double rint(double)
  %res.first = COPY $f1
  %vec.res.first = SUBREG_TO_REG 1, %res.first, %subreg.sub_64

  %arg.swapped = XXPERMDI %arg, %arg, 2
  $f1 = COPY %arg.swapped.sub_64
  call double rint(double)
  %res.second = COPY $f1

  %vec.res.second = SUBREG_TO_REG 1, %res.second, %subreg.sub_64
  %vec.res.splat = XXPERMDI %vec.res.first, %vec.res.second, 0
  %vec.res = XXPERMDI %vec.res.splat, %vec.res.splat, 2
  ; %vec.res == [ %vec.res.second[0], %vec.res.first[0] ]

After optimization:

  ; ...
  %vec.res.splat = XXPERMDI %vec.res.first, %vec.res.second, 0
  ; lookThruCopyLike(%vec.res.first) == lookThruCopyLike(%vec.res.second) == $f1
  ; so the pass replaces the swap with a copy:
  %vec.res = COPY %vec.res.splat
  ; %vec.res == [ %vec.res.first[0], %vec.res.second[0] ]

As best as I can tell, this has occurred since r288152, which added support for lowering certain vector operations to direct moves in the form of a splat.

The patch will also omit an optimization that had incidentally worked if the result of the swap is used for a commutative operation; however, the added complexity of handling that case seems like it would not be worthwhile.

Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D69497

Files:
  llvm/lib/Target/PowerPC/PPCMIPeephole.cpp
  llvm/test/CodeGen/PowerPC/mi-peephole-splat.ll
  llvm/test/CodeGen/PowerPC/vector-constrained-fp-intrinsics.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D69497.226600.patch
Type: text/x-patch
Size: 19243 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20191028/abfc990e/attachment.bin>