[PATCH] D33690: [PowerPC] Match vec_revb builtins to P9 instructions.

Nemanja Ivanovic via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue May 30 21:49:41 PDT 2017


nemanjai added a comment.

I am debating internally about suggesting a potential solution for this, but this implementation essentially misses an entire set of complementary shuffle masks. We have a number of instructions that do element-wise reordering and we now have these that do per-element byte reversal. Combining the capabilities covers a lot more shuffles - I am just not positive that these occur enough to warrant the effort.

Here's what I mean:

- We have an instruction that will do a "rotate-left-by-word" operation on a vector (and a way to emit that instructions)
- We now have a "reverse-bytes-within-word-elements" operation
- We don't have a "reverse-bytes-within-each-word-and-rotate-left-by-word", which we can simply do with a 2 instruction sequence now

And of course, the same goes for all other masks we lower to a single instruction. It might be useful for each of them to detect byte-reversal as well. It is non-trivial work, but doesn't sound fundamentally all that hard. Perhaps we should re-design our handling of shuffles at some point and have a robust way to determine what we can lower to a one or two instruction sequence on any Subtarget.



================
Comment at: lib/Target/PowerPC/PPCISelLowering.cpp:1598
 
 // Check that the mask is shuffling N byte elements.
+static bool isNByteElemShuffleMask(ShuffleVectorSDNode *N, unsigned Width,
----------------
This comment is no longer adequate. There is now another parameter which appears to be the "direction and stride". Please elaborate in a comment what this function does and how to use it.
Also, if values other than 1/-1 don't make sense for the new parameter, you can add an assert.


================
Comment at: lib/Target/PowerPC/PPCISelLowering.cpp:1768
+
+  for (int i = 0; i < 16; i += Width) {
+    if (N->getMaskElt(i) != i + Width - 1)
----------------
Doesn't it just suffice at this point to ensure that each element of the shuffle mask has a value less than 16?


================
Comment at: test/CodeGen/PowerPC/vec_revb.ll:1
+; RUN: llc -verify-machineinstrs -mtriple=powerpc64-unknown-linux-gnu -mcpu=pwr9 < %s | \
+; RUN:   FileCheck %s -check-prefix=CHECK-BE
----------------
If you're breaking up the RUN lines, keep them within 80 columns.


https://reviews.llvm.org/D33690





More information about the llvm-commits mailing list