[PATCH] D94467: [PowerPC] Use rldimi/rlwimi instructions to optimize build_vector

Fri Aug 13 03:57:53 PDT 2021

qiucf added inline comments.

================
Comment at: llvm/lib/Target/PowerPC/PPCISelLowering.cpp:9038
+  // There're already patterns for v4i32 and v2i64 construction.
+  if (VT == MVT::v16i8 || VT == MVT::v8i16) {
+    int NumElt = VT.getVectorNumElements();
----------------
shchenz wrote:
> out of curiosity, if we already have patterns for v4i32 and v2i64, should we also handle v16i8 and v8i16 there?
Seems we can't write nested pattern for instructions like `rldimi` (both read and write first op):

```
def : Pat<(v8i16 (build_vector i16:$A, i16:$B, i16:$C, i16:$D,
                               i16:$E, i16:$F, i16:$G, i16:$H)),
          (MTVSRDD
            (RLDIMI
              (RLWIMI8 AnyExts16.C, AnyExts16.D, 16, 0, 15),
              (RLWIMI8 AnyExts16.A, AnyExts16.B, 16, 0, 15), 32, 0),
            (RLDIMI
              (RLWIMI8 AnyExts16.G, AnyExts16.H, 16, 0, 15),
              (RLWIMI8 AnyExts16.E, AnyExts16.F, 16, 0, 15), 32, 0))>;
```

And these patterns are complex for `v16i8`.

================
Comment at: llvm/test/CodeGen/PowerPC/pre-inc-disable.ll:343
 ; CHECK-LABEL: test16:
 ; CHECK:       # %bb.0: # %entry
 ; CHECK-NEXT:    cmpw r3, r5
----------------
shchenz wrote:
> Why do we eliminate so many instructions in the entry block? Are they moved to the `for.body` block? 
> If so, if `for.body` is a real loop body(for now it is not, maybe we can change the IR to make the `for.body` be a loop body), will this increase the loop size?
Some of these code should be dead. I tried `opt` on it, the loop is gone, and then use current `llc`, they're removed.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D94467/new/

https://reviews.llvm.org/D94467