[all-commits] [llvm/llvm-project] fa8bb2: [RISCV] Optimize vector_shuffles that are interlea...

Thu Jan 20 14:47:16 PST 2022

  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: fa8bb224661dfb38cb2a246f7d98dc61fd45602e
      https://github.com/llvm/llvm-project/commit/fa8bb224661dfb38cb2a246f7d98dc61fd45602e
  Author: Craig Topper <craig.topper at sifive.com>
  Date:   2022-01-20 (Thu, 20 Jan 2022)

  Changed paths:
    M llvm/lib/Target/RISCV/RISCVISelLowering.cpp
    M llvm/lib/Target/RISCV/RISCVISelLowering.h
    M llvm/lib/Target/RISCV/RISCVInstrInfoVVLPatterns.td
    A llvm/test/CodeGen/RISCV/rvv/fixed-vectors-fp-interleave.ll
    A llvm/test/CodeGen/RISCV/rvv/fixed-vectors-int-interleave.ll
    M llvm/test/CodeGen/RISCV/rvv/fixed-vectors-int-shuffles.ll

  Log Message:
  -----------
  [RISCV] Optimize vector_shuffles that are interleaving the lowest elements of two vectors.

RISCV only has a unary shuffle that requires places indices in a
register. For interleaving two vectors this means we need at least
two vrgathers and a vmerge to do a shuffle of two vectors.

This patch teaches shuffle lowering to use a widening addu followed
by a widening vmaccu to implement the interleave. First we extract
the low half of both V1 and V2. Then we implement
(zext(V1) + zext(V2)) + (zext(V2) * zext(2^eltbits - 1)) which
simplifies to (zext(V1) + zext(V2) * 2^eltbits). This further
simplifies to (zext(V1) + zext(V2) << eltbits). Then we bitcast the
result back to the original type splitting the wide elements in half.

We can only do this if we have a type with wider elements available.
Because we're using extends we also have to be careful with fractional
lmuls. Floating point types are supported by bitcasting to/from integer.

The tests test a varied combination of LMULs split across VLEN>=128 and
VLEN>=512 tests. There a few tests with shuffle indices commuted as well
as tests for undef indices. There's one test for a vXi64/vXf64 vector which
we can't optimize, but verifies we don't crash.

Reviewed By: rogfer01

Differential Revision: https://reviews.llvm.org/D117743