[all-commits] [llvm/llvm-project] 4a3e46: [RISCV] Extend demanded field transform in InsertV...

Thu Jun 16 08:01:53 PDT 2022

  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: 4a3e46115a7feae955e26b537efd3de07dd88060
      https://github.com/llvm/llvm-project/commit/4a3e46115a7feae955e26b537efd3de07dd88060
  Author: Philip Reames <preames at rivosinc.com>
  Date:   2022-06-16 (Thu, 16 Jun 2022)

  Changed paths:
    M llvm/lib/Target/RISCV/RISCVInsertVSETVLI.cpp
    M llvm/test/CodeGen/RISCV/rvv/extload-truncstore.ll
    M llvm/test/CodeGen/RISCV/rvv/fixed-vectors-ctlz.ll
    M llvm/test/CodeGen/RISCV/rvv/fixed-vectors-extload-truncstore.ll
    M llvm/test/CodeGen/RISCV/rvv/fixed-vectors-fp-conv.ll
    M llvm/test/CodeGen/RISCV/rvv/fixed-vectors-fp-shuffles.ll
    M llvm/test/CodeGen/RISCV/rvv/fixed-vectors-fp.ll
    M llvm/test/CodeGen/RISCV/rvv/fixed-vectors-fp2i.ll
    M llvm/test/CodeGen/RISCV/rvv/fixed-vectors-i2fp.ll
    M llvm/test/CodeGen/RISCV/rvv/fixed-vectors-int-exttrunc.ll
    M llvm/test/CodeGen/RISCV/rvv/fixed-vectors-int-shuffles.ll
    M llvm/test/CodeGen/RISCV/rvv/fixed-vectors-int-splat.ll
    M llvm/test/CodeGen/RISCV/rvv/fixed-vectors-reduction-fp.ll
    M llvm/test/CodeGen/RISCV/rvv/fixed-vectors-reduction-int.ll
    M llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vfwadd.ll
    M llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vfwsub.ll
    M llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vwadd.ll
    M llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vwaddu.ll
    M llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vwmul.ll
    M llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vwmulsu.ll
    M llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vwmulu.ll
    M llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vwsub.ll
    M llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vwsubu.ll
    M llvm/test/CodeGen/RISCV/rvv/vsetvli-insert.mir

  Log Message:
  -----------
  [RISCV] Extend demanded field transform in InsertVSETVLI to VTYPE subfeilds

The motivating case, and the only one actually enabled by this patch, is a load or store followed by another op with the same SEW/LMUL ratio.

As an example, consider:

define void @test1(ptr %in, ptr %out) {
entry:
  %0 = load <8 x i16>, ptr %in, align 2
  %1 = sext <8 x i16> %0 to <8 x i32>
  store <8 x i32> %1, ptr %out, align 4
  ret void
}

Without this patch, we get:

	vsetivli	zero, 8, e16, mf4, ta, mu
	vle16.v	v8, (a0)
	vsetvli	zero, zero, e32, mf2, ta, mu
	vsext.vf2	v9, v8
	vse32.v	v9, (a1)
	ret

Whereas with the patch we get:

	vsetivli	zero, 8, e32, mf2, ta, mu
	vle16.v	v8, (a0)
	vsext.vf2	v9, v8
	vse32.v	v9, (a1)
	ret

We have rewritten the first vsetvli and thus removed the second one.

As is strongly hinted by the code structure and todos, I am planning on communing this with all (or most all?) of the cases from isCompatible used in the forward data flow. This will be done in a series of following changes - some NFC reworks, and some reviewed optimization extensions.

Differential Revision: https://reviews.llvm.org/D127780