[all-commits] [llvm/llvm-project] e93821: [RISCV] Implement getOptimalMemOpType for memcpy/m...

Tue Aug 1 12:15:08 PDT 2023

  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: e938217f8109e1e1b43978606d459b8e7aeab19e
      https://github.com/llvm/llvm-project/commit/e938217f8109e1e1b43978606d459b8e7aeab19e
  Author: Philip Reames <preames at rivosinc.com>
  Date:   2023-08-01 (Tue, 01 Aug 2023)

  Changed paths:
    M llvm/lib/Target/RISCV/RISCVISelLowering.cpp
    M llvm/lib/Target/RISCV/RISCVISelLowering.h
    M llvm/test/CodeGen/RISCV/rvv/memcpy-inline.ll
    M llvm/test/CodeGen/RISCV/rvv/memset-inline.ll
    M llvm/test/CodeGen/RISCV/rvv/rvv-out-arguments.ll
    M llvm/test/CodeGen/RISCV/rvv/wrong-chain-fixed-load.ll

  Log Message:
  -----------
  [RISCV] Implement getOptimalMemOpType for memcpy/memset lowering

This patch implements the getOptimalMemOpType callback which is used by the generic mem* lowering in SelectionDAG to pick the widest type used. This patch only changes the behavior when vector instructions are available, as the default is reasonable for scalar.

Without this change, we were emitting either XLEN sized stores (for aligned operations) or byte sized stores (for unaligned operations.) Interestingly, the final codegen was nowhere near as bad as that would seem to imply. Generic load combining and store merging kicked in, and frequently (but not always) produced pretty reasonable vector code.

The primary effects of this change are:
* Enable the use of vector operations for memset of non-constant. Our generic store merging logic doesn't know how to merge a broadcast store, and thus we were seeing the generic (and awful) byte expansion lowering for unaligned memset.
* Enable the generic misaligned overlap trick where we write to some of the same bytes twice. The alternative is to either a) use an increasing small sequence of stores for the tail or b) use VL to restrict the vector store. The later is not implemented at this time, so the former is what previously happened. Interestingly, I'm not sure that changing VL (as opposed to the overlap trick) is even obviously profitable here.

Differential Revision: https://reviews.llvm.org/D156249