[all-commits] [llvm/llvm-project] b82be5: [AArch64][SVE] Implement structured load intrinsics

Tue Jun 9 01:52:35 PDT 2020

  Branch: refs/heads/master
  Home:   https://github.com/llvm/llvm-project
  Commit: b82be5db71fbe74f2407c7e38fb5e18fecaf08e4
      https://github.com/llvm/llvm-project/commit/b82be5db71fbe74f2407c7e38fb5e18fecaf08e4
  Author: Cullen Rhodes <cullen.rhodes at arm.com>
  Date:   2020-06-09 (Tue, 09 Jun 2020)

  Changed paths:
    M llvm/include/llvm/IR/IntrinsicsAArch64.td
    M llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp
    M llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
    M llvm/lib/Target/AArch64/AArch64ISelLowering.h
    M llvm/test/CodeGen/AArch64/sve-intrinsics-loads.ll

  Log Message:
  -----------
  [AArch64][SVE] Implement structured load intrinsics

Summary:
This patch adds initial support for the following instrinsics:

    * llvm.aarch64.sve.ld2
    * llvm.aarch64.sve.ld3
    * llvm.aarch64.sve.ld4

For loading two, three and four vectors worth of data. Basic codegen is
implemented with reg+reg and reg+imm addressing modes being addressed
in a later patch.

The types returned by these intrinsics have a number of elements that is a
multiple of the elements in a 128-bit vector for a given type and N, where N is
the number of vectors being loaded, i.e. 2, 3 or 4. Thus, for 32-bit elements
the types are:

    LD2 : <vscale x 8 x i32>
    LD3 : <vscale x 12 x i32>
    LD4 : <vscale x 16 x i32>

This is implemented with target-specific intrinsics for each variant that take
the same operands as the IR intrinsic but return N values, where the type of
each value is a full vector, i.e. <vscale x 4 x i32> in the above example.
These values are then concatenated using the standard concat_vector intrinsic
to maintain type legality with the IR.

These intrinsics are intended for use in the Arm C Language
Extension (ACLE).

Reviewed By: sdesmalen

Differential Revision: https://reviews.llvm.org/D75751