[llvm] [Docs][RISCV] Document RISC-V vector codegen (PR #96740)

Wed Jun 26 09:50:47 PDT 2024

================
@@ -0,0 +1,289 @@
+=========================
+ RISC-V Vector Extension
+=========================
+
+.. contents::
+   :local:
+
+The RISC-V target readily supports the 1.0 version of the `RISC-V Vector Extension (RVV) <https://github.com/riscv/riscv-v-spec/blob/v1.0/v-spec.adoc>`_, but requires some tricks to handle its unique design.
+This guide gives an overview of how RVV is modelled in LLVM IR and how the backend generates code for it.
+
+Mapping to LLVM IR types
+========================
+
+RVV adds 32 ``VLEN`` sized registers, where ``VLEN`` is an unknown constant to the compiler. To be able to represent ``VLEN`` sized values, the RISC-V backend takes the same approach as AArch64's SVE and uses `scalable vector types <https://llvm.org/docs/LangRef.html#t-vector>`_.
+
+Scalable vector types are of the form ``<vscale x n x ty>``, which indicate a vector with a multiple of ``n`` elements of type ``ty``. ``n`` and ``ty`` then end up controlling LMUL and SEW respectively.
+
+LLVM supports only ``ELEN=32`` or ``ELEN=64``, so ``vscale`` is defined as ``VLEN/64`` (see ``RISCV::RVVBitsPerBlock``).
+This makes the LLVM IR types stable between the two ``ELEN`` s considered, i.e., every LLVM IR scalable vector type has exactly one corresponding pair of element type and LMUL, and vice-versa.
+
++-------------------+---------------+----------------+------------------+-------------------+-------------------+-------------------+-------------------+
+|                   | LMUL=⅛        | LMUL=¼         | LMUL=½           | LMUL=1            | LMUL=2            | LMUL=4            | LMUL=8            |
++===================+===============+================+==================+===================+===================+===================+===================+
+| i64 (ELEN=64)     | N/A           | N/A            | N/A              | <v x 1 x i64>     | <v x 2 x i64>     | <v x 4 x i64>     | <v x 8 x i64>     |
++-------------------+---------------+----------------+------------------+-------------------+-------------------+-------------------+-------------------+
+| i32               | N/A           | N/A            | <v x 1 x i32>    | <v x 2 x i32>     | <v x 4 x i32>     | <v x 8 x i32>     | <v x 16 x i32>    |
++-------------------+---------------+----------------+------------------+-------------------+-------------------+-------------------+-------------------+
+| i16               | N/A           | <v x 1 x i16>  | <v x 2 x i16>    | <v x 4 x i16>     | <v x 8 x i16>     | <v x 16 x i16>    | <v x 32 x i16>    |
++-------------------+---------------+----------------+------------------+-------------------+-------------------+-------------------+-------------------+
+| i8                | <v x 1 x i8>  | <v x 2 x i8>   | <v x 4 x i8>     | <v x 8 x i8>      | <v x 16 x i8>     | <v x 32 x i8>     | <v x 64 x i8>     |
++-------------------+---------------+----------------+------------------+-------------------+-------------------+-------------------+-------------------+
+| double (ELEN=64)  | N/A           | N/A            | N/A              | <v x 1 x double>  | <v x 2 x double>  | <v x 4 x double>  | <v x 8 x double>  |
++-------------------+---------------+----------------+------------------+-------------------+-------------------+-------------------+-------------------+
+| float             | N/A           | N/A            | <v x 1 x float>  | <v x 2 x float>   | <v x 4 x float>   | <v x 8 x float>   | <v x 16 x float>  |
++-------------------+---------------+----------------+------------------+-------------------+-------------------+-------------------+-------------------+
+| half              | N/A           | <v x 1 x half> | <v x 2 x half>   | <v x 4 x half>    | <v x 8 x half>    | <v x 16 x half>   | <v x 32 x half>   |
++-------------------+---------------+----------------+------------------+-------------------+-------------------+-------------------+-------------------+
+
+(Read ``<v x k x ty>`` as ``<vscale x k x ty>``)
+
+
+Mask vector types
+-----------------
+
+As for mask vectors, they are physically represented using a layout of densely packed bits in a vector register.
+They are mapped to the following LLVM IR types:
+
+- <vscale x 1 x i1>
+- <vscale x 2 x i1>
+- <vscale x 4 x i1>
+- <vscale x 8 x i1>
+- <vscale x 16 x i1>
+- <vscale x 32 x i1>
+- <vscale x 64 x i1>
+
+Two types with the same ratio SEW/LMUL will have the same related mask type. For instance, two different comparisons one under SEW=64, LMUL=2 and the other under SEW=32, LMUL=1 will both generate a mask <vscale x 2 x i1>.
+
+Representation in LLVM IR
+=========================
+
+Vector instructions can be represented in three main ways in LLVM IR:
+
+1. Regular instructions on both fixed and scalable vector types
+
+   .. code-block:: llvm
+
+       %c = add <vscale x 4 x i32> %a, %b
+
+2. RISC-V vector intrinsics, which mirror the `C intrinsics specification <https://github.com/riscv-non-isa/rvv-intrinsic-doc>`_
+
+   These come in unmasked variants:
+
+   .. code-block:: llvm
+
+       %c = call @llvm.riscv.vadd.nxv4i32.nxv4i32(
+              <vscale x 4 x i32> %passthru,
+	      <vscale x 4 x i32> %a,
+	      <vscale x 4 x i32> %b,
+	      i64 %avl
+	    )
+
+   As well as masked variants:
+
+   .. code-block:: llvm
+
+       %c = call @llvm.riscv.vadd.mask.nxv4i32.nxv4i32(
+              <vscale x 4 x i32> %passthru,
+	      <vscale x 4 x i32> %a,
+	      <vscale x 4 x i32> %b,
----------------
topperc wrote:

This is missing a mask operand

https://github.com/llvm/llvm-project/pull/96740