[Mlir-commits] [mlir] [mlir][vector] Update docs for scalable vectors (PR #101842)

Tue Aug 6 02:48:23 PDT 2024

https://github.com/banach-space updated https://github.com/llvm/llvm-project/pull/101842

>From eca6169ae5fd47a841fab38b6775e56d88889e3d Mon Sep 17 00:00:00 2001
From: Andrzej Warzynski <andrzej.warzynski at arm.com>
Date: Sat, 3 Aug 2024 19:47:12 +0100
Subject: [PATCH 1/4] [mlir][vector] Update docs for scalable vectors

Adds a few notes on scalable vectors in the docs for the Vector dialect
and in a few other places. This is mostly "repeating" things from LLVM's
LangRef. Additionally:

* adds a few basic tests with scalable vectors (those should've been
  added long time ago),
* includes small formatting edits in Vector.md.
---
 mlir/docs/Dialects/Vector.md                  | 62 ++++++++++++-------
 .../Conversion/LLVMCommon/TypeConverter.cpp   |  8 ++-
 mlir/test/Dialect/LLVMIR/types.mlir           |  4 ++
 mlir/test/Target/LLVMIR/llvmir-types.mlir     |  4 ++
 4 files changed, 53 insertions(+), 25 deletions(-)

diff --git a/mlir/docs/Dialects/Vector.md b/mlir/docs/Dialects/Vector.md
index 399571b8b68a1..010fe9f942c4a 100644
--- a/mlir/docs/Dialects/Vector.md
+++ b/mlir/docs/Dialects/Vector.md
@@ -74,12 +74,29 @@ following top-down rewrites and conversions:
 ### LLVM level
 
 On CPU, the `n-D` `vector` type currently lowers to `!llvm<array<vector>>`.
-More concretely, `vector<4x8x128xf32>` lowers to `!llvm<[4 x [ 8 x [ 128 x
-float ]]]>`. There are tradeoffs involved related to how one can access
-subvectors and how one uses `llvm.extractelement`, `llvm.insertelement` and
-`llvm.shufflevector`. The section on [LLVM Lowering
-Tradeoffs](#llvm-lowering-tradeoffs) offers a deeper dive into the current
-design choices and tradeoffs.
+More concretely,
+* `vector<4x8x128xf32>` lowers to `!llvm<[4 x [ 8 x < 128
+x float >]]>` (fixed-width vector), and
+* `vector<4x8x[128]xf32>` lowers to `!llvm<[4 x [ 8 x < vscale x 128
+x float >]]>` (scalable vector).
+
+There are tradeoffs involved related to how one can access subvectors and how
+one uses `llvm.extractelement`, `llvm.insertelement` and `llvm.shufflevector`.
+The section on [LLVM Lowering Tradeoffs](#llvm-lowering-tradeoffs) offers a
+deeper dive into the current design choices and tradeoffs.
+
+Note, while LLVM supports vectors of scalable vectors, these are required to be
+fixed-width arrays of 1-D scalable vectors. This means, effectively, that
+scalable vectors with a non-trailing scalable dimension (e.g.
+`vector<4x[8]x128xf32`) are not convertible to LLVM.
+
+Finally, as a brief reminder, MLIR takes similiar view on scalable Vectors as
+LLVM does (c.f. (Vector Type)[https://llvm.org/docs/LangRef.html#vector-type]):
+> The size of a specific scalable vector type is thus constant within IR, even
+> if the exact size in bytes cannot be determined until run time.
+
+Specifically, the size of a scalable vector is not known at compile time, but
+known and fixed at run time
 
 ### Hardware Vector Ops
 
@@ -269,11 +286,6 @@ proposal for now, this assumes LLVM only has built-in support for 1-D vector.
 The relationship with the LLVM Matrix proposal is discussed at the end of this
 document.
 
-MLIR does not currently support dynamic vector sizes (i.e. SVE style) so the
-discussion is limited to static rank and static vector sizes (e.g.
-`vector<4x8x16x32xf32>`). This section discusses operations on vectors in LLVM
-and MLIR.
-
 LLVM instructions are prefixed by the `llvm.` dialect prefix (e.g.
 `llvm.insertvalue`). Such ops operate exclusively on 1-D vectors and aggregates
 following the [LLVM LangRef](https://llvm.org/docs/LangRef.html). MLIR
@@ -287,10 +299,11 @@ Consider a vector of rank n with static sizes `{s_0, ... s_{n-1}}` (i.e. an MLIR
 `vector<s_0x...s_{n-1}xf32>`). Lowering such an `n-D` MLIR vector type to an
 LLVM descriptor can be done by either:
 
-1.  Flattening to a `1-D` vector: `!llvm<"(s_0*...*s_{n-1})xfloat">` in the MLIR
+1.  Nested aggregate type of `1-D` vector:
+    `!llvm."[s_0x[s_1x[...<s_{n-1}xf32>]]]">` in the MLIR LLVM dialect (current
+    lowering in MLIR).
+2.  Flattening to a `1-D` vector: `!llvm<"(s_0*...*s_{n-1})xfloat">` in the MLIR
     LLVM dialect.
-2.  Nested aggregate type of `1-D` vector:
-    `!llvm."[s_0x[s_1x[...<s_{n-1}xf32>]]]">` in the MLIR LLVM dialect.
 3.  A mix of both.
 
 There are multiple tradeoffs involved in choosing one or the other that we
@@ -303,9 +316,11 @@ vector<4x8x16x32xf32> to vector<4x4096xf32>` operation, that flattens the most
 
 The first constraint was already mentioned: LLVM only supports `1-D` `vector`
 types natively. Additional constraints are related to the difference in LLVM
-between vector and aggregate types: `“Aggregate Types are a subset of derived
-types that can contain multiple member types. Arrays and structs are aggregate
-types. Vectors are not considered to be aggregate types.”.`
+between vector and
+[aggregate types](https://llvm.org/docs/LangRef.html#aggregate-types):
+> Aggregate Types are a subset of derived types that can contain multiple
+> member types. Arrays and structs are aggregate types. Vectors are not
+> considered to be aggregate types.
 
 This distinction is also reflected in some of the operations. For `1-D` vectors,
 the operations `llvm.extractelement`, `llvm.insertelement`, and
@@ -314,12 +329,15 @@ vectors with `n>1`, and thus aggregate types at LLVM level, the more restrictive
 operations `llvm.extractvalue` and `llvm.insertvalue` apply, which only accept
 static indices. There is no direct shuffling support for aggregate types.
 
-The next sentence illustrates a recurrent tradeoff, also found in MLIR, between
+The next sentence (cf. LangRef [structure
+type](https://llvm.org/docs/LangRef.html#structure-type)) illustrates a
+recurrent tradeoff, also found in MLIR, between
 “value types” (subject to SSA use-def chains) and “memory types” (subject to
-aliasing and side-effects): `“Structures in memory are accessed using ‘load’ and
-‘store’ by getting a pointer to a field with the llvm.getelementptr instruction.
-Structures in registers are accessed using the llvm.extractvalue and
-llvm.insertvalue instructions.”`
+aliasing and side-effects):
+> Structures in memory are accessed using ‘load’ and ‘store’ by getting a
+> pointer to a field with the llvm.getelementptr instruction. Structures in
+> registers are accessed using the llvm.extractvalue and llvm.insertvalue
+> instructions.
 
 When transposing this to MLIR, `llvm.getelementptr` works on pointers to `n-D`
 vectors in memory. For `n-D`, vectors values that live in registers we can use
diff --git a/mlir/lib/Conversion/LLVMCommon/TypeConverter.cpp b/mlir/lib/Conversion/LLVMCommon/TypeConverter.cpp
index 784deaac5ee65..17be4d91ee054 100644
--- a/mlir/lib/Conversion/LLVMCommon/TypeConverter.cpp
+++ b/mlir/lib/Conversion/LLVMCommon/TypeConverter.cpp
@@ -509,8 +509,8 @@ Type LLVMTypeConverter::convertMemRefToBarePtr(BaseMemRefType type) const {
 ///  * 1-D `vector<axT>` remains as is while,
 ///  * n>1 `vector<ax...xkxT>` convert via an (n-1)-D array type to
 ///    `!llvm.array<ax...array<jxvector<kxT>>>`.
-/// Returns failure for n-D scalable vector types as LLVM does not support
-/// arrays of scalable vectors.
+/// As LLVM supports arrays of scalable vectors, this method will also convert
+/// n-D scalable vectors provided that only the trailing dim is scalable.
 FailureOr<Type> LLVMTypeConverter::convertVectorType(VectorType type) const {
   auto elementType = convertType(type.getElementType());
   if (!elementType)
@@ -521,7 +521,9 @@ FailureOr<Type> LLVMTypeConverter::convertVectorType(VectorType type) const {
                                     type.getScalableDims().back());
   assert(LLVM::isCompatibleVectorType(vectorType) &&
          "expected vector type compatible with the LLVM dialect");
-  // Only the trailing dimension can be scalable.
+  // For n-D vector types for which a _non-trailing_ dim is scalable,
+  // return a failure. Supporting such cases would require LLVM
+  // to support something akin "scalable arrays" of vectors.
   if (llvm::is_contained(type.getScalableDims().drop_back(), true))
     return failure();
   auto shape = type.getShape();
diff --git a/mlir/test/Dialect/LLVMIR/types.mlir b/mlir/test/Dialect/LLVMIR/types.mlir
index 42d370a5477c2..dcea51f145bcd 100644
--- a/mlir/test/Dialect/LLVMIR/types.mlir
+++ b/mlir/test/Dialect/LLVMIR/types.mlir
@@ -91,6 +91,10 @@ func.func @array() {
   "some.op"() : () -> !llvm.array<10 x ptr<4>>
   // CHECK: !llvm.array<10 x array<4 x f32>>
   "some.op"() : () -> !llvm.array<10 x array<4 x f32>>
+  // CHECK: !llvm.array<10 x array<4 x vector<8xf32>>>
+  "some.op"() : () -> !llvm.array<10 x array<4 x vector<8 x f32>>>
+  // CHECK: !llvm.array<10 x array<4 x vector<[8]xf32>>>
+  "some.op"() : () -> !llvm.array<10 x array<4 x vector<[8] x f32>>>
   return
 }
 
diff --git a/mlir/test/Target/LLVMIR/llvmir-types.mlir b/mlir/test/Target/LLVMIR/llvmir-types.mlir
index 3e533211b0d0c..6e54bb022c077 100644
--- a/mlir/test/Target/LLVMIR/llvmir-types.mlir
+++ b/mlir/test/Target/LLVMIR/llvmir-types.mlir
@@ -99,6 +99,10 @@ llvm.func @return_a8_float() -> !llvm.array<8 x f32>
 llvm.func @return_a10_p_4() -> !llvm.array<10 x ptr<4>>
 // CHECK: declare [10 x [4 x float]] @return_a10_a4_float()
 llvm.func @return_a10_a4_float() -> !llvm.array<10 x array<4 x f32>>
+// CHECK: declare [10 x [4 x <4 x float>]] @return_a10_a4_v4_float()
+llvm.func @return_a10_a4_v4_float() -> !llvm.array<10 x array<4 x vector<4xf32>>>
+// CHECK: declare [10 x [4 x <vscale x 4 x float>]] @return_a10_a4_sv4_float()
+llvm.func @return_a10_a4_sv4_float() -> !llvm.array<10 x array<4 x vector<[4]xf32>>>
 
 //
 // Literal structures.

>From aa4305bbe50628be0825cbcbcd9c7224e1ced911 Mon Sep 17 00:00:00 2001
From: Andrzej Warzynski <andrzej.warzynski at arm.com>
Date: Mon, 5 Aug 2024 10:39:25 +0100
Subject: [PATCH 2/4] fixup! [mlir][vector] Update docs for scalable vectors

Address Cullen's comments
---
 mlir/docs/Dialects/Vector.md        | 16 ++++++++++------
 mlir/test/Dialect/LLVMIR/types.mlir |  4 ++--
 2 files changed, 12 insertions(+), 8 deletions(-)

diff --git a/mlir/docs/Dialects/Vector.md b/mlir/docs/Dialects/Vector.md
index 010fe9f942c4a..fb25997337361 100644
--- a/mlir/docs/Dialects/Vector.md
+++ b/mlir/docs/Dialects/Vector.md
@@ -85,15 +85,19 @@ one uses `llvm.extractelement`, `llvm.insertelement` and `llvm.shufflevector`.
 The section on [LLVM Lowering Tradeoffs](#llvm-lowering-tradeoffs) offers a
 deeper dive into the current design choices and tradeoffs.
 
-Note, while LLVM supports vectors of scalable vectors, these are required to be
-fixed-width arrays of 1-D scalable vectors. This means, effectively, that
-scalable vectors with a non-trailing scalable dimension (e.g.
-`vector<4x[8]x128xf32`) are not convertible to LLVM.
+Note, while LLVM supports arrarys of scalable vectors, these are required to be
+fixed-width arrays of 1-D scalable vectors. This means scalable vectors with a
+non-trailing scalable dimension (e.g. `vector<4x[8]x128xf32`) are not
+convertible to LLVM.
 
 Finally, as a brief reminder, MLIR takes similiar view on scalable Vectors as
 LLVM does (c.f. (Vector Type)[https://llvm.org/docs/LangRef.html#vector-type]):
-> The size of a specific scalable vector type is thus constant within IR, even
-> if the exact size in bytes cannot be determined until run time.
+> For scalable vectors, the total number of elements is a constant multiple
+> (called vscale) of the specified number of elements; vscale is a positive
+> integer that is unknown at compile time and the same hardware-dependent
+> constant for all scalable vectors at run time.  The size of a specific
+> scalable vector type is thus constant within IR, even if the exact size in
+> bytes cannot be determined until run time.
 
 Specifically, the size of a scalable vector is not known at compile time, but
 known and fixed at run time
diff --git a/mlir/test/Dialect/LLVMIR/types.mlir b/mlir/test/Dialect/LLVMIR/types.mlir
index dcea51f145bcd..fd771b6152557 100644
--- a/mlir/test/Dialect/LLVMIR/types.mlir
+++ b/mlir/test/Dialect/LLVMIR/types.mlir
@@ -92,9 +92,9 @@ func.func @array() {
   // CHECK: !llvm.array<10 x array<4 x f32>>
   "some.op"() : () -> !llvm.array<10 x array<4 x f32>>
   // CHECK: !llvm.array<10 x array<4 x vector<8xf32>>>
-  "some.op"() : () -> !llvm.array<10 x array<4 x vector<8 x f32>>>
+  "some.op"() : () -> !llvm.array<10 x array<4 x vector<8xf32>>>
   // CHECK: !llvm.array<10 x array<4 x vector<[8]xf32>>>
-  "some.op"() : () -> !llvm.array<10 x array<4 x vector<[8] x f32>>>
+  "some.op"() : () -> !llvm.array<10 x array<4 x vector<[8]xf32>>>
   return
 }
 

>From 8a880c4dec44746461c4991298838e01961a2347 Mon Sep 17 00:00:00 2001
From: Andrzej Warzynski <andrzej.warzynski at arm.com>
Date: Mon, 5 Aug 2024 11:31:00 +0100
Subject: [PATCH 3/4] fixup! fixup! [mlir][vector] Update docs for scalable
 vectors

Remove a note
---
 mlir/docs/Dialects/Vector.md | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/mlir/docs/Dialects/Vector.md b/mlir/docs/Dialects/Vector.md
index fb25997337361..882e4808d650e 100644
--- a/mlir/docs/Dialects/Vector.md
+++ b/mlir/docs/Dialects/Vector.md
@@ -99,9 +99,6 @@ LLVM does (c.f. (Vector Type)[https://llvm.org/docs/LangRef.html#vector-type]):
 > scalable vector type is thus constant within IR, even if the exact size in
 > bytes cannot be determined until run time.
 
-Specifically, the size of a scalable vector is not known at compile time, but
-known and fixed at run time
-
 ### Hardware Vector Ops
 
 Hardware Vector Ops are implemented as one dialect per target. For internal

>From a280670ee5cd3ab47d253d986c31581dad34f3c7 Mon Sep 17 00:00:00 2001
From: Andrzej Warzynski <andrzej.warzynski at arm.com>
Date: Tue, 6 Aug 2024 10:47:18 +0100
Subject: [PATCH 4/4] fixup! fixup! fixup! [mlir][vector] Update docs for
 scalable vectors

Address Cullen's final nit
---
 mlir/docs/Dialects/Vector.md | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/mlir/docs/Dialects/Vector.md b/mlir/docs/Dialects/Vector.md
index 882e4808d650e..fc5cea331c3c8 100644
--- a/mlir/docs/Dialects/Vector.md
+++ b/mlir/docs/Dialects/Vector.md
@@ -90,8 +90,8 @@ fixed-width arrays of 1-D scalable vectors. This means scalable vectors with a
 non-trailing scalable dimension (e.g. `vector<4x[8]x128xf32`) are not
 convertible to LLVM.
 
-Finally, as a brief reminder, MLIR takes similiar view on scalable Vectors as
-LLVM does (c.f. (Vector Type)[https://llvm.org/docs/LangRef.html#vector-type]):
+Finally, MLIR takes the same view on scalable Vectors as LLVM (c.f. (Vector
+Type)[https://llvm.org/docs/LangRef.html#vector-type]):
 > For scalable vectors, the total number of elements is a constant multiple
 > (called vscale) of the specified number of elements; vscale is a positive
 > integer that is unknown at compile time and the same hardware-dependent