[Mlir-commits] [mlir] [mlir][vector][docs] Document indexed vs. non-indexed arguments (PR #130141)

Mon Mar 17 07:07:28 PDT 2025

https://github.com/banach-space updated https://github.com/llvm/llvm-project/pull/130141

>From 7b273b030ed1fca0fea82a19c642b1e43d5716cf Mon Sep 17 00:00:00 2001
From: Andrzej Warzynski <andrzej.warzynski at arm.com>
Date: Thu, 6 Mar 2025 16:43:32 +0000
Subject: [PATCH 1/3] [mlir][vector][docs] Document indexed vs. non-indexed
 arguments

Adds a section to the Vector dialect documentation introducing the
distinction between **indexed** and **non-indexed** arguments for
"read"/"write"-like operations.

The goal is to provide guidance on improving terminology consistency
within the Vector dialect and to establish a point of reference for
future discussions.

Credits to Diego Caballero <dieg0ca6aller0 at gmail.com> for proposing this
terminology in on of the PRs.
---
 mlir/docs/Dialects/Vector.md | 90 ++++++++++++++++++++++++++++++++++++
 1 file changed, 90 insertions(+)

diff --git a/mlir/docs/Dialects/Vector.md b/mlir/docs/Dialects/Vector.md
index ade0068c56fb6..c3abaff4b90f6 100644
--- a/mlir/docs/Dialects/Vector.md
+++ b/mlir/docs/Dialects/Vector.md
@@ -257,6 +257,96 @@ expressing `vector`s in the IR directly and simple pattern-rewrites.
 [EDSC](https://github.com/llvm/llvm-project/blob/main/mlir/docs/EDSC.md)s
 provide a simple way of driving such a notional language directly in C++.
 
+### Taxonomy for "Read"/"Write" Operations
+
+Below is a list of vector dialect operations that move values from an abstract
+**source** to an abstract **destination**, i.e. "read"/"write" operations:
+
+* `vector.load`, `vector.store`, `vector.transfer_read`,
+  `vector.transfer_write`, `vector.gather`, `vector.scatter`,
+  `vector.compressstore`, `vector.expandload`, `vector.maskedload`,
+  `vector.maskedstore`, `vector.extract`, `vector.insert`,
+  `vector.scalable_extract`, `vector.scalable_insert`,
+  `vector.extract_strided_slice`, `vector.insert_strided_slice`.
+
+#### Current Naming in Vector Dialect
+| **Vector Dialect Op**          | **Operand Names**        | **Operand Types**             | **Result Name**  | **Result Type**      |
+|--------------------------------|--------------------------|-------------------------------|------------------|----------------------|
+| `vector.load`                  | `base`                   | `memref`                      | `result`         | `vector`             |
+| `vector.store`                 | `valueToStore`, `base`   | `vector`, `memref`            | -                | -                    |
+| `vector.transfer_read`         | `source`                 | `memref` / `tensor`           | `vector`         | `vector`             |
+| `vector.transfer_write`        | `vector`, `source`       | `vector`, `memref`/ `tensor`  | `result`         | `vector`             |
+| `vector.gather`                | `base`                   | `memref`                      | `result`         | `vector`             |
+| `vector.scatter`               | `valueToStore`, `base`   | `vector`, `memref`            | -                | -                    |
+| `vector.expandload`            | `base`                   | `memref`                      | `result`         | `vector`             |
+| `vector.compressstore`         | `valueToStore`,`base`    | `vector`, `memref`            | -                | -                    |
+| `vector.maskedload`            | `base`                   | `memref`                      | `result`         | `vector`             |
+| `vector.maskedstore`           | `valueToStore`, `base`   | `vector`, `memref`            | -                | -                    |
+| `vector.extract`               | `vector`                 | `vector`                      | `result`         | `scalar` / `vector`  |
+| `vector.insert`                | `source`, `dest`         | `scalar` / `vector`, `vector` | `result`         | `vector`             |
+| `vector.scalable_extract`      | `source`                 | `vector`                      | `res`            | `scalar` / `vector`  |
+| `vector.scalable_insert`       | `source`, `dest`         | `scalar` / `vector`, `vector` | `res`            | `vector`             |
+| `vector.extract_strided_slice` | `vector`                 | `vector`                      | (missing name)   | `vector`             |
+| `vector.insert_strided_slice`  | `source`, `dest`         | `vector`                      | `res`            | `vector`             |
+
+Note that "read" operations take one operand ("from"), whereas "write"
+operations require two ("value-to-store" and "to").
+
+### Observations
+Each "read" operation has a "from" argument, while each "write" operation has a
+"to" and a "value-to-store" operand. However, the naming conventions are
+**inconsistent**, making it difficult to extract common patterns or determine
+operand roles. Here are some inconsistencies:
+
+- `getBase()` in `vector.load` refers to the **"from"** operand (source).
+- `getBase()` in `vector.store` refers to the **"to"** operand (destination).
+- `vector.transfer_read` and `vector.transfer_write` use `getSource()`, which:
+  - **Conflicts** with the `vector.load` / `vector.store` naming pattern.
+  - **Does not clearly indicate** whether the operand represents a **source**
+    or **destination**.
+- `vector.insert` defines `getSource()` and `getDest()`, making the distinction
+  between "to" and "from" operands **clear**. However, its sibling operation,
+  `vector.extract`, only defines `getVector()`, making it unclear whether it
+  represents a **source** or **destination**.
+- `vector.store` uses `getValueToStore()`, whereas
+  `vector.insert_strided_slice` does not.
+
+There is **no consistent way** to identify:
+- `"from"` (read operand)
+- `"to"` (write operand)
+- `"value-to-store"` (written value)
+
+### Indexed vs. Non-Indexed Taxonomy
+A more consistent way to classify "to", "from", and "value-to-store" arguments
+is by determining whether an operand is _indexed_ or _non-indexed_.
+
+#### **Example: `vector.transfer_read` and `vector.transfer_write`**
+```mlir
+                      Indexed Operand
+                             |
+%res = vector.transfer_read %A[%expr1, %expr2, %expr3, %expr4]
+  { permutation_map : (d0,d1,d2,d3) -> (d2,0,d0) } :
+  memref<?x?x?x?xf32>, vector<3x4x5xf32>
+                         \
+                 Non-Indexed Result
+
+      Non-Indexed Operand      Indexed Operand
+                      \       /
+vector.transfer_write %4, %arg1[%c3, %c3]
+  {permutation_map = (d0, d1)->(d0, d1)}
+    : vector<1x1x4x3xf32>, memref<?x?xvector<4x3xf32>>
+```
+
+Using the "indexed" vs. "non-indexed" classification, we can systematically
+differentiate between "to", "from" and "value-to-store" arguments across
+operations:
+* "to" is always _indexed_ for "write" operations, "value-to-store" is
+  _non-indexed_,
+* "from" is always _indexed_ for "read" operations.
+
+In addition, for "read" operations, we can view the "result" as a non-indexed
+argument.
+
 ## Bikeshed Naming Discussion
 
 There are arguments against naming an n-D level of abstraction `vector` because

>From 747cb61620d61e642ac703edd2ce4f0eb78d73ef Mon Sep 17 00:00:00 2001
From: Andrzej Warzynski <andrzej.warzynski at arm.com>
Date: Thu, 13 Mar 2025 16:52:47 +0000
Subject: [PATCH 2/3] fixup! [mlir][vector][docs] Document indexed vs.
 non-indexed arguments

Define _indexed_
---
 mlir/docs/Dialects/Vector.md | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/mlir/docs/Dialects/Vector.md b/mlir/docs/Dialects/Vector.md
index c3abaff4b90f6..5f34319ea9312 100644
--- a/mlir/docs/Dialects/Vector.md
+++ b/mlir/docs/Dialects/Vector.md
@@ -320,7 +320,12 @@ There is **no consistent way** to identify:
 A more consistent way to classify "to", "from", and "value-to-store" arguments
 is by determining whether an operand is _indexed_ or _non-indexed_.
 
-#### **Example: `vector.transfer_read` and `vector.transfer_write`**
+The distinction is somewhat intuitive, but for clarity:
+- Indexed operands require indices to specify an element/slice (e.g., `%A[0]`).
+- Non-indexed operands do not require indices as they are accessed in their
+  entirety (e.g., `%B`).
+
+Below is an example using: `vector.transfer_read` and `vector.transfer_write`
 ```mlir
                       Indexed Operand
                              |

>From e2e97c8ced1d34ba0adcf2293b7896db67e25e63 Mon Sep 17 00:00:00 2001
From: Andrzej Warzynski <andrzej.warzynski at arm.com>
Date: Mon, 17 Mar 2025 14:00:41 +0000
Subject: [PATCH 3/3] fixup! fixup! [mlir][vector][docs] Document indexed vs.
 non-indexed arguments

Move renaming to #131602
---
 mlir/docs/Dialects/Vector.md | 78 +++++++++---------------------------
 1 file changed, 19 insertions(+), 59 deletions(-)

diff --git a/mlir/docs/Dialects/Vector.md b/mlir/docs/Dialects/Vector.md
index 5f34319ea9312..a01c4266eb9c7 100644
--- a/mlir/docs/Dialects/Vector.md
+++ b/mlir/docs/Dialects/Vector.md
@@ -269,63 +269,23 @@ Below is a list of vector dialect operations that move values from an abstract
   `vector.scalable_extract`, `vector.scalable_insert`,
   `vector.extract_strided_slice`, `vector.insert_strided_slice`.
 
-#### Current Naming in Vector Dialect
-| **Vector Dialect Op**          | **Operand Names**        | **Operand Types**             | **Result Name**  | **Result Type**      |
-|--------------------------------|--------------------------|-------------------------------|------------------|----------------------|
-| `vector.load`                  | `base`                   | `memref`                      | `result`         | `vector`             |
-| `vector.store`                 | `valueToStore`, `base`   | `vector`, `memref`            | -                | -                    |
-| `vector.transfer_read`         | `source`                 | `memref` / `tensor`           | `vector`         | `vector`             |
-| `vector.transfer_write`        | `vector`, `source`       | `vector`, `memref`/ `tensor`  | `result`         | `vector`             |
-| `vector.gather`                | `base`                   | `memref`                      | `result`         | `vector`             |
-| `vector.scatter`               | `valueToStore`, `base`   | `vector`, `memref`            | -                | -                    |
-| `vector.expandload`            | `base`                   | `memref`                      | `result`         | `vector`             |
-| `vector.compressstore`         | `valueToStore`,`base`    | `vector`, `memref`            | -                | -                    |
-| `vector.maskedload`            | `base`                   | `memref`                      | `result`         | `vector`             |
-| `vector.maskedstore`           | `valueToStore`, `base`   | `vector`, `memref`            | -                | -                    |
-| `vector.extract`               | `vector`                 | `vector`                      | `result`         | `scalar` / `vector`  |
-| `vector.insert`                | `source`, `dest`         | `scalar` / `vector`, `vector` | `result`         | `vector`             |
-| `vector.scalable_extract`      | `source`                 | `vector`                      | `res`            | `scalar` / `vector`  |
-| `vector.scalable_insert`       | `source`, `dest`         | `scalar` / `vector`, `vector` | `res`            | `vector`             |
-| `vector.extract_strided_slice` | `vector`                 | `vector`                      | (missing name)   | `vector`             |
-| `vector.insert_strided_slice`  | `source`, `dest`         | `vector`                      | `res`            | `vector`             |
-
-Note that "read" operations take one operand ("from"), whereas "write"
-operations require two ("value-to-store" and "to").
-
-### Observations
-Each "read" operation has a "from" argument, while each "write" operation has a
-"to" and a "value-to-store" operand. However, the naming conventions are
-**inconsistent**, making it difficult to extract common patterns or determine
-operand roles. Here are some inconsistencies:
-
-- `getBase()` in `vector.load` refers to the **"from"** operand (source).
-- `getBase()` in `vector.store` refers to the **"to"** operand (destination).
-- `vector.transfer_read` and `vector.transfer_write` use `getSource()`, which:
-  - **Conflicts** with the `vector.load` / `vector.store` naming pattern.
-  - **Does not clearly indicate** whether the operand represents a **source**
-    or **destination**.
-- `vector.insert` defines `getSource()` and `getDest()`, making the distinction
-  between "to" and "from" operands **clear**. However, its sibling operation,
-  `vector.extract`, only defines `getVector()`, making it unclear whether it
-  represents a **source** or **destination**.
-- `vector.store` uses `getValueToStore()`, whereas
-  `vector.insert_strided_slice` does not.
-
-There is **no consistent way** to identify:
-- `"from"` (read operand)
-- `"to"` (write operand)
-- `"value-to-store"` (written value)
+These operations share the following properties:
+- One of their arguments must be a `vector`.
+- The other argument (or result) can be either a `vector` or a scalar.
+
+Our existing taxonomies make it difficult to differentiate between the two.
 
 ### Indexed vs. Non-Indexed Taxonomy
-A more consistent way to classify "to", "from", and "value-to-store" arguments
-is by determining whether an operand is _indexed_ or _non-indexed_.
+A useful way to classify arguments in "read"/"write" operations is to determine  
+whether an operand is **indexed** or **non-indexed**.
 
 The distinction is somewhat intuitive, but for clarity:
-- Indexed operands require indices to specify an element/slice (e.g., `%A[0]`).
-- Non-indexed operands do not require indices as they are accessed in their
-  entirety (e.g., `%B`).
+- **Indexed operands** require indices to specify an element or slice  
+  (e.g., `%A[0]`).
+- **Non-indexed operands** do not require indices, as they are accessed in  
+  their entirety (e.g., `%B`).
 
-Below is an example using: `vector.transfer_read` and `vector.transfer_write`
+Below is an example using `vector.transfer_read` and `vector.transfer_write`:
 ```mlir
                       Indexed Operand
                              |
@@ -342,15 +302,15 @@ vector.transfer_write %4, %arg1[%c3, %c3]
     : vector<1x1x4x3xf32>, memref<?x?xvector<4x3xf32>>
 ```
 
-Using the "indexed" vs. "non-indexed" classification, we can systematically
-differentiate between "to", "from" and "value-to-store" arguments across
+Using the indexed vs. non-indexed classification, we can also systematically
+differentiate between "to", "from", and "value-to-store" arguments across
 operations:
-* "to" is always _indexed_ for "write" operations, "value-to-store" is
-  _non-indexed_,
-* "from" is always _indexed_ for "read" operations.
 
-In addition, for "read" operations, we can view the "result" as a non-indexed
-argument.
+* "to" is always indexed for "write" operations (i.e., always a `vector`), while
+  "value-to-store" is non-indexed (i.e., either a `vector` or a scalar).
+* "from" is always indexed for "read" operations (i.e., a `vector`).
+
+Finally, for "read" operations, the "result" can be viewed as a non-indexed argument.
 
 ## Bikeshed Naming Discussion