[Mlir-commits] [mlir] [mlir][vector][docs] Document indexed vs. non-indexed arguments (PR #130141)

Thu Mar 6 08:49:48 PST 2025

https://github.com/banach-space created https://github.com/llvm/llvm-project/pull/130141

Adds a section to the Vector dialect documentation introducing the
distinction between **indexed** and **non-indexed** arguments for
"read"/"write"-like operations.

The goal is to provide guidance on improving terminology consistency
within the Vector dialect and to establish a point of reference for
future discussions.

Credits to Diego Caballero <dieg0ca6aller0 at gmail.com> for proposing this
terminology in on of the PRs.


>From 977611122feac1229a385ad3320b1e7202767b85 Mon Sep 17 00:00:00 2001
From: Andrzej Warzynski <andrzej.warzynski at arm.com>
Date: Thu, 6 Mar 2025 16:43:32 +0000
Subject: [PATCH] [mlir][vector][docs] Document indexed vs. non-indexed
 arguments

Adds a section to the Vector dialect documentation introducing the
distinction between **indexed** and **non-indexed** arguments for
"read"/"write"-like operations.

The goal is to provide guidance on improving terminology consistency
within the Vector dialect and to establish a point of reference for
future discussions.

Credits to Diego Caballero <dieg0ca6aller0 at gmail.com> for proposing this
terminology in on of the PRs.
---
 mlir/docs/Dialects/Vector.md | 77 ++++++++++++++++++++++++++++++++++++
 1 file changed, 77 insertions(+)

diff --git a/mlir/docs/Dialects/Vector.md b/mlir/docs/Dialects/Vector.md
index ade0068c56fb6..62807ee97a3cd 100644
--- a/mlir/docs/Dialects/Vector.md
+++ b/mlir/docs/Dialects/Vector.md
@@ -257,6 +257,83 @@ expressing `vector`s in the IR directly and simple pattern-rewrites.
 [EDSC](https://github.com/llvm/llvm-project/blob/main/mlir/docs/EDSC.md)s
 provide a simple way of driving such a notional language directly in C++.
 
+### Taxonomy for "To" and "From" Operands for "Read"/"Write" Ops
+
+Below is a list of vector dialect operations that move values from an abstract
+**source** to an abstract **destination** (i.e., "read"/"write" operations):
+
+* `vector.load`, `vector.store`, `vector.extract`, `vector.insert`,
+  `vector.transfer_read`, `vector.transfer_write`, `vector.gather`,
+  `vector.scatter`, `vector.compressstore`, `vector.expandload`,
+  `vector.maskedload`, `vector.maskedstore`.
+
+For consistency, let's define:
+- **"Source"** as where values are _read from_.
+- **"Destination"** as where values are _written to_.
+
+#### Current Naming in Vector Dialect
+| **Vector Dialect Op**          | **Source Operand Names**                                 | **Source Operand Types**                  | **Result Names** | **Result Types**           |
+|--------------------------------|----------------------------------------------------------|-------------------------------------------|------------------|----------------------------|
+| `vector.load`                  | `base`, `indices`                                        | `memref`, `index`                         | `result`         | `vector`                   |
+| `vector.store`                 | `base`, `indices`, `valueToStore`                        | `memref`, `index`, `vector`               | -                | -                          |
+| `vector.extract`               | `vector`, `dynamic_position` + `static_position`         | `vector`, `index` + `DenseI64ArrayAttr`   | `result`         | `scalar` / `subvector`     |
+| `vector.insert`                | `source`, `dest`, `dynamic_position` + `static_position` | `scalar` / `subvector`, `vector`, `index` | `result`         | `vector`                   |
+| `vector.transfer_read`         | `source`, `indices`                                      | `memref`, `index`, `scalar`               | `vector`         | `vector`                   |
+| `vector.transfer_write`        | `source`, `indices`, `vector`                            | `vector`, `memref`, `index`               | `result`         | `vector`                   |
+| `vector.gather`                | `base`, `indices`                                        | `memref`, `index`                         | `result`         | `vector`                   |
+| `vector.scatter`               | `base`, `indices`, `valueToStore`                        | `vector`, `memref`, `index`               | -                | -                          |
+| `vector.compressstore`         | `base`, `indices`, `mask`, `valueToStore`                | `vector`, `memref`, `mask`                | -                | -                          |
+| `vector.expandload`            | `base`, `indices`                                        | `memref`, `mask`                          | `result`         | `vector`                   |
+| `vector.maskedload`            | `base`, `mask`                                           | `memref`, `mask`                          | `result`         | `vector`                   |
+| `vector.maskedstore`           | `base`, `indices`, `valueToStore`, `mask`                | `vector`, `memref`, `mask`                | -                | -                          |
+| `vector.scalable_extract`      | `vector`, `pos`                                          | `vector`, `index`                         | `res`            | `scalar` / `subvector`     |
+| `vector.scalable_insert`       | `source`, `dest`, `pos`                                  | `scalar` / `subvector`, `vector`, `index` | `res`            | `vector`                   |
+| `vector.extract_strided_slice` | `vector`, `offsets`, `sizes`, `strides`                  | `vector`, `index`, `index`, `index`       | (missing name)   | `subvector`                |
+| `vector.insert_strided_slice`  | `source`, `dest`, `offsets`, `strides`                   | `subvector`, `vector`, `index`, `index`   | `res`            | `vector`                   |
+
+### Observations
+Although each of these operations has a **"source" and "destination"**, their
+operand naming conventions are not always consistent, making it difficult
+to extract common patterns and to identify which one refers to which argument:
+* `getBase()` for `vector.load` and `vector.store` (impossible to tell whether
+  the corresponding argument is a "destiniation" or "source" argument).
+* `getBase()` vs. `getSource()` (inconsistency between `vector.store` and
+  `vector.transfer_write`),
+* `getSource()` + `getDest()` for `vector.insert_strided_slice` vs `getVector()`
+  for `vector.extract_strided_slice` (inconsistency between "sibling" Ops).
+
+Some names should be **unified**, but that is a separate issue:
+* `getVector` vs `getValueToStore` (`vector.store` and
+  `vector.insert_strided_slice`, respectively).
+
+### Indexed vs. Non-Indexed Taxonomy
+One way to define a **consistent distinction** between "to" and "from"
+arguments is based on whether an operand is **indexed** or **non-indexed**.
+
+For example, consider `vector.transfer_read` and `vector.transfer_write`:
+
+```mlir
+                Indexed Operand
+                      |
+vector.transfer_read %A[%expr1, %expr2, %expr3, %expr4]
+  { permutation_map : (d0,d1,d2,d3) -> (d2,0,d0) } :
+  memref<?x?x?x?xf32>, vector<3x4x5xf32>
+                            |
+                     Non-Indexed Result
+
+      Non-Indexed Operand      Indexed Operand
+                      \       /
+vector.transfer_write %4, %arg1[%c3, %c3]
+  {permutation_map = (d0, d1)->(d0, d1)}
+    : vector<1x1x4x3xf32>, memref<?x?xvector<4x3xf32>>
+```
+
+Using the "indexed" vs. "non-indexed" classification, we can systematically
+differentiate between "to" and "from" arguments across operations:
+* "to" is always indexed for "write" operations, "from" is non-indexed,
+* "from" is always indexed for "read" oprations, "to" is always non-indexed.
+
+
 ## Bikeshed Naming Discussion
 
 There are arguments against naming an n-D level of abstraction `vector` because