[Mlir-commits] [mlir] [mlir][vector] Improve vector.gather description (PR #153278)

Thu Aug 14 07:51:17 PDT 2025

================
@@ -2058,39 +2058,52 @@ def Vector_GatherOp :
     Results<(outs AnyVectorOfNonZeroRank:$result)> {
 
   let summary = [{
-    gathers elements from memory or ranked tensor into a vector as defined by an
-    index vector and a mask vector
+    Gathers elements from memory or ranked tensor into a vector as defined by an
+    index vector and a mask vector.
   }];
 
   let description = [{
     The gather operation returns an n-D vector whose elements are either loaded
-    from memory or ranked tensor, or taken from a pass-through vector, depending
+    from a k-D memref or tensor, or taken from an n-D pass-through vector, depending
     on the values of an n-D mask vector.
-    If a mask bit is set, the corresponding result element is defined by the base
-    with indices and the n-D index vector (each index is a 1-D offset on the base).
-    Otherwise, the corresponding element is taken from the n-D pass-through vector.
-    Informally the semantics are:
+
+    If a mask bit is set, the corresponding result element is taken from `base`
+    at an index defined by k indices and n-D `index_vec`. Otherwise, the element
+    is taken from the pass-through vector. As an example, suppose that `base` is
+    3-D and the result is 2-D:
+
+    ```mlir
+    func.func @gather_3D_to_2D(
+        %base: memref<?x10x?xf32>, %i0: index, %i1: index, %i2: index,
+        %index_vec: vector<2x3xi32>, %mask: vector<2x3xi1>,
+        %fall_thru: vector<2x3xf32>) -> vector<2x3xf32> {
+            %result = vector.gather %base[%i0, %i1, %i2]
+                                   [%index_vec], %mask, %fall_thru : [...]
+            return %result : vector<2x3xf32>
+    }
     ```
-    result[0] := if mask[0] then base[index[0]] else pass_thru[0]
-    result[1] := if mask[1] then base[index[1]] else pass_thru[1]
-    etc.
+
+    The indexing semantics are then,
+
+    ```
+    result[i,j] := if mask[i,j] then base[i0, i1, i2 + index_vec[i,j]]
+                   else pass_thru[i,j]
----------------
banach-space wrote:

This could also be written as:
```bash
    result[i,j] := if mask[i,j] then base[i0, i1, i2] + index_vec[i,j]
                   else pass_thru[i,j]
```

As in, `base[i0, i1, i2]` provides the base address and then `index_vec[i,j]` is the "element" index, similarly to how pointer arithmetic works in C.

I wanted to bring it up to make sure that our interpretations are consistent. If that's the case, then I would consider rephrasing:
>     The index into `base` only varies in the innermost ((k-1)-th) dimension.
(which assumes one interpretation) as
>     The index vector defines the indices from the base address as defined by the offsets.

This is a bit tricky/nuanced though, as Tensors have no notion of "base address" 😅 

Taking a step back, we should probably rename the[ input arguments](https://github.com/llvm/llvm-project/blob/5ccc734fa0355f971f8f515457a0bece33ab6642/mlir/include/mlir/Dialect/Vector/IR/VectorOps.td#L2053-L2058) as:
* `index` -> `offsets`
* `index_vec` -> `indices`

Have you thought about it?

https://github.com/llvm/llvm-project/pull/153278