[Mlir-commits] [mlir] [MLIR] VectorEmulateNarrowType to support loading of unaligned vectors (PR #113411)

Wed Oct 23 14:42:31 PDT 2024

================
@@ -294,35 +312,67 @@ struct ConvertVectorLoad final : OpConversionPattern<vector::LoadOp> {
     // %1 = vector.load %0[%linear_index] : memref<6xi8>, vector<2xi8>
     // %2 = vector.bitcast %1 : vector<2xi8> to vector<4xi4>
     //
-    // TODO: Currently, only the even number of elements loading is supported.
-    // To deal with the odd number of elements, one has to extract the
-    // subvector at the proper offset after bit-casting.
+    // There are cases where the number of elements to load is not byte-aligned,
+    // for example:
+    //
+    // %1 = vector.load %0[%c1, %c0] : memref<3x3xi2>, vector<3xi2>
+    //
+    // we will have to load extra bytes and extract the exact slice in between.
+    //
+    // %1 = vector.load %0[%c2] : memref<3xi8>, vector<2xi8>
+    // %2 = vector.bitcast %1 : vector<2xi8> to vector<8xi2>
+    // %3 = vector.extract_strided_slice %1 {offsets = [2], sizes = [3], strides
+    // = [1]}
+    //        : vector<8xi2> to vector<3xi2>
+    //
+    // TODO: Currently the extract_strided_slice's attributes must be known at
+    // compile time as they must be constants.
----------------
lialan wrote:

@hanhanW It is not the size, but the subvector's offset relative to the emulated vector is not necessarily known at compile time. Consider this: if we have a `memref<3x3xi2>`, and we want to extract this:
```
vector.load %0[%var, %c0] : memref<3x3xi2>, vector<3xi2>
```
If `%var == 0`, then we are emulating it by loading the first byte (`i8`), and extract the first 6bits as a `vector<3xi2>`.

If `%var == 2`, then the vector we want to load is located from bit 12 to bit 18 of the memref. To load this, we need to load 2 bytes (the 2nd and 3rd byte) of the memref, and extract the 4-10bit out of the 2bytes and form a `vector<3xi2>`.

So depending on the index, the offset of the subvector is different each time. 


https://github.com/llvm/llvm-project/pull/113411