[Mlir-commits] [mlir] [mlir][vector] Add support for unrolling vector.bitcast ops. (PR #94064)

Mon Jun 3 07:23:49 PDT 2024

================
@@ -0,0 +1,94 @@
+//===- LowerVectorBitCast.cpp - Lower 'vector.bitcast' operation ----------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+// This file implements target-independent rewrites and utilities to lower the
+// 'vector.bitcast' operation.
+//
+//===----------------------------------------------------------------------===//
+
+#include "mlir/Dialect/Vector/IR/VectorOps.h"
+#include "mlir/Dialect/Vector/Transforms/LoweringPatterns.h"
+#include "mlir/Dialect/Vector/Utils/VectorUtils.h"
+#include "mlir/IR/BuiltinTypes.h"
+#include "mlir/IR/PatternMatch.h"
+#include "mlir/Support/LogicalResult.h"
+
+#define DEBUG_TYPE "vector-bitcast-lowering"
+
+using namespace mlir;
+using namespace mlir::vector;
+
+namespace {
+
+/// A one-shot unrolling of vector.bitcast to the `targetRank`.
+///
+/// Example:
+///
+///   vector.bitcast %a, %b : vector<1x2x3x4xi64> to vector<1x2x3x8xi32>
+///
+/// Would be unrolled to:
+///
+/// %result = arith.constant dense<0> : vector<1x2x3x8xi32>
+/// %0 = vector.extract %a[0, 0, 0]                 ─┐
+///        : vector<4xi64> from vector<1x2x3x4xi64>  |
+/// %1 = vector.bitcast %0                           | - Repeated 6x for
+///        : vector<4xi64> to vector<8xi32>          |   all leading positions
+/// %2 = vector.insert %1, %result [0, 0, 0]         |
+///        : vector<8xi64> into vector<1x2x3x8xi32> ─┘
----------------
bjacob wrote:

Since this pattern is added to `VectorToLLVM` with default `targetRank = 1`, isn't this potentially a pessimization for programs that use `vector.bitcast` with shapes where the inner-most dimension's size in bits is smaller than the target SIMD vector size in bit?

For example, looking at the example given in this comment, this causes vectors to be tiled into 256-bit vectors. When the target prefers 512-bit vectors, could this cause codegen of neighbouring ops to use 256-bit vectors instead of 512-bit vectors?

This example is relatively mild in this sense, since the inner dimension's size of 256 bits is still fairly large, but what if the inner dimension was smaller still, such as `vector<4x3x2x1xi32>`.

Out of this consideration, I would expect vector unrolling patterns to default to targeting a certain target vector size in bits, rather than a certain target rank. So you could say that the target size in bits is the target's SIMD vector size, e.g. 512 bits, and keep enough inner dimensions in the vector tile to achieve that size. If a dimension is non-power-of-2, though, it's OK to stop there, as non-power-of-two vectors would have to be broken into smaller power-of-two tiles at codegen anyway.

https://github.com/llvm/llvm-project/pull/94064