[Mlir-commits] [mlir] [mlir][tosa] Work around GCC bug in tosa-to-tensor (PR #91521)

Wed May 8 12:11:21 PDT 2024

https://github.com/sabauma created https://github.com/llvm/llvm-project/pull/91521

GCC 12 and 13 generate incorrect code for a pattern in the tosa-to-tensor pass responsible for lowering tosa.reshape. This results in the tosa.reshape lowering producing IR which fails to verify. I've narrowed down the set of cmake flags needed to reproduce the issue to this:

    cmake -G Ninja ../llvm \
      -DLLVM_ENABLE_PROJECTS="mlir" \
      -DLLVM_TARGETS_TO_BUILD=host \
      -DLLVM_ENABLE_PROJECTS=mlir \
      -DCMAKE_BUILD_TYPE="Release" \
      -DCMAKE_CXX_FLAGS_RELEASE="-O2" \
      -DCMAKE_CXX_FLAGS="-O2" \
      -DCMAKE_CXX_COMPILER=g++ \
      -DCMAKE_C_COMPILER=gcc

This is the failing test case:

    func.func @fails_in_gcc_12(%arg0: tensor<?xf32>) -> tensor<1x1x1x?xf32> {
      %0 = tosa.reshape %arg0 {new_shape = array<i64: 1, 1, 1, -1>} : (tensor<?xf32>) -> tensor<1x1x1x?xf32>
      return %0 : tensor<1x1x1x?xf32>
    }

This should lower to a tensor.expand_shape operation like so:

    func.func @foo(%arg0: tensor<?xf32>) -> tensor<1x1x1x?xf32> {
      %c0 = arith.constant 0 : index
      %dim = tensor.dim %arg0, %c0 : tensor<?xf32>
      %c1 = arith.constant 1 : index
      %expanded = tensor.expand_shape %arg0 [[0, 1, 2, 3]] output_shape [1, 1, 1, %dim] : tensor<?xf32> into tensor<1x1x1x?xf32>
      return %expanded : tensor<1x1x1x?xf32>
    }

Under GCC 12/13 with the above cmake configuration, the tensor.expand_shape looks like this

    %2 = "tensor.expand_shape"(%arg0) <{reassociation = [[0, 1, 2, 3]], static_output_shape = array<i64>}> : (tensor<?xf32>) -> tensor<?x1x1x?xf32>

The key difference is the computed output type of `tensor<?x1x1x?xf32>` rather than the expected `tensor<1x1x1x?xf32>`. This expand_shape fails to verify with this error message:

    error: 'tensor.expand_shape' op expected number of static shape dims to be equal to the output rank (4) but found 0 inputs instead

The problematic code is calculating the intermediate shape of the generated tensor.expand_shape operation in the
expand_shape/collapse_shape sequence that implements tosa.reshape.

    // Compute result shape
    bool resultIsStatic = true;
    auto resultShape = llvm::map_to_vector(newShape, [&](int64_t size) {
      // Omitted

      // If we do not know the total size of the tensor, keep this dimension
      // dynamic in the result shape.
      if (!inputIsStatic) {
        resultIsStatic = false;
        return ShapedType::kDynamic;
      }
    });

    if (resultIsStatic) {
      // do something
      return;
    }

    // do something else
    return;

The failure point seems to be the update of the resultIsStatic variable in the lambda body. The assignment of false is not propagated to the use in the if-statement, resulting in the branch being taken when it should not.

I've found several modification to the code that gets around the bug. The version I settled on is one which makes the logic a little more obvious.

>From 103b3fbcefdc91181504122e428a7c4d2525b79b Mon Sep 17 00:00:00 2001
From: Spenser Bauman <sbauman at mathworks.com>
Date: Wed, 8 May 2024 14:14:26 -0400
Subject: [PATCH] [mlir][tosa] Work around GCC bug in tosa-to-tensor

GCC 12 and 13 generate incorrect code for a pattern in the
tosa-to-tensor pass responsible for lowering tosa.reshape.
This results in the tosa.reshape lowering producing IR which fails to
verify. I've narrowed down the set of cmake flags needed to reproduce
the issue to this:

    cmake -G Ninja ../llvm \
      -DLLVM_ENABLE_PROJECTS="mlir" \
      -DLLVM_TARGETS_TO_BUILD=host \
      -DLLVM_ENABLE_PROJECTS=mlir \
      -DCMAKE_BUILD_TYPE="Release" \
      -DCMAKE_CXX_FLAGS_RELEASE="-O2" \
      -DCMAKE_CXX_FLAGS="-O2" \
      -DCMAKE_CXX_COMPILER=g++ \
      -DCMAKE_C_COMPILER=gcc

This is the failing test case:

    func.func @fails_in_gcc_12(%arg0: tensor<?xf32>) -> tensor<1x1x1x?xf32> {
      %0 = tosa.reshape %arg0 {new_shape = array<i64: 1, 1, 1, -1>} : (tensor<?xf32>) -> tensor<1x1x1x?xf32>
      return %0 : tensor<1x1x1x?xf32>
    }

This should correctly lower to a single tensor.expand_shape operation
like so:

    func.func @foo(%arg0: tensor<?xf32>) -> tensor<1x1x1x?xf32> {
      %c0 = arith.constant 0 : index
      %dim = tensor.dim %arg0, %c0 : tensor<?xf32>
      %c1 = arith.constant 1 : index
      %expanded = tensor.expand_shape %arg0 [[0, 1, 2, 3]] output_shape [1, 1, 1, %dim] : tensor<?xf32> into tensor<1x1x1x?xf32>
      return %expanded : tensor<1x1x1x?xf32>
    }

Under GCC 12/13 with the above cmake configuration, the tensor.expand_shape
looks like this

    %2 = "tensor.expand_shape"(%arg0) <{reassociation = [[0, 1, 2, 3]], static_output_shape = array<i64>}> : (tensor<?xf32>) -> tensor<?x1x1x?xf32>

This expand_shape fails to verify with this error message:

    error: 'tensor.expand_shape' op expected number of static shape dims to be equal to the output rank (4) but found 0 inputs instead

The problematic code is calculating the intermediate shape of the
generated tensor.expand_shape operation in the
expand_shape/collapse_shape sequence that implements tosa.reshape.

    // Compute result shape
    bool resultIsStatic = true;
    auto resultShape = llvm::map_to_vector(newShape, [&](int64_t size) {
      // Omitted

      // If we do not know the total size of the tensor, keep this dimension
      // dynamic in the result shape.
      if (!inputIsStatic) {
        resultIsStatic = false;
        return ShapedType::kDynamic;
      }
    });

    if (resultIsStatic) {
      // do something
      return;
    }

    // do something else
    return;

The failure point seems to be the update of the resultIsStatic variable
in the lambda body. The assignment of false is not propagated to the use
in the if-statement, resulting in the branch being taken when it should
not.

I've found several modification to the code that gets around the bug.
The version I settled on is one which makes the logic a little more
obvious.
---
 mlir/lib/Conversion/TosaToTensor/TosaToTensor.cpp | 1 +
 1 file changed, 1 insertion(+)

diff --git a/mlir/lib/Conversion/TosaToTensor/TosaToTensor.cpp b/mlir/lib/Conversion/TosaToTensor/TosaToTensor.cpp
index cd6da35582469..488c24e90f5a0 100644
--- a/mlir/lib/Conversion/TosaToTensor/TosaToTensor.cpp
+++ b/mlir/lib/Conversion/TosaToTensor/TosaToTensor.cpp
@@ -84,6 +84,7 @@ TensorType inferReshapeExpandedType(TensorType inputType,
     return totalSize / totalSizeNoPlaceholder;
   });
 
+
   // A syntactic restriction in 'tensor.expand_shape' forbids a dynamically
   // shaped input from being reshaped into a statically shaped result. We may
   // simply turn the first result dimension dynamic to address this.