[Mlir-commits] [mlir] [mlir][tosa] Fix integer bilinear (quantized) tosa.resize lowering to use floordivsi (PR #193821)

Thu Apr 23 11:57:44 PDT 2026

llvmbot wrote:



@llvm/pr-subscribers-mlir-tosa

@llvm/pr-subscribers-mlir-linalg

Author: Henry Wu (HanchengWu)

<details>
<summary>Changes</summary>

## Background

`tosa.resize` in bilinear integer (quantized) mode lowers to a `linalg.generic`
body that, for each output pixel, computes a corresponding input coordinate and
blends the four neighboring input pixels. The mapping is:

```
val   = out_coord * scale_d + offset
index = val / scale_n          // integer part — which input pixel to start from
delta = val - index * scale_n  // fractional part, scaled to [0, scale_n)
```

`delta` is the interpolation weight toward the next pixel. The bilinear formula
(integer path) is:

```
topAcc    = pixel[y0,x0] * (scale_x - dx) + pixel[y0,x1] * dx
bottomAcc = pixel[y1,x0] * (scale_x - dx) + pixel[y1,x1] * dx
result    = topAcc * (scale_y - dy) + bottomAcc * dy
```

For this to be a valid convex combination (interpolation, not extrapolation),
`dx` and `dy` must be in `[0, scale_n]`.

The pixel indices `y0`, `y1`, `x0`, `x1` are computed by `getClampedIdxs`:

```cpp
y0 = clamp(iy,   0, H-1)
y1 = clamp(iy+1, 0, H-1)
```

---

## The Bug

The integer path uses `DivSIOp` (truncation toward zero):

```cpp
// getIndexAndDeltaInt — mlir/lib/Conversion/TosaToLinalg/TosaToLinalg.cpp
index = arith::DivSIOp::create(b, val, scaleN);   // truncates toward zero
delta = arith::MulIOp::create(b, index, scaleN);
delta = arith::SubIOp::create(b, val, delta);      // = val - (val/scaleN)*scaleN
```

When `val < 0` (which happens at boundary output pixels when `offset` is
negative), `DivSIOp` truncates toward zero instead of toward -∞. This produces
a negative remainder, i.e. a negative `delta`, which causes extrapolation.

Note: the code comment on line 2058 already says `// ix = floor(x / scale_n)`,
but the code uses truncation — this mismatch is the root cause of the bug.

The float path (`getIndexAndDeltaFp`) uses `FloorDivSIOp` and is unaffected:
with floor division, `r = val - floor(val/scaleN)*scaleN` is always in
`[0, scaleN-1]`.

---

## Concrete Example

**Setup:** 2×2 input upsampled to 4×4, `scale=[4,2,4,2]`, `offset=[-1,-1]`

- `scale_y_n=4`, `scale_y_d=2`, `scale_x_n=4`, `scale_x_d=2`
- `offset_y=-1`, `offset_x=-1`
- Input: `tensor<1x2x2x1xi8>` with `input[0,0,0,0]=100`, all others `0`

**At output pixel (0,0):**

```
val  = 0 * 2 + (-1) = -1
```

### Without any fix (DivSIOp, buggy)

```
iy  = DivSIOp(-1, 4) = 0     // truncates -0.25 toward zero
dy  = -1 - 0*4 = -1          // OUT OF RANGE: should be in [0, 4]

y0 = clamp(0,   0, 1) = 0
y1 = clamp(0+1, 0, 1) = 1    // different rows

ix  = DivSIOp(-1, 4) = 0
dx  = -1                      // same issue

x0 = clamp(0,   0, 1) = 0
x1 = clamp(0+1, 0, 1) = 1

// pixels:
y0x0 = input[0,0,0,0] = 100
y0x1 = input[0,0,1,0] = 0
y1x0 = input[0,1,0,0] = 0
y1x1 = input[0,1,1,0] = 0

topAcc    = 100*(4-(-1)) + 0*(-1) = 500   // EXTRAPOLATION
bottomAcc = 0*(4-(-1))   + 0*(-1) = 0
result    = 500*(4-(-1)) + 0*(-1) = 2500  // WRONG
```


### Fix w/ FloorDivSIOp

```
iy  = FloorDivSIOp(-1, 4) = -1   // floors -0.25 toward -∞
dy  = -1 - (-1)*4 = 3            // naturally in [0, scale_n-1]

y0 = clamp(-1,  0, 1) = 0
y1 = clamp(-1+1, 0, 1) = 0      // SAME row — both snap to boundary

dx  = 3 (same by symmetry)

// all four neighbors collapse to the same boundary pixel:
y0x0 = y0x1 = y1x0 = y1x1 = input[0,0,0,0] = 100

topAcc    = 100*(4-3) + 100*3 = 400
bottomAcc = 100*(4-3) + 100*3 = 400
result    = 400*(4-3) + 400*3 = 1600  // correct
```

`y0=y1=0` means boundary replication is enforced by `getClampedIdxs`, making
`dy` irrelevant.

## Semantic Analysis of the fix

- `iy=-1` correctly signals "this position is before the first input pixel."
- `getClampedIdxs` does its intended job: both `y0` and `y1` snap to the
  boundary pixel, enforcing replication explicitly.
- `dy=3` **appears** valid (it's in `[0, scale_n-1]`) but is semantically
  meaningless: the true position is `-0.25`, which is outside the image — there
  is no "3/4 toward the next pixel" to interpolate toward. It is harmless only
  because `y0=y1`. 
- Same analysis for dx=3 by symmetry.
- Fixes the root cause (wrong division op, matching the existing code comment
  and mirroring the float path), but `delta` still carries a misleading value
  at out-of-bounds positions.


---
Full diff: https://github.com/llvm/llvm-project/pull/193821.diff


2 Files Affected:

- (modified) mlir/lib/Conversion/TosaToLinalg/TosaToLinalg.cpp (+1-1) 
- (modified) mlir/test/Conversion/TosaToLinalg/tosa-to-linalg-resize.mlir (+5-4) 


``````````diff

diff --git a/mlir/lib/Conversion/TosaToLinalg/TosaToLinalg.cpp b/mlir/lib/Conversion/TosaToLinalg/TosaToLinalg.cpp
index 11b3aabcbfeb4..e9c9e17fe6274 100644
--- a/mlir/lib/Conversion/TosaToLinalg/TosaToLinalg.cpp
+++ b/mlir/lib/Conversion/TosaToLinalg/TosaToLinalg.cpp
@@ -2059,7 +2059,7 @@ class GenericResizeConverter : public OpRewritePattern<tosa::ResizeOp> {
         //  dx = x - ix * scale_n;
         Value val = arith::MulIOp::create(b, in, scaleD);
         val = arith::AddIOp::create(b, val, offset);
-        index = arith::DivSIOp::create(b, val, scaleN);
+        index = arith::FloorDivSIOp::create(b, val, scaleN);
         delta = arith::MulIOp::create(b, index, scaleN);
         delta = arith::SubIOp::create(b, val, delta);
       };
diff --git a/mlir/test/Conversion/TosaToLinalg/tosa-to-linalg-resize.mlir b/mlir/test/Conversion/TosaToLinalg/tosa-to-linalg-resize.mlir
index 4900476b25dc5..2959cf59e953a 100644
--- a/mlir/test/Conversion/TosaToLinalg/tosa-to-linalg-resize.mlir
+++ b/mlir/test/Conversion/TosaToLinalg/tosa-to-linalg-resize.mlir
@@ -218,13 +218,13 @@ func.func @resize_nearest_int(%arg0: tensor<1x15x13x1xi8>) -> () {
 
   // CHECK: %[[TEMP_Y:.*]] = arith.muli %[[Y]], %[[SCALE_Y_D]]
   // CHECK: %[[Y:.*]] = arith.addi %[[TEMP_Y]], %[[OFFSET_Y]]
-  // CHECK: %[[I_Y:.*]] = arith.divsi %[[Y]], %[[SCALE_Y_N]]
+  // CHECK: %[[I_Y:.*]] = arith.floordivsi %[[Y]], %[[SCALE_Y_N]]
   // CHECK: %[[TEMP_Y:.*]] = arith.muli %[[I_Y]], %[[SCALE_Y_N]]
   // CHECK: %[[D_Y:.*]] = arith.subi %[[Y]], %[[TEMP_Y]]
 
   // CHECK: %[[TEMP_X:.*]] = arith.muli %[[X]], %[[SCALE_X_D]]
   // CHECK: %[[X:.*]] = arith.addi %[[TEMP_X]], %[[OFFSET_X]]
-  // CHECK: %[[I_X:.*]] = arith.divsi %[[X]], %[[SCALE_X_N]]
+  // CHECK: %[[I_X:.*]] = arith.floordivsi %[[X]], %[[SCALE_X_N]]
   // CHECK: %[[TEMP_X:.*]] = arith.muli %[[I_X]], %[[SCALE_X_N]]
   // CHECK: %[[D_X:.*]] = arith.subi %[[X]], %[[TEMP_X]]
 
@@ -285,13 +285,13 @@ func.func @resize_bilinear_int(%arg0: tensor<1x19x20x1xi8>) {
 
   // CHECK: %[[TEMP_Y:.*]] = arith.muli %[[Y]], %[[SCALE_Y_D]]
   // CHECK: %[[Y:.*]] = arith.addi %[[TEMP_Y]], %[[OFFSET_Y]]
-  // CHECK: %[[I_Y:.*]] = arith.divsi %[[Y]], %[[SCALE_Y_N]]
+  // CHECK: %[[I_Y:.*]] = arith.floordivsi %[[Y]], %[[SCALE_Y_N]]
   // CHECK: %[[TEMP_Y:.*]] = arith.muli %[[I_Y]], %[[SCALE_Y_N]]
   // CHECK: %[[D_Y:.*]] = arith.subi %[[Y]], %[[TEMP_Y]]
 
   // CHECK: %[[TEMP_X:.*]] = arith.muli %[[X]], %[[SCALE_X_D]]
   // CHECK: %[[X:.*]] = arith.addi %[[TEMP_X]], %[[OFFSET_X]]
-  // CHECK: %[[I_X:.*]] = arith.divsi %[[X]], %[[SCALE_X_N]]
+  // CHECK: %[[I_X:.*]] = arith.floordivsi %[[X]], %[[SCALE_X_N]]
   // CHECK: %[[TEMP_X:.*]] = arith.muli %[[I_X]], %[[SCALE_X_N]]
   // CHECK: %[[D_X:.*]] = arith.subi %[[X]], %[[TEMP_X]]
 
@@ -605,3 +605,4 @@ func.func @skip_interpolate_bilinear_f32(%arg0 : tensor<3x1x2x7xf32>) -> tensor<
   // CHECK:  return %[[GENERIC]]
   return %resize : tensor<3x1x4x7xf32>
 }
+

``````````

</details>


https://github.com/llvm/llvm-project/pull/193821