[llvm] LAA: generalize strides over unequal type sizes (PR #108088)

Graham Hunter via llvm-commits llvm-commits at lists.llvm.org
Thu Oct 3 06:48:21 PDT 2024


================
@@ -8,15 +8,21 @@ declare void @llvm.assume(i1)
 define void @different_non_constant_strides_known_backward(ptr %A) {
 ; CHECK-LABEL: 'different_non_constant_strides_known_backward'
 ; CHECK-NEXT:    loop:
-; CHECK-NEXT:      Report: unsafe dependent memory operations in loop. Use #pragma clang loop distribute(enable) to allow loop distribution to attempt to isolate the offending operations into a separate loop
-; CHECK-NEXT:  Unknown data dependence.
+; CHECK-NEXT:      Memory dependences are safe with run-time checks
----------------
huntergr-arm wrote:

I'm not sure the resulting analysis is correct for this loop now. Forcing the target instruction cost in LoopVec results in vectorization with a VF of 4 (at least for NEON) with the following RT checks:

```
vector.memcheck:                                  ; preds = %entry
  %scevgep = getelementptr i8, ptr %A, i64 2044
  %scevgep1 = getelementptr i8, ptr %A, i64 1024
  %bound0 = icmp ult ptr %A, %scevgep1
  %bound1 = icmp ult ptr %A, %scevgep
  %found.conflict = and i1 %bound0, %bound1
  br i1 %found.conflict, label %scalar.ph, label %vector.ph
```

If I'm right, we're safe in this instance because the check should always fail (unless the scevgeps wrap), and we never enter the vectorized code because `%A` should be lower than both `%A+1024` and `%A+2044`. So we've just bloated our binary a bit without a payoff, but should still get the right result.

If at least one scevgep wraps and we enter the vectorized code, then I think we'll get a miscompare with a VF above 2, since you'll be reading from `%A+2` when that location would be written to one lane earlier in the scalar loop:
Scalar iteration 1:
read `%A+0`, write `%A+0`
Scalar iteration 2:
read `%A+1`, write `%A+2`
Scalar iteration 3, hazard with static VF>2:
read `%A+2`, write `%A+4`

Please let me know if I've gotten mixed up here.

https://github.com/llvm/llvm-project/pull/108088


More information about the llvm-commits mailing list