[llvm] feat: fix big endian shuffle vector miscompile (PR #68673)

David Green via llvm-commits llvm-commits at lists.llvm.org
Sat Nov 18 12:11:48 PST 2023


================
@@ -0,0 +1,15 @@
+; RUN: llc < %s -mtriple=aarch64_be | FileCheck %s
----------------
davemgreen wrote:

Can you:
 - Add a LE test line, using --check-prefix=CHECKLE and CHECKBE
 - Autogenerate the check lines using the update_llc_test_checks script.
 - Perhaps add a simpler test that shows the same issue? Perhaps remove the loads and the freeze, but it appears to be the second shuffle that shows the problem. 
 - Maybe something like this (although it doesn't actually show the a due to the zip args both being the same):
```
define <4 x i16> @exttop(<16 x i8> %tmp1, <16 x i8> %tmp2) {
; CHECKLE-LABEL: exttop:
; CHECKLE:       // %bb.0:
; CHECKLE-NEXT:    ext v0.16b, v1.16b, v1.16b, #8
; CHECKLE-NEXT:    zip2 v0.8b, v0.8b, v0.8b
; CHECKLE-NEXT:    bic v0.4h, #255, lsl #8
; CHECKLE-NEXT:    ret
;
; CHECKBE-LABEL: exttop:
; CHECKBE:       // %bb.0:
; CHECKBE-NEXT:    rev64 v0.16b, v1.16b
; CHECKBE-NEXT:    ext v0.16b, v0.16b, v0.16b, #8
; CHECKBE-NEXT:    ext v0.16b, v0.16b, v0.16b, #8
; CHECKBE-NEXT:    zip2 v0.8b, v0.8b, v0.8b
; CHECKBE-NEXT:    bic v0.4h, #255, lsl #8
; CHECKBE-NEXT:    rev64 v0.4h, v0.4h
; CHECKBE-NEXT:    ret
  %tmp4 = shufflevector <16 x i8> %tmp2, <16 x i8> undef, <4 x i32> <i32 12, i32 13, i32 14, i32 15>
  %tmp6 = zext <4 x i8> %tmp4 to <4 x i16>
  ret <4 x i16> %tmp6
}
```
This one does:
```

define <4 x i16> @exttop2(<16 x i8> %tmp1, <16 x i8> %tmp2) {
; CHECKLE-LABEL: exttop2:
; CHECKLE:       // %bb.0:
; CHECKLE-NEXT:    ext v0.16b, v1.16b, v1.16b, #8
; CHECKLE-NEXT:    zip2 v1.8b, v1.8b, v0.8b
; CHECKLE-NEXT:    zip2 v0.8b, v0.8b, v0.8b
; CHECKLE-NEXT:    add v0.4h, v1.4h, v0.4h
; CHECKLE-NEXT:    bic v0.4h, #255, lsl #8
; CHECKLE-NEXT:    ret
;
; CHECKBE-LABEL: exttop2:
; CHECKBE:       // %bb.0:
; CHECKBE-NEXT:    rev64 v0.16b, v1.16b
; CHECKBE-NEXT:    ext v0.16b, v0.16b, v0.16b, #8
; CHECKBE-NEXT:    ext v1.16b, v0.16b, v0.16b, #8
; CHECKBE-NEXT:    zip2 v0.8b, v0.8b, v0.8b
; CHECKBE-NEXT:    zip2 v1.8b, v1.8b, v0.8b
; CHECKBE-NEXT:    add v0.4h, v0.4h, v1.4h
; CHECKBE-NEXT:    bic v0.4h, #255, lsl #8
; CHECKBE-NEXT:    rev64 v0.4h, v0.4h
; CHECKBE-NEXT:    ret
  %tmp3 = shufflevector <16 x i8> %tmp2, <16 x i8> undef, <4 x i32> <i32 4, i32 5, i32 6, i32 7>
  %tmp4 = shufflevector <16 x i8> %tmp2, <16 x i8> undef, <4 x i32> <i32 12, i32 13, i32 14, i32 15>
  %tmp5 = add <4 x i8> %tmp3, %tmp4
  %tmp6 = zext <4 x i8> %tmp5 to <4 x i16>
  ret <4 x i16> %tmp6
}
```
 - I would also pick a different name or add it to one of the existing test files.

https://github.com/llvm/llvm-project/pull/68673


More information about the llvm-commits mailing list