[llvm] [AMDGPU] Vectorize more 16 bit shuffles (PR #90648)
Nuno Lopes via llvm-commits
llvm-commits at lists.llvm.org
Wed May 22 09:13:34 PDT 2024
nunoplopes wrote:
Alive2 complains about this patch:
```llvm
define <2 x i16> @uadd_sat_v9i16_combine_vi16(<9 x i16> %arg0, <9 x i16> %arg1) {
bb:
%arg0.1 = extractelement <9 x i16> undef, i64 7
%arg0.2 = extractelement <9 x i16> %arg0, i64 8
%arg1.1 = extractelement <9 x i16> %arg1, i64 7
%arg1.2 = extractelement <9 x i16> %arg1, i64 8
%add.1 = uadd_sat i16 %arg0.1, %arg1.1
%add.2 = uadd_sat i16 %arg0.2, %arg1.2
%ins.1 = insertelement <2 x i16> undef, i16 %add.1, i64 0
%ins.2 = insertelement <2 x i16> %ins.1, i16 %add.2, i64 1
ret <2 x i16> %ins.2
}
=>
define <2 x i16> @uadd_sat_v9i16_combine_vi16(<9 x i16> %arg0, <9 x i16> %arg1) {
bb:
%#0 = shufflevector <9 x i16> %arg0, <9 x i16> poison, 4294967295, 8
%#1 = shufflevector <9 x i16> %arg1, <9 x i16> poison, 7, 8
%#2 = uadd_sat <2 x i16> %#0, %#1
ret <2 x i16> %#2
}
Transformation doesn't verify! (unsound)
ERROR: Target is more poisonous than source
Example:
<9 x i16> %arg0 = < poison, poison, poison, poison, poison, poison, poison, poison, poison >
<9 x i16> %arg1 = < poison, poison, poison, poison, poison, poison, poison, #x0000 (0) [based on undef value], poison >
Source:
i16 %arg0.1 = #x0000 (0) [based on undef value]
i16 %arg0.2 = poison
i16 %arg1.1 = #x0000 (0)
i16 %arg1.2 = poison
i16 %add.1 = #x0000 (0)
i16 %add.2 = poison
<2 x i16> %ins.1 = < #x0000 (0), #x0000 (0) >
<2 x i16> %ins.2 = < #x0000 (0), poison >
Target:
<2 x i16> %#0 = < poison, poison >
<2 x i16> %#1 = < #x0000 (0), poison >
<2 x i16> %#2 = < poison, poison >
Source value: < #x0000 (0), poison >
Target value: < poison, poison >
```
TL;DR: don't match on insertvalue undef, but just on insertvalue poison.
https://github.com/llvm/llvm-project/pull/90648
More information about the llvm-commits
mailing list