[llvm] [AMDGPU] Vectorize more 16 bit shuffles (PR #90648)

Nuno Lopes via llvm-commits llvm-commits at lists.llvm.org
Wed May 22 09:13:34 PDT 2024


nunoplopes wrote:

Alive2 complains about this patch:
```llvm
define <2 x i16> @uadd_sat_v9i16_combine_vi16(<9 x i16> %arg0, <9 x i16> %arg1) {
bb:
  %arg0.1 = extractelement <9 x i16> undef, i64 7
  %arg0.2 = extractelement <9 x i16> %arg0, i64 8
  %arg1.1 = extractelement <9 x i16> %arg1, i64 7
  %arg1.2 = extractelement <9 x i16> %arg1, i64 8
  %add.1 = uadd_sat i16 %arg0.1, %arg1.1
  %add.2 = uadd_sat i16 %arg0.2, %arg1.2
  %ins.1 = insertelement <2 x i16> undef, i16 %add.1, i64 0
  %ins.2 = insertelement <2 x i16> %ins.1, i16 %add.2, i64 1
  ret <2 x i16> %ins.2
}
=>
define <2 x i16> @uadd_sat_v9i16_combine_vi16(<9 x i16> %arg0, <9 x i16> %arg1) {
bb:
  %#0 = shufflevector <9 x i16> %arg0, <9 x i16> poison, 4294967295, 8
  %#1 = shufflevector <9 x i16> %arg1, <9 x i16> poison, 7, 8
  %#2 = uadd_sat <2 x i16> %#0, %#1
  ret <2 x i16> %#2
}
Transformation doesn't verify! (unsound)
ERROR: Target is more poisonous than source

Example:
<9 x i16> %arg0 = < poison, poison, poison, poison, poison, poison, poison, poison, poison >
<9 x i16> %arg1 = < poison, poison, poison, poison, poison, poison, poison, #x0000 (0)  [based on undef value], poison >

Source:
i16 %arg0.1 = #x0000 (0)        [based on undef value]
i16 %arg0.2 = poison
i16 %arg1.1 = #x0000 (0)
i16 %arg1.2 = poison
i16 %add.1 = #x0000 (0)
i16 %add.2 = poison
<2 x i16> %ins.1 = < #x0000 (0), #x0000 (0) >
<2 x i16> %ins.2 = < #x0000 (0), poison >

Target:
<2 x i16> %#0 = < poison, poison >
<2 x i16> %#1 = < #x0000 (0), poison >
<2 x i16> %#2 = < poison, poison >
Source value: < #x0000 (0), poison >
Target value: < poison, poison >
```

TL;DR: don't match on insertvalue undef, but just on insertvalue poison.

https://github.com/llvm/llvm-project/pull/90648


More information about the llvm-commits mailing list