[PATCH] D129073: [AMDGPU] Combine s_or_saveexec, s_xor instructions.
Sebastian Neubauer via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Jul 8 02:40:32 PDT 2022
sebastian-ne added inline comments.
================
Comment at: llvm/lib/Target/AMDGPU/SIOptimizeExecMasking.cpp:735-736
+// Replace occurences of
+// s_or_saveexec s_i, s_i
+// s_xor exec, exec, s_i
+// with
----------------
tsymalla wrote:
> sebastian-ne wrote:
> > I guess this also works if the input register is not the same as the output register?
> > ```
> > s_or_saveexec s_o, s_i
> > s_xor exec, exec, s_o
> > ```
> No, I don't think so.
> If EXEC = 0b111000, s0 = 0b001001, s1 = 0b010010, then after
> s_or_saveexec s0, s1
> s_xor exec, exec, s0
>
> EXEC = 0b010010
>
> while for s_andn2_saveexec s1, s0 EXEC = s_andn2_saveexec s1, s1 = 0b100100, s_andn2_saveexec s0, s1 EXEC = s_andn2_saveexec s0, s0 = 0b000001.
> Same goes if you change the order of operands, so DST of s_or_saveexec must be equal to SRC0 of s_or_saveexec and thus needs to be DST and SRC0 of s_andn2_saveexec, please correct me if I'm wrong
Not quite sure if I’m following, but isn’t it
```
=> EXEC = 0b111000, s0 = 0b001001, s1 = 0b010010
s_or_saveexec s0, s1
=> EXEC = 0b111010, s0 = 0b111000, s1 = 0b010010
s_xor exec, exec, s0
=> EXEC = 0b000010, s0 = 0b111000, s1 = 0b010010
```
and
```
=> EXEC = 0b111000, s0 = 0b001001, s1 = 0b010010
s_andn2_saveexec s0, s1
=> EXEC = 0b000010, s0 = 0b111000, s1 = 0b010010
(EXEC = ~EXEC & s1 = 0b000111 & 0b010010 = 0b000010)
```
?
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D129073/new/
https://reviews.llvm.org/D129073
More information about the llvm-commits
mailing list