[PATCH] D129073: [AMDGPU] Combine s_or_saveexec, s_xor instructions.
Thomas Symalla via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Jul 8 02:28:53 PDT 2022
tsymalla marked 2 inline comments as done.
tsymalla added inline comments.
================
Comment at: llvm/lib/Target/AMDGPU/SIOptimizeExecMasking.cpp:735-736
+// Replace occurences of
+// s_or_saveexec s_i, s_i
+// s_xor exec, exec, s_i
+// with
----------------
sebastian-ne wrote:
> I guess this also works if the input register is not the same as the output register?
> ```
> s_or_saveexec s_o, s_i
> s_xor exec, exec, s_o
> ```
No, I don't think so.
If EXEC = 0b111000, s0 = 0b001001, s1 = 0b010010, then after
s_or_saveexec s0, s1
s_xor exec, exec, s0
EXEC = 0b010010
while for s_andn2_saveexec s1, s0 EXEC = s_andn2_saveexec s1, s1 = 0b100100, s_andn2_saveexec s0, s1 EXEC = s_andn2_saveexec s0, s0 = 0b000001.
Same goes if you change the order of operands, so DST of s_or_saveexec must be equal to SRC0 of s_or_saveexec and thus needs to be DST and SRC0 of s_andn2_saveexec, please correct me if I'm wrong
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D129073/new/
https://reviews.llvm.org/D129073
More information about the llvm-commits
mailing list