[PATCH] D129073: [AMDGPU] Combine s_or_saveexec, s_xor instructions.

Thomas Symalla via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Jul 8 02:28:53 PDT 2022


tsymalla marked 2 inline comments as done.
tsymalla added inline comments.


================
Comment at: llvm/lib/Target/AMDGPU/SIOptimizeExecMasking.cpp:735-736
+// Replace occurences of
+// s_or_saveexec s_i, s_i
+// s_xor exec, exec, s_i
+// with
----------------
sebastian-ne wrote:
> I guess this also works if the input register is not the same as the output register?
> ```
> s_or_saveexec s_o, s_i
> s_xor exec, exec, s_o
> ```
No, I don't think so.
If EXEC = 0b111000, s0 = 0b001001, s1 = 0b010010, then after
s_or_saveexec s0, s1
s_xor exec, exec, s0

EXEC  = 0b010010

while for s_andn2_saveexec s1, s0 EXEC = s_andn2_saveexec s1, s1 = 0b100100, s_andn2_saveexec s0, s1 EXEC = s_andn2_saveexec s0, s0 = 0b000001.
Same goes if you change the order of operands, so DST of s_or_saveexec must be equal to SRC0 of s_or_saveexec and thus needs to be DST and SRC0 of s_andn2_saveexec, please correct me if I'm wrong


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D129073/new/

https://reviews.llvm.org/D129073



More information about the llvm-commits mailing list