[llvm] [InstCombine] Fold Xor with or disjoint (PR #105992)

Sun Aug 25 13:09:05 PDT 2024

================
@@ -0,0 +1,32 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5
+; RUN: opt < %s -passes=instcombine -S | FileCheck %s
+
+define i32 @fold_xor_with_disjoint_or(i32 %a, i1 %c) {
+; CHECK-LABEL: define i32 @fold_xor_with_disjoint_or(
+; CHECK-SAME: i32 [[A:%.*]], i1 [[C:%.*]]) {
+; CHECK-NEXT:    [[SHL:%.*]] = shl i32 [[A]], 4
+; CHECK-NEXT:    [[TMP1:%.*]] = select i1 [[C]], i32 4, i32 0
+; CHECK-NEXT:    [[XOR:%.*]] = or disjoint i32 [[TMP1]], [[SHL]]
+; CHECK-NEXT:    ret i32 [[XOR]]
+;
+  %s = select i1 %c, i32 0, i32 4
+  %shl = shl i32 %a, 4
+  %or = or disjoint i32 %s, %shl
+  %xor = xor i32 %or, 4
+  ret i32 %xor
+}
+
+define <2 x i32> @fold_xor_with_disjoint_or_vec(<2 x i32> %a, i1 %c) {
+; CHECK-LABEL: define <2 x i32> @fold_xor_with_disjoint_or_vec(
+; CHECK-SAME: <2 x i32> [[A:%.*]], i1 [[C:%.*]]) {
+; CHECK-NEXT:    [[SHL:%.*]] = shl <2 x i32> [[A]], <i32 4, i32 4>
+; CHECK-NEXT:    [[TMP1:%.*]] = select i1 [[C]], <2 x i32> <i32 4, i32 4>, <2 x i32> zeroinitializer
+; CHECK-NEXT:    [[XOR:%.*]] = or disjoint <2 x i32> [[TMP1]], [[SHL]]
+; CHECK-NEXT:    ret <2 x i32> [[XOR]]
+;
+  %s = select i1 %c, <2 x i32> <i32 0, i32 0>, <2 x i32> <i32 4, i32 4>
+  %shl = shl <2 x i32> %a, <i32 4, i32 4>
+  %or = or disjoint <2 x i32> %s, %shl
+  %xor = xor <2 x i32> %or, <i32 4, i32 4>
+  ret <2 x i32> %xor
+}
----------------
elhewaty wrote:

Also, we need test cases for commutative operations (`or` and `xor`) to see if the optimization works when the LHS and RHS are shuffled.
hmm, I think we need only for or as it has a non-const operand `A`, but we don't need for `xor` as the canonicalizer always places constants on the right-hand side.

https://github.com/llvm/llvm-project/pull/105992