[llvm] [PowerPC] Fix inefficient code for __builtin_ppc_test_data_class (PR #181420)
zhijian lin via llvm-commits
llvm-commits at lists.llvm.org
Wed Feb 18 08:51:57 PST 2026
================
@@ -17198,6 +17214,173 @@ static SDValue DAGCombineAddc(SDNode *N,
return SDValue();
}
+/// Optimize zero-extension of setcc when the compared value is known to be 0
+/// or 1.
+///
+/// Pattern: zext(setcc(Value, 0, seteq/setne)) where Value is 0 or 1
+/// -> zext(xor(Value, 1)) for seteq
+/// -> zext(Value) for setne
+///
+/// This optimization avoids the i32 -> i1 -> i32/i64 conversion sequence
+/// by keeping the value in its original i32 type throughout.
+///
+/// Example:
+/// Before: zext(setcc(test_data_class(...), 0, seteq))
+/// // test_data_class returns 0 or 1 in i32
+/// // setcc converts i32 -> i1
+/// // zext converts i1 -> i64
+/// After: zext(xor(test_data_class(...), 1))
+/// // Stays in i32, then extends to i64
+///
+/// This is beneficial because:
+/// 1. Eliminates the setcc instruction
+/// 2. Avoids i32 -> i1 truncation
+/// 3. Keeps computation in native integer width
----------------
diggerlin wrote:
I think the detail doc/comment make the reviewer easy to understand what the function for, I do not think there is hurt for detail document.
> Any reason why you are doing this for here but not for combineXorSelectCC()
the function deal with the
zext(setcc(test_data_class(...), 0, seteq)) --> zext(xor(test_data_class(...), 1)) for both power10 and pre-power10.
For the power10 , it do not go to function
combineXorSelectCC()
https://github.com/llvm/llvm-project/pull/181420
More information about the llvm-commits
mailing list