[llvm] [PowerPC] Fix inefficient code for __builtin_ppc_test_data_class (PR #181420)

zhijian lin via llvm-commits llvm-commits at lists.llvm.org
Wed Feb 18 08:51:57 PST 2026


================
@@ -17198,6 +17214,173 @@ static SDValue DAGCombineAddc(SDNode *N,
   return SDValue();
 }
 
+/// Optimize zero-extension of setcc when the compared value is known to be 0
+/// or 1.
+///
+/// Pattern: zext(setcc(Value, 0, seteq/setne)) where Value is 0 or 1
+///   -> zext(xor(Value, 1))  for seteq
+///   -> zext(Value)          for setne
+///
+/// This optimization avoids the i32 -> i1 -> i32/i64 conversion sequence
+/// by keeping the value in its original i32 type throughout.
+///
+/// Example:
+///   Before: zext(setcc(test_data_class(...), 0, seteq))
+///           // test_data_class returns 0 or 1 in i32
+///           // setcc converts i32 -> i1
+///           // zext converts i1 -> i64
+///   After:  zext(xor(test_data_class(...), 1))
+///           // Stays in i32, then extends to i64
+///
+/// This is beneficial because:
+/// 1. Eliminates the setcc instruction
+/// 2. Avoids i32 -> i1 truncation
+/// 3. Keeps computation in native integer width
----------------
diggerlin wrote:

I think the detail doc/comment make the reviewer easy to understand what the function for, I do not think there is hurt for detail document.

> Any reason why you are doing this for here but not for combineXorSelectCC()

the function deal with the 
zext(setcc(test_data_class(...), 0, seteq)) --> zext(xor(test_data_class(...), 1)) for both power10 and pre-power10.

For the power10 , it do not go to function 
combineXorSelectCC() 

https://github.com/llvm/llvm-project/pull/181420


More information about the llvm-commits mailing list