[llvm] [PowerPC] Replace vspltisw+vadduwm instructions with xxleqv+vsubuwm for adding the vector {1, 1, 1, 1} (PR #160882)

Fri Sep 26 06:35:10 PDT 2025

llvmbot wrote:




@llvm/pr-subscribers-backend-powerpc

Author: None (Himadhith)

<details>
<summary>Changes</summary>

This patch leverages generation of vector of -1s to be **cheaper** than vector of 1s to optimize the current implementation for **`A + vector {1, 1, 1, 1}`**. 

In this optimized version we replace `vspltisw (4 cycles)` with `xxleqv (2 cycles)` using the following identity: 
`A - (-1) = A + 1`.

---
Full diff: https://github.com/llvm/llvm-project/pull/160882.diff


1 Files Affected:

- (modified) llvm/lib/Target/PowerPC/PPCInstrVSX.td (+4) 


``````````diff

diff --git a/llvm/lib/Target/PowerPC/PPCInstrVSX.td b/llvm/lib/Target/PowerPC/PPCInstrVSX.td
index 4e5165bfcda55..dc850d2470cfd 100644
--- a/llvm/lib/Target/PowerPC/PPCInstrVSX.td
+++ b/llvm/lib/Target/PowerPC/PPCInstrVSX.td
@@ -3627,6 +3627,10 @@ def : Pat<(v4i32 (build_vector immSExt5NonZero:$A, immSExt5NonZero:$A,
                                immSExt5NonZero:$A, immSExt5NonZero:$A)),
           (v4i32 (VSPLTISW imm:$A))>;
 
+// Optimise for vector of 1s addition operation
+def : Pat<(add v4i32:$A, (build_vector (i32 1), (i32 1), (i32 1), (i32 1))),
+          (VSUBUWM $A, (v4i32 (COPY_TO_REGCLASS (XXLEQVOnes), VSRC)))>;
+
 // Splat loads.
 def : Pat<(v8i16 (PPCldsplat ForceXForm:$A)),
           (v8i16 (VSPLTHs 3, (MTVSRWZ (LHZX ForceXForm:$A))))>;

``````````

</details>


https://github.com/llvm/llvm-project/pull/160882