[clang] [llvm] [RegisterCoalescer] Improve register allocation for return values by limiting rematerialization (PR #163047)

guan jian via cfe-commits cfe-commits at lists.llvm.org
Sun Oct 19 08:18:25 PDT 2025


================
@@ -202,13 +202,13 @@ define { <4 x i8>, <4 x i1> } @always_usub_const_vector() nounwind {
 ; SSE-LABEL: always_usub_const_vector:
 ; SSE:       # %bb.0:
 ; SSE-NEXT:    pcmpeqd %xmm0, %xmm0
-; SSE-NEXT:    pcmpeqd %xmm1, %xmm1
+; SSE-NEXT:    movdqa %xmm0, %xmm1
----------------
rez5427 wrote:

Is there a way to match zero-idoim?, I found a case in `llvm/test/CodeGen/X86/bfloat.ll` where
```
diff --git a/llvm/test/CodeGen/X86/bfloat.ll b/llvm/test/CodeGen/X86/bfloat.ll
index 684e2921b789..170774b3612a 100644
--- a/llvm/test/CodeGen/X86/bfloat.ll
+++ b/llvm/test/CodeGen/X86/bfloat.ll
@@ -815,9 +815,9 @@ define <32 x bfloat> @pr63017() {
 ; SSE2-LABEL: pr63017:
 ; SSE2:       # %bb.0:
 ; SSE2-NEXT:    xorps %xmm0, %xmm0
-; SSE2-NEXT:    xorps %xmm1, %xmm1
-; SSE2-NEXT:    xorps %xmm2, %xmm2
-; SSE2-NEXT:    xorps %xmm3, %xmm3
+; SSE2-NEXT:    movaps %xmm0, %xmm1
+; SSE2-NEXT:    movaps %xmm0, %xmm2
+; SSE2-NEXT:    movaps %xmm0, %xmm3
 ; SSE2-NEXT:    retq
```
I think this is definitely a regression. And I print the mir before register coalescer:
```
0B	bb.0 (%ir-block.0):
16B	  %0:vr128 = V_SET0
32B	  $xmm0 = COPY %0:vr128
48B	  $xmm1 = COPY %0:vr128
64B	  $xmm2 = COPY %0:vr128
80B	  $xmm3 = COPY %0:vr128
96B	  RET 0, killed $xmm0, killed $xmm1, killed $xmm2, killed $xmm3
```
We need to match this V_SET0. but I don't know how. It seems like a x86 instruction.

```
 virtual bool isZeroIdiom(const MachineInstr *MI, APInt &Mask) const {
    return false;
  }
```


https://github.com/llvm/llvm-project/pull/163047


More information about the cfe-commits mailing list