[clang] [llvm] [RegisterCoalescer] Improve register allocation for return values by limiting rematerialization (PR #163047)
guan jian via cfe-commits
cfe-commits at lists.llvm.org
Sun Oct 19 08:18:25 PDT 2025
================
@@ -202,13 +202,13 @@ define { <4 x i8>, <4 x i1> } @always_usub_const_vector() nounwind {
; SSE-LABEL: always_usub_const_vector:
; SSE: # %bb.0:
; SSE-NEXT: pcmpeqd %xmm0, %xmm0
-; SSE-NEXT: pcmpeqd %xmm1, %xmm1
+; SSE-NEXT: movdqa %xmm0, %xmm1
----------------
rez5427 wrote:
Is there a way to match zero-idoim?, I found a case in `llvm/test/CodeGen/X86/bfloat.ll` where
```
diff --git a/llvm/test/CodeGen/X86/bfloat.ll b/llvm/test/CodeGen/X86/bfloat.ll
index 684e2921b789..170774b3612a 100644
--- a/llvm/test/CodeGen/X86/bfloat.ll
+++ b/llvm/test/CodeGen/X86/bfloat.ll
@@ -815,9 +815,9 @@ define <32 x bfloat> @pr63017() {
; SSE2-LABEL: pr63017:
; SSE2: # %bb.0:
; SSE2-NEXT: xorps %xmm0, %xmm0
-; SSE2-NEXT: xorps %xmm1, %xmm1
-; SSE2-NEXT: xorps %xmm2, %xmm2
-; SSE2-NEXT: xorps %xmm3, %xmm3
+; SSE2-NEXT: movaps %xmm0, %xmm1
+; SSE2-NEXT: movaps %xmm0, %xmm2
+; SSE2-NEXT: movaps %xmm0, %xmm3
; SSE2-NEXT: retq
```
I think this is definitely a regression. And I print the mir before register coalescer:
```
0B bb.0 (%ir-block.0):
16B %0:vr128 = V_SET0
32B $xmm0 = COPY %0:vr128
48B $xmm1 = COPY %0:vr128
64B $xmm2 = COPY %0:vr128
80B $xmm3 = COPY %0:vr128
96B RET 0, killed $xmm0, killed $xmm1, killed $xmm2, killed $xmm3
```
We need to match this V_SET0. but I don't know how. It seems like a x86 instruction.
```
virtual bool isZeroIdiom(const MachineInstr *MI, APInt &Mask) const {
return false;
}
```
https://github.com/llvm/llvm-project/pull/163047
More information about the cfe-commits
mailing list