[llvm] [SelectionDAG] Treat CopyFromReg as freezing the value (PR #85932)

Thu Mar 21 02:37:29 PDT 2024

================
@@ -202,13 +202,27 @@ define <4 x i32> @freeze_add_vec(<4 x i32> %a0) nounwind {
 define <4 x i32> @freeze_add_vec_undef(<4 x i32> %a0) nounwind {
 ; X86-LABEL: freeze_add_vec_undef:
 ; X86:       # %bb.0:
+; X86-NEXT:    pushl %ebp
+; X86-NEXT:    movl %esp, %ebp
+; X86-NEXT:    andl $-16, %esp
+; X86-NEXT:    subl $32, %esp
+; X86-NEXT:    movl %eax, {{[0-9]+}}(%esp)
+; X86-NEXT:    movl $3, {{[0-9]+}}(%esp)
+; X86-NEXT:    movl $2, {{[0-9]+}}(%esp)
+; X86-NEXT:    movl $1, (%esp)
+; X86-NEXT:    paddd (%esp), %xmm0
 ; X86-NEXT:    paddd {{\.?LCPI[0-9]+_[0-9]+}}, %xmm0
-; X86-NEXT:    paddd {{\.?LCPI[0-9]+_[0-9]+}}, %xmm0
+; X86-NEXT:    movl %ebp, %esp
+; X86-NEXT:    popl %ebp
----------------
bjope wrote:

I think the more general problem here is that we now fold freeze over BUILD_VECTOR more often.
Right here, with this patch, we get
```
        t2: v4i32,ch = CopyFromReg t0, Register:v4i32 %0
          t18: i32 = freeze undef:i32
        t19: v4i32 = BUILD_VECTOR Constant:i32<1>, Constant:i32<2>, Constant:i32<3>, t18
      t8: v4i32 = add t2, t19

```
instead of
```
        t2: v4i32,ch = CopyFromReg t0, Register:v4i32 %0
          t7: v4i32 = BUILD_VECTOR Constant:i32<1>, Constant:i32<2>, Constant:i32<3>, undef:i32
        t8: v4i32 = add t2, t7
      t9: v4i32 = freeze t8
```

And then I guess there might be several places where we do handle undef arguments to a BUILD_VECTOR, but we do not deal with "isFreezeUndef" operands just as good. For exampe isBuildVectorOfConstantSDNodes would fail when there is a frozen undef operand, while it would return true if some operands are just undef.

Seems really complicated (and lots of work) to fix all regressions that would pop up if merging https://github.com/llvm/llvm-project/pull/84924 , but considering that different users have suffered from miscompiles I think we might need to accept some regressions ("slow but correct" is often better that "fast but wrong" IMHO).

https://github.com/llvm/llvm-project/pull/85932