[PATCH] D52109: [TwoAddressInstructionPass] Don't update SrcRegMap for copies inserted for tied register constraint when the src isn't killed
Craig Topper via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Jan 21 21:08:42 PST 2019
craig.topper marked an inline comment as done.
craig.topper added inline comments.
================
Comment at: test/CodeGen/X86/psubus.ll:1435
+; SSE2-NEXT: punpckhwd {{.*#+}} xmm3 = xmm3[4],xmm0[4],xmm3[5],xmm0[5],xmm3[6],xmm0[6],xmm3[7],xmm0[7]
+; SSE2-NEXT: movdqa {{.*#+}} xmm0 = [2147483648,2147483648,2147483648,2147483648]
; SSE2-NEXT: movdqa %xmm2, %xmm6
----------------
RKSimon wrote:
> This is a pity - why did the zero vector reuse a live register?
This is the code just after register coalescing with the new code.
```
16B %2:vr128 = COPY $xmm2
32B %1:vr128 = COPY $xmm1
48B %11:vr128 = COPY $xmm0
64B %3:vr128 = V_SET0
80B %17:vr128 = COPY %11:vr128
96B %17:vr128 = PUNPCKLWDrr %17:vr128(tied-def 0), %3:vr128
128B %11:vr128 = PUNPCKHWDrr %11:vr128(tied-def 0), %3:vr128
144B %15:vr128 = MOVAPSrm $rip, 1, $noreg, %const.0, $noreg :: (load 16 from constant-pool)
160B %7:vr128 = COPY %2:vr128
176B %7:vr128 = PXORrr %7:vr128(tied-def 0), %15:vr128
192B %9:vr128 = COPY %11:vr128
208B %9:vr128 = PORrr %9:vr128(tied-def 0), %15:vr128
240B %9:vr128 = PCMPGTDrr %9:vr128(tied-def 0), %7:vr128
272B %11:vr128 = PANDrr %11:vr128(tied-def 0), %9:vr128
304B %9:vr128 = PANDNrr %9:vr128(tied-def 0), %2:vr128
336B %9:vr128 = PORrr %9:vr128(tied-def 0), %11:vr128
352B %13:vr128 = COPY %1:vr128
368B %13:vr128 = PXORrr %13:vr128(tied-def 0), %15:vr128
400B %15:vr128 = PORrr %15:vr128(tied-def 0), %17:vr128
432B %15:vr128 = PCMPGTDrr %15:vr128(tied-def 0), %13:vr128
464B %17:vr128 = PANDrr %17:vr128(tied-def 0), %15:vr128
496B %15:vr128 = PANDNrr %15:vr128(tied-def 0), %1:vr128
528B %15:vr128 = PORrr %15:vr128(tied-def 0), %17:vr128
560B %15:vr128 = PSUBDrr %15:vr128(tied-def 0), %1:vr128
592B %9:vr128 = PSUBDrr %9:vr128(tied-def 0), %2:vr128
624B %9:vr128 = PSLLDri %9:vr128(tied-def 0), 16
656B %9:vr128 = PSRADri %9:vr128(tied-def 0), 16
688B %15:vr128 = PSLLDri %15:vr128(tied-def 0), 16
720B %15:vr128 = PSRADri %15:vr128(tied-def 0), 16
752B %15:vr128 = PACKSSDWrr %15:vr128(tied-def 0), %9:vr128
768B $xmm0 = COPY %15:vr128
```
Given this order of instructions, if xmm0 wasn't available for %15 to use, then the COPY on the last line would have become a real move instead. So we'd just be trading the copy at 80B for the copy at 768B.
The original code commuted the PORrr at 400B which made %17 be used by the last copy instead.
Repository:
rL LLVM
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D52109/new/
https://reviews.llvm.org/D52109
More information about the llvm-commits
mailing list