[PATCH] D89578: [CostModel] Return TCC_Expensive for non-speculatable ctlz/cttz.
Sanjay Patel via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Oct 19 14:08:47 PDT 2020
spatel added a comment.
This might be restoring the old behavior, but I want to make sure we are ok with the potential regressions. If not, we should make some adjustments to the x86 model first.
The SLP v4i32 diff with zero-is-undef set would be something like this, and it's hard to justify IMO:
bsfl src32(%rip), %eax
bsfl src32+4(%rip), %ecx
bsfl src32+8(%rip), %edx
bsfl src32+12(%rip), %esi
movl %eax, dst32(%rip)
movl %ecx, dst32+4(%rip)
movl %edx, dst32+8(%rip)
movl %esi, dst32+12(%rip)
retq
vs.
movdqa src32(%rip), %xmm0
pcmpeqd %xmm1, %xmm1
paddd %xmm0, %xmm1
pandn %xmm1, %xmm0
movdqa %xmm0, %xmm1
psrlw $1, %xmm1
pand .LCPI0_0(%rip), %xmm1
psubb %xmm1, %xmm0
movdqa .LCPI0_1(%rip), %xmm1 # xmm1 = [51,51,51,51,51,51,51,51,51,51,51,51,51,51,51,51]
movdqa %xmm0, %xmm2
pand %xmm1, %xmm2
psrlw $2, %xmm0
pand %xmm1, %xmm0
paddb %xmm2, %xmm0
movdqa %xmm0, %xmm1
psrlw $4, %xmm1
paddb %xmm0, %xmm1
pand .LCPI0_2(%rip), %xmm1
pxor %xmm0, %xmm0
movdqa %xmm1, %xmm2
punpckhdq %xmm0, %xmm2 # xmm2 = xmm2[2],xmm0[2],xmm2[3],xmm0[3]
psadbw %xmm0, %xmm2
punpckldq %xmm0, %xmm1 # xmm1 = xmm1[0],xmm0[0],xmm1[1],xmm0[1]
psadbw %xmm0, %xmm1
packuswb %xmm2, %xmm1
movdqa %xmm1, dst32(%rip)
retq
================
Comment at: llvm/test/CodeGen/X86/dagcombine-select.ll:438-441
+; NOBMI-NEXT: testl %edi, %edi
+; NOBMI-NEXT: je .LBB26_2
+; NOBMI-NEXT: # %bb.1: # %select.false.sink
+; NOBMI-NEXT: bsfl %edi, %eax
----------------
I think we would view this as a regression based on https://bugs.llvm.org/PR46203 / 2328cab16ccd8f17fee782c29fb844662c089fbb
Do we need to adjust the isCheapToSpeculateXXXX APIs to acknowledge the zero-is-undef parameter of the intrinsic?
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D89578/new/
https://reviews.llvm.org/D89578
More information about the llvm-commits
mailing list