[PATCH] D77152: [SelectionDAG] Better legalization for FSHL and FSHR

Mon Aug 24 01:16:03 PDT 2020

foad added inline comments.

================
Comment at: llvm/test/CodeGen/X86/vector-fshl-512.ll:672
+; AVX512F-NEXT:    vpand {{.*}}(%rip), %xmm2, %xmm3
+; AVX512F-NEXT:    vpmovzxwq {{.*#+}} xmm3 = xmm3[0],zero,zero,zero,xmm3[1],zero,zero,zero
+; AVX512F-NEXT:    vextracti64x4 $1, %zmm0, %ymm4
----------------
xbolva00 wrote:
> Is this better sequence? Looks longer.
I agree it's longer but my knowledge of AVX-512 is limited. It's using the expansion of fshl: X << (Z % BW) | Y >> 1 >> (BW - 1 - (Z % BW))
but it looks like the "Y >> 1 >> expr" part is generating messy code. Compared to the old code:
```
vpsrlw %xmm4, %ymm5, %ymm5
vpsrlw %xmm4, %ymm1, %ymm1
```
I would expect to see just a couple more instructions here:
```
vpsrlw $1, %ymm5, %ymm5
vpsrlw %xmm4, %ymm5, %ymm5
vpsrlw $1, %ymm1, %ymm1
vpsrlw %xmm4, %ymm1, %ymm1
```
which would be offset by not needing the vpcmpeqw/vpternlogq at the end.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77152/new/

https://reviews.llvm.org/D77152