[llvm] [X86] Use GFNI for vXi8 shifts/rotates (PR #89115)
Simon Pilgrim via llvm-commits
llvm-commits at lists.llvm.org
Mon Apr 22 14:08:00 PDT 2024
================
@@ -98,34 +72,25 @@ define <16 x i8> @splatconstant_ashr_v16i8(<16 x i8> %a) nounwind {
define <32 x i8> @splatconstant_shl_v32i8(<32 x i8> %a) nounwind {
; GFNISSE-LABEL: splatconstant_shl_v32i8:
; GFNISSE: # %bb.0:
-; GFNISSE-NEXT: psllw $6, %xmm0
-; GFNISSE-NEXT: movdqa {{.*#+}} xmm2 = [192,192,192,192,192,192,192,192,192,192,192,192,192,192,192,192]
-; GFNISSE-NEXT: pand %xmm2, %xmm0
-; GFNISSE-NEXT: psllw $6, %xmm1
-; GFNISSE-NEXT: pand %xmm2, %xmm1
+; GFNISSE-NEXT: pmovsxwq {{.*#+}} xmm2 = [258,258]
+; GFNISSE-NEXT: gf2p8affineqb $0, %xmm2, %xmm0
+; GFNISSE-NEXT: gf2p8affineqb $0, %xmm2, %xmm1
; GFNISSE-NEXT: retq
;
; GFNIAVX1-LABEL: splatconstant_shl_v32i8:
; GFNIAVX1: # %bb.0:
-; GFNIAVX1-NEXT: vextractf128 $1, %ymm0, %xmm1
-; GFNIAVX1-NEXT: vpsllw $6, %xmm1, %xmm1
-; GFNIAVX1-NEXT: vbroadcastss {{.*#+}} xmm2 = [192,192,192,192,192,192,192,192,192,192,192,192,192,192,192,192]
-; GFNIAVX1-NEXT: vpand %xmm2, %xmm1, %xmm1
-; GFNIAVX1-NEXT: vpsllw $6, %xmm0, %xmm0
-; GFNIAVX1-NEXT: vpand %xmm2, %xmm0, %xmm0
-; GFNIAVX1-NEXT: vinsertf128 $1, %xmm1, %ymm0, %ymm0
+; GFNIAVX1-NEXT: vgf2p8affineqb $0, {{\.?LCPI[0-9]+_[0-9]+}}(%rip), %ymm0, %ymm0
; GFNIAVX1-NEXT: retq
;
; GFNIAVX2-LABEL: splatconstant_shl_v32i8:
; GFNIAVX2: # %bb.0:
-; GFNIAVX2-NEXT: vpsllw $6, %ymm0, %ymm0
-; GFNIAVX2-NEXT: vpand {{\.?LCPI[0-9]+_[0-9]+}}(%rip), %ymm0, %ymm0
----------------
RKSimon wrote:
This one I think we can accept - shift+bitop+(folded)vectorload is worse than gfaffine+broadcastload:
https://llvm.godbolt.org/z/nbndPjMTv
https://github.com/llvm/llvm-project/pull/89115
More information about the llvm-commits
mailing list