[llvm-bugs] [Bug 40869] New: [X86] Poor broadcast folding from ext/trunc loads
via llvm-bugs
llvm-bugs at lists.llvm.org
Tue Feb 26 09:17:11 PST 2019
https://bugs.llvm.org/show_bug.cgi?id=40869
Bug ID: 40869
Summary: [X86] Poor broadcast folding from ext/trunc loads
Product: libraries
Version: trunk
Hardware: PC
OS: Windows NT
Status: NEW
Severity: enhancement
Priority: P
Component: Backend: X86
Assignee: unassignedbugs at nondot.org
Reporter: llvm-dev at redking.me.uk
CC: craig.topper at gmail.com, llvm-bugs at lists.llvm.org,
llvm-dev at redking.me.uk, spatel+llvm at rotateright.com
e.g. (from vector-shuffle-512-v32.ll)
llc < %s -mtriple=x86_64-apple-darwin -mcpu=skx
define <32 x i16> @insert_dup_elt1_mem_v32i16_i32(i32* %ptr) #0 {
; KNL-LABEL: insert_dup_elt1_mem_v32i16_i32:
; KNL: ## %bb.0:
; KNL-NEXT: vpbroadcastw 2(%rdi), %ymm0
; KNL-NEXT: vmovdqa %ymm0, %ymm1
; KNL-NEXT: retq
;
; SKX-LABEL: insert_dup_elt1_mem_v32i16_i32:
; SKX: ## %bb.0:
; SKX-NEXT: movzwl 2(%rdi), %eax
; SKX-NEXT: vpbroadcastw %eax, %zmm0
; SKX-NEXT: retq
%tmp = load i32, i32* %ptr, align 4
%tmp1 = insertelement <4 x i32> zeroinitializer, i32 %tmp, i32 0
%tmp2 = bitcast <4 x i32> %tmp1 to <8 x i16>
%tmp3 = shufflevector <8 x i16> %tmp2, <8 x i16> undef, <32 x i32> <i32 1,
i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1,
i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1,
i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1>
ret <32 x i16> %tmp3
}
Notice how the KNL (AVX2) version manages to fold but SKX (AVX512BWVL) ymm
broadcasts fail.
--
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20190226/76be2696/attachment.html>
More information about the llvm-bugs
mailing list