[all-commits] [llvm/llvm-project] fdbc30: [X86][DAGCombiner][SelectionDAG] - Fold Zext Build...
Rohit Aggarwal via All-commits
all-commits at lists.llvm.org
Tue May 6 00:25:32 PDT 2025
Branch: refs/heads/main
Home: https://github.com/llvm/llvm-project
Commit: fdbc30a383973d89d738283e733ba0db98df6a77
https://github.com/llvm/llvm-project/commit/fdbc30a383973d89d738283e733ba0db98df6a77
Author: Rohit Aggarwal <44664450+rohitaggarwal007 at users.noreply.github.com>
Date: 2025-05-06 (Tue, 06 May 2025)
Changed paths:
M llvm/lib/Target/X86/X86ISelLowering.cpp
M llvm/test/CodeGen/X86/buildvec-widen-dotproduct.ll
Log Message:
-----------
[X86][DAGCombiner][SelectionDAG] - Fold Zext Build Vector to Bitcast of widen Build Vector (#135010)
I am working on a problem in which a kernel is SLP vectorized and lead
to generation of insertelements followed by zext. On lowering, the
assembly looks like below:
vmovd %r9d, %xmm0
vpinsrb $1, (%rdi,%rsi), %xmm0, %xmm0
vpinsrb $2, (%rdi,%rsi,2), %xmm0, %xmm0
vpinsrb $3, (%rdi,%rcx), %xmm0, %xmm0
vpinsrb $4, (%rdi,%rsi,4), %xmm0, %xmm0
vpinsrb $5, (%rdi,%rax), %xmm0, %xmm0
vpinsrb $6, (%rdi,%rcx,2), %xmm0, %xmm0
vpinsrb $7, (%rdi,%r8), %xmm0, %xmm0
vpmovzxbw %xmm0, %xmm0 # xmm0 =
xmm0[0],zero,xmm0[1],zero,xmm0[2],zero,xmm0[3],zero,xmm0[4],zero,xmm0[5],zero,xmm0[6],zero,xmm0[7],zero
vpmaddwd (%rdx), %xmm0, %xmm0
After all the insrb, xmm0 looks like
xmm0=xmm0[0],xmm0[1],xmm0[2],xmm0[3],xmm0[4],xmm0[5],xmm0[6],xmm0[7],zero,zero,zero,zero,zero,zero,zero,zero
Here vpmovzxbw perform the extension of i8 to i16. But it is expensive
operation and I want to remove it.
Optimization
Place the value in correct location while inserting so that zext can be
avoid.
While lowering, we can write a custom lowerOperation for
zero_extend_vector_inreg opcode. We can override the current default
operation with my custom in the legalization step.
The changes proposed are state below:
vpinsrb $2, (%rdi,%rsi), %xmm0, %xmm0
vpinsrb $4, (%rdi,%rsi,2), %xmm0, %xmm0
vpinsrb $6, (%rdi,%rcx), %xmm0, %xmm0
vpinsrb $8, (%rdi,%rsi,4), %xmm0, %xmm0
vpinsrb $a, (%rdi,%rax), %xmm0, %xmm0
vpinsrb $c, (%rdi,%rcx,2), %xmm0, %xmm0
vpinsrb $e, (%rdi,%r8), %xmm0, %xmm0 # xmm0 =
xmm0[0],zero,xmm0[1],zero,xmm0[2],zero,xmm0[3],zero,xmm0[4],zero,xmm0[5],zero,xmm0[6],zero,xmm0[7],zero
vpmaddwd (%rdx), %xmm0, %xmm0
More details in the discourse topic
[https://discourse.llvm.org/t/improve-the-gathering-of-the-elements-so-that-unwanted-ext-operations-can-be-avoided/85443](url)
---------
Co-authored-by: Rohit Aggarwal <Rohit.Aggarwal at amd.com>
To unsubscribe from these emails, change your notification settings at https://github.com/llvm/llvm-project/settings/notifications
More information about the All-commits
mailing list