[PATCH] D120193: [X86][SSE] Attempt to lower vec_reduce_add patterns with PSADBW for zero-extended vXi8 sources
Simon Pilgrim via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Sat Feb 19 13:14:08 PST 2022
RKSimon created this revision.
RKSimon added reviewers: pengfei, craig.topper, spatel, lebedev.ri.
Herald added a subscriber: hiraditya.
RKSimon requested review of this revision.
Herald added a project: LLVM.
For i16/32/64 vectors, if the upper bits are known to be zero, then we can try to truncate to vXi8 (if its worth it) and perform this as a PSADBW to add+zext each v4i8 subvector to a i64 sum, which we can then reduce together.
This addresses some of the PR42674 test cases where the source data was vXi8 but had been extended to match a wider unsigned integer accumulator.
Repository:
rG LLVM Github Monorepo
https://reviews.llvm.org/D120193
Files:
llvm/lib/Target/X86/X86ISelLowering.cpp
llvm/test/CodeGen/X86/vector-reduce-add-mask.ll
llvm/test/CodeGen/X86/vector-reduce-add-zext.ll
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D120193.410103.patch
Type: text/x-patch
Size: 119363 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20220219/34d4c847/attachment-0001.bin>
More information about the llvm-commits
mailing list