[PATCH] D120193: [X86][SSE] Attempt to lower vec_reduce_add patterns with PSADBW for zero-extended vXi8 sources

Simon Pilgrim via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Sat Feb 19 13:14:08 PST 2022


RKSimon created this revision.
RKSimon added reviewers: pengfei, craig.topper, spatel, lebedev.ri.
Herald added a subscriber: hiraditya.
RKSimon requested review of this revision.
Herald added a project: LLVM.

For i16/32/64 vectors, if the upper bits are known to be zero, then we can try to truncate to vXi8 (if its worth it) and perform this as a PSADBW to add+zext each v4i8 subvector to a i64 sum, which we can then reduce together.

This addresses some of the PR42674 test cases where the source data was vXi8 but had been extended to match a wider unsigned integer accumulator.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D120193

Files:
  llvm/lib/Target/X86/X86ISelLowering.cpp
  llvm/test/CodeGen/X86/vector-reduce-add-mask.ll
  llvm/test/CodeGen/X86/vector-reduce-add-zext.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D120193.410103.patch
Type: text/x-patch
Size: 119363 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20220219/34d4c847/attachment-0001.bin>


More information about the llvm-commits mailing list