[PATCH] D22814: [X86][SSE] Optimize the truncation of vector comparison results with PACKSS
Simon Pilgrim via llvm-commits
llvm-commits at lists.llvm.org
Tue Jul 26 09:03:55 PDT 2016
RKSimon created this revision.
RKSimon added reviewers: mkuper, delena, ab, spatel, andreadb.
RKSimon added a subscriber: llvm-commits.
RKSimon set the repository for this revision to rL LLVM.
We currently default to using either generic shuffles or MASK+PACKUS/PACKSS to truncate all integer vectors. For vector comparisons, we know that the result will be either all or zero bits in every element, which can be efficiently truncated by directly using PACKSS to repeatedly halve the size of each element.
Due to the limited input values (-1 or 0) we don't need to account for vector element size, so for simplicity we just use the PACKSS(vXi16,vXi16) implementation in all cases. Additionally for AVX2 PACKSS of 256bit data we must perform a PERMQ shuffle to reorder the data into the correct order. I did investigate performing a single shuffle after all the PACKSS calls but the need to cross 128bit lanes makes this difficult to achieve efficiently.
We avoid performing this on AVX512 as it should have better alternative truncation instructions - is this right?
Repository:
rL LLVM
https://reviews.llvm.org/D22814
Files:
lib/Target/X86/X86ISelLowering.cpp
test/CodeGen/X86/setcc-lowering.ll
test/CodeGen/X86/vector-compare-results.ll
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D22814.65535.patch
Type: text/x-patch
Size: 78922 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20160726/124604e9/attachment.bin>
More information about the llvm-commits
mailing list