[PATCH] D44585: [AMDGPU] Scalarize when scalar code cheaper than vector code.
Farhana Aleen via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Mar 16 14:19:22 PDT 2018
FarhanaAleen created this revision.
FarhanaAleen added a reviewer: arsenm.
FarhanaAleen created this object with visibility "All Users".
Herald added subscribers: t-tye, tpr, dstuttard, yaxunl, nhaehnle, wdng, kzhuravl.
Vector code following shuffles can generate more instructions than scalar code which optimizes way the shuffles most of the time.
Here is an example of vector pattern:
vec2 = shuffle();
add = vadd vec1, vec2
res = extract_vector_elt(add, idx)
Depending on the shuffle mask there can be 1-3 instructions needed for the shuffle.
For the above kind of example pattern, scalar code can have less or equal number of instructions as vector code.
Scalar code:
vec1 = extract_vector_elt;
vec2 = extract_vector_elt;
res = add vec1, vec2
https://reviews.llvm.org/D44585
Files:
lib/Target/AMDGPU/SIISelLowering.cpp
test/CodeGen/AMDGPU/scalarize.ll
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D44585.138767.patch
Type: text/x-patch
Size: 5312 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20180316/4d3c2f57/attachment.bin>
More information about the llvm-commits
mailing list