[PATCH][X86] Add target specific combine rules to fold SSE/AVX/AVX2 blend intrinsics.
alexr at leftfield.org
Thu May 15 13:45:35 PDT 2014
After r208664, most of those patch is dead code since Clang no longer generates these intrinsics except the blendv variants. We have to keep the intrinsics as is until the next release because they may exist in IR, but we should remove unused ones when possible.
Also, it seems to me that this kind of folding belongs in InstCombine since it enables other optimizations and does not generate new shuffles. The only benefit here to SelectionDAG is carrying code for X86 intrinsics in non-X86 builds of LLVM. In this case, the code is small, so I'd still favor InstCombine.
> On May 15, 2014, at 8:26 AM, Andrea Di Biagio <andrea.dibiagio at gmail.com> wrote:
> Thanks Nadav!
> Committed revision 208895.
>> On Thu, May 15, 2014 at 4:04 PM, Nadav Rotem <nrotem at apple.com> wrote:
>> LGTM. Thanks Andrea.
>>> On May 15, 2014, at 7:41, Andrea Di Biagio <andrea.dibiagio at gmail.com> wrote:
>>> This patch teaches the x86 backend how to fold SSE4.1/AVX/AVX2 blend
>>> intrinsics in the following trivial cases:
>>> 1) fold (blend A, A, Mask) -> A;
>>> 2) fold (blend A, B, <allZeros>) -> A;
>>> 3) fold (blend A, B, <allOnes>) -> B;
>>> Added two new tests to verify that the new folding rules work for all
>>> the optimized blend intrinsics.
>>> Please let me know if ok to submit.
>>> Andrea Di Biagio
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
More information about the llvm-commits