[PATCH] D87861: [instcombine][x86] Converted pdep/pext with shifted mask to simple arithmetic
Philip Reames via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Sep 17 15:20:51 PDT 2020
reames created this revision.
reames added reviewers: anna, craig.topper.
Herald added subscribers: dantrushin, bollu, hiraditya, mcrosier.
Herald added a project: LLVM.
reames requested review of this revision.
If the mask of a pdep or pext instruction is a shift masked (i.e. one contiguous block of ones) we need at most one and and one shift to represent the operation without the intrinsic. One all platforms I know of, this is faster than the pdep/pext.
The cost modelling for multiple contiguous blocks might be worth exploring in a follow up, but it's not relevant for my current use case. It would almost certainly be a win on AMDs where these are really really slow though.
Repository:
rG LLVM Github Monorepo
https://reviews.llvm.org/D87861
Files:
llvm/lib/Target/X86/X86InstCombineIntrinsic.cpp
llvm/test/Transforms/InstCombine/X86/x86-bmi-tbm.ll
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D87861.292636.patch
Type: text/x-patch
Size: 4043 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20200917/453ca198/attachment.bin>
More information about the llvm-commits
mailing list