[PATCH] D154422: [AMDGPU] Add new BFI intrinsic
Thomas Symalla via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Jul 4 02:58:59 PDT 2023
tsymalla added a comment.
In D154422#4470693 <https://reviews.llvm.org/D154422#4470693>, @arsenm wrote:
> We’ve specifically avoided adding intrinsics for easy to match instructions. This needs a semantic justification over just emitting the expanded bit sequence. It’s a huge amount of work teaching every part of the compiler the equivalent bit optimizations.
Unfortunately, in the case of BFI instructions, these are not so easy to match in LLVM. There is some unfortunate stuff going on that prevents us from generating nested v_bfi instructions (e. g. one BFI as base of another etc.) - which in turn lets us generate way less v_bfi instructions than we could. I have been working on that for a while on https://reviews.llvm.org/D136432, and it is not so easy to get it working properly. From my tests, it seems, that there is no real advantage in terms of codegen when trying to match the and / or patterns, so I'd thought that emitting the intrinsic directly would be sufficient and an improvement over the current state. This should not replace the few existing BFI ISel patterns but rather serve as a way to teach the middle-end to generate BFI instructions.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D154422/new/
https://reviews.llvm.org/D154422
More information about the llvm-commits
mailing list