[PATCH] D154422: [AMDGPU] Add new BFI intrinsic

Tue Jul 4 02:58:59 PDT 2023

tsymalla added a comment.

In D154422#4470693 <https://reviews.llvm.org/D154422#4470693>, @arsenm wrote:

> We’ve specifically avoided adding intrinsics for easy to match instructions. This needs a semantic justification over just emitting the expanded bit sequence. It’s a huge amount of work teaching every part of the compiler the equivalent bit optimizations.

Unfortunately, in the case of BFI instructions, these are not so easy to match in LLVM. There is some unfortunate stuff going on that prevents us from generating nested v_bfi instructions (e. g. one BFI as base of another etc.) - which in turn lets us generate way less v_bfi instructions than we could. I have been working on that for a while on https://reviews.llvm.org/D136432, and it is not so easy to get it working properly. From my tests, it seems, that there is no real advantage in terms of codegen when trying to match the and / or patterns, so I'd thought that emitting the intrinsic directly would be sufficient and an improvement over the current state. This should not replace the few existing BFI ISel patterns but rather serve as a way to teach the middle-end to generate BFI instructions.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D154422/new/

https://reviews.llvm.org/D154422