[PATCH] D136432: [AMDGPU] Combine BFI instructions.

Nicolai Hähnle via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Oct 24 03:34:32 PDT 2022


nhaehnle added a comment.

In D136432#3874692 <https://reviews.llvm.org/D136432#3874692>, @foad wrote:

>> Hmm... creating a plain binary tree is simple enough. Maybe we *should* do that as long as we're here...
>
> I would suggest not doing that. It is not clearly better - it is a tradeoff between latency and register pressure. And I don't think SelectionDAG tries to balance trees in simpler cases, like just nested ANDs or nested ORs.

The situation is _slightly_ different because the BFI generation doesn't allow you to consider subtrees in isolation. If we want to generate optimal code in terms of number of instructions for `((x1 & c1) | (x2 & c2)) | ((x3 & c3) | (x4 & c4))`, we can't select the subtrees separately (since they would fail the checks on the constants).

That said, it's a good point that we can't make an optimal decision here anyway and so it may be best to just go with the simplest thing, which is a chain of BFIs.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D136432/new/

https://reviews.llvm.org/D136432



More information about the llvm-commits mailing list