[PATCH] D136432: [AMDGPU] Combine BFI instructions.
Nicolai Hähnle via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Oct 24 03:34:32 PDT 2022
nhaehnle added a comment.
In D136432#3874692 <https://reviews.llvm.org/D136432#3874692>, @foad wrote:
>> Hmm... creating a plain binary tree is simple enough. Maybe we *should* do that as long as we're here...
>
> I would suggest not doing that. It is not clearly better - it is a tradeoff between latency and register pressure. And I don't think SelectionDAG tries to balance trees in simpler cases, like just nested ANDs or nested ORs.
The situation is _slightly_ different because the BFI generation doesn't allow you to consider subtrees in isolation. If we want to generate optimal code in terms of number of instructions for `((x1 & c1) | (x2 & c2)) | ((x3 & c3) | (x4 & c4))`, we can't select the subtrees separately (since they would fail the checks on the constants).
That said, it's a good point that we can't make an optimal decision here anyway and so it may be best to just go with the simplest thing, which is a chain of BFIs.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D136432/new/
https://reviews.llvm.org/D136432
More information about the llvm-commits
mailing list