[PATCH] D127831: BasicBlockUtils: Add a new way for CreateControlFlowHub()
Jay Foad via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Sat Oct 29 02:52:10 PDT 2022
foad added a comment.
In D127831#3893757 <https://reviews.llvm.org/D127831#3893757>, @sameerds wrote:
> In D127831#3891852 <https://reviews.llvm.org/D127831#3891852>, @foad wrote:
>
>> Right, I understand that using an i32 instead of an i1 means that you need extra icmp instructions in the IR, but does it actually cause more instructions in the final generated code (for AMDGPU)? If the i1 was stored in a general purpose register then you would need a cmp //anyway// to use it to control a conditional branch.
>
> I would expect that the i1 values get packed into an sreg, and then used in cbranch_vccz? That wouldn't need a cmp to actually compare the i1 value.
OK, if they are divergent and get packed into an SGPR with one bit per lane then that does sound much better than using one VGPR per value. And the threshold of 32 makes some sense, since it's roughly equal to the wavefront size.
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D127831/new/
https://reviews.llvm.org/D127831
More information about the llvm-commits
mailing list