[llvm] [CodeGen] Use 128bits for LaneBitmask. (PR #111157)
Sander de Smalen via llvm-commits
llvm-commits at lists.llvm.org
Fri Oct 4 09:34:24 PDT 2024
sdesmalen-arm wrote:
> > I think this is mostly because defining register tuples (2x, 3x and 4x) replicates the regunits. When I define the top bits and do some post-processing of the table in AArch64GenRegisterInfo.inc, I get the following lane masks:
>
> This doesn't sound right. AMDGPU nearly exclusively uses register tuples, and we get one regunit per lane (well, one for each 16-bit half of each lane).
I'm not really sure what AMDGPU does that is different or how it encodes the information more efficiently. Are there any lane masks in the table I shared above that you believe use unnecessary regunits?
https://github.com/llvm/llvm-project/pull/111157
More information about the llvm-commits
mailing list