[PATCH] D64726: AMDGPU/GlobalISel: Fix not constraining result reg of copies to VCC

Nicolai Hähnle via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Jul 22 13:04:01 PDT 2019


nhaehnle added a comment.

One more thought:

> An alternative strategy I've been considering is to disallow SGPR/VGPR s1 values entirely, and to always require these to use SCC/VCC producers for them (which would be a lot simpler). Any s1 use would then get a COPY which will turn into S_CMP/V_CMP.

We really want to keep divergent booleans as lane masks, the operations for which are quite tricky to select correctly in the normal instruction selection flow when they cross basic block boundaries. Without s1, we'd have to first lower everything as vgpr s32 and then run an analysis that pattern-matches to essentially recover the s1 values again in order to optimize them to lanemasks. Which seems a pretty roundabout and complex approach.  So I do think there's a legitimate role for s1 values early in the backend. With this particular issue here we've unfortunately run into some tough questions as to what s1 means in terms of representation in an s32 sgpr.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D64726/new/

https://reviews.llvm.org/D64726





More information about the llvm-commits mailing list