[PATCH] D99507: [amdgpu] Add a pass to avoid jump into blocks with 0 exec mask.

Nicolai Hähnle via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Apr 1 15:12:33 PDT 2021


nhaehnle requested changes to this revision.
nhaehnle added a comment.
This revision now requires changes to proceed.

> I think this requires a lot more thought.

+1

What I'd like to know: why are we reloading a lane mask via V_READFIRSTLANE in the first place? I would expect one of two types of reload:

1. Load from a fixed lane of a VGPR using V_READLANE.
2. Load directly from memory using an SMEM load instruction.

Both types of reload should work just fine with exec=0.

Keeping a lane mask in a VGPR is fundamentally a nonsensical thing to do because it clashes with the whole theory of how different types of data (uniform vs. divergent) are represented in AMDGPU's implementation of SIMT. So I'd really rather we fix that instead of adding yet another hack onto the existing pile of hacks. At the very least, we need to understand this better.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D99507/new/

https://reviews.llvm.org/D99507



More information about the llvm-commits mailing list