[PATCH] D69661: [AMDGPU] Fix vccz after v_readlane/v_readfirstlane to vcc_lo/hi

Matt Arsenault via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Oct 31 08:56:00 PDT 2019


arsenm added inline comments.


================
Comment at: llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp:1406
+
+    if (ST->getGeneration() <= AMDGPUSubtarget::GFX9) {
+      // Up to gfx9, writes to vcc_lo and vcc_hi don't update vccz.
----------------
foad wrote:
> arsenm wrote:
> > I thought we already had a vccz bug subtarget feature check? Either way I want to limit getGeneration checks and keep them in a subtarget check
> Yes we have the hasReadVCCZBug feature mentioned a few lines above, but as I understand it that is a different and very specific bug.
> 
> I'm happy to add a new feature for this if necessary but I'm not sure what to call it. I'm not even sure if it's a bug or a feature. The best explanation I've heard so far is that it's just an artifact of the way vccz is implemented in the hardware, and that it doesn't affect gfx10 because the hardware implementation changed to support wave32. So maybe it should be under a "hasWave32Support" feature?
It doesn't need to be a full subtarget feature, but I do prefer to keep the getGeneration checks consolidated inside the subtarget. I'm not familiar with this problem and not sure what to call it either


================
Comment at: llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp:1409
+      // Writes to vcc will fix it. Only examine explicit defs.
+      for (auto &Op : Inst.defs()) {
+        switch (Op.getReg()) {
----------------
foad wrote:
> arsenm wrote:
> > This won't catch the implicit def in inline asm for example
> I specifically wanted to avoid treating this instruction (from the test case) as a write to vcc, despite its implicit-def:
> ```
> $vcc_hi = V_READFIRSTLANE_B32 killed $vgpr0, implicit $exec, implicit-def $vcc
> ```
> What kind of inline asm are you thinking of?
Any inline asm that touches vcc will appear as an implicit-def. You can't know without context that an implicit-def isn't really modifying the register. I'm guessing this one is to model the super-register def?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69661/new/

https://reviews.llvm.org/D69661





More information about the llvm-commits mailing list