[PATCH] D69661: [AMDGPU] Fix vccz after v_readlane/v_readfirstlane to vcc_lo/hi

Matt Arsenault via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Oct 31 08:18:41 PDT 2019


arsenm added inline comments.


================
Comment at: llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp:1406
+
+    if (ST->getGeneration() <= AMDGPUSubtarget::GFX9) {
+      // Up to gfx9, writes to vcc_lo and vcc_hi don't update vccz.
----------------
I thought we already had a vccz bug subtarget feature check? Either way I want to limit getGeneration checks and keep them in a subtarget check


================
Comment at: llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp:1409
+      // Writes to vcc will fix it. Only examine explicit defs.
+      for (auto &Op : Inst.defs()) {
+        switch (Op.getReg()) {
----------------
This won't catch the implicit def in inline asm for example


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69661/new/

https://reviews.llvm.org/D69661





More information about the llvm-commits mailing list