[PATCH] D69661: [AMDGPU] Fix vccz after v_readlane/v_readfirstlane to vcc_lo/hi
Matt Arsenault via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Oct 31 08:18:41 PDT 2019
arsenm added inline comments.
================
Comment at: llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp:1406
+
+ if (ST->getGeneration() <= AMDGPUSubtarget::GFX9) {
+ // Up to gfx9, writes to vcc_lo and vcc_hi don't update vccz.
----------------
I thought we already had a vccz bug subtarget feature check? Either way I want to limit getGeneration checks and keep them in a subtarget check
================
Comment at: llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp:1409
+ // Writes to vcc will fix it. Only examine explicit defs.
+ for (auto &Op : Inst.defs()) {
+ switch (Op.getReg()) {
----------------
This won't catch the implicit def in inline asm for example
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D69661/new/
https://reviews.llvm.org/D69661
More information about the llvm-commits
mailing list