[PATCH] D69661: [AMDGPU] Fix vccz after v_readlane/v_readfirstlane to vcc_lo/hi
Jay Foad via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Oct 31 09:42:37 PDT 2019
foad marked 2 inline comments as done.
foad added inline comments.
================
Comment at: llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp:1409
+ // Writes to vcc will fix it. Only examine explicit defs.
+ for (auto &Op : Inst.defs()) {
+ switch (Op.getReg()) {
----------------
arsenm wrote:
> foad wrote:
> > arsenm wrote:
> > > This won't catch the implicit def in inline asm for example
> > I specifically wanted to avoid treating this instruction (from the test case) as a write to vcc, despite its implicit-def:
> > ```
> > $vcc_hi = V_READFIRSTLANE_B32 killed $vgpr0, implicit $exec, implicit-def $vcc
> > ```
> > What kind of inline asm are you thinking of?
> Any inline asm that touches vcc will appear as an implicit-def. You can't know without context that an implicit-def isn't really modifying the register. I'm guessing this one is to model the super-register def?
I don't know exactly why the readfirstlane has an implicit def of vcc. I still think the most conservative fix is to only look for explicit defs of vcc_lo/vcc_hi, which is what my patch does.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D69661/new/
https://reviews.llvm.org/D69661
More information about the llvm-commits
mailing list