[PATCH] D63709: [AMDGPU] Add peephole to optimize MOV
Piotr Sobczak via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Jun 25 01:24:50 PDT 2019
piotr marked an inline comment as done.
piotr added inline comments.
================
Comment at: lib/Target/AMDGPU/SIShrinkInstructions.cpp:682
+ if (Src.isImm() &&
+ TargetRegisterInfo::isPhysicalRegister(MI.getOperand(0).getReg())) {
+
----------------
rampitec wrote:
> piotr wrote:
> > rampitec wrote:
> > > Even is block has only one predecessor that does not mean this is the reaching def of the physreg. The predecessor may (and likely will) have a different exec mask. It might work with sclaral, but not with vector moves.
> > Perhaps I misunderstood your comment, but if the block has only one predecessor then it is executed for a subset of threads that already executed the predecessor, so for that subset we're sure that the previous mov happened?
> Consider this:
>
>
> ```
> a:
> v_mov_b32 v0, 1
> s_and_savexec_b64 s[0:1], vcc
> ; mask branch %c
> b:
> v_mov_b32 v0, 2
> c:
> s_or_b32 exec, exec, s[0:1]
> use v0
> ```
>
> You will have divergent v0 value at use even though block c has only one predecessor.
Thanks Stas. Good point, I will need to check for the modification of the exec mask between the movs.
Repository:
rL LLVM
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D63709/new/
https://reviews.llvm.org/D63709
More information about the llvm-commits
mailing list