[PATCH] D63709: [AMDGPU] Add peephole to optimize MOV

Piotr Sobczak via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Jun 25 01:24:50 PDT 2019


piotr marked an inline comment as done.
piotr added inline comments.


================
Comment at: lib/Target/AMDGPU/SIShrinkInstructions.cpp:682
+        if (Src.isImm() &&
+            TargetRegisterInfo::isPhysicalRegister(MI.getOperand(0).getReg())) {
+
----------------
rampitec wrote:
> piotr wrote:
> > rampitec wrote:
> > > Even is block has only one predecessor that does not mean this is the reaching def of the physreg. The predecessor may (and likely will) have a different exec mask. It might work with sclaral, but not with vector moves.
> > Perhaps I misunderstood your comment, but if the block has only one predecessor then it is executed for a subset of threads that already executed the predecessor, so for that subset we're sure that the previous mov happened?
> Consider this:
> 
> 
> ```
> a:
>   v_mov_b32 v0, 1
>   s_and_savexec_b64 s[0:1], vcc
>   ; mask branch %c
> b:
>   v_mov_b32 v0, 2
> c:
>   s_or_b32 exec, exec, s[0:1]
>   use v0
> ```
> 
> You will have divergent v0 value at use even though block c has only one predecessor.
Thanks Stas. Good point, I will need to check for the modification of the exec mask between the movs.


Repository:
  rL LLVM

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D63709/new/

https://reviews.llvm.org/D63709





More information about the llvm-commits mailing list