[PATCH] D63709: [AMDGPU] Add peephole to optimize MOV

Stanislav Mekhanoshin via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Jun 24 09:40:22 PDT 2019


rampitec added inline comments.


================
Comment at: lib/Target/AMDGPU/SIShrinkInstructions.cpp:682
+        if (Src.isImm() &&
+            TargetRegisterInfo::isPhysicalRegister(MI.getOperand(0).getReg())) {
+
----------------
piotr wrote:
> rampitec wrote:
> > Even is block has only one predecessor that does not mean this is the reaching def of the physreg. The predecessor may (and likely will) have a different exec mask. It might work with sclaral, but not with vector moves.
> Perhaps I misunderstood your comment, but if the block has only one predecessor then it is executed for a subset of threads that already executed the predecessor, so for that subset we're sure that the previous mov happened?
Consider this:


```
a:
  v_mov_b32 v0, 1
  s_and_savexec_b64 s[0:1], vcc
  ; mask branch %c
b:
  v_mov_b32 v0, 2
c:
  s_or_b32 exec, exec, s[0:1]
  use v0
```

You will have divergent v0 value at use even though block c has only one predecessor.


Repository:
  rL LLVM

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D63709/new/

https://reviews.llvm.org/D63709





More information about the llvm-commits mailing list