[PATCH] D55444: AMDGPU: Fix DPP combiner

Wed Dec 12 10:25:01 PST 2018

cwabbott added inline comments.

================
Comment at: lib/Target/AMDGPU/GCNDPPCombine.cpp:261
+  case AMDGPU::V_ADD_I32_e32:
+  case AMDGPU::V_AND_B32_e32:
+  case AMDGPU::V_SUBREV_U32_e32:
----------------
I think you have AND and OR mixed up... the identity for AND should be -1, and OR should be 0, right?

================
Comment at: lib/Target/AMDGPU/GCNDPPCombine.cpp:265
+    if (OldOpndValue.getImm() == 0)
+      return OldOpndVGPR;
+    break;
----------------
If the old operand is the identity in the original mov, then in the transformed instruction the old operand should be the un-swizzled source, not the identity again. I think this is the reason the add tests to still fail on radv, since it generates something like:

```
v_mov_b32_e32 v5, 0 
s_nop 1
v_add_u32_dpp v5, vcc, v4, v4  row_shr:4 row_mask:0xf bank_mask:0xe 
```

from

```
%88 = call i32 @llvm.amdgcn.update.dpp.i32(i32 0, i32 %87, i32 276, i32 15, i32 14, i1 false) #2
  %89 = add i32 %87, %88
```

when what we really want is

```
v_add_u32_dpp v4, vcc, v4, v4  row_shr:4 row_mask:0xf bank_mask:0xe
```

Repository:
  rL LLVM

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D55444/new/

https://reviews.llvm.org/D55444