[PATCH] D52019: [AMDGPU] Divergence driven instruction selection. Part 1.

Wed Sep 19 14:56:18 PDT 2018

rampitec added inline comments.

================
Comment at: lib/Target/AMDGPU/VOP2Instructions.td:364
+let isConvergent = 1, Uses = []<Register> in {
+def V_READLANE_B32 : VOP2_Pseudo<"v_readlane_b32", VOP_READLANE,
+  [(set i32:$vdst, (int_amdgcn_readlane i32:$src0, i32:$src1))], "">;
----------------
alex-t wrote:
> rampitec wrote:
> > Is there a reason to move it from its original place? This seems to grow the patch unnecessary and makes review harder.
> In fact the reason was that Tablegen gen-asm-matcher fails with the empty prefix strings
> 
> in a def like this the empty string "" overrides the default VOP2_Pseido "prefix" argument that is prefix = "_e32"
> VOP2_Pseudo <"v_madmk_f32", VOP_MADMK_F32, [], "">
> 
> As a result asm-matcher-gen complained about mnemonic alias with the same string.
> 
> For some miracle reason this expressed only with Predicates[] explicitly set.
> I don't know how it is related. Some another one Tablegen black magic.
> 
> I moved this 4 defs back but removed empty prefix "". Since nothing failed after this I conclude that the empty prefix string was a covered bug.
I think it was supposed to suppress printing of _e32 suffix in a command like:

```
llvm-mc -arch=amdgcn -mcpu=fiji <<< 'v_madmk_f32 v0, v0, 1.0, v0'
```
Although we have tests for it in MC/Disassembler/AMDGPU/, so as long as they pass it should be fine.

================
Comment at: lib/Target/AMDGPU/VOP2Instructions.td:515
+      (getDivergentFrag<Op>.ret i64:$src0, i64:$src1),
+      (REG_SEQUENCE VReg_64,
+        (Inst  
----------------
alex-t wrote:
> rampitec wrote:
> > Why is this reg_sequence and extract_subregs are needed? Why not just use i64 operands?
> How can I use i64 operands with V_AND_B32 that operates i32?
OK, thanks.

================
Comment at: test/CodeGen/AMDGPU/llvm.amdgcn.update.dpp.ll:20
 ; VI-NOOPT: v_mov_b32_e32 v{{[0-9]+}}, 0
-; VI-NOOPT: v_mov_b32_e32 [[REG:v[0-9]+]], v{{[0-9]+}}
+; VI-NOOPT0: v_mov_b32_e32 [[REG:v[0-9]+]], v{{[0-9]+}}
 ; VI-NEXT: s_nop 0
----------------
This must be a leftover from some experiments, there is no such check.

================
Comment at: test/CodeGen/AMDGPU/shift-i128.ll:7
 ; GCN:       ; %bb.0:
-; GCN-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
-; GCN-NEXT:    v_lshl_b64 v[5:6], v[0:1], v4
-; GCN-NEXT:    v_lshl_b64 v[7:8], v[2:3], v4
-; GCN-NEXT:    v_sub_i32_e32 v9, vcc, 64, v4
-; GCN-NEXT:    v_subrev_i32_e32 v11, vcc, 64, v4
-; GCN-NEXT:    v_cmp_gt_u32_e32 vcc, 64, v4
-; GCN-NEXT:    v_cndmask_b32_e32 v6, 0, v6, vcc
-; GCN-NEXT:    v_cndmask_b32_e32 v5, 0, v5, vcc
-; GCN-NEXT:    v_lshr_b64 v[9:10], v[0:1], v9
-; GCN-NEXT:    v_lshl_b64 v[0:1], v[0:1], v11
-; GCN-NEXT:    v_or_b32_e32 v7, v7, v9
-; GCN-NEXT:    v_or_b32_e32 v8, v8, v10
-; GCN-NEXT:    v_cndmask_b32_e32 v1, v1, v8, vcc
-; GCN-NEXT:    v_cndmask_b32_e32 v0, v0, v7, vcc
-; GCN-NEXT:    v_cmp_eq_u32_e32 vcc, 0, v4
-; GCN-NEXT:    v_cndmask_b32_e32 v3, v1, v3, vcc
-; GCN-NEXT:    v_cndmask_b32_e32 v2, v0, v2, vcc
-; GCN-NEXT:    v_mov_b32_e32 v0, v5
-; GCN-NEXT:    v_mov_b32_e32 v1, v6
-; GCN-NEXT:    s_setpc_b64 s[30:31]
+; GCN-NEXT:	s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GCN-NEXT:	v_lshl_b64 v[5:6], v[2:3], v4
----------------
Was it done with update_llc_test_checks.py? It is strange to see indents changed. I suspect the file was really changed manually instead.

https://reviews.llvm.org/D52019