[PATCH] D52019: [AMDGPU] Divergence driven instruction selection. Part 1.

Thu Sep 13 10:38:19 PDT 2018

rampitec added inline comments.

================
Comment at: lib/Target/AMDGPU/SIInstrInfo.td:1762

+class VOP_PAT_GEN <VOPProfile p, bits<2> mode=2> : VOPProfile <p.ArgVT> {
+  let NeedPatGen = mode;
----------------
You can create enum for the mode in SIInstrInfo.td. Look at SIEncodingFamily as an example.
Then I might be missing something, but I do not see anywhere a value 1 used for mode.

================
Comment at: lib/Target/AMDGPU/VOP2Instructions.td:364
+let isConvergent = 1, Uses = []<Register> in {
+def V_READLANE_B32 : VOP2_Pseudo<"v_readlane_b32", VOP_READLANE,
+  [(set i32:$vdst, (int_amdgcn_readlane i32:$src0, i32:$src1))], "">;
----------------
Is there a reason to move it from its original place? This seems to grow the patch unnecessary and makes review harder.

================
Comment at: lib/Target/AMDGPU/VOP2Instructions.td:424

+defm V_ADD_I32 : VOP2bInst <"v_add_i32", VOP2b_I32_I1_I32_I32, null_frag, "v_add_i32", 1>;
+defm V_SUB_I32 : VOP2bInst <"v_sub_i32", VOP2b_I32_I1_I32_I32, null_frag, "v_sub_i32", 1>;
----------------
It would be nice to swap these instructions' blocks back to minimize the diff. Order change is not needed for this patch.

================
Comment at: lib/Target/AMDGPU/VOP2Instructions.td:502
+
+def : GCNPat<
+    (AMDGPUsube i32:$src0, i32:$src1, i1:$src2),
----------------
Again, yhou probably do not need to move adde and sube patterns.

================
Comment at: lib/Target/AMDGPU/VOP2Instructions.td:515
+      (getDivergentFrag<Op>.ret i64:$src0, i64:$src1),
+      (REG_SEQUENCE VReg_64,
+        (Inst  
----------------
Why is this reg_sequence and extract_subregs are needed? Why not just use i64 operands?

================
Comment at: lib/Target/AMDGPU/VOPInstructions.td:576
+class OpOrNull<SDPatternOperator Op, VOPProfile P> {
+  SDPatternOperator ret = !if(!eq(P.NeedPatGen, 1), null_frag, Op);
+}
----------------
Where do you set NeedPatGen = 1?

================
Comment at: test/CodeGen/AMDGPU/insert_vector_elt.v2i16.ll:450
+; GFX89-DAG: s_movk_i32 [[K:s[0-9]+]], 0x3e7
+; CI-DAG: s_movk_i32 [[K:s[0-9]+]], 0x3e7

----------------
It is the same, isn't it? Why not to keep it GCN-DAG?

================
Comment at: test/CodeGen/AMDGPU/insert_vector_elt.v2i16.ll:479
+; GFX89-DAG: s_movk_i32 [[K:s[0-9]+]], 0x1234
+; CI-DAG: s_movk_i32  [[K:s[0-9]+]], 0x1234

----------------
Same here.

Repository:
  rL LLVM

https://reviews.llvm.org/D52019