[PATCH] D38906: AMDGPU/SI: Implement d16 support buffer_load_format and tbuffer_load_format intrinsics

Mon Oct 16 14:18:40 PDT 2017

arsenm added inline comments.

================
Comment at: lib/Target/AMDGPU/SIISelLowering.cpp:204

+  if (!Subtarget->hasVOP3PInsts()) // So v2f16 is not legal.
+    setOperationAction(ISD::INTRINSIC_W_CHAIN, MVT::v2f16, Custom);
----------------
This also applies to VI, but it doesn't really matter if it is legal or not. you can still set it to custom

================
Comment at: lib/Target/AMDGPU/SIISelLowering.cpp:3230
+  case ISD::INTRINSIC_W_CHAIN: {
+    SDValue Op = SDValue(N, 0);
+    EVT ResultT = Op.getValueType();
----------------
Should put this into a new function

================
Comment at: lib/Target/AMDGPU/SIISelLowering.cpp:3232-3234
+    // Should we do assert here?
+    if (ResultT != MVT::v2f16 && ResultT != MVT::v4f16)
+      return;
----------------
This should not be an assert, but should also probably handle v3f16

================
Comment at: lib/Target/AMDGPU/SIISelLowering.cpp:3258-3261
+      MachineMemOperand *MMO = MF.getMachineMemOperand(
+                                   MachinePointerInfo(),
+                                   MachineMemOperand::MOLoad,
+                                   VT.getStoreSize(), VT.getStoreSize());
----------------
This should be moved into getTgtMemIntrinsic so the MMO already exists

================
Comment at: lib/Target/AMDGPU/SIISelLowering.cpp:3277-3280
+      MachineMemOperand *MMO = MF.getMachineMemOperand(
+                                  MachinePointerInfo(MFI->getBufferPSV()),
+                                  MachineMemOperand::MOLoad,
+                                  VT.getStoreSize(), VT.getStoreSize());
----------------
Ditto

================
Comment at: lib/Target/AMDGPU/SIISelLowering.cpp:3284
+                                          DL, VTList, Ops, IntVT, MMO);
+    } else return; // Should we assert here?
+
----------------
Return on different line

================
Comment at: lib/Target/AMDGPU/SIISelLowering.cpp:3286
+
+    // Cast back yo the original result type, and pu in "Results" list.
+    Results.push_back(DAG.getNode(ISD::BITCAST, DL, ResultT, Res));
----------------
Typo yo, pu

================
Comment at: test/CodeGen/AMDGPU/llvm.amdgcn.buffer.load.d16.ll:1
+;RUN: llc < %s -march=amdgcn -mcpu=tonga -verify-machineinstrs | FileCheck -check-prefix=D16 %s
+;RUN: llc < %s -march=amdgcn -mcpu=gfx901 -verify-machineinstrs | FileCheck -check-prefix=D16 %s
----------------
GCN as check prefix. Also should have a VI check line? Is this the one where the output register layout can differ on some gfx8 variant?

================
Comment at: test/CodeGen/AMDGPU/llvm.amdgcn.tbuffer.load.d16.ll:14-15
+
+;D16-LABEL: {{^}}tbuffer_load_d16_xy:
+;D16: tbuffer_load_format_d16_xy
+;D16: s_waitcnt
----------------
Should check operands

https://reviews.llvm.org/D38906