[PATCH] D38906: AMDGPU/SI: Implement d16 support buffer_load_format and tbuffer_load_format intrinsics

Tue Oct 24 13:05:44 PDT 2017

cfang added inline comments.

================
Comment at: lib/Target/AMDGPU/SIISelLowering.cpp:204

+  if (!Subtarget->hasVOP3PInsts()) // So v2f16 is not legal.
+    setOperationAction(ISD::INTRINSIC_W_CHAIN, MVT::v2f16, Custom);
----------------
arsenm wrote:
> This also applies to VI, but it doesn't really matter if it is legal or not. you can still set it to custom
OK

================
Comment at: lib/Target/AMDGPU/SIISelLowering.cpp:3232-3234
+    // Should we do assert here?
+    if (ResultT != MVT::v2f16 && ResultT != MVT::v4f16)
+      return;
----------------
arsenm wrote:
> This should not be an assert, but should also probably handle v3f16
v3f16 as TODO.

================
Comment at: lib/Target/AMDGPU/SIISelLowering.cpp:3258-3261
+      MachineMemOperand *MMO = MF.getMachineMemOperand(
+                                   MachinePointerInfo(),
+                                   MachineMemOperand::MOLoad,
+                                   VT.getStoreSize(), VT.getStoreSize());
----------------
arsenm wrote:
> cfang wrote:
> > arsenm wrote:
> > > This should be moved into getTgtMemIntrinsic so the MMO already exists
> > I could not understand your intention here. Can you be more specific how to move this to "bool SITargetLowering::getTgtMemIntrinsic"? 
> > And how to get MMO?  Thanks.
> If you implement getTgtMemIntrinsic it will add the same MMO to the intrinsic. You then just need to forward it to the new node. This is still suboptimal since we should really be using the buffer PseudoSourceValue instead of a null MachinePointerInfo
Implemented getTgtMemIntrinsic, and made appropriate changes. Thanks.

================
Comment at: lib/Target/AMDGPU/SIISelLowering.cpp:3286
+
+    // Cast back yo the original result type, and pu in "Results" list.
+    Results.push_back(DAG.getNode(ISD::BITCAST, DL, ResultT, Res));
----------------
arsenm wrote:
> Typo yo, pu
Thanks.

================
Comment at: test/CodeGen/AMDGPU/llvm.amdgcn.buffer.load.d16.ll:1
+;RUN: llc < %s -march=amdgcn -mcpu=tonga -verify-machineinstrs | FileCheck -check-prefix=D16 %s
+;RUN: llc < %s -march=amdgcn -mcpu=gfx901 -verify-machineinstrs | FileCheck -check-prefix=D16 %s
----------------
arsenm wrote:
> GCN as check prefix. Also should have a VI check line? Is this the one where the output register layout can differ on some gfx8 variant?
I am not sure. But the ISA for the instructions are exactly the same for both VI and gfx9. So I didn't add VI check.

https://reviews.llvm.org/D38906