[PATCH] D36171: AMDGPU: Use direct struct returns

Tue Aug 8 14:25:27 PDT 2017

arsenm added inline comments.

================
Comment at: lib/CodeGen/TargetInfo.cpp:7386
+  bool isHomogeneousAggregateBaseType(QualType Ty) const override;
+  bool isHomogeneousAggregateSmallEnough(const Type *Base,
+                                         uint64_t Members) const override;
----------------
yaxunl wrote:
> arsenm wrote:
> > yaxunl wrote:
> > > Please add descriptions for the above newly added functions.
> > I prefer not to put descriptions on overrides since they will just be out of date with the declaration
> Please add descriptions for the non-override functions and data members above.
I've added them to the body

================
Comment at: lib/CodeGen/TargetInfo.cpp:7571
+
+      // XXX: Should this be i64 instead, and should the limit increase?
+      llvm::Type *I32Ty = llvm::Type::getInt32Ty(getVMContext());
----------------
b-sumner wrote:
> What we do here depends on NumRegsLeft when the block is entered and NumRegs.  If NumRegsLeft >= NumRegs then we just need 2 adjacent registers.  If NumRegsLeft == 1 and NumRegs == 2, then do we pass the low half in a register and the upper half in memory, or all of it in memory?  Anyway, I think NumRegsLeft shouldn't be updated until we know it's OK, and then we don't need the min().
It's all one or the other. Whether it's passed in memory or not is really determined in codegen based on the actual register limit (which is also higher than the 16 used here, at least for now). Here selects whether to use byval or not. The ABI is slightly different whether it's passed as byval or as too many registers. I'm not sure it ever really makes sense to use byval yet, so I wasn't trying to be very precise here.

https://reviews.llvm.org/D36171