[PATCH] D105072: [GISel] Support llvm.memcpy.inline

Tue Jun 29 13:22:38 PDT 2021

arsenm added inline comments.

================
Comment at: llvm/include/llvm/IR/IntrinsicInst.h:895

-/// This class wraps the llvm.memcpy intrinsic.
+/// This class wraps the llvm.memcpy and llvm.memcpy.inline intrinsics.
 class MemCpyInst : public MemTransferInst {
----------------
jroelofs wrote:
> paquette wrote:
> > Looks like this is used in a few passes too (e.g. the MemCpyOptimizer pass). Does this change impact those passes at all?
> Yes. Maybe I ought to put that in its own diff.
Yes, this should be its own patch. I started doing this a few months ago but forgot about it

================
Comment at: llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp:1357

-  unsigned Limit = TLI.getMaxStoresPerMemcpy(OptSize);
+  unsigned Limit = IsInlined ? std::numeric_limits<unsigned>::max()
+                             : TLI.getMaxStoresPerMemcpy(OptSize);
----------------
jroelofs wrote:
> paquette wrote:
> > Probably worth a comment for folks not familiar with the code.
> > 
> > It'd probably clearer to be able to call a `inlineMemCpy` function here and just return true, but I'm not sure if that's easy right now.
> Good point. I'll give that a try.
Probably should use uint64_t?

================
Comment at: llvm/test/CodeGen/AArch64/GlobalISel/debug-loc-legalize-tail-call.mir:39
     %2:_(s64) = G_IMPLICIT_DEF debug-location !DILocation(line: 0, scope: !6)
-    G_MEMSET %0(p0), %1(s8), %2(s64), 1, debug-location !11 :: (store 1)
     DBG_VALUE 0, 0, !9, !DIExpression(), debug-location !12
----------------
aemerson wrote:
> jroelofs wrote:
> > aemerson wrote:
> > > If only memcpy has an inline variant then we shouldn't be generating G_MEMSETs with this extra argument?
> > I was on the fence about whether to let them all have the same number of arguments or not, since there are several places where code is shared between different opcodes and having them have different operand counts would lead to lots of special case logic.
> > 
> > To strike a balance, maybe it should only be memcpy+memmove instead?
> Why add to memmove though?
> 
> Another way to implement this is to just introduce a new G_MEMCPY_INLINE opcode, then we don't have to add another argument anywhere.
I also think this should be a separate opcode.

There's also the difference that memcpy.inline uses an immarg size argument (which I think is entirely pointless and should be removed, but that's a different battle)

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D105072/new/

https://reviews.llvm.org/D105072