[PATCH] D126903: [clang] Add support for __builtin_memset_inline

Fri Jun 10 05:14:12 PDT 2022

gchatelet added inline comments.

================
Comment at: llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:7334
   if (TSI) {
     SDValue Result = TSI->EmitTargetCodeForMemset(
         *this, dl, Chain, Dst, Src, Size, Alignment, isVol, DstPtrInfo);
----------------
courbet wrote:
> There is nothing preveting a target from emitting a call to `memset` here when `AlwaysInline` is `true`. Actually, `X86SelectionDAGInfo` does just that (this very patch is adding `/* AlwaysInline */ false,` to the `getMemset` call that handles the trailing bytes). It happens that because trailing bytes are typically small and therefore inline. it happens to work, but this should be verified somehow (or, maybe easier,  `AlwaysInline` should be passed to `EmitTargetCodeForMemset` so it can do the right thing).
I think this  is OK now, I had to tweak the ARM selection dag as well.

================
Comment at: llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:7344
+    assert(ConstantSize && "AlwaysInline requires a constant size!");
+    return getMemsetStores(*this, dl, Chain, Dst, Src,
+                           ConstantSize->getZExtValue(), Alignment, isVol, true,
----------------
courbet wrote:
> Not that this is not strictly required to return a valid `SDValue`: Even with an "infinite" `Limit` in `TLI.findOptimalMemOpLowering`, a target that overrides this function could decide to just return false. I'm not sure what we want to do in this case. 
> So I think we should document `findOptimalMemOpLowering` to mention that the default implementation always returns a valid memop lowering if `Limit` is `UINT_MAX` and that target that decide to not provide a memop lowering *must* emit a valid one in `EmitTargetCodeForMemset`. Another option would be to call the generic `TargetLowering::findOptimalMemOpLowering` when the target declines to generate eithe ra memop lowering or target-specific code.
This one is more tricky. I've added a comment on the function and bypassed problematic cases but this is far from ideal.
The greedy nature of the algorithm makes it hard to understand the appropriate behavior.

I think this is worth stepping back and redesigning the abstraction in a subsequent patch.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D126903/new/

https://reviews.llvm.org/D126903