[llvm] [PreISelIntrinsicLowering] Produce a memset_pattern16 libcall for llvm.experimental.memset.pattern when available (PR #120420)

Alex Bradbury via llvm-commits llvm-commits at lists.llvm.org
Wed Jan 15 01:51:36 PST 2025


================
@@ -322,7 +376,41 @@ bool PreISelIntrinsicLowering::expandMemIntrinsicUses(Function &F) const {
     }
     case Intrinsic::experimental_memset_pattern: {
       auto *Memset = cast<MemSetPatternInst>(Inst);
-      expandMemSetPatternAsLoop(Memset);
+      const TargetLibraryInfo &TLI = LookupTLI(*Memset->getFunction());
+      if (Constant *PatternValue = getMemSetPattern16Value(Memset, TLI)) {
+        // FIXME: There is currently no profitability calculation for emitting
+        // the libcall vs expanding the memset.pattern directly.
+        IRBuilder<> Builder(Inst);
+        Module *M = Memset->getModule();
+        const DataLayout &DL = Memset->getDataLayout();
+
+        StringRef FuncName = "memset_pattern16";
+        FunctionCallee MSP = getOrInsertLibFunc(
+            M, TLI, LibFunc_memset_pattern16, Builder.getVoidTy(),
+            Memset->getRawDest()->getType(), Builder.getPtrTy(),
+            Memset->getLength()->getType());
+        inferNonMandatoryLibFuncAttrs(M, FuncName, TLI);
+
+        // Otherwise we should form a memset_pattern16.  PatternValue is known
+        // to be an constant array of 16-bytes. Put the value into a mergable
+        // global.
+        GlobalVariable *GV = new GlobalVariable(
+            *M, PatternValue->getType(), true, GlobalValue::PrivateLinkage,
+            PatternValue, ".memset_pattern");
+        GV->setUnnamedAddr(
+            GlobalValue::UnnamedAddr::Global); // Ok to merge these.
+        GV->setAlignment(Align(16));
----------------
asb wrote:

I'm inheriting this from LoopIdiomRecognize, and my preference would be to keep aspects like this untouched in the initial switchover before later looking to relax. I've put a TODO to re-evaluate. I would guess it was chosen on the assumption that a naturally aligned i128 is most efficiently accessed.

https://github.com/llvm/llvm-project/pull/120420


More information about the llvm-commits mailing list