[llvm] [BPF] Handle certain mem intrinsic functions with addr-space arguments (PR #160025)

via llvm-commits llvm-commits at lists.llvm.org
Tue Oct 7 12:47:04 PDT 2025


================
@@ -493,21 +559,69 @@ bool BPFCheckAndAdjustIR::insertASpaceCasts(Module &M) {
   for (Function &F : M) {
     DenseMap<Value *, Value *> CastsCache;
     for (BasicBlock &BB : F) {
-      for (Instruction &I : BB) {
+      for (Instruction &I : llvm::make_early_inc_range(BB)) {
         unsigned PtrOpNum;
 
-        if (auto *LD = dyn_cast<LoadInst>(&I))
+        if (auto *LD = dyn_cast<LoadInst>(&I)) {
           PtrOpNum = LD->getPointerOperandIndex();
-        else if (auto *ST = dyn_cast<StoreInst>(&I))
+          aspaceWrapOperand(CastsCache, &I, PtrOpNum);
+          continue;
+        }
+        if (auto *ST = dyn_cast<StoreInst>(&I)) {
           PtrOpNum = ST->getPointerOperandIndex();
-        else if (auto *CmpXchg = dyn_cast<AtomicCmpXchgInst>(&I))
+          aspaceWrapOperand(CastsCache, &I, PtrOpNum);
+          continue;
+        }
+        if (auto *CmpXchg = dyn_cast<AtomicCmpXchgInst>(&I)) {
           PtrOpNum = CmpXchg->getPointerOperandIndex();
-        else if (auto *RMW = dyn_cast<AtomicRMWInst>(&I))
+          aspaceWrapOperand(CastsCache, &I, PtrOpNum);
+          continue;
+        }
+        if (auto *RMW = dyn_cast<AtomicRMWInst>(&I)) {
           PtrOpNum = RMW->getPointerOperandIndex();
-        else
+          aspaceWrapOperand(CastsCache, &I, PtrOpNum);
           continue;
+        }
+
+        auto *CI = dyn_cast<CallInst>(&I);
+        if (!CI)
+          continue;
+
+        Function *Callee = CI->getCalledFunction();
+        if (!Callee || !Callee->isIntrinsic())
+          continue;
+
+        // Check memset/memcpy/memmove
+        Intrinsic::ID ID = Callee->getIntrinsicID();
+        bool IsSet = ID == Intrinsic::memset;
+        bool IsCpy = ID == Intrinsic::memcpy;
+        bool IsMove = ID == Intrinsic::memmove;
----------------
yonghong-song wrote:

I asked ChatGPT for the following question?
```
In llvm, what kind of C code can generate the following intrinsic: Intrinsic::experimental_memset_pattern ?
```
The following is the answer:

====================
Short version: call Apple’s memset_pattern{4,8,16} from C on a Darwin target (macOS/iOS). Clang recognizes these and lowers them to the LLVM IR intrinsic llvm.experimental.memset.pattern (which can then be expanded efficiently).

Minimal example (macOS / iOS targets)

// clang -O2 -target x86_64-apple-macos14 -S -emit-llvm ex.c -o ex.ll
#include <string.h>

void fill16(void *dst, size_t n) {
    unsigned char pat[16] =
        {0,1,2,3,4,5,6,7, 8,9,10,11,12,13,14,15};
    memset_pattern16(dst, pat, n);
}

void fill8(void *dst, size_t n) {
    unsigned char pat[8] = {1,2,3,4,5,6,7,8};
    memset_pattern8(dst, pat, n);
}

void fill4(void *dst, size_t n) {
    unsigned char pat[4] = {0xAA,0xBB,0xCC,0xDD};
    memset_pattern4(dst, pat, n);
}

In the emitted IR you’ll see calls like:

call void @llvm.experimental.memset.pattern.p0.i64(
  ptr %dst, ptr %pat, i64 %n, i1 false)

Notes

These memset_pattern{4,8,16} functions are Apple libc extensions. On non-Darwin targets, Clang won’t lower them to the intrinsic—either you’ll get a normal library call or a loop.

There isn’t a portable C standard function that maps to llvm.experimental.memset.pattern.

The intrinsic allows repeating multi-byte patterns (4/8/16). Plain llvm.memset only repeats a single byte.

If you’re not on macOS/iOS but still want the intrinsic (for experimentation), compile with a Darwin target triple as shown above.

====================

I actually tried to compile with the above example. It compiled successfully with the following compiler:

Apple clang version 17.0.0 (clang-1700.0.13.5)
Target: arm64-apple-darwin24.6.0
Thread model: posix
InstalledDir: /Library/Developer/CommandLineTools/usr/bin

But it will fail to compile with linux and x86 target.
```
$ clang -O2 -S -emit-llvm ex.c -o ex.ll
ex.c:7:5: error: call to undeclared function 'memset_pattern16'; ISO C99 and later do not support
      implicit function declarations [-Wimplicit-function-declaration]
    7 |     memset_pattern16(dst, pat, n);
      |     ^
ex.c:12:5: error: call to undeclared function 'memset_pattern8'; ISO C99 and later do not support
      implicit function declarations [-Wimplicit-function-declaration]
   12 |     memset_pattern8(dst, pat, n);
      |     ^
ex.c:17:5: error: call to undeclared function 'memset_pattern4'; ISO C99 and later do not support
      implicit function declarations [-Wimplicit-function-declaration]
   17 |     memset_pattern4(dst, pat, n);
      |     ^
3 errors generated.
```

Unfortunately, the compiler of Apple on my Mac is too old to generate llvm.experimental.memset.pattern. I suspect the latest clang (with Apple target) should generate llvm.experimental.memset.pattern. The following is the related code in LoopIDiomRecognize.cpp:

```
  if (SplatValue) {
    NewCall = Builder.CreateMemSet(BasePtr, SplatValue, MemsetArg,
                                   MaybeAlign(StoreAlignment),
                                   /*isVolatile=*/false, AATags);
  } else if (ForceMemsetPatternIntrinsic ||
             isLibFuncEmittable(M, TLI, LibFunc_memset_pattern16)) {
    assert(isa<SCEVConstant>(StoreSizeSCEV) && "Expected constant store size");

    NewCall = Builder.CreateIntrinsic(
        Intrinsic::experimental_memset_pattern,
        {DestInt8PtrTy, PatternValue->getType(), IntIdxTy},
        {BasePtr, PatternValue, MemsetArg,
         ConstantInt::getFalse(M->getContext())});
    if (StoreAlignment)
      cast<MemSetPatternInst>(NewCall)->setDestAlignment(*StoreAlignment);
    NewCall->setAAMetadata(AATags);
  } else {
    // Neither a memset, nor memset_pattern16
    return Changed;
  }
```
ForceMemsetPatternIntrinsic is an internal flag. 
```
static cl::opt<bool> ForceMemsetPatternIntrinsic(
    "loop-idiom-force-memset-pattern-intrinsic",
    cl::desc("Use memset.pattern intrinsic whenever possible"), cl::init(false),
    cl::Hidden);
```
So memset_pattern16 function is needed to generate Intrinsic::experimental_memset_pattern() and memset_pattern16 is only available for Apple target.

So I will skip experimental_memset_pattern for now.

https://github.com/llvm/llvm-project/pull/160025


More information about the llvm-commits mailing list