[llvm] [BPF] Handle certain mem intrinsic functions with addr-space arguments (PR #160025)
via llvm-commits
llvm-commits at lists.llvm.org
Tue Oct 7 12:47:04 PDT 2025
================
@@ -493,21 +559,69 @@ bool BPFCheckAndAdjustIR::insertASpaceCasts(Module &M) {
for (Function &F : M) {
DenseMap<Value *, Value *> CastsCache;
for (BasicBlock &BB : F) {
- for (Instruction &I : BB) {
+ for (Instruction &I : llvm::make_early_inc_range(BB)) {
unsigned PtrOpNum;
- if (auto *LD = dyn_cast<LoadInst>(&I))
+ if (auto *LD = dyn_cast<LoadInst>(&I)) {
PtrOpNum = LD->getPointerOperandIndex();
- else if (auto *ST = dyn_cast<StoreInst>(&I))
+ aspaceWrapOperand(CastsCache, &I, PtrOpNum);
+ continue;
+ }
+ if (auto *ST = dyn_cast<StoreInst>(&I)) {
PtrOpNum = ST->getPointerOperandIndex();
- else if (auto *CmpXchg = dyn_cast<AtomicCmpXchgInst>(&I))
+ aspaceWrapOperand(CastsCache, &I, PtrOpNum);
+ continue;
+ }
+ if (auto *CmpXchg = dyn_cast<AtomicCmpXchgInst>(&I)) {
PtrOpNum = CmpXchg->getPointerOperandIndex();
- else if (auto *RMW = dyn_cast<AtomicRMWInst>(&I))
+ aspaceWrapOperand(CastsCache, &I, PtrOpNum);
+ continue;
+ }
+ if (auto *RMW = dyn_cast<AtomicRMWInst>(&I)) {
PtrOpNum = RMW->getPointerOperandIndex();
- else
+ aspaceWrapOperand(CastsCache, &I, PtrOpNum);
continue;
+ }
+
+ auto *CI = dyn_cast<CallInst>(&I);
+ if (!CI)
+ continue;
+
+ Function *Callee = CI->getCalledFunction();
+ if (!Callee || !Callee->isIntrinsic())
+ continue;
+
+ // Check memset/memcpy/memmove
+ Intrinsic::ID ID = Callee->getIntrinsicID();
+ bool IsSet = ID == Intrinsic::memset;
+ bool IsCpy = ID == Intrinsic::memcpy;
+ bool IsMove = ID == Intrinsic::memmove;
----------------
yonghong-song wrote:
I asked ChatGPT for the following question?
```
In llvm, what kind of C code can generate the following intrinsic: Intrinsic::experimental_memset_pattern ?
```
The following is the answer:
====================
Short version: call Apple’s memset_pattern{4,8,16} from C on a Darwin target (macOS/iOS). Clang recognizes these and lowers them to the LLVM IR intrinsic llvm.experimental.memset.pattern (which can then be expanded efficiently).
Minimal example (macOS / iOS targets)
// clang -O2 -target x86_64-apple-macos14 -S -emit-llvm ex.c -o ex.ll
#include <string.h>
void fill16(void *dst, size_t n) {
unsigned char pat[16] =
{0,1,2,3,4,5,6,7, 8,9,10,11,12,13,14,15};
memset_pattern16(dst, pat, n);
}
void fill8(void *dst, size_t n) {
unsigned char pat[8] = {1,2,3,4,5,6,7,8};
memset_pattern8(dst, pat, n);
}
void fill4(void *dst, size_t n) {
unsigned char pat[4] = {0xAA,0xBB,0xCC,0xDD};
memset_pattern4(dst, pat, n);
}
In the emitted IR you’ll see calls like:
call void @llvm.experimental.memset.pattern.p0.i64(
ptr %dst, ptr %pat, i64 %n, i1 false)
Notes
These memset_pattern{4,8,16} functions are Apple libc extensions. On non-Darwin targets, Clang won’t lower them to the intrinsic—either you’ll get a normal library call or a loop.
There isn’t a portable C standard function that maps to llvm.experimental.memset.pattern.
The intrinsic allows repeating multi-byte patterns (4/8/16). Plain llvm.memset only repeats a single byte.
If you’re not on macOS/iOS but still want the intrinsic (for experimentation), compile with a Darwin target triple as shown above.
====================
I actually tried to compile with the above example. It compiled successfully with the following compiler:
Apple clang version 17.0.0 (clang-1700.0.13.5)
Target: arm64-apple-darwin24.6.0
Thread model: posix
InstalledDir: /Library/Developer/CommandLineTools/usr/bin
But it will fail to compile with linux and x86 target.
```
$ clang -O2 -S -emit-llvm ex.c -o ex.ll
ex.c:7:5: error: call to undeclared function 'memset_pattern16'; ISO C99 and later do not support
implicit function declarations [-Wimplicit-function-declaration]
7 | memset_pattern16(dst, pat, n);
| ^
ex.c:12:5: error: call to undeclared function 'memset_pattern8'; ISO C99 and later do not support
implicit function declarations [-Wimplicit-function-declaration]
12 | memset_pattern8(dst, pat, n);
| ^
ex.c:17:5: error: call to undeclared function 'memset_pattern4'; ISO C99 and later do not support
implicit function declarations [-Wimplicit-function-declaration]
17 | memset_pattern4(dst, pat, n);
| ^
3 errors generated.
```
Unfortunately, the compiler of Apple on my Mac is too old to generate llvm.experimental.memset.pattern. I suspect the latest clang (with Apple target) should generate llvm.experimental.memset.pattern. The following is the related code in LoopIDiomRecognize.cpp:
```
if (SplatValue) {
NewCall = Builder.CreateMemSet(BasePtr, SplatValue, MemsetArg,
MaybeAlign(StoreAlignment),
/*isVolatile=*/false, AATags);
} else if (ForceMemsetPatternIntrinsic ||
isLibFuncEmittable(M, TLI, LibFunc_memset_pattern16)) {
assert(isa<SCEVConstant>(StoreSizeSCEV) && "Expected constant store size");
NewCall = Builder.CreateIntrinsic(
Intrinsic::experimental_memset_pattern,
{DestInt8PtrTy, PatternValue->getType(), IntIdxTy},
{BasePtr, PatternValue, MemsetArg,
ConstantInt::getFalse(M->getContext())});
if (StoreAlignment)
cast<MemSetPatternInst>(NewCall)->setDestAlignment(*StoreAlignment);
NewCall->setAAMetadata(AATags);
} else {
// Neither a memset, nor memset_pattern16
return Changed;
}
```
ForceMemsetPatternIntrinsic is an internal flag.
```
static cl::opt<bool> ForceMemsetPatternIntrinsic(
"loop-idiom-force-memset-pattern-intrinsic",
cl::desc("Use memset.pattern intrinsic whenever possible"), cl::init(false),
cl::Hidden);
```
So memset_pattern16 function is needed to generate Intrinsic::experimental_memset_pattern() and memset_pattern16 is only available for Apple target.
So I will skip experimental_memset_pattern for now.
https://github.com/llvm/llvm-project/pull/160025
More information about the llvm-commits
mailing list