[llvm] r373333 - [InstCombine] sprintf(dest, "%s", str) -> memccpy(dest, str, 0, MAX)

Peter Smith via llvm-commits llvm-commits at lists.llvm.org
Wed Oct 2 08:16:17 PDT 2019


Thanks for the response, apologies for missing the revert earlier.

May I ask that you copy Nick Desaulniers (on CC:) on the review for
your next change that introduces memccpy. The linux kernel does use
snprintf and I'm sure they'd want some warning that their clang builds
are likely to break.
My understanding (https://clang.llvm.org/compatibility.html) is that
clang defaults to C11, I can understand making the transformation if
the standard is set to C20, however I'm not sure whether we should be
doing it when the standard is C11. Anyway one for the review.

Peter

On Wed, 2 Oct 2019 at 16:07, Dávid Bolvanský <david.bolvansky at gmail.com> wrote:
>
> This was reverted a bit later. I mismeasured it and thought it is faster than memcpy and strlen solution.
> Anyway, snprintf -> memccpy is worth it (30% faster). I have a patch for it, I will land it later. There are more oportunities to exploit memccpy for efficient string concatenation/copying. (You can find redhat’s article somewhere about it).
>
> It is a part of C20. The solution for the kernel is either use -fno-builtin-memccpy or implement it.
>
> The kernel
>
> Odoslané z iPhonu
>
> > Dňa 2. 10. 2019 o 15:25 užívateľ Peter Smith <peter.smith at linaro.org> napísal:
> >
> > This change has caused a regression in the AArch64 Linux kernel build
> > (Linaro CI), I'll forward the details from the CI system of how to
> > reproduce it directly via email
> >
> > The errors look to be from the linux kernel module tool modpost. These
> > are undefined symbols coming from linux kernel modules.
> > ERROR: "memccpy" [sound/pci/hda/snd-hda-codec-ca0132.ko] undefined!
> > ERROR: "memccpy" [drivers/infiniband/hw/mlx4/mlx4_ib.ko] undefined!
> > ERROR: "memccpy" [drivers/target/iscsi/iscsi_target_mod.ko] undefined!
> > ERROR: "memccpy" [drivers/nvdimm/libnvdimm.ko] undefined!
> > ERROR: "memccpy" [drivers/staging/rtl8192e/rtllib.ko] undefined!
> > ERROR: "memccpy" [drivers/i2c/busses/i2c-jz4780.ko] undefined!
> > ERROR: "memccpy" [drivers/usb/mtu3/mtu3.ko] undefined!
> > ERROR: "memccpy" [drivers/net/usb/hso.ko] undefined!
> > ERROR: "memccpy" [drivers/net/wireless/ath/ath9k/ath9k_htc.ko] undefined!
> > ERROR: "memccpy" [drivers/net/ethernet/apm/xgene-v2/xgene-enet-v2.ko] undefined!
> > ERROR: "memccpy" [drivers/net/ethernet/apm/xgene/xgene-enet.ko] undefined!
> > ERROR: "memccpy" [drivers/net/ethernet/qlogic/qed/qed.ko] undefined!
> > ERROR: "memccpy" [drivers/net/ethernet/mellanox/mlx4/mlx4_core.ko] undefined!
> > ERROR: "memccpy" [drivers/net/ethernet/cavium/liquidio/liquidio_vf.ko]
> > undefined!
> > ERROR: "memccpy" [drivers/net/ethernet/cavium/liquidio/liquidio.ko] undefined!
> > ERROR: "memccpy" [drivers/net/ethernet/brocade/bna/bna.ko] undefined!
> > ERROR: "memccpy" [drivers/scsi/mpt3sas/mpt3sas.ko] undefined!
> > ERROR: "memccpy" [drivers/scsi/bfa/bfa.ko] undefined!
> >
> > I believe that memccpy is not in the C Standard so it is likely the
> > Linux kernel does not provide its own definition. Is there a reason to
> > use a non-standard memccpy in the transformation unless we can
> > guarantee that the environment will support it?
> >
> > Can you take a look?
> >
> > Peter
> >
> >> On Tue, 1 Oct 2019 at 14:01, David Bolvansky via llvm-commits
> >> <llvm-commits at lists.llvm.org> wrote:
> >>
> >> Author: xbolva00
> >> Date: Tue Oct  1 06:03:10 2019
> >> New Revision: 373333
> >>
> >> URL: http://llvm.org/viewvc/llvm-project?rev=373333&view=rev
> >> Log:
> >> [InstCombine] sprintf(dest, "%s", str) -> memccpy(dest, str, 0, MAX)
> >>
> >> Modified:
> >>    llvm/trunk/lib/Transforms/Utils/SimplifyLibCalls.cpp
> >>    llvm/trunk/test/Transforms/InstCombine/2010-05-30-memcpy-Struct.ll
> >>    llvm/trunk/test/Transforms/InstCombine/sprintf-1.ll
> >>
> >> Modified: llvm/trunk/lib/Transforms/Utils/SimplifyLibCalls.cpp
> >> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Utils/SimplifyLibCalls.cpp?rev=373333&r1=373332&r2=373333&view=diff
> >> ==============================================================================
> >> --- llvm/trunk/lib/Transforms/Utils/SimplifyLibCalls.cpp (original)
> >> +++ llvm/trunk/lib/Transforms/Utils/SimplifyLibCalls.cpp Tue Oct  1 06:03:10 2019
> >> @@ -2415,9 +2415,11 @@ Value *LibCallSimplifier::optimizePrintF
> >> }
> >>
> >> Value *LibCallSimplifier::optimizeSPrintFString(CallInst *CI, IRBuilder<> &B) {
> >> +  Value *Dst = CI->getArgOperand(0);
> >> +  Value *FmtStr = CI->getArgOperand(1);
> >>   // Check for a fixed format string.
> >>   StringRef FormatStr;
> >> -  if (!getConstantStringInfo(CI->getArgOperand(1), FormatStr))
> >> +  if (!getConstantStringInfo(FmtStr, FormatStr))
> >>     return nullptr;
> >>
> >>   // If we just have a format string (nothing else crazy) transform it.
> >> @@ -2428,9 +2430,10 @@ Value *LibCallSimplifier::optimizeSPrint
> >>       return nullptr; // we found a format specifier, bail out.
> >>
> >>     // sprintf(str, fmt) -> llvm.memcpy(align 1 str, align 1 fmt, strlen(fmt)+1)
> >> -    B.CreateMemCpy(CI->getArgOperand(0), 1, CI->getArgOperand(1), 1,
> >> -                   ConstantInt::get(DL.getIntPtrType(CI->getContext()),
> >> -                                    FormatStr.size() + 1)); // Copy the null byte.
> >> +    B.CreateMemCpy(
> >> +        Dst, 1, FmtStr, 1,
> >> +        ConstantInt::get(DL.getIntPtrType(CI->getContext()),
> >> +                         FormatStr.size() + 1)); // Copy the null byte.
> >>     return ConstantInt::get(CI->getType(), FormatStr.size());
> >>   }
> >>
> >> @@ -2440,13 +2443,14 @@ Value *LibCallSimplifier::optimizeSPrint
> >>       CI->getNumArgOperands() < 3)
> >>     return nullptr;
> >>
> >> +  Value *Str = CI->getArgOperand(2);
> >>   // Decode the second character of the format string.
> >>   if (FormatStr[1] == 'c') {
> >>     // sprintf(dst, "%c", chr) --> *(i8*)dst = chr; *((i8*)dst+1) = 0
> >> -    if (!CI->getArgOperand(2)->getType()->isIntegerTy())
> >> +    if (!Str->getType()->isIntegerTy())
> >>       return nullptr;
> >> -    Value *V = B.CreateTrunc(CI->getArgOperand(2), B.getInt8Ty(), "char");
> >> -    Value *Ptr = castToCStr(CI->getArgOperand(0), B);
> >> +    Value *V = B.CreateTrunc(Str, B.getInt8Ty(), "char");
> >> +    Value *Ptr = castToCStr(Dst, B);
> >>     B.CreateStore(V, Ptr);
> >>     Ptr = B.CreateGEP(B.getInt8Ty(), Ptr, B.getInt32(1), "nul");
> >>     B.CreateStore(B.getInt8(0), Ptr);
> >> @@ -2455,17 +2459,30 @@ Value *LibCallSimplifier::optimizeSPrint
> >>   }
> >>
> >>   if (FormatStr[1] == 's') {
> >> -    // sprintf(dest, "%s", str) -> llvm.memcpy(align 1 dest, align 1 str,
> >> -    // strlen(str)+1)
> >> -    if (!CI->getArgOperand(2)->getType()->isPointerTy())
> >> +    if (!Str->getType()->isPointerTy())
> >>       return nullptr;
> >> +
> >> +    // sprintf(dest, "%s", str) -> memccpy(dest, str, 0, MAX)
> >> +    if (CI->use_empty()) {
> >> +      Value *S = (Dst->getType() == Str->getType())
> >> +                     ? Str
> >> +                     : B.CreateBitCast(Str, Dst->getType());
> >> +      unsigned SizeTyBitwidth =
> >> +          DL.getIntPtrType(CI->getContext())->getPrimitiveSizeInBits();
> >> +      Value *NewSize = ConstantInt::get(B.getIntNTy(SizeTyBitwidth),
> >> +                                        APInt::getMaxValue(SizeTyBitwidth));
> >> +      emitMemCCpy(Dst, S, B.getInt32('\0'), NewSize, B, TLI);
> >> +      return Dst;
> >> +    }
> >>
> >> -    Value *Len = emitStrLen(CI->getArgOperand(2), B, DL, TLI);
> >> +    // sprintf(dest, "%s", str) -> llvm.memcpy(align 1 dest, align 1 str,
> >> +    // strlen(str)+1)
> >> +    Value *Len = emitStrLen(Str, B, DL, TLI);
> >>     if (!Len)
> >>       return nullptr;
> >>     Value *IncLen =
> >>         B.CreateAdd(Len, ConstantInt::get(Len->getType(), 1), "leninc");
> >> -    B.CreateMemCpy(CI->getArgOperand(0), 1, CI->getArgOperand(2), 1, IncLen);
> >> +    B.CreateMemCpy(Dst, 1, Str, 1, IncLen);
> >>
> >>     // The sprintf result is the unincremented number of bytes in the string.
> >>     return B.CreateIntCast(Len, CI->getType(), false);
> >>
> >> Modified: llvm/trunk/test/Transforms/InstCombine/2010-05-30-memcpy-Struct.ll
> >> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/InstCombine/2010-05-30-memcpy-Struct.ll?rev=373333&r1=373332&r2=373333&view=diff
> >> ==============================================================================
> >> --- llvm/trunk/test/Transforms/InstCombine/2010-05-30-memcpy-Struct.ll (original)
> >> +++ llvm/trunk/test/Transforms/InstCombine/2010-05-30-memcpy-Struct.ll Tue Oct  1 06:03:10 2019
> >> @@ -1,3 +1,4 @@
> >> +; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
> >> ; RUN: opt -instcombine -S < %s | FileCheck %s
> >> ; PR7265
> >>
> >> @@ -9,10 +10,14 @@ target triple = "x86_64-unknown-linux-gn
> >> @.str = private constant [3 x i8] c"%s\00"
> >>
> >> define void @CopyEventArg(%union.anon* %ev) nounwind {
> >> +; CHECK-LABEL: @CopyEventArg(
> >> +; CHECK-NEXT:  entry:
> >> +; CHECK-NEXT:    [[TMP0:%.*]] = bitcast %union.anon* [[EV:%.*]] to i8*
> >> +; CHECK-NEXT:    [[MEMCCPY:%.*]] = call i8* @memccpy(i8* undef, i8* [[TMP0]], i32 0, i64 -1)
> >> +; CHECK-NEXT:    ret void
> >> +;
> >> entry:
> >>   %call = call i32 (i8*, i8*, ...) @sprintf(i8* undef, i8* getelementptr inbounds ([3 x i8], [3 x i8]* @.str, i64 0, i64 0), %union.anon* %ev) nounwind
> >> -; CHECK: bitcast %union.anon* %ev to i8*
> >> -; CHECK: call void @llvm.memcpy.p0i8.p0i8.i64
> >>   ret void
> >> }
> >>
> >>
> >> Modified: llvm/trunk/test/Transforms/InstCombine/sprintf-1.ll
> >> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/InstCombine/sprintf-1.ll?rev=373333&r1=373332&r2=373333&view=diff
> >> ==============================================================================
> >> --- llvm/trunk/test/Transforms/InstCombine/sprintf-1.ll (original)
> >> +++ llvm/trunk/test/Transforms/InstCombine/sprintf-1.ll Tue Oct  1 06:03:10 2019
> >> @@ -85,15 +85,11 @@ define void @test_simplify4(i8* %dst) {
> >>
> >> define void @test_simplify5(i8* %dst, i8* %str) {
> >> ; CHECK-LABEL: @test_simplify5(
> >> -; CHECK-NEXT:    [[STRLEN:%.*]] = call i32 @strlen(i8* nonnull dereferenceable(1) [[STR:%.*]])
> >> -; CHECK-NEXT:    [[LENINC:%.*]] = add i32 [[STRLEN]], 1
> >> -; CHECK-NEXT:    call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 1 [[DST:%.*]], i8* align 1 [[STR]], i32 [[LENINC]], i1 false)
> >> +; CHECK-NEXT:    [[MEMCCPY:%.*]] = call i8* @memccpy(i8* [[DST:%.*]], i8* [[STR:%.*]], i32 0, i32 -1)
> >> ; CHECK-NEXT:    ret void
> >> ;
> >> ; CHECK-IPRINTF-LABEL: @test_simplify5(
> >> -; CHECK-IPRINTF-NEXT:    [[STRLEN:%.*]] = call i32 @strlen(i8* nonnull dereferenceable(1) [[STR:%.*]])
> >> -; CHECK-IPRINTF-NEXT:    [[LENINC:%.*]] = add i32 [[STRLEN]], 1
> >> -; CHECK-IPRINTF-NEXT:    call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 1 [[DST:%.*]], i8* align 1 [[STR]], i32 [[LENINC]], i1 false)
> >> +; CHECK-IPRINTF-NEXT:    [[MEMCCPY:%.*]] = call i8* @memccpy(i8* [[DST:%.*]], i8* [[STR:%.*]], i32 0, i32 -1)
> >> ; CHECK-IPRINTF-NEXT:    ret void
> >> ;
> >>   %fmt = getelementptr [3 x i8], [3 x i8]* @percent_s, i32 0, i32 0
> >>
> >>
> >> _______________________________________________
> >> llvm-commits mailing list
> >> llvm-commits at lists.llvm.org
> >> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits


More information about the llvm-commits mailing list