[llvm] r311333 - [LibCallSimplifier] try harder to fold memcmp with constant arguments

Sanjay Patel via llvm-commits llvm-commits at lists.llvm.org
Mon Aug 21 13:25:41 PDT 2017


Hi Adrian -

Sorry - I broke lots of bots with that one.

I reverted r311333 at:
https://reviews.llvm.org/rL311340

And recommitted with a fix at:
https://reviews.llvm.org/rL311366

Please let me know if you still see problems with the new version.


On Mon, Aug 21, 2017 at 2:19 PM, Adrian Prantl <aprantl at apple.com> wrote:

> Here's a link to the failure in case you want to reproduce this:
>
> http://green.lab.llvm.org/green/job/perf_darwin_x86_
> O3flto/12409/consoleFull#2864721549ba4694-19c4-4d7e-bec5-911270d8a58c
>
> -- adrian
>
> On Aug 21, 2017, at 1:16 PM, Adrian Prantl via llvm-commits <
> llvm-commits at lists.llvm.org> wrote:
>
> The green dragon LNT bot has started to timeout between r311333 and
> r311336.
> I logged into the machine and attached lldb to a clang job that has been
> running for more than 60min.
>
> Could you please take a look and/or revert this commit?
>
> thanks,
> Adrian
>
>
> green-dragon-06:~ buildslave$ lldb -p  33286
> (lldb) process attach --pid 33286
> Process 33286 stopped
> * thread #1: tid = 0xac2ec5, 0x000000010d76788f clang-6.0`llvm::
> ConstantFoldLoadFromConstPtr(llvm::Constant*, llvm::Type*,
> llvm::DataLayout const&) + 927, queue = 'com.apple.main-thread', stop
> reason = signal SIGSTOP
>    frame #0: 0x000000010d76788f clang-6.0`llvm::
> ConstantFoldLoadFromConstPtr(llvm::Constant*, llvm::Type*,
> llvm::DataLayout const&) + 927
> clang-6.0`llvm::ConstantFoldLoadFromConstPtr:
> ->  0x10d76788f <+927>: testb  $0x1, 0x50(%rbx)
>    0x10d767893 <+931>: je     0x10d767e06               ; <+2326>
>    0x10d767899 <+937>: movq   %rbx, %rdi
>    0x10d76789c <+940>: callq  0x10dcc6690               ;
> llvm::GlobalValue::isDeclaration() const
>
> Executable module set to "/Users/buildslave/jenkins/
> workspace/perf_darwin_x86_O3flto/host-compiler/bin/clang-6.0".
> Architecture set to: x86_64-apple-macosx.
> (lldb) bt
> * thread #1: tid = 0xac2ec5, 0x000000010d76788f clang-6.0`llvm::
> ConstantFoldLoadFromConstPtr(llvm::Constant*, llvm::Type*,
> llvm::DataLayout const&) + 927, queue = 'com.apple.main-thread', stop
> reason = signal SIGSTOP
>  * frame #0: 0x000000010d76788f clang-6.0`llvm::
> ConstantFoldLoadFromConstPtr(llvm::Constant*, llvm::Type*,
> llvm::DataLayout const&) + 927
>    frame #1: 0x00007feb2db8e2d8
>    frame #2: 0x000000010e35e06f clang-6.0`llvm::LibCallSimplifier::
> optimizeMemCmp(llvm::CallInst*, llvm::IRBuilder<llvm::ConstantFolder,
> llvm::IRBuilderDefaultInserter>&) + 1119
>    frame #3: 0x000000010e365e7a clang-6.0`llvm::LibCallSimplifier::
> optimizeStringMemoryLibCall(llvm::CallInst*, llvm::IRBuilder<llvm::ConstantFolder,
> llvm::IRBuilderDefaultInserter>&) + 266
>    frame #4: 0x000000010e3670b8 clang-6.0`llvm::LibCallSimplifier::optimizeCall(llvm::CallInst*)
> + 1080
>    frame #5: 0x000000010de61193 clang-6.0`llvm::InstCombiner::
> tryOptimizeCall(llvm::CallInst*) + 99
>    frame #6: 0x000000010de5c759 clang-6.0`llvm::InstCombiner::visitCallSite(llvm::CallSite)
> + 2761
>    frame #7: 0x000000010de57aa6 clang-6.0`llvm::InstCombiner::visitCallInst(llvm::CallInst&)
> + 55574
>    frame #8: 0x000000010de1ef16 clang-6.0`llvm::InstCombiner::run() + 3222
>    frame #9: 0x000000010de20be9 clang-6.0`combineInstructionsOverFunction(llvm::Function&,
> llvm::InstCombineWorklist&, llvm::AAResults*, llvm::AssumptionCache&,
> llvm::TargetLibraryInfo&, llvm::DominatorTree&, llvm::OptimizationRemarkEmitter&,
> bool, llvm::LoopInfo*) + 4201
>    frame #10: 0x000000010de21207 clang-6.0`llvm::
> InstructionCombiningPass::runOnFunction(llvm::Function&) + 631
>    frame #11: 0x000000010dcfa353 clang-6.0`llvm::FPPassManager:
> :runOnFunction(llvm::Function&) + 547
>    frame #12: 0x000000010dcfa5b3 clang-6.0`llvm::FPPassManager::runOnModule(llvm::Module&)
> + 51
>    frame #13: 0x000000010dcfaa3e clang-6.0`llvm::legacy::
> PassManagerImpl::run(llvm::Module&) + 766
>    frame #14: 0x000000010e49e2b4 clang-6.0`clang::
> EmitBackendOutput(clang::DiagnosticsEngine&, clang::HeaderSearchOptions
> const&, clang::CodeGenOptions const&, clang::TargetOptions const&,
> clang::LangOptions const&, llvm::DataLayout const&, llvm::Module*,
> clang::BackendAction, std::__1::unique_ptr<llvm::raw_pwrite_stream,
> std::__1::default_delete<llvm::raw_pwrite_stream> >) + 15172
>    frame #15: 0x000000010e6d5ea3 clang-6.0`clang::BackendConsumer::
> HandleTranslationUnit(clang::ASTContext&) + 947
>    frame #16: 0x000000010f0ad595 clang-6.0`clang::ParseAST(clang::Sema&,
> bool, bool) + 469
>    frame #17: 0x000000010e944f2c clang-6.0`clang::FrontendAction::Execute()
> + 76
>    frame #18: 0x000000010e901021 clang-6.0`clang::CompilerInstance::
> ExecuteAction(clang::FrontendAction&) + 1217
>    frame #19: 0x000000010e9a9695 clang-6.0`clang::
> ExecuteCompilerInvocation(clang::CompilerInstance*) + 4949
>    frame #20: 0x000000010cad1492 clang-6.0`cc1_main(llvm::ArrayRef<char
> const*>, char const*, void*) + 1394
>    frame #21: 0x000000010cacfb43 clang-6.0`main + 11939
>    frame #22: 0x00007fff903cc5ad libdyld.dylib`start + 1
>    frame #23: 0x00007fff903cc5ad libdyld.dylib`start + 1
>
> On Aug 21, 2017, at 6:55 AM, Sanjay Patel via llvm-commits <
> llvm-commits at lists.llvm.org> wrote:
>
> Author: spatel
> Date: Mon Aug 21 06:55:49 2017
> New Revision: 311333
>
> URL: http://llvm.org/viewvc/llvm-project?rev=311333&view=rev
> Log:
> [LibCallSimplifier] try harder to fold memcmp with constant arguments
>
> Try to fold:
> memcmp(X, C, ConstantLength) == 0 --> load X == *C
>
> Without this change, we're unnecessarily checking the alignment of the
> constant data,
> so we miss the transform in the first 2 tests in the patch.
>
> I noted this shortcoming of LibCallSimpifier in one of the recent CGP
> memcmp expansion
> patches. This doesn't help the example in:
> https://bugs.llvm.org/show_bug.cgi?id=34032#c13
> ...directly, but it's worth short-circuiting more of these simple cases
> since we're
> already trying to do that.
>
> The benefit of transforming to load+cmp is that existing IR
> analysis/transforms may
> further simplify that code. For example, if the load of the variable is
> common to
> multiple memcmp calls, CSE can remove the duplicate instructions.
>
> Differential Revision: https://reviews.llvm.org/D36922
>
> Added:
>   llvm/trunk/test/Transforms/InstCombine/memcmp-constant-fold.ll
> Modified:
>   llvm/trunk/lib/Transforms/Utils/SimplifyLibCalls.cpp
>
> Modified: llvm/trunk/lib/Transforms/Utils/SimplifyLibCalls.cpp
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Utils/
> SimplifyLibCalls.cpp?rev=311333&r1=311332&r2=311333&view=diff
> ============================================================
> ==================
> --- llvm/trunk/lib/Transforms/Utils/SimplifyLibCalls.cpp (original)
> +++ llvm/trunk/lib/Transforms/Utils/SimplifyLibCalls.cpp Mon Aug 21
> 06:55:49 2017
> @@ -18,6 +18,7 @@
> #include "llvm/ADT/SmallString.h"
> #include "llvm/ADT/StringMap.h"
> #include "llvm/ADT/Triple.h"
> +#include "llvm/Analysis/ConstantFolding.h"
> #include "llvm/Analysis/OptimizationDiagnosticInfo.h"
> #include "llvm/Analysis/TargetLibraryInfo.h"
> #include "llvm/Analysis/ValueTracking.h"
> @@ -751,29 +752,44 @@ Value *LibCallSimplifier::optimizeMemCmp
>  }
>
>  // memcmp(S1,S2,N/8)==0 -> (*(intN_t*)S1 != *(intN_t*)S2)==0
> +  // TODO: The case where both inputs are constants does not need to be
> limited
> +  // to legal integers or equality comparison. See block below this.
>  if (DL.isLegalInteger(Len * 8) && isOnlyUsedInZeroEqualityComparison(CI))
> {
> -
>    IntegerType *IntType = IntegerType::get(CI->getContext(), Len * 8);
>    unsigned PrefAlignment = DL.getPrefTypeAlignment(IntType);
>
> -    if (getKnownAlignment(LHS, DL, CI) >= PrefAlignment &&
> -        getKnownAlignment(RHS, DL, CI) >= PrefAlignment) {
> +    // First, see if we can fold either argument to a constant.
> +    Value *LHSV = nullptr;
> +    if (auto *LHSC = dyn_cast<Constant>(LHS)) {
> +      LHSC = ConstantExpr::getBitCast(LHSC, IntType->getPointerTo());
> +      LHSV = ConstantFoldLoadFromConstPtr(LHSC, IntType, DL);
> +    }
> +    Value *RHSV = nullptr;
> +    if (auto *RHSC = dyn_cast<Constant>(RHS)) {
> +      RHSC = ConstantExpr::getBitCast(RHSC, IntType->getPointerTo());
> +      RHSV = ConstantFoldLoadFromConstPtr(RHSC, IntType, DL);
> +    }
>
> +    // Don't generate unaligned loads. If either source is constant data,
> +    // alignment doesn't matter for that source because there is no load.
> +    if (!LHSV && getKnownAlignment(LHS, DL, CI) >= PrefAlignment) {
>      Type *LHSPtrTy =
>          IntType->getPointerTo(LHS->getType()->getPointerAddressSpace());
> +      LHSV = B.CreateLoad(B.CreateBitCast(LHS, LHSPtrTy), "lhsv");
> +    }
> +
> +    if (!RHSV && getKnownAlignment(RHS, DL, CI) >= PrefAlignment) {
>      Type *RHSPtrTy =
>          IntType->getPointerTo(RHS->getType()->getPointerAddressSpace());
> +      RHSV = B.CreateLoad(B.CreateBitCast(RHS, RHSPtrTy), "rhsv");
> +    }
>
> -      Value *LHSV =
> -          B.CreateLoad(B.CreateBitCast(LHS, LHSPtrTy, "lhsc"), "lhsv");
> -      Value *RHSV =
> -          B.CreateLoad(B.CreateBitCast(RHS, RHSPtrTy, "rhsc"), "rhsv");
> -
> +    if (LHSV && RHSV)
>      return B.CreateZExt(B.CreateICmpNE(LHSV, RHSV), CI->getType(),
> "memcmp");
> -    }
>  }
>
> -  // Constant folding: memcmp(x, y, l) -> cnst (all arguments are
> constant)
> +  // Constant folding: memcmp(x, y, Len) -> constant (all arguments are
> const).
> +  // TODO: This is limited to i8 arrays.
>  StringRef LHSStr, RHSStr;
>  if (getConstantStringInfo(LHS, LHSStr) &&
>      getConstantStringInfo(RHS, RHSStr)) {
>
> Added: llvm/trunk/test/Transforms/InstCombine/memcmp-constant-fold.ll
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/
> Transforms/InstCombine/memcmp-constant-fold.ll?rev=311333&view=auto
> ============================================================
> ==================
> --- llvm/trunk/test/Transforms/InstCombine/memcmp-constant-fold.ll (added)
> +++ llvm/trunk/test/Transforms/InstCombine/memcmp-constant-fold.ll Mon
> Aug 21 06:55:49 2017
> @@ -0,0 +1,65 @@
> +; RUN: opt < %s -instcombine -S -data-layout=e-n32 | FileCheck %s
> --check-prefix=ALL --check-prefix=LE
> +; RUN: opt < %s -instcombine -S -data-layout=E-n32 | FileCheck %s
> --check-prefix=ALL --check-prefix=BE
> +
> +declare i32 @memcmp(i8*, i8*, i64)
> +
> +; The alignment of this constant does not matter. We constant fold the
> load.
> +
> + at charbuf = private unnamed_addr constant [4 x i8] [i8 0, i8 0, i8 0, i8
> 1], align 1
> +
> +define i1 @memcmp_4bytes_unaligned_constant_i8(i8* align 4 %x) {
> +; LE-LABEL: @memcmp_4bytes_unaligned_constant_i8(
> +; LE-NEXT:    [[TMP1:%.*]] = bitcast i8* %x to i32*
> +; LE-NEXT:    [[LHSV:%.*]] = load i32, i32* [[TMP1]], align 4
> +; LE-NEXT:    [[TMP2:%.*]] = icmp eq i32 [[LHSV]], 16777216
> +; LE-NEXT:    ret i1 [[TMP2]]
> +;
> +; BE-LABEL: @memcmp_4bytes_unaligned_constant_i8(
> +; BE-NEXT:    [[TMP1:%.*]] = bitcast i8* %x to i32*
> +; BE-NEXT:    [[LHSV:%.*]] = load i32, i32* [[TMP1]], align 4
> +; BE-NEXT:    [[TMP2:%.*]] = icmp eq i32 [[LHSV]], 1
> +; BE-NEXT:    ret i1 [[TMP2]]
> +;
> +  %call = tail call i32 @memcmp(i8* %x, i8* getelementptr inbounds ([4 x
> i8], [4 x i8]* @charbuf, i64 0, i64 0), i64 4)
> +  %cmpeq0 = icmp eq i32 %call, 0
> +  ret i1 %cmpeq0
> +}
> +
> +; We still don't care about alignment of the constant. We are not limited
> to constant folding only i8 arrays.
> +; It doesn't matter if the constant operand is the first operand to the
> memcmp.
> +
> + at intbuf_unaligned = private unnamed_addr constant [4 x i16] [i16 1, i16
> 2, i16 3, i16 4], align 1
> +
> +define i1 @memcmp_4bytes_unaligned_constant_i16(i8* align 4 %x) {
> +; LE-LABEL: @memcmp_4bytes_unaligned_constant_i16(
> +; LE-NEXT:    [[TMP1:%.*]] = bitcast i8* %x to i32*
> +; LE-NEXT:    [[RHSV:%.*]] = load i32, i32* [[TMP1]], align 4
> +; LE-NEXT:    [[TMP2:%.*]] = icmp eq i32 [[RHSV]], 131073
> +; LE-NEXT:    ret i1 [[TMP2]]
> +;
> +; BE-LABEL: @memcmp_4bytes_unaligned_constant_i16(
> +; BE-NEXT:    [[TMP1:%.*]] = bitcast i8* %x to i32*
> +; BE-NEXT:    [[RHSV:%.*]] = load i32, i32* [[TMP1]], align 4
> +; BE-NEXT:    [[TMP2:%.*]] = icmp eq i32 [[RHSV]], 65538
> +; BE-NEXT:    ret i1 [[TMP2]]
> +;
> +  %call = tail call i32 @memcmp(i8* bitcast (i16* getelementptr inbounds
> ([4 x i16], [4 x i16]* @intbuf_unaligned, i64 0, i64 0) to i8*), i8* %x,
> i64 4)
> +  %cmpeq0 = icmp eq i32 %call, 0
> +  ret i1 %cmpeq0
> +}
> +
> +; TODO: Any memcmp where all arguments are constants should be constant
> folded. Currently, we only handle i8 array constants.
> +
> + at intbuf = private unnamed_addr constant [2 x i32] [i32 0, i32 1], align 4
> +
> +define i1 @memcmp_3bytes_aligned_constant_i32(i8* align 4 %x) {
> +; ALL-LABEL: @memcmp_3bytes_aligned_constant_i32(
> +; ALL-NEXT:    [[CALL:%.*]] = tail call i32 @memcmp(i8* bitcast (i32*
> getelementptr inbounds ([2 x i32], [2 x i32]* @intbuf, i64 0, i64 1) to
> i8*), i8* bitcast ([2 x i32]* @intbuf to i8*), i64 3)
> +; ALL-NEXT:    [[CMPEQ0:%.*]] = icmp eq i32 [[CALL]], 0
> +; ALL-NEXT:    ret i1 [[CMPEQ0]]
> +;
> +  %call = tail call i32 @memcmp(i8* bitcast (i32* getelementptr inbounds
> ([2 x i32], [2 x i32]* @intbuf, i64 0, i64 1) to i8*), i8* bitcast (i32*
> getelementptr inbounds ([2 x i32], [2 x i32]* @intbuf, i64 0, i64 0) to
> i8*), i64 3)
> +  %cmpeq0 = icmp eq i32 %call, 0
> +  ret i1 %cmpeq0
> +}
> +
>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20170821/46794442/attachment.html>


More information about the llvm-commits mailing list