[llvm] r314435 - [JumpThreading] Preserve DT and LVI across the pass

Michael Zolotukhin via llvm-commits llvm-commits at lists.llvm.org
Thu Oct 12 10:16:46 PDT 2017


> On Oct 11, 2017, at 4:20 PM, Jakub (Kuba) Kuderski <kubakuderski at gmail.com> wrote:
> 
> Does fullLTO use a different pass manager configuration, or where does the difference come from?
Yes, LTO uses a different pass pipeline. Please see lib/Transforms/IPO/PassManagerBuilder.cpp for details. The difference might come from different inlining decisions, for example.

Michael

> 
> On Wed, Oct 11, 2017 at 5:53 PM, Michael Zolotukhin <mzolotukhin at apple.com <mailto:mzolotukhin at apple.com>> wrote:
> 
>> On Oct 11, 2017, at 2:40 PM, Brian M. Rzycki <brzycki at gmail.com <mailto:brzycki at gmail.com>> wrote:
>> 
>> I did more testing on the DT/LVI patch for JumpThreading and I am unable to reproduce the 10% sqlite3 regression Michael Zolotukhin reported.
> Could you please try building full sqlite3 test with -flto? The 10% regression was detected on an O3-LTO bot, while in my example (just O3) I also saw just 5%. That’s what you’re probably seeing now (plus-minus differences in our configurations and noise).
> 
> Thanks for the testing!
> 
> Michael
> 
>> 
>> There are two compilers: upstream and dt. The upstream compiler has the
>> following patch as its HEAD:
>> $ git log HEAD
>> commit c491ccc7bbddf8c04885fd3ab90f90cc9c319d33 (HEAD -> master)
>> Author: Ayal Zaks <ayal.zaks at intel.com <mailto:ayal.zaks at intel.com>>
>> Date:   Thu Oct 5 15:45:14 2017 +0000
>> 
>>     [LV] Fix PR34743 - handle casts that sink after interleaved loads
>> 
>>     When ignoring a load that participates in an interleaved group, make
>>         sure to move a cast that needs to sink after it.
>> 
>>     Testcase derived from reproducer of PR34743.
>> 
>>     Differential Revision: https://reviews.llvm.org/D38338 <https://reviews.llvm.org/D38338>
>> 
>> The dt compiler is based on the exact same patch as upstream, only with
>> https://reviews.llvm.org/D38558 <https://reviews.llvm.org/D38558> applied.
>> 
>> The sqlite version:
>> $ cat sqlite/VERSION
>> 3.20.1
>> 
>> The host is an X86 server-class CPU:
>> processor       : 30
>> vendor_id       : GenuineIntel
>> cpu family      : 6
>> model           : 79
>> model name      : Intel(R) Xeon(R) CPU E5-2667 v4 @ 3.20GHz
>> stepping        : 1
>> 
>> Here are the compiler runs:
>> 
>> # upstream compiler, best of 3 runs
>> $ time taskset -c 30 /work/brzycki/upstream/install/bin/clang  -DNDEBUG -O3 -DSTDC_HEADERS=1  -DSQLITE_OMIT_LOAD_EXTENSION=1 -DSQLITE_THREADSAFE=0 -o sqlite3.c.o -c sqlite3.c
>>  
>> real    7m57.157s
>> user    7m56.472s
>> sys     0m0.560s
>>  
>> # dt patched compiler, best of 3 runs
>> $ time taskset -c 30 /work/brzycki/dt/install/bin/clang  -DNDEBUG -O3 -DSTDC_HEADERS=1  -DSQLITE_OMIT_LOAD_EXTENSION=1 -DSQLITE_THREADSAFE=0 -o sqlite3.c.o -c sqlite3.c
>>  
>> real    8m14.753s
>> user    8m13.932s
>> sys     0m0.496s
>>  
>> That’s a difference of 4.3%. I'm unsure why Arm64 is showing a further regression of 6%. I've also done some native Aarch64 Linux testing on a Firefly ARM72 device and see a difference very close to this number. If anyone is interested I can post this data too.
>> 
>> I plan on re-running these tests when https://reviews.llvm.org/D38802 <https://reviews.llvm.org/D38802> lands to see if it further reduces the overhead or not.
>> 
>> On Wed, Oct 4, 2017 at 3:12 PM, Brian M. Rzycki <brzycki at gmail.com <mailto:brzycki at gmail.com>> wrote:
>> I have just pushed two patches into Phabricator:
>> Take two of the DT/LVI patch:
>> https://reviews.llvm.org/D38558 <https://reviews.llvm.org/D38558>
>> 
>> A patch to place DT/LVI preservation under a flag:
>> 
>> https://reviews.llvm.org/D38559 <https://reviews.llvm.org/D38559>
>> 
>> Jakub, I too am planning on doing some profiling later this week. I would like to collaborate to minimize duplicated efforts.
>> 
>> 
>> On Wed, Oct 4, 2017 at 12:01 PM, Jakub (Kuba) Kuderski via llvm-commits <llvm-commits at lists.llvm.org <mailto:llvm-commits at lists.llvm.org>> wrote:
>> I will try to profile the sqlite compilation later this week. It is not easy to come up with some benchmarks for the batch updater, and because of that it hasn't really been optimized yet. JumpThreading seems to be the first pass to put significant pressure on the incremental updater, so it'd be valuable to learn how much faster we can get there.
>> 
>> On Wed, Oct 4, 2017 at 12:47 PM, Michael Zolotukhin <mzolotukhin at apple.com <mailto:mzolotukhin at apple.com>> wrote:
>> 
>>> On Oct 4, 2017, at 9:25 AM, Daniel Berlin <dberlin at dberlin.org <mailto:dberlin at dberlin.org>> wrote:
>>> 
>>> +jakub for real.
>>> 
>>> On Wed, Oct 4, 2017 at 9:21 AM, Daniel Berlin <dberlin at dberlin.org <mailto:dberlin at dberlin.org>> wrote:
>>> +jakub
>>> 
>>> I don't think preservation should be controlled by a flag.  That seems like not a great path forward.
>>> Either we should make it fast enough, or we should give up :)
>>> 
>>> Do you have any profiles on this?
>> I don’t, but it seems pretty clear what’s going on there - we do an extra work (preserving DT) for no benefits (we’re recomputing it a couple of passes later anyway). The pass after which it’s recomputed is SimplifyCFG, and I think it would be a much bigger endeavor to teach it to account for DT - so putting it under a flag for JumpThreading could help doing this work incrementally.
>> 
>> Michael
>> 
>>> 
>>> 
>>> 
>>> On Sat, Sep 30, 2017 at 12:19 PM, Michael Zolotukhin via llvm-commits <llvm-commits at lists.llvm.org <mailto:llvm-commits at lists.llvm.org>> wrote:
>>> For the record, we also detected significant compile-time increases caused by this patch. It caused up to 10% slow downs on sqlite3 and some other tests from CTMark/LLVM testsuite (on O3/LTO and other optsets).
>>> 
>>> Reproducer:
>>> 
>>> time clang-r314435/bin/clang  -DNDEBUG -O3 -arch arm64 -isysroot $SYSROOT -DSTDC_HEADERS=1  -DSQLITE_OMIT_LOAD_EXTENSION=1 -DSQLITE_THREADSAFE=0  -w -o sqlite3.c.o -c $TESTSUITE/CTMark/sqlite3/sqlite3.c
>>> real    0m20.235s
>>> user    0m20.130s
>>> sys     0m0.091s
>>> 
>>> time clang-r314432/bin/clang  -DNDEBUG -O3 -arch arm64 -isysroot $SYSROOT -DSTDC_HEADERS=1  -DSQLITE_OMIT_LOAD_EXTENSION=1 -DSQLITE_THREADSAFE=0  -w -o sqlite3.c.o -c $TESTSUITE/CTMark/sqlite3/sqlite3.c
>>> real    0m19.254s
>>> user    0m19.154s
>>> sys     0m0.094s
>>> 
>>> JumpThreading started to consume much longer time trying to preserve DT, but DT was still being recomputed couple of passes later. In general, I agree that we should preserve DT and in the end it should be beneficial for compile time too, but can we put it under a flag for now until we can actually benefit from it?
>>> 
>>> Thanks,
>>> Michael
>>> 
>>> 
>>>> On Sep 30, 2017, at 5:02 AM, Daniel Jasper via llvm-commits <llvm-commits at lists.llvm.org <mailto:llvm-commits at lists.llvm.org>> wrote:
>>>> 
>>>> I found other places where this Clang was running into this segfault. Reverted for now in r314589.
>>>> 
>>>> On Fri, Sep 29, 2017 at 10:19 PM, Friedman, Eli via llvm-commits <llvm-commits at lists.llvm.org <mailto:llvm-commits at lists.llvm.org>> wrote:
>>>> On 9/28/2017 10:24 AM, Evandro Menezes via llvm-commits wrote:
>>>>> Author: evandro
>>>>> Date: Thu Sep 28 10:24:40 2017
>>>>> New Revision: 314435
>>>>> 
>>>>> URL: http://llvm.org/viewvc/llvm-project?rev=314435&view=rev <http://llvm.org/viewvc/llvm-project?rev=314435&view=rev>
>>>>> Log:
>>>>> [JumpThreading] Preserve DT and LVI across the pass
>>>>> 
>>>>> JumpThreading now preserves dominance and lazy value information across the
>>>>> entire pass.  The pass manager is also informed of this preservation with
>>>>> the goal of DT and LVI being recalculated fewer times overall during
>>>>> compilation.
>>>>> 
>>>>> This change prepares JumpThreading for enhanced opportunities; particularly
>>>>> those across loop boundaries.
>>>>> 
>>>>> Patch by: Brian Rzycki <b.rzycki at samsung.com> <mailto:b.rzycki at samsung.com>,
>>>>>           Sebastian Pop <s.pop at samsung.com> <mailto:s.pop at samsung.com>
>>>>> 
>>>>> Differential revision: https://reviews.llvm.org/D37528 <https://reviews.llvm.org/D37528>
>>>> 
>>>> It looks like this is causing an assertion failure on the polly aosp buildbot (http://lab.llvm.org:8011/builders/aosp-O3-polly-before-vectorizer-unprofitable/builds/266/steps/build-aosp/logs/stdio <http://lab.llvm.org:8011/builders/aosp-O3-polly-before-vectorizer-unprofitable/builds/266/steps/build-aosp/logs/stdio>):
>>>> clang: /var/lib/buildbot/slaves/hexagon-build-03/aosp/llvm.src/include/llvm/Support/GenericDomTreeConstruction.h:906: static void llvm::DomTreeBuilder::SemiNCAInfo<llvm::DominatorTreeBase<llvm::BasicBlock, false> >::DeleteEdge(DomTreeT &, const BatchUpdatePtr, const NodePtr, const NodePtr) [DomTreeT = llvm::DominatorTreeBase<llvm::BasicBlock, false>]: Assertion `!IsSuccessor(To, From) && "Deleted edge still exists in the CFG!"' failed.
>>>> #0 0x00000000014f6fc4 PrintStackTraceSignalHandler(void*) (llvm.inst/bin/clang+0x14f6fc4)
>>>> #1 0x00000000014f72d6 SignalHandler(int) (llvm.inst/bin/clang+0x14f72d6)
>>>> #2 0x00007f694afddd10 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x10d10)
>>>> #3 0x00007f6949bbb267 gsignal /build/buildd/glibc-2.21/signal/../sysdeps/unix/sysv/linux/raise.c:55:0
>>>> #4 0x00007f6949bbceca abort /build/buildd/glibc-2.21/stdlib/abort.c:91:0
>>>> #5 0x00007f6949bb403d __assert_fail_base /build/buildd/glibc-2.21/assert/assert.c:92:0
>>>> #6 0x00007f6949bb40f2 (/lib/x86_64-linux-gnu/libc.so.6+0x2e0f2)
>>>> #7 0x000000000104a1dc (llvm.inst/bin/clang+0x104a1dc)
>>>> #8 0x0000000001357fd9 llvm::JumpThreadingPass::ProcessBlock(llvm::BasicBlock*) (llvm.inst/bin/clang+0x1357fd9)
>>>> #9 0x0000000001356e6e llvm::JumpThreadingPass::runImpl(llvm::Function&, llvm::TargetLibraryInfo*, llvm::LazyValueInfo*, llvm::AAResults*, llvm::DominatorTree*, bool, std::unique_ptr<llvm::BlockFrequencyInfo, std::default_delete<llvm::BlockFrequencyInfo> >, std::unique_ptr<llvm::BranchProbabilityInfo, std::default_delete<llvm::BranchProbabilityInfo> >) (llvm.inst/bin/clang+0x1356e6e)
>>>> #10 0x0000000001363c49 (anonymous namespace)::JumpThreading::runOnFunction(llvm::Function&) (llvm.inst/bin/clang+0x1363c49)
>>>> #11 0x00000000010a53c4 llvm::FPPassManager::runOnFunction(llvm::Function&) (llvm.inst/bin/clang+0x10a53c4)
>>>> #12 0x00000000031f7376 (anonymous namespace)::CGPassManager::runOnModule(llvm::Module&) (llvm.inst/bin/clang+0x31f7376)
>>>> #13 0x00000000010a5b05 llvm::legacy::PassManagerImpl::run(llvm::Module&) (llvm.inst/bin/clang+0x10a5b05)
>>>> #14 0x0000000001688beb clang::EmitBackendOutput(clang::DiagnosticsEngine&, clang::HeaderSearchOptions const&, clang::CodeGenOptions const&, clang::TargetOptions const&, clang::LangOptions const&, llvm::DataLayout const&, llvm::Module*, clang::BackendAction, std::unique_ptr<llvm::raw_pwrite_stream, std::default_delete<llvm::raw_pwrite_stream> >) (llvm.inst/bin/clang+0x1688beb)
>>>> #15 0x0000000001ed7672 clang::BackendConsumer::HandleTranslationUnit(clang::ASTContext&) (llvm.inst/bin/clang+0x1ed7672)
>>>> #16 0x0000000002195235 clang::ParseAST(clang::Sema&, bool, bool) (llvm.inst/bin/clang+0x2195235)
>>>> #17 0x0000000001a75628 clang::FrontendAction::Execute() (llvm.inst/bin/clang+0x1a75628)
>>>> #18 0x0000000001a37181 clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) (llvm.inst/bin/clang+0x1a37181)
>>>> #19 0x0000000001b06501 clang::ExecuteCompilerInvocation(clang::CompilerInstance*) (llvm.inst/bin/clang+0x1b06501)
>>>> #20 0x000000000080e433 cc1_main(llvm::ArrayRef<char const*>, char const*, void*) (llvm.inst/bin/clang+0x80e433)
>>>> #21 0x000000000080c997 main (llvm.inst/bin/clang+0x80c997)
>>>> #22 0x00007f6949ba6a40 __libc_start_main /build/buildd/glibc-2.21/csu/libc-start.c:323:0
>>>> #23 0x0000000000809479 _start (llvm.inst/bin/clang+0x809479)
>>>> I'll try to come up with a reduced testcase this afternoon.
>>>> 
>>>> -Eli
>>>> -- 
>>>> Employee of Qualcomm Innovation Center, Inc.
>>>> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project
>>>> 
>>>> _______________________________________________
>>>> llvm-commits mailing list
>>>> llvm-commits at lists.llvm.org <mailto:llvm-commits at lists.llvm.org>
>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits>
>>>> 
>>>> 
>>>> _______________________________________________
>>>> llvm-commits mailing list
>>>> llvm-commits at lists.llvm.org <mailto:llvm-commits at lists.llvm.org>
>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits>
>>> 
>>> 
>>> _______________________________________________
>>> llvm-commits mailing list
>>> llvm-commits at lists.llvm.org <mailto:llvm-commits at lists.llvm.org>
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits>
>>> 
>>> 
>>> 
>> 
>> 
>> 
>> 
>> -- 
>> Jakub Kuderski
>> 
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at lists.llvm.org <mailto:llvm-commits at lists.llvm.org>
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits>
>> 
>> 
>> 
> 
> 
> 
> 
> -- 
> Jakub Kuderski

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20171012/7a88b7d1/attachment.html>


More information about the llvm-commits mailing list