[llvm] r314435 - [JumpThreading] Preserve DT and LVI across the pass

Jakub (Kuba) Kuderski via llvm-commits llvm-commits at lists.llvm.org
Wed Oct 11 16:24:21 PDT 2017


>
> I plan on re-running these tests when https://reviews.llvm.org/D38802 lands
> to see if it further reduces the overhead or not.

The new verifier is only run when -verify-dom-info is provided or when LLVM
is compiled with EXPENSIVE_CHECKS, so it won't make a difference here.

On Wed, Oct 11, 2017 at 7:20 PM, Jakub (Kuba) Kuderski <
kubakuderski at gmail.com> wrote:

> Does fullLTO use a different pass manager configuration, or where does the
> difference come from?
>
> On Wed, Oct 11, 2017 at 5:53 PM, Michael Zolotukhin <mzolotukhin at apple.com
> > wrote:
>
>>
>> On Oct 11, 2017, at 2:40 PM, Brian M. Rzycki <brzycki at gmail.com> wrote:
>>
>> I did more testing on the DT/LVI patch for JumpThreading and I am unable
>> to reproduce the 10% sqlite3 regression Michael Zolotukhin reported.
>>
>> Could you please try building full sqlite3 test with -flto? The 10%
>> regression was detected on an O3-LTO bot, while in my example (just O3) I
>> also saw just 5%. That’s what you’re probably seeing now (plus-minus
>> differences in our configurations and noise).
>>
>> Thanks for the testing!
>>
>> Michael
>>
>>
>> There are two compilers: upstream and dt. The upstream compiler has the
>> following patch as its HEAD:
>> $ git log HEAD
>> commit c491ccc7bbddf8c04885fd3ab90f90cc9c319d33 (HEAD -> master)
>> Author: Ayal Zaks <ayal.zaks at intel.com>
>> Date:   Thu Oct 5 15:45:14 2017 +0000
>>
>>     [LV] Fix PR34743 - handle casts that sink after interleaved loads
>>
>>     When ignoring a load that participates in an interleaved group, make
>>         sure to move a cast that needs to sink after it.
>>
>>     Testcase derived from reproducer of PR34743.
>>
>>     Differential Revision: https://reviews.llvm.org/D38338
>>
>> The dt compiler is based on the exact same patch as upstream, only with
>> https://reviews.llvm.org/D38558 applied.
>>
>> The sqlite version:
>> $ cat sqlite/VERSION
>> 3.20.1
>>
>> The host is an X86 server-class CPU:
>> processor       : 30
>> vendor_id       : GenuineIntel
>> cpu family      : 6
>> model           : 79
>> model name      : Intel(R) Xeon(R) CPU E5-2667 v4 @ 3.20GHz
>> stepping        : 1
>>
>> Here are the compiler runs:
>>
>> # upstream compiler, best of 3 runs
>> $ time taskset -c 30 /work/brzycki/upstream/install/bin/clang  -DNDEBUG
>> -O3 -DSTDC_HEADERS=1  -DSQLITE_OMIT_LOAD_EXTENSION=1 -DSQLITE_THREADSAFE=0
>> -o sqlite3.c.o -c sqlite3.c
>>
>> real    7m57.157s
>> user    7m56.472s
>> sys     0m0.560s
>>
>> # dt patched compiler, best of 3 runs
>> $ time taskset -c 30 /work/brzycki/dt/install/bin/clang  -DNDEBUG -O3
>> -DSTDC_HEADERS=1  -DSQLITE_OMIT_LOAD_EXTENSION=1 -DSQLITE_THREADSAFE=0 -o
>> sqlite3.c.o -c sqlite3.c
>>
>> real    8m14.753s
>> user    8m13.932s
>> sys     0m0.496s
>>
>> That’s a difference of 4.3%. I'm unsure why Arm64 is showing a further
>> regression of 6%. I've also done some native Aarch64 Linux testing on a
>> Firefly ARM72 device and see a difference very close to this number. If
>> anyone is interested I can post this data too.
>>
>> I plan on re-running these tests when https://reviews.llvm.org/D38802
>> lands to see if it further reduces the overhead or not.
>>
>> On Wed, Oct 4, 2017 at 3:12 PM, Brian M. Rzycki <brzycki at gmail.com>
>> wrote:
>>
>>> I have just pushed two patches into Phabricator:
>>> Take two of the DT/LVI patch:
>>>
>>> https://reviews.llvm.org/D38558
>>>
>>>
>>> A patch to place DT/LVI preservation under a flag:
>>>
>>> https://reviews.llvm.org/D38559
>>>
>>>
>>> Jakub, I too am planning on doing some profiling later this week. I
>>> would like to collaborate to minimize duplicated efforts.
>>>
>>> On Wed, Oct 4, 2017 at 12:01 PM, Jakub (Kuba) Kuderski via llvm-commits
>>> <llvm-commits at lists.llvm.org> wrote:
>>>
>>>> I will try to profile the sqlite compilation later this week. It is not
>>>> easy to come up with some benchmarks for the batch updater, and because of
>>>> that it hasn't really been optimized yet. JumpThreading seems to be the
>>>> first pass to put significant pressure on the incremental updater, so it'd
>>>> be valuable to learn how much faster we can get there.
>>>>
>>>> On Wed, Oct 4, 2017 at 12:47 PM, Michael Zolotukhin <
>>>> mzolotukhin at apple.com> wrote:
>>>>
>>>>>
>>>>> On Oct 4, 2017, at 9:25 AM, Daniel Berlin <dberlin at dberlin.org> wrote:
>>>>>
>>>>> +jakub for real.
>>>>>
>>>>> On Wed, Oct 4, 2017 at 9:21 AM, Daniel Berlin <dberlin at dberlin.org>
>>>>> wrote:
>>>>>
>>>>>> +jakub
>>>>>>
>>>>>> I don't think preservation should be controlled by a flag.  That
>>>>>> seems like not a great path forward.
>>>>>> Either we should make it fast enough, or we should give up :)
>>>>>>
>>>>>> Do you have any profiles on this?
>>>>>>
>>>>> I don’t, but it seems pretty clear what’s going on there - we do an
>>>>> extra work (preserving DT) for no benefits (we’re recomputing it a couple
>>>>> of passes later anyway). The pass after which it’s recomputed is
>>>>> SimplifyCFG, and I think it would be a much bigger endeavor to teach it to
>>>>> account for DT - so putting it under a flag for JumpThreading could help
>>>>> doing this work incrementally.
>>>>>
>>>>> Michael
>>>>>
>>>>>
>>>>>>
>>>>>>
>>>>>> On Sat, Sep 30, 2017 at 12:19 PM, Michael Zolotukhin via llvm-commits
>>>>>> <llvm-commits at lists.llvm.org> wrote:
>>>>>>
>>>>>>> For the record, we also detected significant compile-time increases
>>>>>>> caused by this patch. It caused up to 10% slow downs on sqlite3 and some
>>>>>>> other tests from CTMark/LLVM testsuite (on O3/LTO and other optsets).
>>>>>>>
>>>>>>> Reproducer:
>>>>>>>
>>>>>>> time clang-r314435/bin/clang  -DNDEBUG -O3 -arch arm64 -isysroot
>>>>>>> $SYSROOT -DSTDC_HEADERS=1  -DSQLITE_OMIT_LOAD_EXTENSION=1
>>>>>>> -DSQLITE_THREADSAFE=0  -w -o sqlite3.c.o -c $TESTSUITE/CTMark/sqlite3/sqli
>>>>>>> te3.c
>>>>>>> real    0m20.235s
>>>>>>> user    0m20.130s
>>>>>>> sys     0m0.091s
>>>>>>>
>>>>>>> time clang-r314432/bin/clang  -DNDEBUG -O3 -arch arm64 -isysroot
>>>>>>> $SYSROOT -DSTDC_HEADERS=1  -DSQLITE_OMIT_LOAD_EXTENSION=1
>>>>>>> -DSQLITE_THREADSAFE=0  -w -o sqlite3.c.o -c $TESTSUITE/CTMark/sqlite3/sqli
>>>>>>> te3.c
>>>>>>> real    0m19.254s
>>>>>>> user    0m19.154s
>>>>>>> sys     0m0.094s
>>>>>>>
>>>>>>> JumpThreading started to consume much longer time trying to preserve
>>>>>>> DT, but DT was still being recomputed couple of passes later. In general, I
>>>>>>> agree that we should preserve DT and in the end it should be beneficial for
>>>>>>> compile time too, but can we put it under a flag for now until we can
>>>>>>> actually benefit from it?
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Michael
>>>>>>>
>>>>>>>
>>>>>>> On Sep 30, 2017, at 5:02 AM, Daniel Jasper via llvm-commits <
>>>>>>> llvm-commits at lists.llvm.org> wrote:
>>>>>>>
>>>>>>> I found other places where this Clang was running into this
>>>>>>> segfault. Reverted for now in r314589.
>>>>>>>
>>>>>>> On Fri, Sep 29, 2017 at 10:19 PM, Friedman, Eli via llvm-commits <
>>>>>>> llvm-commits at lists.llvm.org> wrote:
>>>>>>>
>>>>>>>> On 9/28/2017 10:24 AM, Evandro Menezes via llvm-commits wrote:
>>>>>>>>
>>>>>>>> Author: evandro
>>>>>>>> Date: Thu Sep 28 10:24:40 2017
>>>>>>>> New Revision: 314435
>>>>>>>>
>>>>>>>> URL: http://llvm.org/viewvc/llvm-project?rev=314435&view=rev
>>>>>>>> Log:
>>>>>>>> [JumpThreading] Preserve DT and LVI across the pass
>>>>>>>>
>>>>>>>> JumpThreading now preserves dominance and lazy value information across the
>>>>>>>> entire pass.  The pass manager is also informed of this preservation with
>>>>>>>> the goal of DT and LVI being recalculated fewer times overall during
>>>>>>>> compilation.
>>>>>>>>
>>>>>>>> This change prepares JumpThreading for enhanced opportunities; particularly
>>>>>>>> those across loop boundaries.
>>>>>>>>
>>>>>>>> Patch by: Brian Rzycki <b.rzycki at samsung.com> <b.rzycki at samsung.com>,
>>>>>>>>           Sebastian Pop <s.pop at samsung.com> <s.pop at samsung.com>
>>>>>>>>
>>>>>>>> Differential revision: https://reviews.llvm.org/D37528
>>>>>>>>
>>>>>>>>
>>>>>>>> It looks like this is causing an assertion failure on the polly
>>>>>>>> aosp buildbot (http://lab.llvm.org:8011/buil
>>>>>>>> ders/aosp-O3-polly-before-vectorizer-unprofitable/builds/266
>>>>>>>> /steps/build-aosp/logs/stdio):
>>>>>>>>
>>>>>>>> clang: /var/lib/buildbot/slaves/hexagon-build-03/aosp/llvm.src/include/llvm/Support/GenericDomTreeConstruction.h:906: static void llvm::DomTreeBuilder::SemiNCAInfo<llvm::DominatorTreeBase<llvm::BasicBlock, false> >::DeleteEdge(DomTreeT &, const BatchUpdatePtr, const NodePtr, const NodePtr) [DomTreeT = llvm::DominatorTreeBase<llvm::BasicBlock, false>]: Assertion `!IsSuccessor(To, From) && "Deleted edge still exists in the CFG!"' failed.
>>>>>>>> #0 0x00000000014f6fc4 PrintStackTraceSignalHandler(void*) (llvm.inst/bin/clang+0x14f6fc4)
>>>>>>>> #1 0x00000000014f72d6 SignalHandler(int) (llvm.inst/bin/clang+0x14f72d6)
>>>>>>>> #2 0x00007f694afddd10 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x10d10)
>>>>>>>> #3 0x00007f6949bbb267 gsignal /build/buildd/glibc-2.21/signal/../sysdeps/unix/sysv/linux/raise.c:55:0
>>>>>>>> #4 0x00007f6949bbceca abort /build/buildd/glibc-2.21/stdlib/abort.c:91:0
>>>>>>>> #5 0x00007f6949bb403d __assert_fail_base /build/buildd/glibc-2.21/assert/assert.c:92:0
>>>>>>>> #6 0x00007f6949bb40f2 (/lib/x86_64-linux-gnu/libc.so.6+0x2e0f2)
>>>>>>>> #7 0x000000000104a1dc (llvm.inst/bin/clang+0x104a1dc)
>>>>>>>> #8 0x0000000001357fd9 llvm::JumpThreadingPass::ProcessBlock(llvm::BasicBlock*) (llvm.inst/bin/clang+0x1357fd9)
>>>>>>>> #9 0x0000000001356e6e llvm::JumpThreadingPass::runImpl(llvm::Function&, llvm::TargetLibraryInfo*, llvm::LazyValueInfo*, llvm::AAResults*, llvm::DominatorTree*, bool, std::unique_ptr<llvm::BlockFrequencyInfo, std::default_delete<llvm::BlockFrequencyInfo> >, std::unique_ptr<llvm::BranchProbabilityInfo, std::default_delete<llvm::BranchProbabilityInfo> >) (llvm.inst/bin/clang+0x1356e6e)
>>>>>>>> #10 0x0000000001363c49 (anonymous namespace)::JumpThreading::runOnFunction(llvm::Function&) (llvm.inst/bin/clang+0x1363c49)
>>>>>>>> #11 0x00000000010a53c4 llvm::FPPassManager::runOnFunction(llvm::Function&) (llvm.inst/bin/clang+0x10a53c4)
>>>>>>>> #12 0x00000000031f7376 (anonymous namespace)::CGPassManager::runOnModule(llvm::Module&) (llvm.inst/bin/clang+0x31f7376)
>>>>>>>> #13 0x00000000010a5b05 llvm::legacy::PassManagerImpl::run(llvm::Module&) (llvm.inst/bin/clang+0x10a5b05)
>>>>>>>> #14 0x0000000001688beb clang::EmitBackendOutput(clang::DiagnosticsEngine&, clang::HeaderSearchOptions const&, clang::CodeGenOptions const&, clang::TargetOptions const&, clang::LangOptions const&, llvm::DataLayout const&, llvm::Module*, clang::BackendAction, std::unique_ptr<llvm::raw_pwrite_stream, std::default_delete<llvm::raw_pwrite_stream> >) (llvm.inst/bin/clang+0x1688beb)
>>>>>>>> #15 0x0000000001ed7672 clang::BackendConsumer::HandleTranslationUnit(clang::ASTContext&) (llvm.inst/bin/clang+0x1ed7672)
>>>>>>>> #16 0x0000000002195235 clang::ParseAST(clang::Sema&, bool, bool) (llvm.inst/bin/clang+0x2195235)
>>>>>>>> #17 0x0000000001a75628 clang::FrontendAction::Execute() (llvm.inst/bin/clang+0x1a75628)
>>>>>>>> #18 0x0000000001a37181 clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) (llvm.inst/bin/clang+0x1a37181)
>>>>>>>> #19 0x0000000001b06501 clang::ExecuteCompilerInvocation(clang::CompilerInstance*) (llvm.inst/bin/clang+0x1b06501)
>>>>>>>> #20 0x000000000080e433 cc1_main(llvm::ArrayRef<char const*>, char const*, void*) (llvm.inst/bin/clang+0x80e433)
>>>>>>>> #21 0x000000000080c997 main (llvm.inst/bin/clang+0x80c997)
>>>>>>>> #22 0x00007f6949ba6a40 __libc_start_main /build/buildd/glibc-2.21/csu/libc-start.c:323:0
>>>>>>>> #23 0x0000000000809479 _start (llvm.inst/bin/clang+0x809479)
>>>>>>>>
>>>>>>>> I'll try to come up with a reduced testcase this afternoon.
>>>>>>>>
>>>>>>>> -Eli
>>>>>>>>
>>>>>>>> --
>>>>>>>> Employee of Qualcomm Innovation Center, Inc.
>>>>>>>> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project
>>>>>>>>
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> llvm-commits mailing list
>>>>>>>> llvm-commits at lists.llvm.org
>>>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>>>>>>>>
>>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> llvm-commits mailing list
>>>>>>> llvm-commits at lists.llvm.org
>>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> llvm-commits mailing list
>>>>>>> llvm-commits at lists.llvm.org
>>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Jakub Kuderski
>>>>
>>>> _______________________________________________
>>>> llvm-commits mailing list
>>>> llvm-commits at lists.llvm.org
>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>>>>
>>>>
>>>
>>
>>
>
>
> --
> Jakub Kuderski
>



-- 
Jakub Kuderski
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20171011/710dba88/attachment.html>


More information about the llvm-commits mailing list