[llvm] r314435 - [JumpThreading] Preserve DT and LVI across the pass

Jakub (Kuba) Kuderski via llvm-commits llvm-commits at lists.llvm.org
Wed Oct 11 16:20:36 PDT 2017


Does fullLTO use a different pass manager configuration, or where does the
difference come from?

On Wed, Oct 11, 2017 at 5:53 PM, Michael Zolotukhin <mzolotukhin at apple.com>
wrote:

>
> On Oct 11, 2017, at 2:40 PM, Brian M. Rzycki <brzycki at gmail.com> wrote:
>
> I did more testing on the DT/LVI patch for JumpThreading and I am unable
> to reproduce the 10% sqlite3 regression Michael Zolotukhin reported.
>
> Could you please try building full sqlite3 test with -flto? The 10%
> regression was detected on an O3-LTO bot, while in my example (just O3) I
> also saw just 5%. That’s what you’re probably seeing now (plus-minus
> differences in our configurations and noise).
>
> Thanks for the testing!
>
> Michael
>
>
> There are two compilers: upstream and dt. The upstream compiler has the
> following patch as its HEAD:
> $ git log HEAD
> commit c491ccc7bbddf8c04885fd3ab90f90cc9c319d33 (HEAD -> master)
> Author: Ayal Zaks <ayal.zaks at intel.com>
> Date:   Thu Oct 5 15:45:14 2017 +0000
>
>     [LV] Fix PR34743 - handle casts that sink after interleaved loads
>
>     When ignoring a load that participates in an interleaved group, make
>         sure to move a cast that needs to sink after it.
>
>     Testcase derived from reproducer of PR34743.
>
>     Differential Revision: https://reviews.llvm.org/D38338
>
> The dt compiler is based on the exact same patch as upstream, only with
> https://reviews.llvm.org/D38558 applied.
>
> The sqlite version:
> $ cat sqlite/VERSION
> 3.20.1
>
> The host is an X86 server-class CPU:
> processor       : 30
> vendor_id       : GenuineIntel
> cpu family      : 6
> model           : 79
> model name      : Intel(R) Xeon(R) CPU E5-2667 v4 @ 3.20GHz
> stepping        : 1
>
> Here are the compiler runs:
>
> # upstream compiler, best of 3 runs
> $ time taskset -c 30 /work/brzycki/upstream/install/bin/clang  -DNDEBUG
> -O3 -DSTDC_HEADERS=1  -DSQLITE_OMIT_LOAD_EXTENSION=1 -DSQLITE_THREADSAFE=0
> -o sqlite3.c.o -c sqlite3.c
>
> real    7m57.157s
> user    7m56.472s
> sys     0m0.560s
>
> # dt patched compiler, best of 3 runs
> $ time taskset -c 30 /work/brzycki/dt/install/bin/clang  -DNDEBUG -O3
> -DSTDC_HEADERS=1  -DSQLITE_OMIT_LOAD_EXTENSION=1 -DSQLITE_THREADSAFE=0 -o
> sqlite3.c.o -c sqlite3.c
>
> real    8m14.753s
> user    8m13.932s
> sys     0m0.496s
>
> That’s a difference of 4.3%. I'm unsure why Arm64 is showing a further
> regression of 6%. I've also done some native Aarch64 Linux testing on a
> Firefly ARM72 device and see a difference very close to this number. If
> anyone is interested I can post this data too.
>
> I plan on re-running these tests when https://reviews.llvm.org/D38802
> lands to see if it further reduces the overhead or not.
>
> On Wed, Oct 4, 2017 at 3:12 PM, Brian M. Rzycki <brzycki at gmail.com> wrote:
>
>> I have just pushed two patches into Phabricator:
>> Take two of the DT/LVI patch:
>>
>> https://reviews.llvm.org/D38558
>>
>>
>> A patch to place DT/LVI preservation under a flag:
>>
>> https://reviews.llvm.org/D38559
>>
>>
>> Jakub, I too am planning on doing some profiling later this week. I would
>> like to collaborate to minimize duplicated efforts.
>>
>> On Wed, Oct 4, 2017 at 12:01 PM, Jakub (Kuba) Kuderski via llvm-commits <
>> llvm-commits at lists.llvm.org> wrote:
>>
>>> I will try to profile the sqlite compilation later this week. It is not
>>> easy to come up with some benchmarks for the batch updater, and because of
>>> that it hasn't really been optimized yet. JumpThreading seems to be the
>>> first pass to put significant pressure on the incremental updater, so it'd
>>> be valuable to learn how much faster we can get there.
>>>
>>> On Wed, Oct 4, 2017 at 12:47 PM, Michael Zolotukhin <
>>> mzolotukhin at apple.com> wrote:
>>>
>>>>
>>>> On Oct 4, 2017, at 9:25 AM, Daniel Berlin <dberlin at dberlin.org> wrote:
>>>>
>>>> +jakub for real.
>>>>
>>>> On Wed, Oct 4, 2017 at 9:21 AM, Daniel Berlin <dberlin at dberlin.org>
>>>> wrote:
>>>>
>>>>> +jakub
>>>>>
>>>>> I don't think preservation should be controlled by a flag.  That seems
>>>>> like not a great path forward.
>>>>> Either we should make it fast enough, or we should give up :)
>>>>>
>>>>> Do you have any profiles on this?
>>>>>
>>>> I don’t, but it seems pretty clear what’s going on there - we do an
>>>> extra work (preserving DT) for no benefits (we’re recomputing it a couple
>>>> of passes later anyway). The pass after which it’s recomputed is
>>>> SimplifyCFG, and I think it would be a much bigger endeavor to teach it to
>>>> account for DT - so putting it under a flag for JumpThreading could help
>>>> doing this work incrementally.
>>>>
>>>> Michael
>>>>
>>>>
>>>>>
>>>>>
>>>>> On Sat, Sep 30, 2017 at 12:19 PM, Michael Zolotukhin via llvm-commits
>>>>> <llvm-commits at lists.llvm.org> wrote:
>>>>>
>>>>>> For the record, we also detected significant compile-time increases
>>>>>> caused by this patch. It caused up to 10% slow downs on sqlite3 and some
>>>>>> other tests from CTMark/LLVM testsuite (on O3/LTO and other optsets).
>>>>>>
>>>>>> Reproducer:
>>>>>>
>>>>>> time clang-r314435/bin/clang  -DNDEBUG -O3 -arch arm64 -isysroot
>>>>>> $SYSROOT -DSTDC_HEADERS=1  -DSQLITE_OMIT_LOAD_EXTENSION=1
>>>>>> -DSQLITE_THREADSAFE=0  -w -o sqlite3.c.o -c $TESTSUITE/CTMark/sqlite3/sqli
>>>>>> te3.c
>>>>>> real    0m20.235s
>>>>>> user    0m20.130s
>>>>>> sys     0m0.091s
>>>>>>
>>>>>> time clang-r314432/bin/clang  -DNDEBUG -O3 -arch arm64 -isysroot
>>>>>> $SYSROOT -DSTDC_HEADERS=1  -DSQLITE_OMIT_LOAD_EXTENSION=1
>>>>>> -DSQLITE_THREADSAFE=0  -w -o sqlite3.c.o -c $TESTSUITE/CTMark/sqlite3/sqli
>>>>>> te3.c
>>>>>> real    0m19.254s
>>>>>> user    0m19.154s
>>>>>> sys     0m0.094s
>>>>>>
>>>>>> JumpThreading started to consume much longer time trying to preserve
>>>>>> DT, but DT was still being recomputed couple of passes later. In general, I
>>>>>> agree that we should preserve DT and in the end it should be beneficial for
>>>>>> compile time too, but can we put it under a flag for now until we can
>>>>>> actually benefit from it?
>>>>>>
>>>>>> Thanks,
>>>>>> Michael
>>>>>>
>>>>>>
>>>>>> On Sep 30, 2017, at 5:02 AM, Daniel Jasper via llvm-commits <
>>>>>> llvm-commits at lists.llvm.org> wrote:
>>>>>>
>>>>>> I found other places where this Clang was running into this segfault.
>>>>>> Reverted for now in r314589.
>>>>>>
>>>>>> On Fri, Sep 29, 2017 at 10:19 PM, Friedman, Eli via llvm-commits <
>>>>>> llvm-commits at lists.llvm.org> wrote:
>>>>>>
>>>>>>> On 9/28/2017 10:24 AM, Evandro Menezes via llvm-commits wrote:
>>>>>>>
>>>>>>> Author: evandro
>>>>>>> Date: Thu Sep 28 10:24:40 2017
>>>>>>> New Revision: 314435
>>>>>>>
>>>>>>> URL: http://llvm.org/viewvc/llvm-project?rev=314435&view=rev
>>>>>>> Log:
>>>>>>> [JumpThreading] Preserve DT and LVI across the pass
>>>>>>>
>>>>>>> JumpThreading now preserves dominance and lazy value information across the
>>>>>>> entire pass.  The pass manager is also informed of this preservation with
>>>>>>> the goal of DT and LVI being recalculated fewer times overall during
>>>>>>> compilation.
>>>>>>>
>>>>>>> This change prepares JumpThreading for enhanced opportunities; particularly
>>>>>>> those across loop boundaries.
>>>>>>>
>>>>>>> Patch by: Brian Rzycki <b.rzycki at samsung.com> <b.rzycki at samsung.com>,
>>>>>>>           Sebastian Pop <s.pop at samsung.com> <s.pop at samsung.com>
>>>>>>>
>>>>>>> Differential revision: https://reviews.llvm.org/D37528
>>>>>>>
>>>>>>>
>>>>>>> It looks like this is causing an assertion failure on the polly aosp
>>>>>>> buildbot (http://lab.llvm.org:8011/buil
>>>>>>> ders/aosp-O3-polly-before-vectorizer-unprofitable/builds/266
>>>>>>> /steps/build-aosp/logs/stdio):
>>>>>>>
>>>>>>> clang: /var/lib/buildbot/slaves/hexagon-build-03/aosp/llvm.src/include/llvm/Support/GenericDomTreeConstruction.h:906: static void llvm::DomTreeBuilder::SemiNCAInfo<llvm::DominatorTreeBase<llvm::BasicBlock, false> >::DeleteEdge(DomTreeT &, const BatchUpdatePtr, const NodePtr, const NodePtr) [DomTreeT = llvm::DominatorTreeBase<llvm::BasicBlock, false>]: Assertion `!IsSuccessor(To, From) && "Deleted edge still exists in the CFG!"' failed.
>>>>>>> #0 0x00000000014f6fc4 PrintStackTraceSignalHandler(void*) (llvm.inst/bin/clang+0x14f6fc4)
>>>>>>> #1 0x00000000014f72d6 SignalHandler(int) (llvm.inst/bin/clang+0x14f72d6)
>>>>>>> #2 0x00007f694afddd10 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x10d10)
>>>>>>> #3 0x00007f6949bbb267 gsignal /build/buildd/glibc-2.21/signal/../sysdeps/unix/sysv/linux/raise.c:55:0
>>>>>>> #4 0x00007f6949bbceca abort /build/buildd/glibc-2.21/stdlib/abort.c:91:0
>>>>>>> #5 0x00007f6949bb403d __assert_fail_base /build/buildd/glibc-2.21/assert/assert.c:92:0
>>>>>>> #6 0x00007f6949bb40f2 (/lib/x86_64-linux-gnu/libc.so.6+0x2e0f2)
>>>>>>> #7 0x000000000104a1dc (llvm.inst/bin/clang+0x104a1dc)
>>>>>>> #8 0x0000000001357fd9 llvm::JumpThreadingPass::ProcessBlock(llvm::BasicBlock*) (llvm.inst/bin/clang+0x1357fd9)
>>>>>>> #9 0x0000000001356e6e llvm::JumpThreadingPass::runImpl(llvm::Function&, llvm::TargetLibraryInfo*, llvm::LazyValueInfo*, llvm::AAResults*, llvm::DominatorTree*, bool, std::unique_ptr<llvm::BlockFrequencyInfo, std::default_delete<llvm::BlockFrequencyInfo> >, std::unique_ptr<llvm::BranchProbabilityInfo, std::default_delete<llvm::BranchProbabilityInfo> >) (llvm.inst/bin/clang+0x1356e6e)
>>>>>>> #10 0x0000000001363c49 (anonymous namespace)::JumpThreading::runOnFunction(llvm::Function&) (llvm.inst/bin/clang+0x1363c49)
>>>>>>> #11 0x00000000010a53c4 llvm::FPPassManager::runOnFunction(llvm::Function&) (llvm.inst/bin/clang+0x10a53c4)
>>>>>>> #12 0x00000000031f7376 (anonymous namespace)::CGPassManager::runOnModule(llvm::Module&) (llvm.inst/bin/clang+0x31f7376)
>>>>>>> #13 0x00000000010a5b05 llvm::legacy::PassManagerImpl::run(llvm::Module&) (llvm.inst/bin/clang+0x10a5b05)
>>>>>>> #14 0x0000000001688beb clang::EmitBackendOutput(clang::DiagnosticsEngine&, clang::HeaderSearchOptions const&, clang::CodeGenOptions const&, clang::TargetOptions const&, clang::LangOptions const&, llvm::DataLayout const&, llvm::Module*, clang::BackendAction, std::unique_ptr<llvm::raw_pwrite_stream, std::default_delete<llvm::raw_pwrite_stream> >) (llvm.inst/bin/clang+0x1688beb)
>>>>>>> #15 0x0000000001ed7672 clang::BackendConsumer::HandleTranslationUnit(clang::ASTContext&) (llvm.inst/bin/clang+0x1ed7672)
>>>>>>> #16 0x0000000002195235 clang::ParseAST(clang::Sema&, bool, bool) (llvm.inst/bin/clang+0x2195235)
>>>>>>> #17 0x0000000001a75628 clang::FrontendAction::Execute() (llvm.inst/bin/clang+0x1a75628)
>>>>>>> #18 0x0000000001a37181 clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) (llvm.inst/bin/clang+0x1a37181)
>>>>>>> #19 0x0000000001b06501 clang::ExecuteCompilerInvocation(clang::CompilerInstance*) (llvm.inst/bin/clang+0x1b06501)
>>>>>>> #20 0x000000000080e433 cc1_main(llvm::ArrayRef<char const*>, char const*, void*) (llvm.inst/bin/clang+0x80e433)
>>>>>>> #21 0x000000000080c997 main (llvm.inst/bin/clang+0x80c997)
>>>>>>> #22 0x00007f6949ba6a40 __libc_start_main /build/buildd/glibc-2.21/csu/libc-start.c:323:0
>>>>>>> #23 0x0000000000809479 _start (llvm.inst/bin/clang+0x809479)
>>>>>>>
>>>>>>> I'll try to come up with a reduced testcase this afternoon.
>>>>>>>
>>>>>>> -Eli
>>>>>>>
>>>>>>> --
>>>>>>> Employee of Qualcomm Innovation Center, Inc.
>>>>>>> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project
>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> llvm-commits mailing list
>>>>>>> llvm-commits at lists.llvm.org
>>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>>>>>>>
>>>>>>>
>>>>>> _______________________________________________
>>>>>> llvm-commits mailing list
>>>>>> llvm-commits at lists.llvm.org
>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>>>>>>
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> llvm-commits mailing list
>>>>>> llvm-commits at lists.llvm.org
>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> Jakub Kuderski
>>>
>>> _______________________________________________
>>> llvm-commits mailing list
>>> llvm-commits at lists.llvm.org
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>>>
>>>
>>
>
>


-- 
Jakub Kuderski
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20171011/7ebe95c4/attachment.html>


More information about the llvm-commits mailing list