[llvm-dev] RFC: New support for triaging optimization-related failures in front ends

Tue Apr 5 07:35:47 PDT 2016

Hi,

I'm late to the party on this one due to having been off on my holidays but
I wanted to add an enthusiastic +1 to this idea.  I presented our SNC
compiler's similar max_opts approach at EuroLLVM 2014 (slide 38 onwards in
http://llvm.org/devmtg/2014-04/PDFs/Talks/GBedwell_PS4CPUToolchain_EuroLLVM2014_distribution.pdf
) so I'm very excited to see this.   In that compiler we had it implemented
per-transformation so a single difference between max_opts levels would
often only relate to a single line of IR changing.  The ability to take a
1000+ source file game and reduce any difference in behaviour to a single
line of IR changing somewhere in an automated (or mostly automated) manner
was absolutely invaluable (in fact, the only disadvantage I can think of is
that it made triaging optimizer bugs so trivially easy that we stopped
needing to use our debugger to investigate them and lost out on a lot of
dogfooding of debug data as a result).

I think this would also be of tremendous benefit in our triage of LTO
issues so It'd like to see the option exposed there too.

We paired up the option with our own 'autochop' harness which would build
the project we were triaging in two configurations (one with no transforms
allowed and one with unlimited transforms allowed) which would then bisect
at link time based on mixing the two sets of object files, until it found
the first one to make some difference in observable behaviour followed by
rebuilding that file with different values of max_opts to bisect to the
first bad transformation (determined either by parsing console output or,
more often in our case, asking the user 'is what appears on the screen
correct?').

Just a thought, but I'm wondering whether once this change lands it would
be worth also having some version of that autochop tool available in the
open source.

Thanks

-Greg

On 28 March 2016 at 18:36, Kaylor, Andrew via llvm-dev <
llvm-dev at lists.llvm.org> wrote:

> I agree that the more fine grained this becomes the more useful it can be.
>
>
>
> I’ve updated my prototype to use a single number approach.  I’m going to
> clean this up and post a review in the next day or two.
>
>
>
> -Andy
>
>
>
> *From:* llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] *On Behalf Of *Pete
> Cooper via llvm-dev
> *Sent:* Friday, March 25, 2016 10:22 PM
> *To:* Matthias Braun <matze at braunis.de>
>
> *Cc:* llvm-dev at lists.llvm.org
> *Subject:* Re: [llvm-dev] RFC: New support for triaging
> optimization-related failures in front ends
>
>
>
> I've worked on a compiler with a counter, but for individual
> optimisations, not just passes. It was incredibly useful!
>
>
>
> In the llvm world, it would let you bisect exactly which instcombine,
> dagcombine, or whatever causes an issue.
>
>
>
> I support the addition of a pass counter if it helps bisecting, but just
> wanted to point out that this can be as fine grained as the community is
> willing to accept.
>
>
>
> Incidentally, would be great to have some fuzzers running the pass
> bisector to make sure nothing crashes when a pass is added/removed,
> excepting any backend passes required to run.
>
>
>
> Pete
>
> Sent from my iPhone
>
>
> On Mar 25, 2016, at 5:29 PM, Matthias Braun via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
> Ok I cleaned it up and added some explaining comments. It's in
> llvm/utils/abtest now.
>
>
>
> - Mathias
>
>
>
> On Mar 25, 2016, at 4:40 PM, Michael Gottesman <mgottesman at apple.com>
> wrote:
>
>
>
>
>
> On Mar 25, 2016, at 4:37 PM, Matthias Braun <matze at braunis.de> wrote:
>
>
>
> And as we are on the topic of bisecting/diagnosing scripts I attached my
> personal script I used before.
>
>
>
> You give it two directories with assembly files (typically from a known
> good compiler and a "bad" compiler). The script will then go on and create
> permutations by picking all files from the "good" directory and combining
> them with a single file form the "bad" directory, to see which files make
> it fail. In a 2nd step it can do the same thing by replacing functions in a
> "good" with functions from a "bad" file.
>
> In case of a compiler revisions that triggers a miscompile this is usually
> enough to track down the function that is miscompiled.
>
>
>
> Andys proposed scheme should be striclty more powerfull though as it is
> robust against different optimization decisions between the two
> compilations and even allows you to track down the pass that introduced the
> failure, but maybe my script is useful for somebody else in the meantime.
>
>
>
> Yes these sorts of scripts are useful (I have written the same script 3-4
> times).
>
>
>
> Why not clean it up and commit it under utils, maybe creating a debug
> utility directory and stick it with bisect? (Just a random thought). I
> would love not to have to write it again ; ).
>
>
>
> Michael
>
>
>
>
>
> - Matthias
>
>
>
> <abtest.py>
>
>
>
> On Mar 25, 2016, at 4:05 PM, Michael Gottesman via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>
>
> I will describe the complete process for completeness thus hopefully
> forestalling all questions [tell me if I did not ; )]. There is not much to
> it TBH.
>
>
>
> ./utils/bisect is a dumb python script that allows for arbitrary bisecting
> via the exit status of a script it runs . The way you use it is you write a
> script (lets call it test.sh). Then you invoke:
>
>
>
> ./utils/bisect --start=N --end=M ./test.sh "%(count)s"
>
>
>
> the bisect script will just invoke test.sh over and over again
> interpolating "%(count)s" with whatever the current count is. test.sh uses
> the count argument in its internal computations.
>
>
>
> In these test.sh scripts I invoke the swift compiler using an option
> called "-sil-opt-pass-count". The SIL pass manager is aware of this option
> and when the option is set, the pass manager increments a counter for all
> "actions" that it performs. Keep in mind this means ALL actions. You are
> correct that this /does/ make the counter value balloon. But bisecting is
> quick, (think about log2 of UINT64_MAX). When it finishes, the pass manager
> stops running passes. If the action to perform takes a long time, I just
> let it run overnight or do it on an extra cpu or something like that. The
> simplicity is worth it though IMHO.
>
>
>
> Thats all it is.
>
>
>
> Michael
>
>
>
> On Mar 25, 2016, at 1:56 PM, Kaylor, Andrew <andrew.kaylor at intel.com>
> wrote:
>
>
>
> > In the swift-world we use utils/bisect + a single number all the time +
> extra verifications. It works really well.
>
>
>
> Can you describe to me what you mean by that exactly?
>
>
>
> Are you using the single number in the LLVM back end or somewhere else?
>
>
>
>
>
> *From:* llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org
> <llvm-dev-bounces at lists.llvm.org>] *On Behalf Of *Michael Gottesman via
> llvm-dev
> *Sent:* Friday, March 25, 2016 12:35 PM
> *To:* Adrian Prantl <aprantl at apple.com>
> *Cc:* llvm-dev at lists.llvm.org
> *Subject:* Re: [llvm-dev] RFC: New support for triaging
> optimization-related failures in front ends
>
>
>
>
>
> On Mar 25, 2016, at 12:10 PM, Adrian Prantl via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>
>
>
> On Mar 25, 2016, at 11:56 AM, Mehdi Amini via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>
>
> Hi Andy,
>
>
>
> On Mar 25, 2016, at 11:41 AM, Kaylor, Andrew via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>
>
> The Intel C++ compiler has long had an internal facility to help our
> development teams track down the source of optimization related bugs by
> allowing a sort of binary search in which optimization passes and even
> individual optimizations are selectively disabled in response to front end
> command line options.  As we've been doing development work on LLVM our
> developers have found that we miss this capability and so we would like to
> introduce something like this to the LLVM code base.
>
>
>
> I am aware of the many existing facilities in LLVM for tracking down bugs
> such as llc, opt and bugpoint.  Our development teams are, of course,
> making use of these facilities, but we believe that the feature I am going
> to propose today can powerfully complement these existing tools and provide
> a better way to automate defect finding in products that provide a front
> end for LLVM.  This can also be used by customers (who may not be able to
> share their source code) to narrow down problems and provide a concise
> reproducer.
>
>
>
> While the combination of llc, opt and bugpoint is very effective at
> locating compilation failures that can be reproduced from an LLVM IR or
> bitcode file, they can be cumbersome and difficult to automate,
> particularly when debugging runtime failures in programs with non-trivial
> build systems.  The new feature that I am proposing provides a way to
> selectively turn off ranges of optimizations directly from a set of front
> end command line options.
>
>
>
> The proposed feature works by assigning arbitrary but repeatable values to
> modules, functions, passes, etc. as a compilation unit is being optimized.
> The developer can then use these numbers to selectively disable parts of
> the optimization without otherwise altering the behavior of the compiler.
>
>
>
> In keeping with the current LLVM pass manager structure, I am handling
> module passes, SCC passes and function passes separately.  I propose
> handling loop, region and basic block passes as if they were function
> passes.  (I think what this means will become clear below.)  Because the
> handling of function passes illustrates well all of the steps involved in
> using the feature I am proposing, I'll start by describing that case in
> detail.  I'm describing this as a series of manual steps, but I intend that
> these steps could be easily automated.
>
>
>
> I propose introducing three command line options to support locating
> problems with function pass optimizations.  I envision these as options
> available through the front end but implemented in the core LLVM library.
>
>
>
> -num-fn
>
>   sets an upper limit on the function to optimize
>
>   -1 enables all and displays the numbering
>
>
>
> -num-fn-pass
>
>   sets an upper limit on the function pass to run on the limit function
> specified by -num-fn
>
>   -1 enables all and displays the numbering
>
>   ignored if -num-fn is not used or is -1
>
>   all passes are run on functions below the limit function
>
>   only necessary and analysis passes are run on functions above the limit
> function
>
>
>
> -num-fn-case
>
>   sets an upper limit on the optimization case to apply within the limit
> function pass specified by -num-fn-pass
>
>   -1 enables all and displays the numbering
>
>   ignored if -num-fn-pass is not used or is -1
>
>   all cases are applied in passes below the limit function pass
>
>   no cases are applied in optional passes below the limit function pass
>
>
>
> As an example, a developer searching for function pass related
> optimization problems (assuming a case which fails when compiled with -O2
> but passes when compiled with -O0) would begin by using the '-num-fn=-1'
> option to see the functioning numbering.
>
>
>
> clang -c -O2 -num-fn=-1 test.c
>
>
>
> Optimizing function (1) prune_match
>
> Optimizing function (2) free_S
>
> Optimizing function (3) hash_S
>
> Optimizing function (4) insert_S
>
> Optimizing function (5) zero_S
>
> Optimizing function (6) init_S
>
> Optimizing function (7) matches_S
>
> Optimizing function (8) clean_up
>
>
>
> The developer would then use a binary search, recompiling with selective
> optimization and re-running the test case to determine the function in
> which the problem occurs.
>
>
>
> clang -c -O2 -num-fn=4 test.c
>
> <test passes>
>
> clang -c -O2 -num-fn=6 test.c
>
> <test passes>
>
> clang -c -O2 -num-fn=7 test.c
>
> <test fails>
>
>
>
> Having found the problem function, the developer would use the
> '-num-fn-pass=-1' option to see the numbering of function passes (including
> loop, region and basic block passes) that are run on that function.
>
>
>
> clang -c -O2 -num-fn=7 -num-fn-pass=-1 test.c
>
>
>
> Optimizing function (1) prune_match
>
> Optimizing function (2) free_S
>
> Optimizing function (3) hash_S
>
> Optimizing function (4) insert_S
>
> Optimizing function (5) zero_S
>
> Optimizing function (6) init_S
>
> Optimizing function (7) matches_S
>
> running necessary pass Function Pass Manager on function (7) matches_S
>
> running necessary pass Module Verifier on function (7) matches_S
>
> running necessary pass Add DWARF path discriminators (3) on function (7)
> matches_S
>
> running pass (1) Simplify the CFG on function (7) matches_S
>
> running analysis pass Dominator Tree Construction on function (7) matches_S
>
> running pass (2) SROA on function (7) matches_S
>
> running pass (3) Early CSE on function (7) matches_S
>
> running pass (4) Lower 'expect' Intrinsics on function (7) matches_S
>
> <additional passes ignored for brevity>
>
> NOT Optimizing function (8) clean_up
>
>
>
> The user would again use a binary search to determine the problem pass.
>
>
>
> clang -c -O2 -num-fn=7 -num-fn-pass=2 test.c
>
> <test passes>
>
> clang -c -O2 -num-fn=7 -num-fn-pass=3 test.c
>
> <test fails>
>
>
>
> Having determined the problem pass, the developer would use the
> '-num-fn-case=-1' option to see the numbering of individual optimizations
> applied by the pass.
>
>
>
> clang -c -O2 -num-fn=7 -num-fn-pass=3 -num-fn-case=-1 test.c
>
>
>
> Optimizing function (1) prune_match
>
> Optimizing function (2) free_S
>
> Optimizing function (3) hash_S
>
> Optimizing function (4) insert_S
>
> Optimizing function (5) zero_S
>
> Optimizing function (6) init_S
>
> Optimizing function (7) matches_S
>
> running necessary pass Function Pass Manager on function (7) matches_S
>
> running necessary pass Module Verifier on function (7) matches_S
>
> running necessary pass Add DWARF path discriminators (3) on function (7)
> matches_S
>
> running pass (1) Simplify the CFG on function (7) matches_S
>
> running analysis pass Dominator Tree Construction on function (7) matches_S
>
> running pass (2) SROA on function (7) matches_S
>
> running pass (3) Early CSE on function (7) matches_S
>
> Another case (1): EarlyCSE CSE value
>
> Another case (2): EarlyCSE CSE value
>
> Another case (3): EarlyCSE CSE value
>
> Another case (4): EarlyCSE CSE load
>
> Another case (5): EarlyCSE CSE value
>
> Another case (6): EarlyCSE CSE load
>
> Another case (7): EarlyCSE CSE value
>
> Another case (8): EarlyCSE CSE load
>
> NOT running pass (4) Lower 'expect' Intrinsics on function (7) matches_S
>
> <additional passes ignored for brevity>
>
> NOT Optimizing function (8) clean_up
>
>
>
> Finally, the developer would use one last binary search to determine which
> optimization case was causing the failure.
>
>
>
> clang -c -O2 -num-fn=7 -num-fn-pass=3 -num-fn-case=4 test.c
>
> <test fails>
>
> clang -c -O2 -num-fn=7 -num-fn-pass=3 -num-fn-case=2 test.c
>
> <test passes>
>
> clang -c -O2 -num-fn=7 -num-fn-pass=3 -num-fn-case=3 test.c
>
> <test fails>
>
>
>
> Most of the above functionality can be implemented by inserting hooks into
> the various pass managers.  Support for the '-num-fn-case' option would
> require instrumentation of individual passes.  I propose implementing this
> on an 'opt in' basis so that it can be added to passes as needed and passes
> which do not opt in will simply not report having selectable cases.
>
>
>
> Note that I have introduced the concept of a "necessary" pass.  Because
> the goal of this is to have the front end produce the same sort of object
> or executable file that it normally would, some passes (such as the
> register allocator and its various helper passes) cannot be skipped.  I am
> also, for now, proposing that all analysis passes be run always to avoid
> the problem of determining which analysis passes would be required for
> later "necessary" passes.  We can revisit this as needed.
>
>
>
> As I've said, I intend to handle loop, region and basic block passes as if
> they were just another function pass.  While I would like to display an
> associated number for the target loop, region or block, I don't see value
> in being able to filter passes based on this value (because they are
> relatively dynamic constructs), so consecutive runs of a single loop pass
> (for instance) on different loops would be identified as distinct function
> passes.  For example:
>
>
>
> running pass (25) Loop Invariant Code Motion on loop (1), function (7)
> matches_S
>
> running pass (26) Loop Invariant Code Motion on loop (2), function (7)
> matches_S
>
> running pass (27) Loop Invariant Code Motion on loop (3), function (7)
> matches_S
>
>
>
> I would also like to have a feature that would allow the developer to
> request that an LLVM IR/bitcode file be emitted just before the first
> disabled pass so that the isolated IR could be used with opt, llc or
> bugpoint for further debugging.  I haven't worked out exactly how this
> would work, so I'm just mentioning it here as a fuzzy part of the proposal.
>
>
>
> Circling back now to module and SCC passes, these would work in pretty
> much the same way as the above method for function passes so I won't cover
> them in as much detail.  I propose the following new command line options.
>
>
>
> -num-mod
>
>   sets an upper limit on the module on which to run module passes
>
>   -1 enables all and displays the numbering
>
>   defaults to 1 since I expect most front ends to usually operate on a
> single module
>
>
>
> -num-mod-pass
>
>   sets an upper limit on the module pass to run on the limit module
> specified by -num-mod
>
>   -1 enables all and displays the numbering
>
>   ignored if -num-mod is -1
>
>   all passes are run on modules below the limit module
>
>   only necessary and analysis passes are run on modules above the limit
> module
>
>
>
> -num-mod-case
>
>   sets an upper limit on the optimization case to apply within the limit
> module pass specified by -num-mod-pass
>
>   -1 enables all and displays the numbering
>
>   ignored if -num-mod-pass is not used or is -1
>
>   all cases are applied in passes below the limit module pass
>
>   no cases are applied in optional passes below the limit module pass
>
>
>
> -num-scc-pass
>
>   sets an upper limit on the SCC pass to run
>
>   -1 enables all and displays the numbering
>
>
>
> -num-scc-case
>
>   sets an upper limit on the optimization case to apply within the limit
> SCC pass specified by -num-scc-pass
>
>   -1 enables all and displays the numbering
>
>   ignored if -num-scc-pass is not used or is -1
>
>   all cases are applied in passes below the limit SCC pass
>
>   no cases are applied in optional passes below the limit SCC pass
>
>
>
> Similar to loops, regions and basic blocks, I would like to present
> numbering information for each unique SCC, but I do not believe there is
> value in being able to filter on a specific SCC because they are subject to
> reorganization.  Rather I would treat each invocation of an SCC pass as a
> separate selectable pass.  The output for an SCC problem search would look
> something like this:
>
>
>
> Optimizing SCC (1): <null function>
>
> running pass (1) Remove unused exception handling info on SCC (1)
>
> running pass (2) Function Integration/Inlining on SCC (1)
>
> running pass (3) Deduce function attributes on SCC (1)
>
> Optimizing SCC (2): isupper
>
> running pass (4) Remove unused exception handling info on SCC (2)
>
> running pass (5) Function Integration/Inlining on SCC (2)
>
> running pass (6) Deduce function attributes on SCC (2)
>
> <snip>
>
> Optimizing SCC (34): or_purge_E_list, and_purge_E_list, purge_Exp
>
> running pass (101) Remove unused exception handling info on SCC (34)
>
> running pass (102) Function Integration/Inlining on SCC (34)
>
> running pass (103) Deduce function attributes on SCC (34)
>
> <snip>
>
>
>
> Here I'm printing the functions (if any) associated with nodes in the SCC
> just to give the developer some feedback on which part of the code is being
> worked on.
>
>
>
> I have a working prototype of all of the above functionality using both
> the new pass manager and the legacy pass manager.
>
>
>
> So what do you think?  Do you agree that support for a feature of this
> sort would be useful in the LLVM core library?  What problems do you see in
> my proposal?  What additional options would you like to see added?
>
>
>
> Thanks in advance for any feedback.
>
>
>
>
>
> You're right that we could benefit from support to help "bisecting" where
> a miscompile happens in a more automated way.
>
>
>
> Reading through you (long but nice) description, I thought about something
> simpler: a single number incremented every time a pass in the optimizer is
> ran (as the order shown by -debug-pass=Executions).
>
> That way you run once to get the total number of passes that ran and then,
> a simple bisect script can find quite quickly at which point the miscompile
> is introduced (between 0 and max).
>
> I haven't thought about it much but I'm interested in you view of the
> pros/cons.
>
>
>
> This reminds me a bit of utils/bisect.
>
>
>
> In the swift-world we use utils/bisect + a single number all the time +
> extra verifications. It works really well.
>
>
>
> Michael
>
>
>
>
>
>
> -- adrian
>
>
>
>
> --
>
> Mehdi
>
>
>
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
>
>
>
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160405/25f1ed95/attachment-0001.html>