[LLVMdev] Jump Theading/GVN bug - moving discussion to llvm-dev
Hal Finkel
hfinkel at anl.gov
Tue Feb 24 10:51:17 PST 2015
----- Original Message -----
> From: "Philip Reames" <listmail at philipreames.com>
> To: "Chandler Carruth" <chandlerc at google.com>, "Katya Romanova" <Katya_Romanova at playstation.sony.com>, "Nick Lewycky"
> <nlewycky at google.com>
> Cc: llvmdev at cs.uiuc.edu, "Hal Finkel" <hfinkel at anl.gov>, "Rafael Espindola" <rafael_espindola at playstation.sony.com>,
> "Sanjoy Das" <sanjoy at playingwithpointers.com>, "David Majnemer" <david.majnemer at gmail.com>
> Sent: Tuesday, February 24, 2015 12:45:51 PM
> Subject: Re: Jump Theading/GVN bug - moving discussion to llvm-dev
>
> Whatever we end up deciding, I think we do need to preserve the
> ability of a language frontend to generate complete garbage in the
> form of unreachable code.
And this is exactly what I don't understand. Why would a frontend need to produce "complete garbage" in unreachabe code?
-Hal
> If we decide to move in the direction of
> disallowing the production of unreachable code, we need to write a
> canonicalization pass that runs first thing in the pipeline. A
> frontend should be able to produce unreachable code. This enables
> important simplifications in the frontend.
>
> I happen to agree with Chandler, but I don't have a strong opinion
> either way. His testing point in particular is one that resonates
> with me. I'll, in practice, take testability over verification any
> day. :)
>
> Sanjoy's suggestion of having an "unreachable code analysis" might be
> a way to split the difference. Every pass could depend on it. If it
> doesn't report "no unreachable code", the pass calls a cleanup
> transform utility. Any pass that wanted to spend time to "preserve"
> the analysis (by making sure there was no unreachable code), could
> do so. Everything would "just work".
>
> Do we have a set of unreachable, but otherwise valid, code examples?
> It doesn't seem to hard to write extra run lines for each pass. The
> corpus of valid but problematic examples should be common to all
> passes.
>
> Philip
>
>
> On 02/23/2015 08:32 PM, Chandler Carruth wrote:
>
>
>
> So, first things first.
>
>
> Handling unreachable code is annoying. I agree. However, its
> important to characterize how annoying. Most code is not
> unreachable, and we're (obviously) fine just bailing on such crazy
> code paths. So in practice the common cost is keeping a local set to
> detect cycles in the graph where we aren't expecting them. We are
> essentially always traversing a linked list representing this graph.
> Because these linked list nodes don't have any particular reason to
> have high locality, we have a cache-hostile traversal where we have
> to do primarily local and cache friendly book-keeping at each step
> to catch duplication. I suspect that this has a very small (but
> non-zero) impact on compile time.
>
>
> Handling unreachable code is also error prone. We have had a long
> history of bugs here. So if it were free to get rid of unreachable
> code, that would be super awesome.
>
>
>
>
> The first problem to realize is that for a large chunk of LLVM, we
> can't actually get rid of handling unreachable code. Passes are
> quite likely to create unreachable code while running, and that will
> mean that all of the utilities will end up needing to be
> conservatively correct and handle unreachable code even when they
> don't need to. =/
>
>
>
>
> The second problem is that cleaning up unreachable code isn't free.
> The code which is most likely to create unreachable code is
> similarly likely to destroy the analysis information we would use to
> remove it cheaply. And even then, just walking lists of basic blocks
> as we go out of the function is going to dirty cache lines that we
> might not have any other reason to look at. I can genuinely imagine
> cases where batching this will be beneficial. Will it outstrip the
> cost of handling it? I don't know. But I think it will mitigate the
> gains, especially if the gains aren't as significant as we might
> naively think.
>
>
>
>
> The third problem I have is that this makes it much harder to
> constructively produce a correct IR transformation pass. When
> transforming the IR we must never forget about regions we have made
> unreachable. A single mistake here will cascade to a verification
> failure. This is the reverse of the problem we have today where your
> pass must *accept* unreachable IR. But I think the constructive
> reasoning is much easier. It makes it harder to have a bug through
> omission of a case.
>
>
>
>
> The fourth problem I have is related to the third problem. How do we
> gain confidence in the correctness of any IR transformation? We must
> construct test cases that we believe will exercise the system. It
> seems *incredibly* easier to construct test cases with interesting
> unreachable IR patterns that should all be handled, than to
> construct test cases where the pass will happen to form unreachable
> IR patterns in each way and ensure that none escape the pass. One is
> essentially covering input diversity, the other has to cover
> *output* diversity, and must prove the negative. We fundamentally
> must gain confidence that the pass *never* produces unreachable IR.
> This seems much harder than demonstrating that it does handle
> unreachable IR.
>
>
>
>
> The fifth problem I have is also related to the third and fourth
> problems. Precluding unreachable code seems likely to make tools
> like delta test case reduction dramatically less effective. Now,
> when cutting edges in the CFG to minimize a test case they must
> actually build the graph and prune all unreachable code.
>
>
>
>
> All of these problems seem surmountable, but in aggregate, they make
> me strongly doubt that the benefit is worth the cost.
>
>
> -Chandler
>
>
>
>
>
>
>
>
> On Mon, Feb 23, 2015 at 1:04 PM, Romanova, Katya <
> Katya_Romanova at playstation.sony.com > wrote:
>
>
>
>
>
>
> Hello,
>
> I encountered a problem triggered by Jump-Threading optimization.
> This pass is creating an unreachable block with an instruction that
> is not well formed, which then causes the subsequent GVN pass to
> enter an infinite loop.
>
>
>
> I have submitted a bug report and proposed fix to llvm-commits. This
> bug opened a can of worms. I was asked to move the discussion to
> llvm-dev to reach for a wider audience.
>
>
>
> >> Can we move the general discussion to llvm-dev? This probably
> >> warrants a wider audience.
>
>
>
> Here is the original post and a couple of replies from last week:
>
> http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20150216/261330.html
>
>
>
> Here is a thread of replies from today:
>
> http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20150223/261463.html
>
>
>
> Katya.
>
>
>
>
--
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory
More information about the llvm-dev
mailing list