[LLVMdev] Jump Theading/GVN bug - moving discussion to llvm-dev

Mon Feb 23 20:32:30 PST 2015

So, first things first.

Handling unreachable code is annoying. I agree. However, its important to
characterize how annoying. Most code is not unreachable, and we're
(obviously) fine just bailing on such crazy code paths. So in practice the
common cost is keeping a local set to detect cycles in the graph where we
aren't expecting them. We are essentially always traversing a linked list
representing this graph. Because these linked list nodes don't have any
particular reason to have high locality, we have a cache-hostile traversal
where we have to do primarily local and cache friendly book-keeping at each
step to catch duplication. I suspect that this has a very small (but
non-zero) impact on compile time.

Handling unreachable code is also error prone. We have had a long history
of bugs here. So if it were free to get rid of unreachable code, that would
be super awesome.

The first problem to realize is that for a large chunk of LLVM, we can't
actually get rid of handling unreachable code. Passes are quite likely to
create unreachable code while running, and that will mean that all of the
utilities will end up needing to be conservatively correct and handle
unreachable code even when they don't need to. =/

The second problem is that cleaning up unreachable code isn't free. The
code which is most likely to create unreachable code is similarly likely to
destroy the analysis information we would use to remove it cheaply. And
even then, just walking lists of basic blocks as we go out of the function
is going to dirty cache lines that we might not have any other reason to
look at. I can genuinely imagine cases where batching this will be
beneficial. Will it outstrip the cost of handling it? I don't know. But I
think it will mitigate the gains, especially if the gains aren't as
significant as we might naively think.

The third problem I have is that this makes it much harder to
constructively produce a correct IR transformation pass. When transforming
the IR we must never forget about regions we have made unreachable. A
single mistake here will cascade to a verification failure. This is the
reverse of the problem we have today where your pass must *accept*
unreachable IR. But I think the constructive reasoning is much easier. It
makes it harder to have a bug through omission of a case.

The fourth problem I have is related to the third problem. How do we gain
confidence in the correctness of any IR transformation? We must construct
test cases that we believe will exercise the system. It seems *incredibly*
easier to construct test cases with interesting unreachable IR patterns
that should all be handled, than to construct test cases where the pass
will happen to form unreachable IR patterns in each way and ensure that
none escape the pass. One is essentially covering input diversity, the
other has to cover *output* diversity, and must prove the negative. We
fundamentally must gain confidence that the pass *never* produces
unreachable IR. This seems much harder than demonstrating that it does
handle unreachable IR.

The fifth problem I have is also related to the third and fourth problems.
Precluding unreachable code seems likely to make tools like delta test case
reduction dramatically less effective. Now, when cutting edges in the CFG
to minimize a test case they must actually build the graph and prune all
unreachable code.

All of these problems seem surmountable, but in aggregate, they make me
strongly doubt that the benefit is worth the cost.

-Chandler

On Mon, Feb 23, 2015 at 1:04 PM, Romanova, Katya <
Katya_Romanova at playstation.sony.com> wrote:

>  Hello,
>
> I encountered a problem triggered by Jump-Threading optimization.  This
> pass is creating an unreachable block with an instruction that is not well
> formed, which then causes the subsequent GVN pass to enter an infinite loop.
>
>
>
> I have submitted a bug report and proposed fix to llvm-commits. This bug
> opened a can of worms. I was asked to move the discussion to llvm-dev to
> reach for a wider audience.
>
>
>
> >> Can we move the general discussion to llvm-dev?  This probably warrants
> a wider audience.
>
>
>
> Here is the original post and a couple of replies from last week:
>
>
> http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20150216/261330.html
>
>
>
> Here is a thread of replies from today:
>
>
> http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20150223/261463.html
>
>
>
> Katya.
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150223/e86f1b09/attachment.html>