[llvm-dev] [RFC] Add IR level interprocedural outliner for code size.

Daniel Berlin via llvm-dev llvm-dev at lists.llvm.org
Tue Aug 1 13:03:31 PDT 2017


On Tue, Aug 1, 2017 at 12:28 PM, Sean Silva <chisophugis at gmail.com> wrote:

>
>
> On Tue, Aug 1, 2017 at 10:28 AM, Daniel Berlin <dberlin at dberlin.org>
> wrote:
>
>>
>>>
>>> Also as a side note, I think in the original MachineOutliner RFC thread
>>> there was some confusion as to whether it was possible to solve the code
>>> folding outlining problem exactly as a graph problem on SSA using standard
>>> value numbering algorithms in polynomial time.
>>>
>>
>>
>>> I can elaborate further, but
>>> 1. it is easy to see that you can map an arbitrary dag to an isomorphic
>>> data flow graph in an SSA IR e.g. in LLVM IR or pre-RA MIR
>>>
>>
>>
>>> 2. Given two dags, you can create their respective isomorphic data flow
>>> graphs (say, put them into two separate functions)
>>> 3. An exact graph based code folding outliner would be able to discover
>>> if the two dataflow graphs are isomorphic (that is basically what I mean by
>>> exact) and outline them.
>>> 4. Thus, graph isomorphism on dags can be solved with such an algorithm
>>> and thus the outlining problem is GI-hard and a polynomial time solution
>>> would be a big breakthrough in CS.
>>>
>>
>> First, you'd have to reduce them in both directions to prove that ;)
>>
>> All that's been shown is that you can reduce it to a hard problem.
>> You can also reduce it to 3-sat., but that doesn't mean anything unless
>> you can reduce 3-sat to it.
>>
>
> Only one direction is needed to prove it is GI-hard.
>

Yes sorry, but you still haven't done it.
So if you really want to do it, let's start with a definition:

"The precise definition here is that *a problem X is NP-hard, if there is
an NP-complete problem Y, such that Y is reducible to X in polynomial time*
."

For this to be GI-hard you must be able to reduce *every* GI problem to one
about SSA-based outlining, and you must be able to do so in *polynomial
time*.

The above does not do this.

Things that would need proof:

1. it is easy to see that you can map an arbitrary dag to an isomorphic
data flow graph in an SSA IR e.g. in LLVM IR or pre-RA MIR

A routine or proof to do this in polynomial time ;)  Please make sure you
do an arbitrary dag, and not just rooted DAGs, as isomorphism on rooted
DAGs can be solved in linear time ;)


3. in exact graph based code folding outliner would be able to discover if
the two dataflow graphs are isomorphic (that is basically what I mean by
exact) and outline them.

Proof that  code outlining necessarily requires discovering full
isomorphism between the graphs.

(IE that whatever process code outlining uses must end up doing a thing
that would prove isomorphism).

Keep in mind that the way we ended up with linear time dataflow was
precisely to prove that it did not actually require doing things that
reduced to boolean matrix multiplication


But let's take this offline, as it's going to go into a very theoretical
route.

I would suggest reading some of the very early papers. Early value
numbering, and complete formulations, assume uninterpreted operators and in
some cases, operands.  There is also the distinction between whether you
are trying to discover the set of equivalent values that already exist in
the program vs things that are equivalent to those values vs ....

It's 100% possible to come up with formulations of VN that are still
exponential time depending precisely on the details you are giving.

It's also interesting to note that Babai still has his paper on graph
isomorphism in quasi-polynomial time, and nobody has found holes in it yet
(and since it's in NP but not NP complete, it's little more believable than
the standard P = NP fair)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170801/1569703b/attachment.html>


More information about the llvm-dev mailing list