[LLVMdev] How to decide whether a function is executed or not

Sean Silva chisophugis at gmail.com
Tue May 20 12:28:50 PDT 2014


(sorry for the long post; hopefully you will find it informative)

On Tue, May 20, 2014 at 8:19 AM, lyh.kernel <lyh.kernel at gmail.com> wrote:

> Hello everyone,
>
> I want to decide whether a function is executed or not. For example (the
> value of
> cond is not determined at compile time):
>
> br i1 %cond, label %if, label %else
>
> if:
>   ...
>   call void f()
>   ...
>   br label %exit
>
> else:
>   ...
>   br label %exit
>
> We could say that function f is control dependent on cond and may not be
> executed.
>
> On the other hand:
>
> br i1 %cond, label %if, label %else
>
> if:
>   ...
>   call void f()
>   ...
>   br label %exit
>
> else:
>   ...
>   call void f()
>   ...
>   br label %exit
>
> No matter the value of cond is, function f would be executed.
>
> I am wondering whether there exist any algorithm to decide whether
> function f is
> executed or not. Any suggestion or key word are appreciated.
>

There is not. Almost any question that depends on the result of evaluating
"arbitrary code" is undecidable. "undecidable" is a technical term which
basically means that there is no algorithm that answers the question.
"algorithm" also has a technical meaning in this context which is basically
a computer program *that is guaranteed to return an answer in all cases*
(the alternative is to keep running forever). If you want to learn more
about this, the keyword is "theory of computation".

Intuitively, this means that there is no general "shortcut" for
understanding the behavior of an arbitrary computer program when it runs:
the only way to know is to actually run it. This issue is the crux of the
so-called "Full Employment Theorem for Compiler Writers"; i.e. you cannot
generate optimal code in all cases (not just that it is hard to do so, but
actually provably impossible). The connection with compilers is that if you
could short-cut the execution, then you could constant-fold the entire
program into a single return expression, basically.

As another intuition for why this might be a difficult problem, if you had
an algorithm for deciding whether a function is called, you would put all
mathematicians out of business by running your algorithm on the following
program:

void foo() {}

int main() {
for K = 1, 2, ... (forever):
  for every string S of length K:
    if S is a valid proof of <<insert theorem of your choice>>:
      foo();
}

The same intuition can be applied to almost any question about the behavior
of a program at runtime (for example, with a trivial modification of the
above program, you could just as easily put mathematicians out of business
by being able to tell whether a function returns true or false; same goes
for the Halting Problem (deciding whether the program ever stops running)).



There is a small hitch with everything I just said. The Halting Problem
(and all these related problems) can be easily solved if you can compute an
upper bound on the maximum number of steps taken to produce an answer: run
the program for longer than that upper bound, and if it the program hasn't
given you an answer by then, it must be stuck in an infinite loop.

This upper bound can be established for real computers, or any model of
computation where there is a (computable) limit on the amount of storage
space. A real computer has a finite amount of state (say N bits) in RAM,
registers, etc.; furthermore, each "clock cycle" it deterministically
transitions from one particular configuration of those N bits to another
configuration (ignoring I/O, "random seed" instructions, etc.); there are
only 2^N such configurations. So if it runs for 2^N + 1 cycles, at least
one configuration must have been entered twice. Because the computer is
deterministic, the same path that took it from the repeating configuration
back to itself will now take it back to that configuration again, and
again, and again, so the program is stuck in an infinite loop.

Of course, 2^N is a completely useless running time for such an algorithm.
Also, the arbitrary restriction to the N bits of a particular computer is
more likely to give you an answer to "does this program happen to go into
an infinite loop when it runs out of memory" (or similar) than about the
actual "ideal" behavior of the program.

Generally speaking, the more closely you can bound the amount of state used
in a computation, the more you can know about its runtime behavior without
actually running it.

-- Sean Silva



>
> Many thanks
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20140520/21288539/attachment.html>


More information about the llvm-dev mailing list