[LLVMdev] missed optimizations

Thu Sep 4 17:45:19 PDT 2008

On Thu, Sep 4, 2008 at 8:39 AM, Nuno Lopes <nunoplopes at sapo.pt> wrote:
> Hi,
>
> I have two questions about optimizations performed by llvm.
>
> Consider these simple functions:
> int x(int b) { return b?4:6; }
> int y() { return x(0); }
>
> int x2() { return 5; }
> int y2() { return x2(); }
>
> the optimized bitcode (with clang + opt -std-compiler-opts) is:
> define i32 @y(...) nounwind {
> entry:
>  ret i32 6
> }
>
> define i32 @y2(...) nounwind {
> entry:
>  %call = call i32 (...)* @x2( )  ; <i32> [#uses=0]
>  ret i32 5
> }
>
> So why does LLVM optimizes the more difficult case, but leaves behind the
> function call in the easiest case? :)

Pretty simple: LLVM doesn't know how to eliminate the call.

There are a couple of ways it could be eliminated: DCE or inlining.
The current DCE infrastructure isn't clever enough to do
interprocedural analysis, so that doesn't eliminate it.  And the
inliner immediately gives up on varargs functions, so that doesn't
eliminate it.  And

That said, clang really should be turning int x2() { return x(0); }
into "define i32 @x2()" rather than "define i32 @x2(...)"; the
function isn't varargs, and marking it as such could lead to wrong
code for exotic calling conventions.

>
> Second question:
> int f();
>
> int g() {
>  if (f() == 5)  return 3;
>  return 4;
> }
>
> int h() {
>  if (g() == 6) return 2;
>  return 1;
> }
>
> gives the following optimized bc:
>
> define i32 @g(...) nounwind {
> entry:
>  %call = call i32 (...)* @f( )  ; <i32> [#uses=1]
>  %cmp = icmp eq i32 %call, 5  ; <i1> [#uses=1]
>  %retval = select i1 %cmp, i32 3, i32 4  ; <i32> [#uses=1]
>  ret i32 %retval
> }
>
> define i32 @h(...) nounwind {
> entry:
>  %call = call i32 (...)* @g( )  ; <i32> [#uses=1]
>  %cmp = icmp eq i32 %call, 6  ; <i1> [#uses=1]
>  %retval = select i1 %cmp, i32 2, i32 1  ; <i32> [#uses=1]
>  ret i32 %retval
> }
>
> In function h(), llvm doesn't realize that g() never returns 6, and thus it
> could reduce that function to { g(); return 1; }. Doesn't llvm produce some
> sort of summary for functions with e.g. their possible return values? If
> not, is there any pass where I could implement this?  (I have some code that
> has many opportunities for this kind of optimization)

No, there isn't any such infrastructure at the moment.  The LLVM pass
that might do that sort of thing is predsimplify, but it isn't an
interprocedural pass.  It's not necessarily difficult to implement,
depending on how precise you want it to be, but nobody's implemented
it; as far as I know, it doesn't provide significant benefits for
normal C/C++ code.  If there's some specific pattern in your code, you
could probably throw together a specialized pass pretty quickly.

> Also, shouldn't the function g() get inlined in the h() function? It is
> inlined only if I change the g() function to be static. So isn't llvm being
> too conservative when inlining global functions?

The inliner doesn't like varargs; see above.

-Eli