[LLVMdev] Cross-module function inlining
mark.i.r.muir at gmail.com
Wed Jan 13 12:05:56 PST 2010
On 13 Jan 2010, at 16:43, Nick Lewycky wrote:
> Mark Muir wrote:
>> - Run the existing Clang tool on each source file, using -emit-llvm to generate a .bc file for each module.
>> - Run llvm-link to merge them into a single .bc file.
>> - Run llc to generate a complete machine assembly.
>> However, with optimisations enabled, the resulting code is not as efficient as it would be if all the code were in a single module. In particular, function inlining is only performed by clang (i.e. only on a module-by-module basis), and not by llvm-link or llc.
> It sounds like you're not running the LTO optimizations. You could try replacing llvm-link with llvm-ld which will, or run 'opt -std-link-opts' between llvm-link and llc.
Yep, that sorted inlining. Thanks.
But... now there's a small problem with library calls. Symbols such as 'memset', 'malloc', etc. are being removed by global dead code elimination. They are implemented in one of the bitcode modules that are linked together (implementations are based on newlib). I get the same behaviour of them being stripped even when they are live, by the following:
opt -internalize -globaldce
Other (not standard-library) functions implemented in different modules than where they are called, are correctly seen as live. So, could this be something to do with what is declared as a built-in? I haven't provided any list of built-ins (or overridden the defaults), nor could I figure out how exactly to do that.
I've also noticed other problems related to built-ins - in one example, code made use of abs(), but didn't #include <stdlib.h>. The resulting code compiled without warning or error, but the resulting code was broken, due to the arguments not being seen as live, e.g.:
Without #include <stdlib.h>:
0x181e8b0: i32 = TargetGlobalAddress <i32 (...)* @abs> 0 [TF=1]
=> JUMP_CALLi <ga:abs>[TF=1], %r2<imp-def>, %r3<imp-def>, %r4<imp-def,dead>, %r5<imp-def,dead>, %r6<imp-def,dead>, %r7<imp-def,dead>, %r8<imp-def,dead>, %r9<imp-def,dead>, %r10<imp-def,dead>
With #include <stdlib.h>:
0x181e8b0: i32 = TargetGlobalAddress <i32 (i32)* @abs> 0 [TF=1]
=> JUMP_CALLi <ga:abs>[TF=1], %r3<kill>, %r2<imp-def>, %r3<imp-def>, %r4<imp-def,dead>, %r5<imp-def,dead>, %r6<imp-def,dead>, %r7<imp-def,dead>, %r8<imp-def,dead>, %r9<imp-def,dead>, %r10<imp-def,dead>
Where r2 is the link register, and r3 to r10 are argument/retval registers. LowerFormalArguments() doesn't see any arguments in the former, and consequently doesn't add input register nodes to the DAG.
I guess I need help with the concept of built-ins, and what code is related to them in the Clang driver and back-end.
More information about the llvm-dev