Here I will introduce my idea of a inliner that can handle splitting cold edge as well:<br>My idea is to use the inlined function, but inline from the uninlined function.<br><br>1. Do normal optimizations without inlining<br>
2. Split the cold edges into other functions<br>3. Copy each function @xyz to @xyz.i (or other name that does not clash) and mark @xyz with 'inlines @xyz.i'<br> ==> this needs an identifier to mark the shadow function for each function<br>
All other operations are performed on @xyz. @xyz.i will contain the uninlined version<br>4. Each function @xyz can now inline its contained functions @abc while you don't inline @abc directly, but @abc.i<br>5. When a function @xyz becomes shorter than @xyz.i, the inline remark and the non-inlined<br>
6. After everything is inlined, new cold edges can appear. They can now be split again. New optimizations are possible.<br>7. Functions that call functions containing these cold edges can now inline them. These functions can be optimized again.<br>
<br>Example: (function signatures in LLVM, content in PSEUDO-Code)<br><br>Starting point:<br>---------------<br><br>define i32 @findNumber() {<br> for i = 0 to 10000<br> if @numberFits(i) then<br> MYSTUFFINTHEFUNCTION<br>
THATISREALLYCOLD<br> end<br> end<br>}<br><br>define i1 @numberFits() {<br> ret DO_A_SIMPLE_PRIME_TEST<br>}<br><br>After 1 and 2:<br>--------------<br><br>define i32 @findNumber() {<br> for i = 0 to 10000<br> if @numberFits(i) then<br>
call @findNumber.c(i)<br> end<br> end<br>}<br><br>define void @findNumber.c(i32 i) {<br> MYSTUFFINTHEFUNCTION<br> THATISREALLYCOLD<br>}<br><br>define i1 @numberFits(i32 i) {<br> ret DO_A_SIMPLE_PRIME_TEST_i<br>
}<br><br>After 3 and 4 (5-7 do not apply here):<br>--------------------------------------<br><br>define i32 @findNumber() inlines @findNumber.i {<br> for i = 0 to 10000<br> if DO_A_SIMPLE_PRIME_TEST_i then<br> call @findNumber.c(i)<br>
end<br> end<br>}<br>define i32 @findNumber.i() {<br> for i = 0 to 10000<br> if @numberFits(i) then<br> call @findNumber.c(i)<br> end<br> end<br>}<br><br>define void @findNumber.c(i32 i) {<br> MYSTUFFINTHEFUNCTION<br>
THATISREALLYCOLD<br>}<br><br>define i1 @numberFits(i32 i) {<br> ret DO_A_SIMPLE_PRIME_TEST_i<br>}<br><br>The linker can now strip all 'inlines' dependencies<br>---------------------------------------------------<br>
<br><br>define i32 @findNumber() {<br> for i = 0 to 10000<br> if DO_A_SIMPLE_PRIME_TEST_i then<br> call @findNumber.c(i)<br> end<br> end<br>}<br><br>define void @findNumber.c(i32 i) {<br> MYSTUFFINTHEFUNCTION<br>
THATISREALLYCOLD<br>}<br><br><br>What do you think about the idea?<br>