<div class="__aliyun_email_body_block"><div  style="clear:both;">Hi,</div><div  style="clear:both;"><br ></div><div  style="clear:both;">Recently, a patch which implemented function specialization is being reviewed at: https://reviews.llvm.org/D93838.</div><div  style="clear:both;">This pass works under LTO. And I want to see if it is possible to make it available under ThinLTO, although this patch isn't accepted yet.</div><div  style="clear:both;"><br ></div><div  style="clear:both;">I know the current framework for ThinLTO would import functions which could be used for all the passes in current translation unit. However,</div><div  style="clear:both;">the problem is that the current cost model would only import functions whose IR is less than threshold (which is relatively small). I can guess</div><div  style="clear:both;">that it is designed for inlining. My point is, in the current cost model, the most function imported would be inlined directly. The space for function</div><div  style="clear:both;">specialization to work is not big. The goal I want is to import more functions which can't be handled by inline but function specialization.</div><div  style="clear:both;"><br ></div><div  style="clear:both;">The idea I had initially was simple. We could add value infomation in the edge of call graph. Then we could make the call graph a bidirection graph</div><div  style="clear:both;">(now the graph only contains edges to the callee from my point). Finally, in the import stage, we could traverse the call graph to pick the suitable </div><div  style="clear:both;">functions to import.</div><div  style="clear:both;"><br ></div><div  style="clear:both;">However, there seems to be some problems:</div><div  style="clear:both;">- We can't see the funciton body before we import it.</div><div  style="clear:both;">- It would repeat traversing the call graph in each translation unit, which is very redundant.</div><div  style="clear:both;">- It may specialize functions with the same version, which could make the code size get larger and redundant.</div><div  style="clear:both;"><br ></div><div  style="clear:both;">I had some solutions:</div><div  style="clear:both;">- We could extract the analysis part from function specialization pass. Then we can use the analysis pass to generate summary infomation. However,</div><div  style="clear:both;">the down side for this approach is that it may make the time for generating summary longger (it looks like the process of generating summary isn't pararrel).</div><div  style="clear:both;">- I can't find solution for the second problem. If we put this part in generating summary, it would only make it slower.</div><div  style="clear:both;">- My solution was to add special marker to functions specialized. Then we can eliminate the redundant functions at the end. However, it looks like serial too.</div><div  style="clear:both;">And I don't know if it is time killer to traverse and merge functions.</div><div  style="clear:both;"><br ></div><div  style="clear:both;">How do you guys think about this?</div><div  style="clear:both;"><br ></div><div  style="clear:both;">Thanks </div><div  style="clear:both;"><span  style="font-size:14.0px;">Chuanqi</span><br ></div></div>