[LLVMdev] LiveVariables/LiveInterval on huge functions

Mon Apr 14 01:39:02 PDT 2008

Evan Cheng wrote:
> On Apr 13, 2008, at 1:28 PM, Török Edwin wrote:
>
>   
>> Hi,
>>
>> In PR2193 LiveVariables runs out of memory on a 512M limit, after
>> processing 11557 basicblocks.
>> VirtRegInfo has ~180000 entries with ~700 bytes each.
>> If I give it more memory (1.5G) it runs out of memory in LiveInterval.
>>     
>
> Some of the information kept by LiveVariables are somewhat redundant  
> and can be removed. I think there was talk of folding much of its  
> computation into LiveIntervalAnalysis. It's not clear whether that  
> would make a difference without careful analysis.
>
> It's not clear to me what to do about LiveInterval. It does keep a lot  
> of information but it's all essential.
>
> I suspect StrongPHIElimination may help the situation somewhat but  
> again I can't guess its impact.
>
>   

Another question to ask, is why that function became so large in the
first place [X86DAGToDAGISel::SelectCode(llvm::SDOperand)]
We have inline limits, don't we?

>> I don't see any easy solution to reduce memory usage, but should we
>> optimize such a huge function at once?
>> If the function is over some reasonable limit (5k basic-blocks?) we
>> could split it into smaller functions, and avoid
>> these problems.
>>     
>
> But what is the right limit? How do you estimate the limit given  
> different host configuration? 

I think having a limit is better than having none. It will prevent
memory usage to get out of control.
The memory usage of LiveVariables/LiveInterval looks like O(N^2) to me,
if we limit N, the memory usage can differ at most by a constant factor.
Having a function twice the size, won't suddenly require 4x the memory.
Determining a right limit, to fit exactly into a given amount of memory
is hard.
We could gather some statistics on medium and maximum basic-block count
in applications from llvm-test. Then choose a default value, and allow
the user to override via a command-line option.

For example once gcc once gave me a warning, something like: more than
15000 BBs, disabling gcse (IIRC, I can't find the source file that gave
me that warning anymore).
Disabling an optimization entirely doesn't seem the best choice to me, I
think it is better if we split it into smaller functions, and optimize
those.

But maybe the right solution is to avoid generating functions that
large, instead of splitting it afterwards.
> If you can make the decision earlier,  
> than it can default to the local register allocator path which doesn't  
> require LiveIntervalAnalysis.
>   

Yep, the local allocator works.
How much worse is the performance of generated code with the local
allocator vs. the default?

>
>   
>> I am writing a pass to do this split at llvm IR level. What do you  
>> think?
>>     
>
> This functionality sounds useful, but I am not sure it's the *right*  
> fix. We should definitely try harder to make codegen more scalable.  
> But it's not a trivial problem.

Good point. If we have the size-limiting pass, we could become lazy and
not care of scalability too much. So maybe it is better if I don't write
it ;)

Best regards,
--Edwin