[cfe-dev] ld taking too much memory to link clang

Wed Jan 23 19:13:28 PST 2013

On Wed, Jan 23, 2013 at 08:31:32AM +0000, David Chisnall wrote:
> On 23 Jan 2013, at 00:30, Karen Shaeffer wrote:
> 
> > I think your ideas are good. You might want to fully generalize the issue, and
> > monitor the actual system memory usage and availability in deciding when to
> > run the linker. You can do that in real-time easily:
> > 
> > cat /proc/meminfo
> > 
> > You can also monitor individual process memory usage through /proc/pid/...,
> > but the system stats are more appropriate here.
> 
> It's a bit more tricky than it first appears.  When linking LLVM, memory usage of ld grows fairly slowly, at about 10MB/s for me (some months ago, when I actually bothered to track it) because it's doing a load of work in between allocations.  If you start one linker process, by the time you check again, your build is only using 20MB, so you start another.  Check again, and you're using 60MB, so start a third.  Check again, now you're using 120MB, still space for a fourth.  Then you're at your -j4 limit, so you stop, but these all then grow to 1GB each and you're well over the 2GB that the VM has and you're almost out of swap.  
> 
> Ideally, the build system would watch the compiler or linker grow and kill some of the processes if the total memory usage grew too high, then restart them, but then the question is which one do you kill?  You can take the Linux OOM killer approach, and identify the process with the most unsaved data to kill, or possibly kill the largest one (although this is the one that will have made the most progress, so imposes the greatest time penalty on the build), or the newest one (which may only grow to 50MB).  This is a fairly active research area in warehouse-scale computing, as it's the same problem that you encounter with map-reduce / hadoop style jobs.
> 
> It's also not obvious what you should count towards the 'used' memory, as some linkers will mmap() all of the object code, which can then be swapped out for free but must go to disk again to swap it back in, slowing things down.  On the other hand, some of the accesses through mmap'd files are fairly linear, so swapping them out isn't a problem.
> 
> While I'd love to see this solved in a build system, doing it in a sensible way is far from trivial.
> 
> David

Hi David,
Thanks for your comments.

You talk about killing off processes and the consequences. But you are missing the
point: The build process should never drive a system to the point it needs to kill
off processes not associated with the build. Build processes are not mission critical
real-time processes. And monitoring memory provides a simple way to realize the goal
here. And mmapped memory is not an issue at all, if you are using a set of policies
that prevent the system from getting to the critical point of an out-of-memory failure.
Oh sure, the kernel may decide to reclaim such memory more aggressively than other
allocations, but it will do so without loss of data. This is true, as long as the
system is not exhausted of memory.

Some simple observations:

1.) If I don't have enough memory on a system, then I would hope the build process
would self terminate with a log to inform me of the memory shortage.

2.) If there is sufficient memory to proceed slowly, then I would hope the build
process would inform me of the limited memory available in the logs. And then I
hope the system would run in a slow mode. If things deteriorate with an increasing
risk of an out-of-memory crash, then I would hope the build process would, as a
last resort, self terminate rather than crash the computer. No processes should
ever be killed except build processes.

3.) If there are sufficient resources, then I would hope the build process will
make every effort to use them to complete the build as quickly as possible. If
the system memory deteriorates during the build, then I would hope the build
process could slow down to mitigate the resource limit and possibly kill some
of the build processes if needed. In no case, should the build system force the
kernel to kill off other processes.

4.) What if the computer(s) are running other jobs, and these other jobs need
all the resources while a build is in progress. Well, the sensible policy is
for the build process to yield the resources, terminating the build entirely
as a last resort. Why? Because a build process is never real-time critical.
Other processes may have real-time constraints. The operator of the system(s)
has made an error. The build process should yield the resources as a courtesy,
writing logs to inform the operator of the circumstances. And if the other
processes require resources faster than the build process can respond, then
the whole system may crash -- but it won't be due to the build process.

All that is needed is some knowledge about how much memory these systems require
to do the work they are asked to do. Surely those numbers are known. And, if they
are not, then it shouldn't be too hard to acquire them. And then the build process
simply needs to enforce a reasonable policy on memory requirements. The system
memory stats at /proc/meminfo provide the following metrics that should be
sufficient to implement such a system:

MemTotal: Total usable ram (i.e. physical ram minus a few reserved bits and the
kernel binary code)

MemFree: The sum of LowFree+HighFree

SwapTotal: total amount of swap space available

SwapFree: Memory which has been evicted from RAM, and is temporarily on the disk

SwapCached: Memory that once was swapped out, is swapped back in but still also
is in the swapfile (if memory is needed it doesn't need to be swapped out AGAIN
because it is already in the swapfile. This saves I/O)

Active: Memory that has been used more recently and usually not reclaimed unless
absolutely necessary.

Inactive: Memory which has been less recently used.  It is more eligible to be
reclaimed for other purposes

I believe those few metrics can provide all the information the build process
needs to implement a sane policy.

Thanks for your comments. Enjoyed reading them.
Karen
-- 
Karen Shaeffer
Neuralscape, Mountain View, CA 94040