[PLUG] tracking/accounting for memory use

Tim Wescott tim at wescottdesign.com
Sat Dec 12 00:58:54 UTC 2009


Russell Senior wrote:
> I've got a problem tracking down where memory is disappearing to on an
> embedded linux platform.  I know basically about caches and buffers
> and such and have looked at /proc/meminfo and /proc/slabinfo and kind
> of understand about how slabs work.  However, I don't know what
> numbers in /proc/meminfo are supposed to add up to what in a way that
> is going to give me clues where the memory is disappearing to.
>
> A little more detail.  We've got a bunch of Netgear WGT634U devices
> running a customized OpenWrt scattered around town and we collect data
> from them every 5 minutes or so via SNMP.  We have Cacti graphs of
> memory utilization, e.g.:
>
>   https://personaltelco.net/graphs/graph.php?action=view&local_graph_id=253&rra_id=all
>
> We are running NoCatAuth (a captive portal system that uses Perl),
> OpenVPN, OLSRd and SNMPd.  The WGT634U has 32 meg of RAM.  After a
> reset, we usually have about 18 megabytes in free+cache+buffers.  Over
> time, that total tends to degrade down to about 14 meg (or less), at
> which time we become more susceptible to running out of memory during
> forking (e.g. the NoCatAuth software forks 10 processes for every
> authorization).  Typically, we see the failure in NoCat which causes
> it to die (breaking our node in the process), but sometimes other
> programs die instead.  Usually, the system stays running and we get
> alerted and we can log in and fix it, but with 30-50 of these and a
> 1/week failure rate, we have to fix a few every day, which is
> annoying.
>
> When I have looked, it did not *seem* that our userspace programs were
> growing fast enough to account for the degradation in
> free+cache+buffers, but maybe I was looking at the wrong thing.
> Busybox ps is only giving me VSZ and not RSS.  Can someone suggest a
> robust way of tracking memory usage?  I am particularly interested in
> figuring out and accounting for what userspace is using, and what the
> kernel is using, to see what exactly is growing so that I can make it
> stop!
>
> Grateful for any pointers.  Thanks!
>   
This isn't what you want to hear, but I would be concerned about an 
18/14 difference between life and death.  Are you sure that your problem 
isn't heap fragmentation, instead of lack of raw memory?  Heap memory is 
a notorious cause of problems in embedded system, and a notoriously 
popular resource among desktop programmers (which is one reason why 
embedded programmers tend to sneer at desktop programmers).

If the kernel has different flavors of heap it may be that it's running 
out of one and not another, and you can shake the cereal down in the box 
a bit by reapportioning how it uses them.

-- 
Tim Wescott
Wescott Design Services
Voice: 503-631-7815
Cell:  503-349-8432
http://www.wescottdesign.com





More information about the PLUG mailing list