First Last Prev Next    No search results available
Details
: gmond memory leak when deaf = no & no other gmond instanc...
Bug#: 327
: Ganglia Monitoring System
: gmond
Status: NEW
Resolution:
: PC
: Linux
: trunk
: P5
: major

:
:
:
  Show dependency tree - Show dependency graph
People
Reporter: Aaron Nichols <anichols@trumped.org>
Assigned To: Sourceforge Bugzilla <ganglia-bugzilla@lists.sourceforge.net>
:

Attachments
Gmond memory utilization, before and after the change (26.37 KB, image/png)
2012-03-29 18:10, Vladimir Vuksan
Details


Note

You need to log in before you can comment on or make changes to this bug.

Related actions


Description:   Opened: 2012-03-29 12:48
Configuration:
- gmond 3.3.1 built from source into RPM using specfile provided with source
- OS Centos 5
- Using unicast with all nodes in a cluster configured with udp_send_channel to
send to designated aggregation nodes
- All nodes have deaf = no & both a tcp_accept_channel & udp_recv_channel
configured for simplicity

Problem:
When configured as above, those nodes which have deaf = no leak memory at a
pretty significant rate, reaching hundreds of megs within 12 hours. The nodes
which are aggregators, which are the destination of the udp_send_channel
configured on all nodes, do not leak memory. 

So it appears that if you have a gmond node which is not receiving metrics
through a configured rx channel it will leak memory. 

Modifying the above configuration by removing the tcp_accept_channel &
udp_recv_channel and setting mute = yes will cause the leak to stop.
------- Comment #1 From Vladimir Vuksan 2012-03-29 18:09:33 -------
Per comment from Kostas Georgiou on IRC I changed

gmond.c from "if ((now - udp_last_heard) > 60 * APR_USEC_PER_SEC)" to 60000 

and problem "goes away". 

I am attaching the image that shows the memory consumption originally. And with
the above change.
------- Comment #2 From Vladimir Vuksan 2012-03-29 18:10:41 -------
Created an attachment (id=277) [details]
Gmond memory utilization, before and after the change
------- Comment #3 From Martin Knoblauch 2012-03-30 03:31:17 -------
Just curious. You write "Modifying the above configuration by removing the
tcp_accept_channel & udp_recv_channel and setting mute = yes will cause the
leak to stop."

But if you set "mute=yes", the gmond will not send any data any more. did you
mean "deaf=yes"?
------- Comment #4 From Aaron Nichols 2012-03-30 05:33:52 -------
Good catch - yes, I meant that you must change deaf = yes.
------- Comment #5 From Kostas Georgiou 2012-03-30 17:26:31 -------
Current attempt for a fix at  git://github.com/georgiou/monitor-core.git in the
fixes/bz327 branch.

The first commit only resets the channels once every 60 secs instead of every
cycle reducing the effects of the leak.

The second commit fixes a leak in join_mcast and should also be safe.

The third commit fixes the main leak but it will need a second pair of eyes and
some testing to make sure it doesn't break anything.
------- Comment #6 From Carlo Marcelo Arenas Belon 2012-04-04 09:15:57 -------
in review to merge after 3.3.5 gets released in :

  https://github.com/ganglia/monitor-core/pull/30

First Last Prev Next    No search results available