1 | # TCP Stats Interface
|
2 |
|
3 | A really simple TCP management interface is available by default on port `8126`
|
4 | or overriden in the configuration file. Inspired by the memcache stats approach
|
5 | this can be used to monitor a live statsd server. You can interact with the
|
6 | management server by telnetting to port `8126`, the following commands are
|
7 | available based on the running server.
|
8 |
|
9 | ## Common commands
|
10 |
|
11 | * health [up|down] - a way to get/set the health status of statsd. Alone will get you the current health status. Passing a second command will set the status to the new value. Accepted values are _up_ and _down_.
|
12 | * config - a dump of the current configuration
|
13 | * quit - close the connection from the server side
|
14 |
|
15 | ## Statsd specific commands
|
16 |
|
17 | * stats - some stats about the running server
|
18 | * counters - a dump of all the current counters
|
19 | * gauges - a dump of all the current gauges
|
20 | * timers - a dump of the current timers
|
21 | * delcounters - delete a counter or folder of counters
|
22 | * delgauges - delete a gauge or folder of gauges
|
23 | * deltimers - delete a timer or folder of timers
|
24 |
|
25 | The stats output currently will give you:
|
26 |
|
27 | * uptime: the number of seconds elapsed since statsd started
|
28 | * messages.last_msg_seen: the number of elapsed seconds since statsd received a message
|
29 | * messages.bad_lines_seen: the number of bad lines seen since startup
|
30 |
|
31 | You can use the del commands to delete an individual metric like this :
|
32 |
|
33 | #to delete counter sandbox.test.temporary
|
34 | echo "delcounters sandbox.test.temporary" | nc 127.0.0.1 8126
|
35 |
|
36 | Or you can use the del command to delete a folder of metrics like this :
|
37 |
|
38 | #to delete counters sandbox.test.*
|
39 | echo "delcounters sandbox.test.*" | nc 127.0.0.1 8126
|
40 |
|
41 |
|
42 | Each backend will also publish a set of statistics, prefixed by its module name.
|
43 |
|
44 | Graphite:
|
45 |
|
46 | * graphite.last_flush: unix timestamp of last successful flush to graphite
|
47 | * graphite.last_exception: unix timestamp of last exception thrown whilst flushing to graphite
|
48 | * graphite.flush_length: the length of the string sent to graphite
|
49 | * graphite.flush_time: the time it took to send the data to graphite
|
50 |
|
51 | Those statistics will also be sent to graphite under the namespaces
|
52 | `stats.statsd.graphiteStats.last_exception` and
|
53 | `stats.statsd.graphiteStats.last_flush`.
|
54 |
|
55 | A simple nagios check can be found in the `utils/` directory that can be used to
|
56 | check metric thresholds, for example the number of seconds since the last
|
57 | successful flush to graphite.
|
58 |
|
59 | The health output:
|
60 | * the health command alone allows you to see the current health status.
|
61 | * using health up or health down, you can change the current health status.
|
62 | * the healthStatus configuration option allows you to set the default health status at start.
|
63 |
|
64 | ## Statsd Proxy specific commands
|
65 |
|
66 | * status - the status of the current server
|
67 |
|
68 | The __status__ output currently will give you:
|
69 |
|
70 | * uptime: the number of seconds elapsed since statsd proxy started
|
71 | * nodes: a space separated list of host:port for each active node in the ring
|