Norbert Hartl

mostly brackets and pipes

System Monitoring for Pharo Images

In our daily programming tasks we all know that having tests is great way of keeping our software under control. It is something that we regularly execute in order to see that we didn’t break any of our previous assumptions. If we are keen on tests we even try to have a 100% code coverage meaning that every available code path is executed at least once.

Deploying the software adds additional precautions to take. Apart from having a software running without errors it also needs to deal with resources that are available. And resources run short sometimes. Having the software tested for execution doesn’t tell us it doesn’t waste resources which prevents the software from running a long time. The software can leak memory which is unseen while we execute isolated single test cases. Having the software run for a longer time will reveal this problem.

Even if there isn’t a significant leakage of resources the software can run short on resources if used more frequently. There is a point where we need to think about scaling the deployment to meet the resources needed. Regardless where the shortage comes from we need a way to see an immediate problem or to see a tendency that projects a shortage in an overseeable time frame.

In order to deal with that system monitoring comes into play. System monitoring is the task of inspecting the resources of an environment all the time. These inspections of resources are combined with defined thresholds we find feasible with the environment. Whenever are threshold is exceeded the inspecting software can send us alerts making us have a look at the software before it is going to break. The inspected values of the resources can also be graphed so we can see tendencies of resource usage over time.

I’ve implemented a utility that helps monitoring a pharo image. It is called Monitoring. The source is availble on smalltalkhub

Install a fresh pharo image (pharo 4.0 at the moment)

$ curl get.pharo.org | bash

Open the image using

$ ./pharo-ui Pharo.image

Open a playground and load the project with the following expression

Gofer it
    smalltalkhubUser: 'NorbertHartl' project: 'Monitoring';
    configurationOf: 'Monitoring';
    loadStable.   

The following text describes some ways how to enable monitoring of a pharo system.

munin

Munin is a tool that can produce graphs like this one.

a memory graph produced by munin

It inspects the image every 5 minutes and puts the inquired values in a RRDTool. From this tool it creates graphs for overviews over day, week, month and year. This enables use to see the progression of resource usage over a whole year which makes it quite easy to project something into the future.

If you’ve installed the project with the snippet above you should now execute the following in the playground (formerly known as workspace).

MonitorMuninExampleServer image

It starts a zinc server on port 5000 and adds everything needed to monitor vm memory and the garbage collector.

You can request it using curl and the following command line

$ curl http://127.0.0.1:5000/memory?output=munin-values

and you should see something like this (the values will be different of course)

oldspace.value 163980984
youngspace.value 1336248
tenures.value 165317588
free.value 5123340

A munin graph needs to be configured. We can request that using the command

$ curl http://127.0.0.1:5000/memory?output=munin-config

You should see this

graph_title VM memory
graph_category test-server
graph_vlabel bytes
graph_args --base 1000
memory.label Memory
memory.warning 60000000
memory.critical 80000000
memory.type GAUGE
oldspace.label Oldspace
oldspace.type GAUGE
youngspace.label Youngspace
youngspace.type GAUGE
tenures.label Tenures
tenures.type GAUGE
free.label Free
free.type GAUGE

The configuration of the graph is read by munin at startup. It configures the graph and then every 5 minutes it requests the value and is able to plot it.

Now we need to install munin. In linux execute

$ apt-get install munin

adding a web server

Installing a web server depends highly on which web server you like to use and what environment you are targeted for. In depth configurations of several web servers is beyond the scope of this post. I’ll take a shortcut here to just get you something to see. I assume an actual installation of ubuntu (in my case this is a 14.10 distribution). I leave the fine grained configuration as an exercise for the reader. You can find the munin documentation here

The default installation of munin includes a default configuration that is suitable to us. It assumes everything is local. If you need to adjust it you can find the configuration in /etc/munin/munin.conf.

This example uses nginx and assumes you don’t have it installed yet. Execute

$ apt-get install nginx

and add the following

    location /munin/static/ {
            alias /etc/munin/static/;
            expires modified +1w;
    }

    location /munin/ {
            alias /var/cache/munin/www/;
            expires modified +310s;
    }

in the server section of

/etc/nginx/sites-enabled/default

You will find a location statement there already. Just add the above on the same level as the other statement. After you saved the file restart the nginx server by doing

$ service nginx restart

If everything went well you should be able to open the following URI using a web browser

http://localhost/munin/

and you should see something like this

a memory graph produced by munin

Connecting the parts

We have now downloaded a pharo image, installed the Monitoring project, started a pharo zinc server, installed munin, installed and configured a web server. Now the only thing we need to do is to add our memory graph to munin.

Create a file /etc/munin/plugin-conf.d/pharo-monitoring with the following content

[test-server_*]
env.monitorUrl http://127.0.0.1:5000/

This tells munin that every plugin that has a prefix of test-server_ will get an environment containing a variable monitorUrl pointing to our pharo image.

Next we need to create a munin plugin (a shell script that translates the way munin calls its plugins to our zinc server). I’ve made a git repository to ease that task. Install it using

$ cd /opt
$ git clone https://github.com/noha/pharo-scripts 

Activate the plugin using

$ cd /etc/munin/plugins
$ ln -s /opt/pharo-scripts/monitoring/munin-plugin test-server_memory

We created the plugin with the prefix test-server_ so munin will assign it the monitorUrl we configured above. The script takes the suffix after the underscore to construct the complete URI.

You can test it by issuing

$ munin-run test-server_memory

that should show the values we saw above and

$ munin-run test-server_memory config

that should show the configuration of the graph from above. The last two invocations of the script are exactly how munin will call them and are a great way to debug problems like permissions et al.

If everything went well we can restart munin

$ service munin-node restart

and after a while the new plugin should appear like you can see here.

the plugin is there

Clicking on the test-server link shows us

more to come

and you can see on the right side of the left graph there are few lines starting to build up. Now we just have to wait for the data to fill the graph.

Conclusion

I might say this is an easy way to start monitoring pharo images. But as this blog shows there are quite some things to set up in order to have it work. These are all basic building blocks like installing and configuring a web server and doing system administration tasks. I assume that if you are interested in system monitoring you know most of this already and you can easily translates everything that has been written here into your own environment.

This post did a quick and rough shortcut throughout the components that need play together. I tried every step but I’m sure there are a lot of things that can go wrong like permission problems or the like. If you have problems don’t hesitate to bug me about it.

If you got this far the future of monitoring is bright. The next task to monitor your garbage collector is as simple as

$ cd /etc/munin/plugins
$ ln -s /opt/pharo-scripts/monitoring/munin-plugin test-server_gc
$ service munin-node restart

and wait. The graph will appear automatically. You are now prepared to orchestrate your whole server farm. With munin it is quite easy to collect graphs from a lot of hosts and combine them in a single web page. The example server sets also some thresholds for warning and critical. Just waste a bit memory and see the label in the graph turn yellow in case of exceeding the warning threshold and read in case of critical. You can configure munin the send e.g. an email if the thresholds are exceeded.

This should be enough for an introduction to system monitoring in pharo. In following articles I will present more tools and I’ll dig into the code and explain how you can easily monitor any parameter of your system you like.

As always I like to have feedback about things being good and things being bad. Hope you enjoy system monitoring your pharo images.

Happy Monitoring!

Comments