Norbert Hartl

mostly brackets and pipes

Maintaining GemStone Disk Space

It is very easy to start development with GemStone. You can use the virtual appliance provided or you can do a manual installation. In both cases all necessary things are preconfigured so you don’t have to care about the details of GemStone. And GemStone is a smalltalk image and a database which we find way cool while working with it. Persistence is just there, yeah!

Being a database

But this database thingy is not a magic thing after all. It is down-to-earth technology that works beneath. You need to learn about it if you try to do a more serious approach in gemstone maintenance. There are some flavours of how users approach the nuts & bolts of the “being a database”. Some users detect that there are files growing on the disk eating up a lot of disk space. Or they discover through GemTools that their image is growing all the time and they don’t know why. I learned it the hard way back then as GemStone had a size limit on the community edition.

It just stopped working as soon as it reached the limit. Why is that? We said GemStone is a database. A database takes your data and writes it to disk (or something similar). A fault tolerant database uses transaction logs to minimize the possibilities of data corruption. A transaction log is sequential stream of data changes that are written first to disk before the database content itself is changed. So you have two places where the data is written. If an error occurs between the write of the transaction log and the write of the database content this is not a problem. As soon as the database is restarted it detects that the transaction log has newer entries than the databases last change. It can then replay all the transactions that are missing in the database. That’s why they are sometimes called replay logs.

The same thing you can find in filesystems. But then it is called a journaled filesystem. The journal is something like a transaction log. The database and tranlog files in a normal Glass installation you can find in

/opt/gemstone/product/seaside/data 

and it looks like:

-rw-r--r-- 1 gemstone gemstone  520093696 2011-06-03 14:55 extent0.dbf
-rw-r--r-- 1 gemstone gemstone  47648256 2011-06-03 14:55 tranlog2.dbf

Understanding growth

The watchful reader already saw there is a double growth. The image grows which is reflected in the size of the database file (usually extent0.dbf). The database/image is dependent on the number of objects in the image. If new objects are created and persisted the image/database file grows. Note: the database/image file will not shrink if objects are deleted. The space is freed within the database/image file leaving the file on the filesystem at the same size. If new objects are created it will not grow until the freed space in the file has been filled. And transaction logs grow. Transaction logs grow when data is changed. Tranlogs are written sequentially. Each tranlog grows in size until a limit is reached. If the limit is reached a new tranlog file is created and used. The files are sequentially numbered. If a tranlog file with name of tranlog2.dbf is active the next file will be tranlog3.dbf. The maximum size of the tranlog file is a configurable value.

Keeping the image from growing

In case you are using Glass I assume you are using seaside. Seaside sessions are not small. And seaside does not automatically cleanup old sessions for you. You need to do it yourself. If your image is growing steadily than this could be one reason why. After having cleaned the seaside sessions the image is still growing. By cleaning seaside sessions we just detached the objects from the active object graph but they still exist in the image. GemStone does not automatically run the garbage collector to collect these objects and free the space. Have a look at the script in

/opt/gemstone/product/seaside/bin/startMaintenance

This script will run forever and will become active every 60 seconds. It will then remove old seaside sessions and after that it runs a garbage collection. Well, it does a so called “mark for collection” that means it detects all detached objects and adds them to a list of “candidates to remove”. It can take even a few minutes before GemStone starts to actually remove the objects. As the database/image file is not shrinking you won’t see it while watching at the file.

The best option might be to use the option “File Size Report” in GemTools. If you are logged in you can see a button “Admin…”. Pressing the button will open a menu where you can find an item “DoIt…”. Pressing it will open another menu which contains the “File Size Report”. It looks like this:

Extent #1
-----------   
   Filename = !TCP@localhost#dir:/opt/gemstone/log#dbf!/opt/application/myapp/data/extent0.dbf   
   File size =       496.00 Megabytes   
   Space available = 333.22 Megabytes

Here we can see the file is 496 MB big but inside there are more than 333 MB free. So your actual image size is somewhat around 163 MB in size. If the maintenance script is running and your own model is cleaned up as well than you should see only little growth on the database/image file. Or better the growth should correspond to the growth of your domain model.

Maintaining tranlog files

The database/image is now stable in size but the tranlog files are growing further. Even if you just have a small website you can grow tranlog files of more than 1GB a day. If traffic raises or you have multiple stones with multiple sites running it adds up pretty quick to a noticeable amount of space. It might not be a problem. You can just let it run and clean up the space once a month or once a week. Disk space is cheap so you probably don’t need to care.

I need to. We have a dedicated server at a big hoster. The machine has 1TB of disk space but backup space is only 100GB and there is no option to increase it. We have a lot of domains, sites and email accounts. Doing a triple full backup with incremental backups in between fills the space pretty quick. Enough reason to clean up unnecessary garbage. And the tranlogs are of this sort.

Minimizing disk waste

I wrote a script to help with maintaing tranlog files. The script is available in the stone-creator package in the bin/ directory. The script is called delete-old-tranlogs.sh

The script detects old tranlog files that aren’t needed anymore (by investigating the extent0.dbf) file. It can be used to automatically remove those files. It has very few options:

usage: delete-old-tranlogs.sh 
   -d [directory] 
   -g [gemstonedir] 
   -r   
   -d [directory]   data directory of stone (containing extent0.dbf)   
   -g [gemstonedir] directory of gemstone installation                     (default: /opt/gemstone/product)   
   -r               really remove tranlogs. Without this switch they are                    
                    only shown

This will keep old tranlogs from laying around unused. The default maximum size of tranlog file is 1GB. To minimize the disk usage any further we can decrease the maximum size of the tranlogs. Looking at the file

/opt/gemstone/product/seaside/system.conf

there is a lineSTN_TRAN_LOG_SIZES = 1000, 1000;I just changed it to

STN_TRAN_LOG_SIZES = 100, 100;

which makes tranlog files of 100MB in size. The files are smaller and become obsolete much earlier. We can remove them with the script above on a regular base. The script detects if nothing is to do. I just start it soon enough before the backup process starts to run.

Comments