Abel Bernabeu's blog: Backup automation with FlyBack

Making frequent backups of your business or institutional data is something all IT involved people see as essential. However everybody recognizes that we never do our backups as often as we would like.

Over the years I've learned a maxim: without automation there are no backups. The responsability of a competent system admin is not having his schedule plenty of annotations in order to remember him that a weekly backup must be done. The role of a system admin, instead, consist in having the backup task automated and act as a supervisor who periodically checks that everything is working in the right way.

Even more, the system admin should ideally be a delegate of the automation system: the element of the system which handles the most uncommon and difficult tasks.

Time ago I used to write my own shell scripts in order to do my backups. While my scripts allowed me enough customization to get the solution perfectly fit our needs I felt they lacked two important virtues:

Being easily used by any coworker.
Robustness.

Missing those virtues in my previous backup solution I felt it was the time for searching something better. If I were an Apple fanatic I would have screwed my Linux server, bought an Apple one and limited my search to coping with the well known Time Machine. But being a Linux guy as I am I preferred taking a look to some open source alternatives.

There are plenty of open source backup solutions around. A very popular one is Amanda. Some noticeable features of Amanda are:

Being highly configurable.
Having a client/server architecture scaling well to institutions having their data spread around lot of machines.
Having clients for almost any operating system (Windows included).

On the other hand Amanda configuration is not trivial. I must sadly say that Amanda is an oversized solution for people like me, who just want to backup a single Linux server (or any other Unix variant of your choice).

Reached this point I started looking for desktop oriented solutions finding a really nice piece of Python software: FlyBack. According to its author, FlyBack is inspired by Apple's Time Machine, but the author is modest enough to warn about the fact that still does not compare. IMO you should have in mind that FlyBack is not the same as Apple's Time Machine but be sure it will be a decent counterpart in the future.

A noticeable capability of Time Machine is its ability to seamlessly background run. Time Machine periodically copies the last changes done in the selected file set without letting the user ever notice its activity. A Mac OS kernel subsystem allows monitoring for changes in a selected file set. This kernel subsystem is exposed to applications through the FSEvents Framework.

The Linux kernel has now a similar subsystem allowing applications to be notified about changes on monitored files: the inotify subsystem and libinotify (for application level use) .

However at the day of today FlyBack does not make use of the mentioned inotify subsystem and a slow per-file check for modifications is done on eventual invocation of the program (consuming a noticeable amount of machine resources). FlyBack cannot seamlessly run in the background (as Gnome's tracker does, for example) so an explicit invocation of FlyBack program is needed in order to get some backup done. The user can of course schedule the invocation as a crontab task (indeed this is easily accomplished by using FlyBack's GUI).

If you are curious on how difficult adding the “seamless background running” feature to FlyBack could be I can give you a hint: inotify-tools would do a good job without a major rewritting of FlyBack's code.

Another point where Apple's Time Machine still performs better is the integration with the desktop's file browser. However we can expect that whenever a popular background backup tool is available, a Nautilus plugin allowing per file/directory version tracking will soon appear for Gnome (the same can be said for KDE, of course).

What about client/server capabilities I pointed about Amanda? Well, don't feel using FlyBack you are sacrifying the potential network scalability of your backup solution. A seemless client/server architecture can be easily setup by using ssh storage. At the day of today the ssh storage feature is present in FlyBack's GUI while the program's internal logic still gets unimplemented in the mainstream distribution. This should not be seen a trace of incompleteness but as a taste of what next versions will bring us and a suggestive starting point for your contribution to the FlyBack community :)

Even if I your are a pragmatic pearson wanting to evaluate software by its already present capabilites and not by its potential ones, you should admit that customizing FlyBack to your own needs is at a convenient middle point between using a really complex solution you don't really need and continuing with your ancient shell scripts. At least the tradeoff seems reasonable for me.

Abel Bernabeu's blog

20081021

Backup automation with FlyBack

No comments:

archivo