TarTheft is an open-source PACS ('Picture Archive and Communication System') project. Instead of creating a whole new system, we opted to patch the popular subversion tool and make it work with a tape library.
TarTheft software is supported by the Belgian Royal Institute for the Conservation and Restaurantion of Cultural Heritage ( KIKIRPA ). This is an Academic Institute with 900'000 hi-res bw and color photo's and 300'000 associated documents. The pictures include optical, X-ray, InfraRed,.. and are mostly used by art-history reseachers and restorators for their publications. The challenge to the IT crew of KIKIRPA was to make an easy accessible and secured digital archive, hosting both the picture's and associated study material.
WHAT DOES TARTHEFT DO
TARTHEFT is a solution for on-line archiving of large files (like pictures). The end-result show be some type of PACS (picture-archive-and-communication-system)
Tapes are still a preferred technology for long-term archiving because data cartridges can last for ±30 years without power-supply, BUT... tape storage is difficult to handle.
Disk based archives, on the other hand are easier to use, but not efficient for archiving; Disks demand power and maintenance (you have to change failed drives). Furthermore, an archive can be very large (our picture archive is >33TByte) and most data is seldom accessed...
The ideal solution is a combination of the two. This is called 'near-line media': What you need is a large hard-drive (I have 2TB partition), a tape-driver (IBM here, WORM capable) and a tape-changer (QualStar in my case). The disk is basicly a cache before it gets stored on a tape volume.
The problem is now to maintain a link between the harddrive and the tapes ..
That's when tartheft comes in: Because tartheft uses a version-controlsystem (subversion) for the the disk-based cache, it can have several advanteges:
use subversion's properties to remember where a revision was backup'ed (ie. "on what tape it's gone to")
even if the file-structure on the hard-drive changes (eg. people move a file), the version-repository remember the history of that file and the link to the correct tape remains.
as an extra, tartheft-subversion allows external (svn and svn+ssh) access to your archive..
HOW DOES TARTHEFT DO IT
TARTHEFT has 4 sub-programs to juggle:
1/ servett & svnserve: svnserve is a patch on your svnserve program. If a logs *update requests* on version and logs its to a logfile (see config.h) if that revision has a TARTHEFT prope.
The program servett examines that logfile live (acts as a daemon) and launches a script 'ttextract' See SCRIPT's on item (3) below. NOTE: all output goes to syslog.
2/ looktt - looktt is like subversion's 'svnlook'. it admnistrates your tartheft-repository.
looktt has two operating modes:
CHECK(-c) - gives an overview of all revision in a repos and see if...
(1) the TARTHEFT revision-property is set or not
(2) if not set, than get a list of all updated or added file
DUMP(-d) - the same as CHECK, but DUMP all updated/added files to a directory
HELP(-h) - show some help..
You can specify your repository path using the -r switch..
- ttextract : retrieve a TAR from file/tape (see scripts directory)
- ttassemble : puts revision-dumps (from looktt -d ) on tape and place revision-property to mark success
WARNING: scripts not fully functional in tarhteft v 0.9
- ttthin hacks the FSFS repository and removes all the heavy files from it if the tartheft-property was set. This way the disk-based repository can be made thin again.
· Subversion 1.4.5
What's New in This Release:
· added chopped file extension support.
· more documentation and increase shard depth.