DAR's Documentation

Presentation


This is the summary page for dar's documentation. This documentation is presented by point of views:
  1. I discovered dar but I don't know what it is and whether it would addresses my needs
  2. I want to use dar with as much as needed features
  3. I wonder how dar works
  4. I want to use libdar's API in my prog
  5. Unclassified documentations




1 - Dar's general presentation

2 - Dar's command-line tools

Dar's package provides a set of six command-line, which all (or quite all) rely on a library (called libdar) that is also part of this package. To address your needs, the following documents classified by logical order of use are suggested:

3 - Dar's internals

4 - Libdar and its API

Libdar has a well documented interface that let you access dar's features from your program, it is written in C++ thus you have either to use C or C++ or define your bindings for the language of your choice (bindings you are welcome to make public on your web site for example, inform me for I add a link from dar's home page to your website.)

5 - Miscellaneous informations:



Hall of Fame


Who
What
How
When
Notes
Ralph Slooten
Uses dar to backup all the company's development and live websites
for all customer websites, MySQL databases, and company documents,
artwork, software etc.
Synchronizes files from several servers (Windows and Linux) to one
central Linux backup server using rsync. The local backup files are then
incrementally backed up and rotated every week (per server), and stored
for 8 weeks.
Feb.
2008
Due to bandwidth limitations in New Zealand, using a synchronization tool like rsync is required to save bandwidth and get a copy of the altered files to a local Linux backup server.

Here Ralph is able to keep months worth of compressed data for each server. Every week an offsite copy is created of all dar backups.
Michael W.Cocke Archived all of his data (in particular a huge web site) in a single, multi-slice dar backup. The result is the biggest archive that I have ever heard about: 1.4 Terabytes using dar 2.0.x and two hundred (!) DVD-Rs Feb. 2004 Mike has created Daromizer to parallelize CD/DVD burning and slice creation (see Related Projects). After extensive testing, he finally let the backup cover his whole server. He was a very lucky man: a few days later he got a disk crash (hardware failure), but was able to recover the entire 1.4 TB of data thanks to Dar. To quote him: "DVDs are a good deal faster (and less error prone) than the DAT tapes I used to use."
François Monnet
CEO FMC Luxembourg
Uses DAR to make differential backup over a VPN joining Paris, Luxembourg and Brussels, in the insurance companies area. using dar 1.3.0 May 2003 At the time of the version 2.1.0 release FMC Luxembourg is still satisfied and still uses DAR for his remote backup.



To compile dar from source you need at least:
  1. C++ compiler (tested done with gcc-3, but other version and compiler should work)
  2. make (tested with gnu make)

You may also have installed the following tools:

The consequences to build dar without theses optional tools are the following:
  • If you lack libz library, dar will compile but will not be able to compress or uncompress an archive using the gzip algorithm
  • If you lack libbzip2 library dar will compile but will not be able to compress or uncompress an archive using the bzip2 algorithm
  • If you lack openssl dar will still compile but you will not be able to use strong encryption
  • If you lack Doxygen dar will still compile but you will not have the reference documentation for libdar after calling make
  • If you lack upx dar will still compile but the resulting binary will not be compressed after calling make install-strip
  • If you lack groff dar will not generate man pages in html format

Dar is compiled from source code in the following way:

./configure [eventually with some options]
make
make install-strip

Important: due to a bug in the autoconf/libtool softwares used to build the configure script you must not have spaces in the name of the path where are extracted dar' sources. You can install dar binary anywhere you want, the problem  does not concern dar itself but the ./configure script used to build dar: To work properly it must not be ran from a path which has a space in it.
Important too: By default the configure script set optimization to -O2, depending on the compiler this may lead to problems in the resulting binary (or even in the compilation process), before reporting a bug try first to compile with less optimization:

CXXFLAGS=-O
export CXXFLAGS
make clean distclean
./configure [options...]
make
make install-strip

The configure script may receive several options, they are listed here.

Note for packagers that the DESTDIR variable may be set at installation time to install dar in another directory. This makes the creation of dar binary packages very easy. Here is an example

./configure --prefix=/usr [eventually with some options]
make
make DESTDIR=/some/where install-strip

This will install dar in /some/where/usr/{bin | lib | ...} directories. You can build a package from files under /some/where and install/remove the package at the root of your filesystem (thus here files will go in /usr/{bin | lib | ... }).




Binary package for windows is provided here. If you need a particular binary package for your distro (Linux) check the dowload area at your distro site, they probably do packages for dar (Redhat, Debian, Ubuntu, Suze, Gentoo, and probably many other do provide dar's binary packages)

There is several point to be aware when using dar under windows:

the binary package is a *.zip file (thus you need winzip to unpack it). It contains a subdirectory (named dar) you will have to extract where you want in your directory tree. Optionally you can add the path to dar in the PATH variable in autoexec.bat. Considering dar has been extracted under C:\dar you can add the following line in autoexec.bat:

set PATH=%PATH%;C:\Dar

then you have to reboot. (Just kidding ! This was to respect the Windows usage and way of life ;-) ) Else if you don't setup the PATH variable, you need to specify the full path to dar executables to use them from the Windows command-line prompt.

IMPORTANT NOTES !

Note that path given to dar suite's program must respect the UNIX way (use slashes "/" not back slashes "\") thus you have to have to use /temp in place of \temp. Moreover, drive letters cannot be used the usual way,  like c:\windows\system32. Instead you will have to give the following path /cygdrive/c/windows/system32. As you see the /cygdrive directory is a virtual directory that has all the drives as children directories :

X:\some\file  has to be written  /cygdrive/X/some/file


for example:

c:\dar_win-1.2.1\dar -c /cygdrive/f/tmp/toto -s 2G -z1 -R "/cygdrive/c/My Documents"

  ^             ^         ^   ^                     ^
  |             |         |   |                     |
 ---------------         ---------------------------
here use anti-slash        but here we use slash
as usually under           in arguments given to dar
windows to point
the command


Another point reported by "HansS713" on Sourceforge is about network paths. The windows notation \\host\file\path cannot be translated replacing \ by / because it would lead to dar see a empty name as directory which is not a valid path. The workaround found by "HansS713" is to translate the path as follows:

\\host\file\path     becomes     \\\\host/file/path


Binary packages for MacOS

 Dave Vasilevsky provides binary packages for MacOS.




Available options for the configure script

Optimization option:

 --enable-mode  --enable-mode=32 or --enable-mode=64
if set, replace infinint by 32 or 64 bits integers. This makes a faster executable and less fond of memory, but with several restrictions (about for example ability to handle large files, or high dates. See the limitations for more).

Deactivation options;

--disable-largefile Whatever your system is, dar will not be able to handle file of size larger than 4GB
--disable-ea-support
Whatever your system is, dar will not be able to save or restore Extended Attributes (see the Notes paragraphs I and V)

--disable-nodump-flag
Whatever your system is, dar will not be able to take care of the nodump-flag (thanks to the --nodump option)
--disable-dar-static
dar_static binary (statically linked version of dar) will not be built
--disable-special-alloc
dar uses a special allocation scheme by default (gather the many small allocations in big fewer ones), this improves dar's execution speed
--disable-upx
If upx is found in the PATH, binary are upx compressed at installation step. This can be disabled by this option, when upx is available and you don't want compressed binaries.
--disable-gnugetopt
 on non GNU systems (Solaris, etc.) configure looks for libgnugetopt to have the long options support thanks to the gnu getopt_long() call, this can be disabled.
--disable-thread-safe
 libdar may need POSIX mutex to be thread safe. If you don't want libdar relaying on POSIX mutex even if they are available, use this option. The resulting library may not be thread safe. But it will always be thread safe if you use --disable-special-alloc, and it will never be thread safe if --enable-test-memory is used.
--disable-libdl-linking
Ignore any libdl library and avoid linking with it
--disable-libz-linking
Disable linking to libz, thus -z option (gzip compression) will not be available
--disable-libbz2-linking
Disable linking to libbz2, thus -z option (libbz2 compression) will not be available
--disable-libcrypto-linking
Disable linking with openssl's libcrypto library. Strong encryption will not be available
--disable-build-html
Do not build API documentation reference with Doxygen (when it is available)


Troubleshooting option:

--enable-os-bits
If set, dar uses the given argument (32 or 64) to determine which integer type to use. This much match your CPU register size. By default dar uses the system <stdint.h> file to determine the correct integer type to use


Debugging options:

--enable-examples
If set, example programs based on infinint will also be built
--enable-debug
If set, use debug compilation option, and if possible statically link binaries
--enable-pedantic
If set, transmits the -pedantic option to the compiler
--enable-build-usage
If set, rebuild usage files (requires libxml2)
--enable-test-memory
if set, wrap memory allocation routine to track memory leakage (makes a very slow executable)
--enable-profiling
Enable executable profiling



Dar has several mailing-lists:

For support requests there is the dar-support mailing-list (please, read previous post before blindly ask for support).
For important news about security problem, new release, etc. (less than 10 email per year) there is the dar-news mailing-list
For libdar and its API,  support or discussions there is the libdar-api mailing-list
For any discussion about dar that does not fit in the previous mailing-list there is the dar-discussion mailing-list,
During pre-releases phases only to participate to the stress testing download the pre-released package and report problem to the pre-release mailing-list.

Important : I no longer answer support requests made by email directly addressed to me. The reason is simple: posting your request in a public area (like the dar-support mailing-list), makes it visible to anyone. Answers to your problem might concern other people, and so a public forum is the best place for answers to reside as well. I do not have as much time as I wish to develop DAR (adding new features and porting to new systems), so keeping support public will save me a little time, since it avoids me repeating the same answers to the same questions.
 Sharing must be both directions.



Dar Relase Process

Development Phase:
Dar receive new features during the development phase, at this stage sources are modified and tested after each feature addition. The development sources are stored in a CVS repository at sourceforge, repository you can access in read-only.

Frozen API Phase:
No new feature that would change the API are added. The API shall be documented enough to let API users give their feedback about the design and its implementation. During this time, development continues, whatever is necessary while it does not changes the API, like documentation of the whole project, problem fix in libdar, new features in command-line part of the source, and so on.

Pre-release Phase:
Once the documentation and API is stable, comes the pre-release phase, this phase starts and ends by a email to the dar-news mailing-list. At this period intensive test is done on the pre-release source, feedback and information about new pre-release packages are exchanged through the pre-release mailing-list, this mailing-list lives only during the pre-release phases and is not archived, nor visible through a mail to news gateway. Of  course, you are welcome to participate in the testing process and report to the pre-release mailing list any problem you could meet with a given pre-release package.

Release Phase:
Some little time after pre-release has ended, a first package is released (last number version is zero) and available at sourceforge for download. This phase also begins by an email to dar-news mailing-list. During that phase, users may report bugs/problem about the released software,  depending on the amount of bugs found and of their importance a new release will take place to only fixe theses found bugs (no features is added), the last number of the version is incremented by one and a new mail to dar-news is sent with the list of problem fixed by the new release. The release phase ends when a new release phase begins, thus during a release phase a concurrent development phase takes place, then a frozen API, then a pre-release phase but for a new major version (the first or the second number of the version changes).

Dar's Versions

package release version

Dar packages are release during the pre-release phase (see above). Each version is identified by three number separated by dot like for example, version 2.3.0 . The last number is incremented between releases that take place in the same release phase (just bug have been fixed), the middle number increments at each pre-release phase. Last the first number is incremented when a major change in the software structure took place [version 2.0.0 has seen the split of dar's code in one part related to command-line and the rest put in a library called libdar, that can be accessed by a well defined API even by external softwares (like kdar for example). Version 2.0.0 has also seen the apparition of the configure script and the use of the gnu tools autoconf, autmake, libtool and gettext, thus in particular the possibility to have internationalization].

Note that release versionning is completely different from what is done for the Linux kernel, here for dar all versionnized packages are stable released software and thus stability increases with the last number of the version.

Libdar version

Unfortunately, the release version does not give much information about the compatibility of different libdar version, from the point of view of an external application, that thus has not been released with libdar and may be faced to different libdar versions. So, libdar has its own version. It is also a three number version, (for example, current libdar version is version 3.1.2), but each number has a different meaning. The last number increases with a new version that only fixes bugs, the middle number increases with when new features has been added but stay compatible with older libdar version in the way to use older features. Last the first number changes when the API has been changed in a way that no ascendant compatibility is no more possible for some features.

Other versions


beside the libdar library, you can find five command-line applications: dar, dar_xform, dar_slave, dar_manager and dar_cp. Theses except dar have their own version which is here too made of three numbers. Their meaning is the same as the meaning for the package release version: The last number increases upon bug fix, the middle upon new feature, the first upon major architecture changes.

Archive format version

When new features come, it is sometime necessary to change the structure of the archive. To be able to know the format used in the archive, a field is present in each archive that defines this format. Each dar version can thus read all archive format, well of course a particular version cannot guess the format of archive that have been defined *after* that dar version has been released. If you try to open a recent archive with an old dar version, you will have a warning about the fact that dar is probably not able to read the archive ,dar will then ask you if you want to proceed anyway. Of course, you can try to read it, but this is at your own risk. In particular, depending on the feature used (See the Changelog to know which feature required to upgrade the archive format), you may succeed reading a recent archive with an old dar version and get neither error nor warning, but this does not mean that dar did all that was necessary to restore the files properly, so it is advised to avoid using an archive with a version of dar that is tool old to handle it properly (and rather reserve this possibility only in case of necessity) but rather upgrade your dar version as necessary to avoid the warning to appear.

Cross reference matrix

OK, you may now find that this is a bit complex so a list of version is give below. Just remember that there are two points of view: The command-line user and the external application developer.

Date
released package and dar version
Archive format
libdar version
dar_xform
dar_slave
dar_manager
dar_cp
April 2nd, 2002
1.0.0
01
----- ----- ----- ----- -----
April 24th, 2002
1.0.1
01
----- ----- ----- ----- -----
May 8th, 2002
1.0.2
01
----- ----- ----- ----- -----
May 27th, 2002
1.0.3
01
----- ----- ----- ----- -----
June 26th, 2002
1.1.0
02
----- 1.0.0
1.0.0
----- -----
Nov. 4th, 2002
1.2.0
03
----- 1.1.0
1.1.0
1.0.0
-----
Jan. 10th, 2003
1.2.1
03
----- 1.1.0 1.1.0 1.0.0
-----
May 19th, 2003
1.3.0
03
----- 1.1.0
1.1.0
1.1.0
-----
Nov. 2nd, 2003
2.0.0
03
1.0.0
1.1.0
1.1.0
1.2.0
1.0.0
Nov. 21th, 2003
2.0.1
03
1.0.1
1.1.0
1.1.0
1.2.0
1.0.0
Dec. 7th, 2003
2.0.2
03
1.0.2
1.1.0
1.1.0
1.2.0
1.0.0
Dec. 14th, 2003
2.0.3
03
1.0.2
1.1.0
1.1.0
1.2.1
1.0.0
Jan. 3rd, 2004
2.0.4
03
1.0.2
1.1.0
1.1.0
1.2.1
1.0.0
Feb. 8th, 2004
2.1.0
03
2.0.0
1.2.0
1.2.0
1.2.1
1.0.0
March 5th, 2004
2.1.1
03
2.0.1
1.2.1
1.2.1
1.2.2
1.0.0
March 12th, 2004
2.1.2
03
2.0.2
1.2.1
1.2.1
1.2.2
1.0.0
May 6th, 2004
2.1.3
03
2.0.3
1.2.1
1.2.1
1.2.2
1.0.1
July 13th, 2004
2.1.4
03
2.0.4
1.2.1
1.2.1
1.2.2
1.0.1
Sept. 12th, 2004
2.1.5
03
2.0.5
1.2.1
1.2.1
1.2.2
1.0.1
Jan. 29th, 2005
2.1.6
03
2.0.5
1.2.1
1.2.1
1.2.2
1.0.1
Jan. 30th, 2005
2.2.0
04
3.0.0
1.3.0
1.3.0
1.3.0
1.0.1
Feb. 20th, 2005
2.2.1
04
3.0.1
1.3.1
1.3.1
1.3.1
1.0.1
May 12th, 2005
2.2.2
04
3.0.2
1.3.1
1.3.1
1.3.1
1.0.2
Sept. 13th, 2005
2.2.3
04
3.1.0
1.3.1
1.3.1
1.3.1
1.0.2
Nov. 5th, 2005
2.2.4
04
3.1.1
1.3.1
1.3.1
1.3.1
1.0.2
Dec. 6th, 2005
2.2.5
04
3.1.2
1.3.1
1.3.1
1.3.1
1.0.2
Jan. 19th, 2006
2.2.6
04
3.1.3
1.3.1
1.3.1
1.3.1
1.0.3
Feb. 24th, 2006
2.2.7
04
3.1.4
1.3.1
1.3.1
1.3.1
1.0.3
Feb. 24th, 2006
2.3.0
05
4.0.0
1.4.0
1.3.2
1.4.0
1.1.0
June 26th, 2006
2.3.1
05
4.0.1
1.4.0
1.3.2
1.4.0
1.1.0
Oct. 30, 2006
2.3.2
05
4.0.2
1.4.0 1.3.2 1.4.0 1.1.0
Feb. 24th, 2007
2.3.3
05
4.1.0
1.4.0
1.3.2
1.4.1
1.2.0
June 30th, 2007
2.3.4
06
4.3.0
1.4.0
1.3.2
1.4.1
1.2.0
Aug. 28th, 2007
2.3.5
06
4.4.0
1.4.1
1.3.3
1.4.2
1.2.1
Sept. 29th, 2007
2.3.6
06
4.4.1
1.4.1
1.3.3
1.4.2
1.2.1
Feb. 10th, 2008
2.3.7
06
4.4.2
1.4.2
1.3.4
1.4.3
1.2.2
June 20th,
2008
2.3.8
07
4.4.3
1.4.2
1.3.4
1.4.3
1.2.2



Reporting a Bug:

First check in the dar-support mailing-list that your problem has not been reported there (and maybe even solved). It arrives that what seems to be a user problem is in fact a bug (the opposite is much more often true). Anyway, if your conclusion that you have found a bug, please use the Bug Tracker. Note that you need to register at Sourceforge to be able to open a bug (subscription to Sourceforge is free, and won't spam your email, I use it since 2002 and never got spam from the email I use only for sourceforge). Giving a real email address is beneficial, because you'll get a notification when the status of the bug changes (like when it has been resolved).

You can also consult old bug description (what was in place before using sourceforge bug tracker).

Asking for a new feature:

Please use the New Feature tracker. Same remark as above about Sourceforge.

Submitting a patch:

Please use the Patch Tracker at Sourceforge. Same remark as above about Sourceforge.




Known Projects that use dar or libdar:
  • kdar is a KDE Graphical User Interface to dar made by Johnathan Burchill
  • Daromizer by Michael W.Cocke manages dar and growisofs for parallelizing slice creation and CD/DVD burning, making huge archive faster to complete.
  • SaraB: Schedule And Rotate Automatic Backups - by Tristan Rhodes. SaraB works with DAR to schedule and rotate backups. Supports the Towers of Hanoi, Grandfather-Father-Son, or any custom backup rotation strategy.
  • bzSaraB is a merge from SaraB and bzbackup
  • Lazy Backup by Daniel Johnson. Lazy Backup is intended to be so easy even lazy people will do their backups
  • A Dar plugin has been made by Guus Jansman for Midnight commander (mc)
  • HUbackup : Home User backup (Thanks to Tristan Rhodes for informing about this project)
  • Baras by Aaron D. Marasco it a rewriting in Perl of SaraB.
  • Python Bindings by Wesley Leggette
  • Disk archive interface for Emacs by Stefan Reichör
If I have forgotten or am not aware of your project you are welcome to contact me (see the AUTHOR document in the source package).


Related Sites




Downloading


Source packages and binary packages for windows are signed with author's GPG key. You can find the signatures for released packages on the home page and on its mirror site. Binary packages for your distro have to be fetched from your distro's site which will probably provide you electronic signatures.

Source packages and windows binary can be found here.




Concurrent Versions System (CVS)

Presentation

Instead of having source code copied from one host to another, which makes last versions of each file difficult to identify, dar --- as many other softwares --- uses CVS. CVS defines a repository, which is a central place where all the source code is stored. Anyone can then 'synchronize' the repository with a directory of his own hard disk and this way retrieve a particular version of the source code. Getting source code this way implies you have a CVS client, thus for releases, a copy of the source code is extracted from CVS and is provided as a compressed tar (or zip) archive, which is more easy to use for anyone. Note also that even if you can download source code from CVS repository only authorized developers can upload their changes back to the repository. If you are not an authorized developer you can however submit a patch here , they will be review and maybe integrated as is, its implementation may be modified when necessary, or they can be rejected (for some good reason of course).

Dar's repository Organization

Over the centralization of the source code, CVS brings the possibility to store different versions of a given file and to give them a label. Labels can be shared between different files which is this way it is used: a given label let us keep a record of what version of each file is used at a specific release, for example, label  v2_3_0 can be used to obtain source code of release 2.3.0 (label must start by a letter and must not contain any dot, thus we use this notation for dar's source labels). Having concurrent versions does not only mean having a stack of versions for each file, but it let us have a tree-like structure:

The Trunk of the tree contains development source code, when enough features have been added and thus comes the time for a new release, a new branch is created at that place. Branch are like labels except that they also create a fork for each file. Dar branches are of the form branch_2_3_x for example. A given branch will start with pre-releases versions, then will come releases for that branch, all of them have the same major and medium number, only the minor number changes between releases of a same branch, which indicates hat only bug fixes are done between theses releases. At each release the changes done from the previous release on the same branch are merged to the trunk, this way, the development code inherits the bug fixes.

Usage

To use CVS, you must first set the repository, either on command-line thanks to the -d option or thanks to the CVSROOT environment variable, all theses general information about the CVS repository are explained here (note that this documentation speaks about module, this is a set of file inside the repository, actually in dar's repository only exist one module, which name is dar).

First you must login, this has to be done once for all:

cvs -d:pserver:anonymous@dar.cvs.sourceforge.net:/cvsroot/dar login

assuming you want to get (or "check out (=co)") source code of a particular version (here version 1.2.3) :

cvs -z3 -d:pserver:anonymous@dar.cvs.sourceforge.net:/cvsroot/dar co -r v1_2_3 dar

assuming you want to get the latest version of a branch (code which contains pending bug fixes for the next release)

cvs -z3 -d:pserver:anonymous@dar.cvs.sourceforge.net:/cvsroot/dar co -r branch_1_2_x dar

Thus, if the latest release on branch 1.2.x is release 1.2.3 but some bugs have been reported an fixed, you can download the updated source code this way even before release 1.2.4 takes place (depending on the severity of the bugs and on my free time, such fix may stay pending for one or more months). Beside this, once a branch has its first release, (release v2_3_0 appears on branch_2_3_x), older branches die (branch_2_2_x died when release 2.3.0 was done), don't thus expect extra bug fixes except for the latest branch.

Note that if you already have "checked out" a particular branch (not a particular version), you can quickly get the lastest changes (that is "update") on that branch using the following command:

cvs -z3 -d:pserver:anonymous@dar.cvs.sourceforge.net:/cvsroot/dar update
 
This will transfer only the difference since the last time you have checked out or updated the local copy of the repository to the most recent version of files on that branch.

Last, if you do not specify any label (-r option) or branch when checking out dar's source code:

cvs -z3 -d:pserver:anonymous@dar.cvs.sourceforge.net:/cvsroot/dar co dar

you get source code from the trunk, thus the code with new features implemented (read the Changelog file to know what has been implemented so far). Adding new features may have impact on some other existing ones and while at each features addition testing is done to be sure the feature works perfectly and while also a new feature is never uploaded to CVS unless it has been tested, trunk version may still have bugs, and must be considered as development source code.  Note moverover, that adding a new feature in the development source may break the compatibility with older development version (while compatibility will stay with released version), thus if you really need to use a developement version for something else than just testing it or trying it, it is strongly recommended to keep a copy of the source code in a secure place to be able to rebuild a dar binary able to read your archives.

Having the sources ready for compilation

CVS does not always restore the proper timestamps of files, this lead automake, autoconfig and other tools consider that some key file have changed and that they must regenerate the configure script. To avoid you having to install theses tools in that particular case, read the file do the following after having retrieved sources from CVS:

cd dar
chmod u+x misc/clean_dates
misc/clean_dates

Also take the time to read the file named "USING_SOURCE_FROM_CVS"