Pantek Incorporated Hires Holland as Project Engineer and Developer

Pantek Incorporated has hired Curtis Holland as a Project Engineer and PHP Developer.  Curtis brings with him over 7 years of technical expertise in Information Technology Administration, Web Development and Data Manipulation. He has developed his skills at such companies as Internet Payment Exchange and The Karcher Group, Inc.

Curtis’ experience with a wide variety of Open Source applications, Linux operating systems and open source development platforms will be a great addition to the Pantek team. His enthusiasm for success is a highly valued trait at Pantek.

Curtis is an alumnus of the University of Toledo where he earned his Bachelor of Science degree in Computer Engineering. He has studied American Sign Language, studied abroad in London, England and holds a number of Red Cross certifications. Curtis plans to start work on a Masters of Information Science degree in the Spring of 2011.

Tar, from basic to advanced, a high quality reliable backup tool

Tar is one of the earliest applications used for backups in unix and its still a very functional backup tool. Tar is both the file format the tar application generates and the application itself and for that reason the files generated by tar are generally referred to as tarballs.

Tar stands for Tape ARchiver and it backs up in a sequential manner, storing permissions, directory structure, and filesystem information as well as special attributes by default. Typically a tarball will have .tar as an extension unless it’s compressed in which case you’ll typically have .tar.ext, in windows the 3 letter extension convention is typically followed by shortening the file names .tar.gz -> .tgz, .tar.bz2 -> .tb2, etc. The actual structure of a tarball is fairly simple and is a concatenation of the files being archived each with a small header (512 bytes) and ending with zero padding rounded up to the next number divisible by 512 plus 1024 bytes of 0’s. This makes the record divisible without remainder by 512 bytes and delineates the end of the record. Therefore, anything backed up in a tar archive will be a minimum of 1536 bytes per file plus necessary header, ownership, and file system information. This yields a realistic minimum size of ~8-12kb. The biggest problem with tar is that it is entirely sequential, and there is no metadata relating to where to find each individual piece of information in the tarball (like an index or table of contents), and thus its often hard to extract a single piece quickly in a large tarball.

So now that we understand the basics of what tar is, lets talk about how to use it…

The most common options we’ll use with tar for creation of an archive are c (create), f (file), v (verbose), z (gzip), p (preserve acl), and j (bzip2). Lets talk about what a few of these do. Any time you create a new tarball you will use the c option, this means create. Most often, unless you are backing up to tape, you will use the f option as well. This means read and write your input to and from a file. If you’ve got ACLs you would like backed up you need to use the p option. The z and j options are mutually exclusive. Although less common, you may also see other extensions that denote alternate compression formats (lz for example), these however are not standard and change from version to version of tar.

The most common options for decompression of a tarball are x (extract) and t (test). Extract does exactly what you imagine it would do– it extracts the contents of the tar ball to the current location unless you specify an alternate location with the -C option. Test shows you what files are in the archive and where they extract to.

You will also occasionally see the u (update) option and it cannot be used in conjunction with the x (extract) or c (create) options. It updates the archive with additional files and changes to the existing files on a file level (not block). So lets see a few examples of commands you will commonly use with tar.

tar -xvzf tarball.tar.gz

This command will extract a gzip compressed tarball into the current location with the stored path.

tar -cvzf tarball.tar.gz .

This command will store the current directory as a gzip compressed tarball.

tar -tvzf tarball.tar.gz

Will display the contents of the tarball without extracting anything.

Those are the bare minimum basics of tar and with them they cover 90%+ of all times you will use tar… but tar has many more features available.

For instance you can exclude files by using the –exclude “path/filename” syntax.

tar --exclude "/etc/resolv.conf" --exclude "/dev" -czvf tarball.tar.gz /

One caveat for the –exclude syntax, some versions of tar require the excludes to occur before the path to tar, some require it after. Check the man page for the syntax for the version of tar you’re using.

You can use -C to extract a tarball to a different directory.

tar -xzvf tarball.tar.gz -C /usr/local/src

You can utilize the -f option to extract to standard out by specifying the filename as –

cat /usr/local/src/tarball.tar.gz | tar -xzvf -

The last brings us to one nice option you can use with tar, because it correctly handles piping you can pipe it over remote connections like netcat or ssh–

cd /usr/local/configuration && ssh root@hostname.domain.tld "cat /backups/server1/configuration.tar.gz" | tar -xzvf -

 

tar -cvzf - /usr/local/configuration | ssh root@hostname.domain.tld "cat > /backups/server1/configuration.tar.gz"

The last thing people generally want to do with tar is either update an existing tarball with additional files or create an incremental backup from an existing tarball and directory structure this is fairly easily done by using the following commands, first to update a tar ball…

tar -cvzf tarball.tar.gz /path/to/backup

First create an initial tarball as above

tar -uzvf tarball.tar.gz /path/to/backup

Then update it with the changed contents.

Easy enough, but what if you need previous revisions? As easy as setting up tar to do incremental backups.

tar --listed-incremental /backups/server1/index.snar -czvf $(date +%Y%m%d%H%M%S) /path/to/backup

This command will update the original tarball to make it current (this will overwrite existing files)

tar --listed-incremental /backups/server1/index.snar -cfvz $(date +%Y%m%d%H%M%S).tar.gz /path/to/backup

The major requirement if you’re using this method is that you decompress all of the files in order as tar will remove and create files as required which is fairly trivial with a simple for loop.

As you can see tar is a very functional backup tool that exists on almost every unix distribution world wide, so the next time you’re looking for a quick backup solution without installation of additional applications take a look at tar, it really is a high quality, tested backup tool that enables you to reliably archive and restore data in a variety of ways.

Remember, backup, backup, backup– backup before you perform any system changes and you can save yourself dozens of hours of headache.

As always if you need further assistance with this or any other open source application or issue, the experts at Pantek Inc. are available 24/7 at info@pantek.com, 216-344-1614, and 877-LINUX-FIX.

Pantek Incorporated Hires Doherty as I.T. Manager

Pantek Incorporated has hired Cindy Doherty as Manager of Information Technology. Cindy brings with her 21 years of technical experience in database administration, data architecture, UNIX and server administration, hardware and software installations and upgrades. She has honed her skills at such companies as Westfield Group, FedEx Custom Critical and J.M. Smucker Company.

Cindy’s strength is her uncompromising focus on quality, stability and efficiency. Her expertise in software development lifecycles, change management, and the design and development of corporate business continuity plans will all contribute to the success of the Pantek staff.

She is an alumnus of Baldwin-Wallace College in Berea, Ohio where she earned her Bachelor of Science degree. Cindy also enjoys teaching Sunday school and volunteering at Marian’s Closet in Wadsworth.