Background

Introduction

The purpose of this document is to describe the Computing and Network Services (CNS) Network Storage Archive Management (NSAM) service. For a reasonable charge, this service is available to departments, faculty, and staff of the University of Florida. Specifics on the charging structure can be found in the pricing section.

Introduction

The purpose of this document is to describe the Computing and Network Services (CNS) Network Storage Archive Management (NSAM) service. For a reasonable charge, this service is available to departments, faculty, and staff of the University of Florida. Specifics on the charging structure can be found in the pricing section. Fundamentally, the NSAM service provides a replacement to departmentally managed backups.

Archive what, when, and how

This section assumes that A) you need to keep a backup of some set of data, and B) someone has been identified, as a backup administrator, to manage the process. There has always been a need to make sure critical data is not lost and smart people have been coming up with clever ways to make sure it stays around. Here are a few common schemes people use to archive their data.

Full backups

The most basic backup scheme is to make a full copy of each file onto a fresh tape every night. Since tape media is not free, backup administrators usually come up with some method to rotate a pool of tapes. For example, one common method is to recycle the Monday through Thusday tapes each week and keep the Friday tapes for a year. Or, to ensure a 30-day restore horizon, this scheme sometimes uses four sets of Monday through Thursday tapes. More than one set of tapes would guard against the case of a file written on Monday and deleted on Thursday being lost forever on Friday.

Restoring files to a specific date requires the tape that was used the night in question. If there is a media failure on the tape there is a chance to get most of the data back by going a day or two forward or backward in the tape set.

The cost of this scheme can be enormous in several different ways. Some obvious concerns would be the backup adminisrator's time for tasks such as: switching tapes every night, physical management of tapes, maintenance of the tape hardware, and restoring files. Some of the equipment costs would include: purchasing tape drive(s), maintenance of tape drives, purchasing new and replacement tape media. Finally, if the data is stored on client machines with no tape drive attached locally, the network bandwidth during the file copy may impact other services.

Usually when the size of the backed up files exceeds the available space on the backup media, adminstrators will switch to a combination of full and differential, or full and incremental, copies of the files.

Differentials

To reduce the number of tapes needed, the differential scheme copies only the files that have changed since the last full backup to a nightly differential tape. This scheme still uses a full backup (sometimes referred to as level0) at either some regularly spaced interval or when the number of differential tapes reach a large size. Unless the data is changing on a daily basis, this scheme will save some space on the tapes and be faster to generate than a full backup, but at a cost in reliability and restore time.

Figure 1. Example of a differential backup

Example of a differential backup

Restoring files to a specific date requires the last full backup tape and the differential tape from the date in question. If there is a media failure on any of the tapes there may be some implication to fully recovering the data. However, if the failure happens in a differential tape using a day or two forward or backward in the tape set should get very close to the correct data. But, if the failure is in the full backup tape, the restore process has to revert back to the previous full backup.

Figure 2. Example of a differential restore

Example of a differential restore

Normally, the cost of this scheme is better than full backups in that it lessens the number of tapes used for a backup. But, there is a higher risk of losing data in the event of a tape media error, at which point the backup administrator may be spending a great deal of time trying to piece together all of the information that exists in order to form a reasonable restore.

Incrementials

Extending on the differential idea, incrementals keep track of changes since the last differential tape. By just including the changes since last night, or the last full backup on each nightly tape, this scheme can save even more tape space. Again, unless the data is changing on a daily basis, this scheme will save the most space and be the fastest backup, but it has the longest restore time and the most chance of disaster in the event of a media failure during the restore process.

Figure 3. Example of an incremental backup

Example of an incremental backup

Restoring files to a specific date requires the last full backup tape and each incremental tape up to the date in question. If there is a media failure on any of the incremental tapes, it may be impossible to fully recover the data. For example, if the failure is in the middle of the incremental tapes there is a good chance that all data after the failed media will be lost. Or, as in the differential example, if the failure is in the full backup tape, the restore has to revert back to the previous full backup.

Figure 4. Example of an incremental restore

Example of an incremental restore

The cost of an incremental scheme is less than that of differential backups since incremental schemes lessen the number of tapes used for a backup, but there is an even higher risk of losing data in the event of a tape media error.

Progressive -- Incrementals forever

In the mid-90s, IBM released a software product named Adstar Distributed Storage Manager™ (ADSM), which took incrementals a step further. Though the official IBM nomenclature for the product (Tivoli Storage Manager™ (TSM), IBM/TSM, ITSM, etc.) seems to change every other year, this software product is unlike any other solution on the market. ITSM treats each backup as an incremental--even the first backup, which just happens to an incremental of every file, similar to a full backup. Every night, only the data that has changed from the previous night is backed up. Further, ITSM is smart enough to know that if only a few bytes of a file have changed, then only those few bytes need to be sent over the network.

Figure 5. Example of a progressive backup

Example of a progressive backup

By not requiring periodic full backups, this scheme can save a lot of tape space over an extended period of time, but beyond that ITSM adds many more features, such as:

  • Consolidation of tape hardware
  • Policy-based management
  • Database management of all data
  • Automatic reclamation of tape space
  • Easy collocation of data (offsites)
  • Plug-ins for popular databases and applications (IBM DB2™, MS Exchange™, Oracle™, etc.)
  • No human intervention required to change tapes
  • ... and much more

 

Restoring files to a specific date requires a request to the ITSM server, which looks up where the specific data is in its storage pools. The ITSM server then mounts the appropriate media and starts to send the data back to the user. No human is involved in the process.

Figure 6. Example of a progressive restore

Example of a progressive restore

The cost of this scheme across a large network can be quite reasonable. By centralizing the hardware and administrative staff, CNS can offer the service at a very reasonable price to campus departments.

Specifics of the current implementation

Hardware and Network

As of the last time this document was updated the ITSM server consisted of the following hardware:

  • Storage Pools and Tape Drive(s)
    • IBM 3494 Enterprise Automated Tape Library
    • Over 3/4 of a TB of disk connected via IBM 7133 Serial Disk System
    • Two offsite storage locations: one on campus and one at another university in Florida.
  • Server
    • IBM RS/6000 SP System™, current node is a 2 processor 333MHz Power3
    • 2GB of system RAM
    • Gigabit ethernet interface

Support Staff

  • 24-hour call center for issues
  • 24-hour staff for managing hardware
  • 1.5 FTE ITSM administrators
  • 1 FTE tape librarian

Future of NSAM service

The CNS IBM 3494 Automated Tape Library still has plenty of room for expansion. There are currently about 1,500 tape slots in the unit, with growth available up to over 6,000. The tape drives have growth planned for the next generation, which would gain an additional 20GB of space per tape. Finally, there are longer-length tapes, still being phased into production, that offer an additional 20GB of space per tape. With all of these improvements, the unit can theoretically be expanded to over 1,000 TB of data storage. Please refer to the System status page for more accurate detail on the current configuration.