Network backup: understanding your storage options
Before you select a backup strategy, you need to know how the components work together. We explain when to use each backup approach -- with examples from users
In an enterprise environment, more than half of the dollars spent are spent on storage. When you're looking at gigabytes or terabytes of data, the cost to replace that data, if the data is lost, could be more than enough to put a company out of business. For that reason, every company, regardless of size, needs a backup strategy for its data. The strategy might be as simple as adding an eight-millimeter (mm) subsystem to your server or building a backup approach with disk, tape, and optical. We put it in perspective. (3,300 words, including sidebar)
Most people use backup plans in their everyday lives -- from keeping an extra car key, to photocoping a second copy of their credit card statements, to leaving a reminder note on the refrigerator about their mom's birthday. No one is a stranger to the phrase "just in case." But what about something critical like losing precious computer system data that could shut down your business if that data is unrecoverable? At this point, it's time to consult the experts.
More and more businesses these days depend on their computer networks to survive. Lost data costs are calculated in terms of time, labor, lost billings, and lost customers. In these times when many critical tasks have been moved from large systems to servers, and the number of LAN users is reaching significantly larger proportions, companies cannot afford to lose what has taken years to input. Today, server-based LANs contain an average of four gigabytes of data. Data communications servers commonly hold much more. It is estimated that server disk failures account for 26 percent of all system downtime and are the single-leading cause of lost processing time.
For the informed buyer, the market holds many different sources for protecting and backing up your server at very reasonable prices.
Juan Orlandini, Direct Connect Systems' backup strategist cautions, "End-to-end fault tolerant networks are simply too expensive to implement at most small Sun (Microsystems)-based LAN sites; therefore, this is an unrealistic goal. But most sites can take a first step by protecting their server and its associated data."
A three-year-old Atlanta, GA-based consulting firm, Direct Connect Systems advises corporate clients on data protection and backup strategies, on which products to buy to best meet their needs, and on how to implement fault-tolerant, high-availability systems solutions to protect critical network data by adding to technology already in place.
A data reliability and data availability strategy must be implemented in order to best accomplish Orlandini's backup plan. "Don't confuse data reliability and data availability -- they are not the same thing. Data reliability protects against data loss or corruption so that data recovery procedures can bring the data back. All data backup devices ensure this," he says.
"Data availability," explains Orlandini, "guarantees your ability to access the data easily. The data is reachable because component redundancy assures that the storage subsystem cannot be brought down at a single point of failure. All backup implementations do not provide this fault-tolerant feature."
Some of the older, traditional storage backup devices are no longer practical because their capacity has not kept pace with today's LAN needs. In addition these devices, like helical scan tape (e.g., Digital Audio Tape (DAT) and 8mm), require a person to change tape media and storage compartments to hold the cumbersome media, and need much longer timeframes to accomplish the backup task. There are, however, quite a number of backup options that can provide reliable, high-capacity data protection and backup -- including RAID, Digital Linear Tape (DLT), and optical devices. All have SCSI interface compatibility for ease of connection to most small computer networks. But none are a "magic bullet" to solving all data protection and server backup problems -- so read on before deciding what is best for your system.
RAID: the high-performance choice
We begin with RAID (see sidebar, "Not all RAID levels are created equal"), which stands for redundant array of inexpensive (or independent) disks and can provide server-based networks with a level of data protection and reliability equal to that of their larger mainframe cousins. RAID also provides fast access times to gigabytes of stored information. The RAID redundancy scheme guarantees disk array operation should any one of its components (drives, platters, etc.) fail.
The RAID standards board recognizes seven levels of RAID, of which 1, 3, and 5 are the levels most often used. Though each level provides some degree of data protection, the RAID levels do not refer to performance but to the methodology used to store and protect data.
One advantage to many RAID devices is the opportunity to remove or add an individual drive without interrupting network access to the storage subsystem. Some RAID devices also can automatically activate a built-in spare drive if a working drive should fail. The RAID subsystem proceeds to rebuild the data from the failing drive to the spare, invisible to the user. RAID subsystems commonly store from four gigabytes to a few terabytes of data.
RAID can be more expensive than an optical drive option, but for transaction processing applications and critical, online database access environments, this is the only solution providing the needed fault-tolerance and quick access times.
When one company, Orent Graphics Art, went to the backup strategy consultants at Direct Connect Systems for help in protecting its server, data, and ultimately its client base, a RAID solution was recommended. This Omaha, NE-based company specializes in graphics and prepress data using a high-end scanning process. Its clients are large, well-known national and international publishing houses and retail stores.
For its clients' weekly ads and magazine inserts, Orent Graphics takes transparencies, scans them into the system, and transforms them digitally by revising images, changing color, and applying other graphics "secrets." The end result is a digital file which is sent to the printer for production.
"Our clients expect and receive a two- to four-day turnaround on their projects," says Ken Young, engineering manager. The Sun-based system is used by creative and ad people for desktop publishing tasks. There isn't any extra time for server problems. A typical high-end image processing job generates about seven files, each of which are 40 megabytes to 100 megabytes in length. With the company's client base, it has multigigabytes of data perpetually online.
Young says Orent turned to Direct Connect Systems because "We were impressed with their record of workable solutions for large corporations having similar situations, as well as their concern for cost and using our in-house components."
Direct Connect's Don Schrenk says, "We found Orent to have very large file sizes and extensive amounts of data to protect. Not only does their data have to be protected, but it must be available at all times." His recommendation was a centralized, fault-tolerant storage array. This system would help guard against errors (human and mechanical), and it would preserve the cost effectiveness of the project by utilizing off-the-shelf components.
Upon first inspection, Direct Connect found that Orent's massive data files (approximately six gigabytes per client) were scattered throughout the LAN without any central storage; the remainder of the files were located on about 100 individual LAN nodes. Orent was using Sun SPARC20 servers, two of which Direct Connect configured with RAID units as a central storage subsystem. The servers were set up in a dual fault-tolerant configuration. Some 152 gigabytes of protected storage has been provided by adding one Guardian 24-bay RAID subsystem (containing 24 drives with 76 gigabytes of usable storage) to each server.
For the 24 drive bays per server, Direct Connect first integrated 16 of Orent's existing disk drives. Seagate drives were purchased for the rest of the bays. The installation was done in phases. During phase one, data from existing drives was transferred to the first fault-tolerant RAID configuration. The original existing drives were then reinstalled into the second RAID subsystem (phase two).
Guardian RAID units were chosen for this project for several reasons. Among the unit's features are its ability to use off-the-shelf, hot-swappable, standard RAID controller products and its ability to employ RAID striping levels 0, 1, 0+1, 3, and/or 5. Dual components including power supplies, circuits, fans, and drives, and two paths to the hosts ensure fault-tolerant operation. The Sun hosts are also fault-tolerant -- should a server fail, the other utilizes both arrays (the RAID disks are seen by the hosts as one seamless large disk). This system also includes a UPS backup system.
Direct Connect suggested an off-line data backup for the servers, a DLT4000 library with software as well. This particular drive from Quantum Corp. utilizes 20-gigabyte tape cartridges (uncompressed), thus providing capacity for any LAN data Orent might back up each night, with storage to spare. "It's nice to have that safety net," Schrenk says.
"The solution recommended and implemented by Direct Connect not only solved our storage protection and system availability issues, but gave us the additional benefit of improving overall LAN performance by migrating the older disks into RAID sets and automating the backups and data management into a "lights out" concept," Young says. He adds that the whole project, from concept to operational status, was completed in just 30 days.
Backing up with tape
DLT, which provides up to four times the capacity and three times the performance level of traditional quarter-inch tapes, makes it an excellent network backup and archiving choice. DLT media can handle up to 35 gigabytes (uncompressed) of storage and performs at a data transfer rate up to five megabytes per second (uncompressed), enough backup capacity for most LAN networks and fast enough for many to use DLT in data libraries and archiving applications as a low-cost alternative to optical media. One user who has implemented this fault tolerant storage concept is Tropicana Dole Beverages.
Tropicana, the international beverage company, went the DLT route when it wanted to backup its LAN. Direct Connect installed an ATL Products' 6/176 Tape Library with six tape drives, each having a native capacity of 20 gigabytes each. The units employ Quantum Corp.'s DLT 4000 drives and can hold up to 176 cartridges for a total of 3.5 terabytes of uncompressed data.
The ATL library is directly connected to three Unix hosts, with four more servers attached via LAN links. More than 200 users across the company, including procurement, accounting, sales, and planning personnel back up their server data to the tape library.
Serving as the company's central database backup, it holds production information, disaster recovery copies, and database file restores. Eric Eriksen, IS director at Tropicana, says, "When we installed the ATL library, we went to a three-host hookup because the network backbone was not solid at the time. We wanted to minimize the amount of data going out over the network for backup. By placing the library directly on three servers (via SCSI interfaces), we were able to minimize network traffic.
"Our backup strategy is always to be scalable, thus protecting investment in current equipment," says Direct Connect's Orlandini. "In the future, the ATL library can be upgraded to accommodate Quantum's new DLT 7000 drives. This would provide over six terabytes of data storage capacity, should the need arise."
The three hosts manage network backup and simplify administration. The library is used for daily, weekly, and monthly backups, as well as disaster recovery backup copies. "Not having to send the data off-site for secure backup is a big time and cost savings," states Eriksen.
"The typical LAN-server is adding six gigabytes of data every three months, and this data is critical business information that must be protected. What's happening is that the LAN administrator's backup window is shrinking while the data to be backed up is growing," says Robert Archibald, Hewlett-Packard's marketing manager for the Information Storage Group. DLT libraries, like the HP SureStore DLT40e, provide up to 40 gigabytes of unattended backup, and their high bandwidth allows for large amounts of data backup in a short period of time.
Of course, DLT is not the only choice for tape backup. The 8mm tape developed by Exabyte Corp. continues to be a popular tape backup medium for small- to mid-range servers. At the very high end, 19 mm tape, such as that from Ampex Corp., offers backup capacities extending up to 330 gigabytes per cartridge.
Tape is the best choice for medium-to-long-term data retention (two to 15 years) when speed of access to individual files is not critical -- generally less than 10 minutes -- and high capacity media and high throughout is a must.
Making the optical connection
Though more expensive than tape, optical devices have faster access and data transfer rates, provide online protected access, and have enough backup capacity to accommodate most networks. They are available in several forms: WORM (write-once-read-many), rewritable (erasable), and CD-format (CD-R and CD-writable).
There are a variety of WORM and rewritable optical sizes available. The commonly used 5 1/4-inch media provide from 1.3 gigabytes or 2.6 gigabytes of storage capacity per platter. Most sites with Sun-based LAN servers will find the 640-megabyte capacity of CD-R and CD-writable devices to be less than adequate to meet their data backup needs.
Optical drives have a relatively long history in the computing industry, having migrated down to the existing size from 12-inch platters. In fact, some of the large drives are still in use today. Like tape drives, optical drives can be installed in libraries where very high capacities can be kept nearline.
Because optical is a random access storage device, it is more popular than tape when data must be accessed quickly. However, optical never did quite break through the mindshare game and the technology continues to be considered more esoteric than, in some cases, functional.
Optical backup is best employed when extremely long retention periods (greater than 15 years of shelf-life) or fast access (less than 30 seconds, including library operation) to small files (less than one megabyte) is needed.
Using solid state disks
Solid-state disks (SSD) are storage devices that hold data in electronic circuits instead of on physical disk surfaces. They are like RAM, but through the use of internal drives and batteries can be implemented in non-volatile form, providing an extremely fast, backup media. Because SSDs are electronic and not mechanical, they provide reliable, safe backup with data access times between the CPU and the SSD up to 100 times faster than with standard disks, according to Quantum.
Bay Networks (Santa Clara, CA), a vendor of networking products, installed six gigabytes of solid state disks to raise its network's I/O system performance and shrink nightly backup times. "The Digital Equipment Corp. storage subsystem employed (StorageWorks SW800) utilizes Quantum's 5 1/4-inch, 950-megabyte SSDs, which can provide user-accessible data rates near the SCSI-bus limits," says Paul Massic, the company's IS director. "The SSD additions solved our nightly batch processing backup window dilemma. With worldwide users needing system access around-the-clock, we can't afford to lock them out during extended backup processing times. The SSDs provide dependable backup and decreased the nightly backup window substantially."
SSD is a very expensive storage backup option (it cost Bay Networks about $750,000 for its SSD subsystem). But, notes Massic, "SSDs are a strategic asset. We couldn't do without them at the end of each quarter. They're something you can't afford if you don't really need them; but if you do, you can't afford not to have them."
Because SSD can be packaged in a traditional 5 1/4-inch package, the disks can be used in a traditional multi-drive tower or chassis. And because it is transferred at bus speeds, it is by far the fastest storage subsystem available. However, size and cost often limit the use of the technology, as does the volatility of the chips themselves. While non-volatile RAM can be used, writing data to a platter or tape often is considered more "permanent" than saving it to a SSD.
Using the software approach: hierarchical storage management
When considering the cost of each option above, some may find that a combination of magnetic disk, optical disk, tape, and SSD will provide the most efficient, yet cost effective solution to accessing, backing up, and archiving important data. Of course, a cohesive environment to provide backup and access to server data -- all invisible to the user -- would be needed for a system of this type, says Direct Connect's Orlandini.
To set up such a system, it would be advantageous to first separate your needs into frequently used files (to be set up on the best price/performance storage medium -- hard disk), less frequently used files (set up on slower, large capacity, and less expensive nearline storage devices such as optical and DLT tape libraries), and those files rarely used (for archiving onto slower, less costly standard tape media). A hierarchical storage management (HSM) system can provide the cohesive environment that will intermix these diverse devices into an efficiently functioning storage subsystem which works transparently.
HSM is software that automatically and transparently moves files between media types, managing the network's complete storage hierarchy (online, near-online, offline, and archiving). They can be set to perform at preset non-peak load time, predetermined times, or initiated by user request.
Providing a network backup solution while maintaining control of overall storage costs is perhaps the biggest challenge facing network executives today. To be effective, all storage backup devices must work together in a cohesive environment to provide data protection and access to server data without affecting user operations.
Shell Oil Company's Pipeline Division, based in Houston, TX, serves as a typical example of how optical storage and HSM can be utilized on a network. According to Shell Pipeline's Senior System Analyst Juan Russian, the file migration path on their Unix-based workstation network is from the one-gigabyte hard disk to the 40-gigabyte (32 platters), read/write optical library. "Each day, all files 31 days old are migrated to the library. If disk usage is high, files 15 days old or less are sent to the jukebox. Once each month, all files on the network are copied to the library and then backed up."
A Hewlett-Packard jukebox is used for archiving. It serves as the centralized file storage location for the pipeline division and contains about 27,000 pipeline drawings (images). Says Russian, "The Advanced Software Concepts (ASC) NetArchive Hierarchical Storage Management software knows where the file is located and how to get it, all automatically."
While HSM is needed to transparently manage the migration of data between diverse storage devices, it is not a backup replacement. Data backup is still required in an HSM environment. HSM merely manages data storage -- it does not protect data.
If you have technical problems with this magazine, contact email@example.com
There are seven levels of RAID technologies currently recognized by the industry's RAID Advisory Board with additional levels (modifications to the seven) claimed by some vendors. However, the most common levels are RAID 1, RAID 3, and RAID 5.
RAID 1 duplicates data files. It effectively mirrors all data during read or write operations onto a second disk. If the primary drive fails, the second drive automatically becomes active. Thus the user never notices the malfunction. RAID 1 features excellent data availability with fast read operations. At least two drives are required, with only half the total capacity usable (due to the mirroring).
RAID 3 utilizes a minimum of three drives. Data is striped across two of them with the remaining drive dedicated to parity-bit recording. Data is recreated from the information on any two working drives if any one drive fails. Available capacity is two-thirds of the total storage space. RAID 3 is excellent for transferring large blocks of data.
RAID 5 also stripes data across a minimum of three drives, but all three are used for data storage. It, too, provides two-thirds of total storage space as usable capacity. If additional drives are installed, user usable storage increases. While RAID 1 and 3 perform one data transfer (a single read or write) at a time per array, RAID 5 is capable of executing concurrent read/write data transfers. RAID 5 offers the best combination of data availability and fault-tolerant protection for online database storage applications.
It is important to note that RAID levels do not represent performance characteristics. Rather, the levels refer to the methodology used to store data on the storage media. For instance, RAID level 1 might meet a user's needs better than RAID level 5 if data availability is more important than capacity.
About the author
Ron Levine is a free lance writer based in Carpinteria, CA. He specializes in networking, storage device, and emerging technology applications. Reach Ron at firstname.lastname@example.org.