The evolution of storage clustering

We've come a long way from the days of manual disk copying. Take a trip through the progression of storage clustering -- from disk sharing to the storage area network

September 1998

Abstract

This month's Connectivity column offers a reprise of the advances we've made in the area of storage management. Columnist Rawn Shah journeys through time for a look at the evolution of storage clustering, touching on SCSI, RAID, network filing systems like NFS and (PC)NFS, hierarchical storage management systems, and the modern-day storage area network. (2,300 words)

Mail this
article to
a friend

hen clustering first arrived on the scene, there were two forms of resource sharing in effect: storage clusters and disk sharing. These are the two oldest forms of resource sharing still in existence today. From this basic beginning, printer and modem sharing, record-level locking databases, application-level redundancy, and parallel processing have emerged. Some forms of storage sharing have been around for decades and are still potent. For example, consider the 16-year-old Network filesystem (NFS); created by Sun Microsystems back in 1982, it's hardly changed since its inception. Now, a new generation of disk sharing system, the storage area network (SAN), is coming alive. As such, the need for storage clusters shows no sign of losing support any time soon.

When storage clusters were first created, storage space was fairly expensive. The cheapest storage method was to make one group of disks available to as many users as possible. Today, you can buy gigabyte-sized drives for under $100. What cost $1 per megabyte in the early '90s has decreased to 0.1 of a cent per megabyte today. Unfortunately, cheaper and more available storage doesn't solve all our problems. In fact, it creates more of them. We keep adding more drive space and we keep using it all up. Software has expanded to fill the available space. More complex operating systems now take up significant room. And data files are always increasing in size and number.

In creating storage clusters we have several key goals to satisfy:

Increase available disk space
Serve multiple consumers of disk space
Reduce access time of data on the disks
Ensure the coherency or reliability of data on the disks
Ensure the security of data on the disks

The solutions for these issues come in all forms -- from RAID (redundant array of independent disks) clusters to HSM (hierarchical storage management) systems. These include:

Direct-drive or partition sharing (Microsoft Common Internet filesystem (CIFS), (PC)NFS)
Coalesced storage (NFS, the Andrew filesystem (AFS), RAID systems, SCSI clusters, Fibre Channel clusters, SANs)
Data access prioritization (RAID systems, some SANs and Fibre Channel products)
Data coherency systems (RAID systems, disk or file replication systems)
Data encryption systems (public key encryption, disk encryption)
Automated backup or archiving inactive data (HSM, SANs)
Multiple consumers and data locking (distributed lock managers, SANs, NFS)

As we said in last month's column, the definition of clustering keeps changing, but the notion of shared resources lies at its very heart. Systems like NFS and RAID can now arguably be called clusters.

SCSI and RAID
The small computer system interface (SCSI) has grown from a fast, simple system that first allowed megabit-per-second data access speeds from a small group of disks to almost a separate networking architecture in itself. Today we have SCSI-II, Fast SCSI-II, Fast Wide SCSI-II, SCSI-III, UltraSCSI, and Differential Wide SCSI. They're all different standards to allow greater bandwidth and greater drive capacity. This has even expanded into SCSI hubs and switches.

SCSI provides the low-level hardware disk infrastructure to connect multiple drives to a single system. The next step combines a logical infrastructure on top of the hard disks. A system like RAID will allow you to combine multiple drives into a single logical drive optimized for performance, reliability, or a combination of both.

RAID introduces disk striping and parity. A stripe is a logical disk volume that is actually spread across several physical drives. By striping across several disks, you can relieve the bottleneck of access from each disk. Since physical disks can only rotate so much at a time and most disk heads can only read or write so much at a time, it makes sense to spread the data across several drives, each with its own read/write head and interface. Parity creates more coherent data. Parity essentially exists only because physical disk media is always susceptible to damage or data corruption. A parity block can correct bitwide or bytewide errors. This kind of data consistency at the lowest level does slow down access, because each read or write operation needs to be checked first, but in the end it's a good measure to ensure that your data is exact.

Advertisements

Network filing systems
NFS, CIFS, and AFS are all forms of network filesystems. These protocols combine drives from independent computing nodes into a single structure such that they can be accessed directly by users. They work at the operating system level within the filesystem architecture. NFS and AFS create a tree structure of the filesystems, essentially making an entire volume appear just like a directory on a local drive. In fact, they were built on the principle of the Unix filesystem in which a tree of drive partitions is seamless. CIFS is similar to Windows disk systems with separate drive devices and letters, each signifying a logical or physical drive.

These systems work by translating local disk function calls into network disk function calls. Each maintains records of the remote drives and their hosts. In some cases, they also maintain user identities and access rights. Drives are either mounted manually or automatically to pre-specified mount points. Some of these can even work across multiple operating systems. NFS began only as a network filing mechanism for Unix systems but has since expanded to numerous other platforms, including Windows, Macintosh, OpenVMS, and mainframe environments. The files are stored in the filesystem of the local hard disk; however, when they're accessed, the bytes are translated to the appropriate formats that the remote system understands.

(PC)NFS is probably one of the most commonly used filesystems. Through (PC)NFS, a DOS, Windows, or Macintosh system can store files and executables that can be directly executed on the logical network drive, even if the disk server is running Unix. Users on the Unix server will not be able to read these files without a conversion program, and they will certainly not be able to execute them directly on the Unix server.

The Andrew filesystem initially developed at Carnegie-Mellon University takes NFS to the next level of global network file sharing. The technical issues involved in server file sharing are different when you have to access servers around the world. It can become a problem of efficiency and reliability -- and AFS is more capable of handling such potential problems.

Almost all Unix systems have NFS these days. Some systems like Solaris also have an automounter system that can handle mounting and unmounting volumes automatically on a per-user basis. There are two variants of NFS: NFS/UDP (better for the LAN) and NFS/TCP (optimized for the WAN).

When it comes to Windows NT, you can share disks through NFS with packages offered by third-party vendors such as Hummingbird and NetManage. Soon Microsoft will also release a package called NT Services for Unix that includes an NFS client and server for your NT system.

The ultimate in NFS servers comes from Network Appliance and Auspex Systems. Their products have specialized heavy-duty servers that are essentially storage servers optimized for disk access and data delivery over NFS.

Hierarchical storage management
HSM is for the big boys. It becomes necessary when you have so much data that you cannot cost-effectively keep it all on hard drives. HSM is an automatic archival system that keeps combinations of hard drives, tape drives, or optical drives. The basic principle is that the most actively used data is kept on the hard drives, and the least actively used data is stored in a library or jukebox of tapes or magneto-optical or (in the near future) DVD disks. The system monitors use of files and data and keeps track of them by volume, by directory, or by individual file. As the data becomes less used, the system indicates that it should be transferred to intermediary and then long-term volumes. These systems often have robotic arm mechanisms to keep newer, unused data closer to the drive reader and move older data away. Access time to any data file in the archive can take from less than a few milliseconds directly from a hard disk to under a minute from the disk or tape library.

If HSM sounds like a glorified backup system, that's because it is. However, the volume of data stored by some of these systems makes it crucial to automate the mechanism. HSM systems not only make it easier to archive the data but also to retrieve it in as short a time as possible. Products such as Legato Networker and Cheyenne/Computer Associates ARCserve are at the top of the HSM food chain.

Storage area networks
The storage area network is the new generation of disk sharing systems brought to life by the advent of Fibre Channel technology. Fibre Channel takes the model that SCSI created and expands it into the optical fiber realm while also expanding the roles of hubs, switches, and even routers. Although Fibre Channel is a generic hardware networking technology (layer 1 or layer 2 in the protocol stack), its most common application is in the disk subsystem.

Fibre Channel creates a separate network system intended only for disk subsystems and not for traffic like TCP or UDP packets on a regular LAN. It is a layer 1/layer 2 protocol in that it handles physical signaling between a storage device and the controller. But it also handles a simple form of routing and addressing. In fact, it is flexible enough to have other disk/peripheral bus protocols like SCSI, the high performance parallel interface (HIPPI), and IBM's ESCON for mainframes. There are even TCP/IP implementations running directly over Fibre Channel. This makes it possible for older drives to hook up through an interface or controller to a SAN, which consequently saves significant money because you won't have to replace all your disks.

Fibre Channel, as the name implies, started out as an optical fiber-only system. It now also includes copper transmission systems as well. Running at speeds from 100 megabits per second to 4 gigabits per second (hopefully in the future), it's faster than the main buses on many systems. What's more, bandwidth can be guaranteed. Initial Fibre Channel runs are in the 2-to-10 kilometer range, but with future repeaters this can be extended to 20 to 40 kilometers. For a campus area network, this is a great solution.

The beauty of the SAN is the ability to dedicate, multiplex, and fractionalize bandwidth on a network of storage devices and provide access to this data to multiple platforms. This allows lower speed devices to use a smaller channel and faster devices to use a larger one. The ability to channel data is what allows the guaranteed access rates between devices.

The storage server is the outlet from the SAN to all the computers on the network. The storage server usually has multiple connections directly to the storage buses of the host computers. For example, the Sun StorEdge A7000 allows IBM mainframes to directly connect by assuming the role of an IBM 3990 storage controller. At the same time, Unix systems can hook in with another Fibre Channel connection, and NT systems can hook their SCSI connectors directly into the server.

There is a drawback in the SANs concept. Its primary use of Fibre Channel means it's a developing technology that shares much in common with ATM and Gigabit Ethernet, at least in use, if not in technical structure. Why not just hook up all these disks onto an Ethernet and be done with it? The first problem is the classic configuration and management issues associated with IP. The second problem lies in the lack of guaranteed prioritization in standard Ethernet, Fast Ethernet, or Gigabit Ethernet. Additionally, the lack of support for layering existing disk protocols as available in Fibre Channel loses any backward compatibility. Finally, although both Ethernet and SCSI systems have been around for years, there really aren't that many disk systems that have a built-in protocol stack for IP.

A solution for many
A storage sharing or clustering technology solution must be specific to your needs and budget. Most workgroups make do with simple disk sharing. When it comes to massive data storage and archival, the concept of the storage area network comes into play. The SAN is scalable from simple two-drive systems on a Fibre Channel chain. It is the mechanization and management capabilities that differentiate the storage clusters of today from what we had 20 years ago.

Resources

Network Appliance, a NFS storage server vendor http://www.netapps.com
Auspex is the leader in high-end NFS storage servers http://www.auspex.com
Legato NetWorker is an excellent HSM system http://www.legato.com/prod
Cheyenne, now a division of Computer Associates is another leader in HSM http://www.cheyenne.com
Sun's high-end SANs system, the StorEdge A7000 Intelligent Storage Server http://www.sun.com/storage/white-papers/a7000-arch.html
Sun's Enterprise storage solutions with details on all its latest SAN products http://www.sun.com/storage/index.html
Ancor, a Fibre Channel switch manufacturer, offers a good overview of Fibre Channel technology http://www.ancor.com/inside.htm
"The Intelligent Storage Network: Sun's latest pursuit," May 1998 SunWorld feature story http://www.sunworld.com/swol-05-1998/swol-05-isn.html
"What exactly is a cluster, anyway?" August 1998 Connectivity column http://www.sunworld.com/swol-08-1998/swol-08-connectivity.html
Full listing of previous Connectivity columns in SunWorld http://www.sunworld.com/common/swol-backissues-columns.html#connectivity

About the author
Rawn Shah is chief analyst for Razor Research Group, covering WAN and MAN networking technology and network-centric computing. He has expertise in a wide range of technologies including ATM, DSL, PC-to-Unix connectivity, PC network programming, Unix software development, and systems integration. He helped found NC World magazine in December 1996, and has led the charge to the deployment of network-centric computing in the corporate world. Reach Rawn at rawn.shah@sunworld.com.

If you have technical problems with this magazine, contact webmaster@sunworld.com

URL: http://www.sunworld.com/swol-09-1998/swol-09-connectivity.html
Last modified:

Comments:
Name:
Email:
Company Name: