RAID basics, Part 4: Uniting systems and storage

or the past several months I've focused on high-performance redundant storage systems and their internal workings and configurations. But even if you've already designed a disk system, its controllers, and its software support, you've only begun to build a storage subsystem; you must also implement the connection scheme that provides paths from the servers to the storage. After all, it's the servers that decipher, secure, and store the data to the disk system.

How fast can you drink?
When thinking about connecting disks and servers, you might find it helpful to visualize the flow of data as a stream of water. Imagine that you are the water commissioner of a bustling, growing city whose residents demand a continuous, reliable supply of drinking water. Their needs may rise and fall, but the overall trend is towards an increasing demand. Nearby is a huge reservoir of water with a capacity far greater than the needs of the city. Between the city and the reservoir is ... a garden hose.

To placate the thirsty populace, you replace the garden hose with a single, eight-foot-wide pipe. Water is freely available -- that is, until the pipe breaks and the supply is cut off. Seeing that the single, giant pipe is also a single point of failure in your connection scheme, you replace it with four two-foot-wide pipes. Now it seems that you have a solution that strikes a balance between performance and reliability -- but when your customers' demand peaks, you can't get water to them fast enough. To solve this problem, you install holding tanks on the outskirts of town that hook directly to the city system using remnants from the original eight-foot-wide pipe. By filling the holding tank constantly with the smaller two-foot-wide pipes, you can let users drain the tank at peak moments with the full capacity of the eight-foot-wide connection.

Connecting disk systems is exactly like this scenario. Some technologies (like Fibre Channel) give you eight-foot-wide pipes, while cheaper schemes (like SCSI) use smaller pipes. Although redundant connections are expensive, you can use inexpensive connection solutions and help performance with holding tanks (caching controllers). The trick is to build a connection infrastructure that yields high performance and reliability -- and doesn't break the bank in the process.

SCSI: Old faithful
For almost 20 years SCSI has been the connection solution of choice for connecting disk drives to systems. SCSI originally offered a data transfer rate of just a few megabits per second (Mbps) and allowed seven devices to be attached to a single shared bus. In those days, you may only have had one or two disk drives for an entire system, so the restrictions were not onerous.

Over time, disk drives became cheaper and larger, and people began to attach more of them to a single system. SCSI still fit the bill, with increases in performance bumping the data rate to 10 or 20 Mbps. Next came software RAID implementations, which took advantage of SCSI and managed a small number of single-system drives with good results.

SCSI has many benefits: it's easily understood, it's a universal connection standard, and, until recently, it's been a leader in the performance arena. Perhaps its most limiting factor is the number of devices on a bus; though the original limit of seven has now been theoretically upgraded to 15, real-world SCSI chains rarely exceed five devices on a single bus.

The other SCSI limit was the length of the bus itself. As bus speeds increased, the overall length of the bus fell to just a few meters. Improvements in packaging allowed more drives to be placed in single enclosures, while the introduction of differential SCSI enabled cable length to grow to over 20 meters.

In general, SCSI is a good choice for small- to medium-sized storage solutions. Because most RAID systems can handle multiple SCSI connections to the host server, you can carefully define your logical storage devices to balance the I/O load across several SCSI connections. Properly balanced, two or four UltraSCSI connections running at 40 Mbps can each yield effective sustained throughput nearing 80 or 160 Mbps. Alas, that balancing act is tough. On the bright side, though, faster technology has arrived that eliminates all of the disadvantages of SCSI, while adding a few niceties of its own.

Fibre Channel: New kid on the block
While SCSI uses traditional copper wires to carry signals from the host to the drive, Fibre Channel uses a single strand of glass to accomplish the same feat faster and with greater versatility. As was the case with SCSI, however, it has taken years for a universal Fibre Channel standard to emerge.

Fibre Channel became available when Sun introduced its first line of storage array products back in 1994. Unfortunately, the Sun implementation of Fibre Channel was proprietary and only worked with Sun storage devices. While Fibre Channel had always been envisioned as a 100 Mbps connection medium, Sun's initial implementation ran at just 20 Mbps, with a later version increasing that to 40 Mbps.

Even in its early days, Sun's proprietary Fibre Channel offered lots of advantages: it was faster than SCSI connections, relatively easy to install and use, and it supported cable lengths of up to 150 meters. This last feature made the idea of a datacenter disk farm a reality: you could build an array of storage devices in a central location and run fiber optic cables back to servers located some distance away.

While Sun made hay with its proprietary solution, other vendors finally began producing devices that complied with the real Fibre Channel specification -- running at 100 Mbps and allowing up to 127 devices per Fibre Channel bus. Sun eventually came around, and its latest round of Fibre Channel products adheres to the standard, making open disk systems based on Fibre Channel a reality.

The last important piece finally fell into place 18 months ago when it became possible to attach disk drives directly to a Fibre Channel bus. Until then, Fibre Channel ran from a server to a RAID controller, which managed drives attached by conventional SCSI buses. Systems could not see the individual drives; they only referenced the logical devices presented by the controllers. Now the path from the server to the drive is completely Fibre Channel, allowing greater flexibility in the management of your storage devices. Smart RAID controllers are still critical, but having more methods to connect devices available is a benefit.

Because it runs at much higher speeds (100 and 200 Mbps, depending on duplexing), Fibre Channel eliminates some of the need for multiple load-balanced channels in your disk subsystem. This makes storage configuration and management easier and improves overall system performance. A single 100 Mbps Fibre Channel connection will run faster and consume fewer backplane slots for its controller than an old-style setup with four 20 Mbps SCSI connections.

Creating redundant connections
The redundancy of RAID systems allows you to tolerate individual device failures within your RAID subsystem, with extra drives taking over for a failed unit while you make on-the-fly repairs. Because you've spent a lot of time and effort constructing disk systems that won't fail, you ought to make sure you build redundant connection schemes that tolerate failure as well.

The key phrase in designing your redundant connection is "single point of failure." As you lay out each component in your connection infrastructure, consider what would happen if you were to remove it while the system was running. First and foremost, would the system as a whole continue to run? In many systems, especially low-end servers without the built-in ability to detect component failure, losing a cable or controller card will cause a system crash. Often, simply having a device go offline is enough to confuse the operating system.

With this in mind, your first step should be the installation of servers that gracefully detect and handle component failure. Sun's high-end Enterprise servers, such as the Enterprise 6500 and 10000, can detect board failures, map out the offending hardware, and keep running in a reduced capacity. Mid-range systems may not handle the board-level failures, but can deal with devices going offline, cables being removed, or power failures. Again, the key is to decide which failure modes cannot be tolerated, and how much money you can afford to spend in order to circumvent such modes.

In general, you want to ensure you have at least two data paths to every storage subsystem attached to your server. If one path fails, the other should take over seamlessly. Each path includes the cabling, controller boards, SCSI or Fibre Channel adapters, RAID controller, and anything else physically in the path of the data. To ensure this level of redundancy, you'll need hardware that handles failure and software that makes the hardware all work together. Most importantly, you need to diligently work through the entire data path, deciding how each component can be made redundant and how much that redundancy will cost.

For example, in my shop we have an HP K210 server attached to a CLARiiON 3000 RAID array. This system cannot tolerate any downtime during production hours. The K210 has two SCSI controllers in two different backplane slots, each attached to separate SCSI ports on the CLARiiON unit. HP/UX supports automatic failover if one of the SCSI connections fails; thus, the process of remounting file systems from the failed connection to the remaining connection is transparent to our applications.

Within the CLARiiON unit, each SCSI port is attached to a separate hardware RAID controller. The 30 drives in the unit are attached to six SCSI buses, each of which is attached to both controller boards. In normal operation, half the drives are managed by each controller. If one of the controllers fails, the other picks up all the drives and keeps running.

This may sound like overkill, but it proved its worth early this year. Over the Christmas holiday, one of the controllers in the CLARiiON shut down. When users arrived back to work at the start of 1999, work on the system proceeded normally. About a week into the new year, an administrator noticed the red light on the front of the CLARiiON, indicating the controller failure. The system had been running in its degraded mode for over a week without a single interruption in service. Data General shipped a new controller board, which was hot-swapped into the unit, and we were back online without ever bothering an end user.

Clearly, the design worked and kept our shop running (although the issue of effective systems monitoring just as clearly needed to be dealt with). The overall solution was not expensive, either -- about $50,000 for the entire disk subsystem and SCSI controllers.

Creating high-performance connections
Redundancy doesn't always lead to high-performance solutions, and the same is true of connection technology. Fortunately, high throughput and redundancy tend to work hand-in-hand with your connection architecture, so spending money on one side of the equation often leads to improvements on the other.

The first step in creating high-performance connections is selecting the fastest connection technology you can afford. These days, that could mean shelling out for an infrastructure based entirely on Fibre Channel, and possibly moving to a true storage area network (which we'll cover next month). If you can afford the controllers on the server side, and the appropriate connections on the RAID system side, Fibre Channel gives you a huge performance advantage. It also grants greater flexibility with device addressability and cable lengths, and lets you place more devices, at a greater distance, on a single bus.

Even so, be careful not to overload your Fibre Channel buses. Don't feel compelled to park 127 devices on a single strand just because you can. Carefully assess the aggregate throughput of the devices on the bus and make sure you can sustain that load. With today's high-end RAID systems, you can easily saturate a single Fibre Channel connection if you aren't careful.

If you can't afford Fibre Channel, SCSI is still a good alternative. It is doubtful that a single SCSI bus will meet any but the smallest system's needs, but running multiple SCSI attachments in parallel is a great way to build an effective high-performance connection scheme. Again, the goal is to determine the maximum throughput you need to keep your servers fed with data, and to provide enough channels to carry it all.

Keep in mind that bandwidth on a disk connection is a tricky thing: 1,024 I/O operations of 512 bytes each is not the same as 64 8 kilobit I/O operations. Bandwidth must be measured in both raw bytes per second and in total I/O operations per second. Depending on how your RAID controller works and the amount of cache available, you may be able to initiate anywhere from 100 to several thousand I/Os per second on a single bus, either SCSI or Fibre Channel. If your server can generate 1,000 operations per second, but your low-end RAID controller can only handle 200 per second, it doesn't really matter what the interconnect is: the controller will hit the wall long before you saturate either SCSI or Fibre Channel.

As you assess and install, you may find that the multiple connections you need to increase throughput also double nicely as redundant data paths in the event of a failure. It is a trade-off that a degraded system, missing a controller card or cable, will run more slowly than a fully operational system. This trade saves you money without costing you downtime, and users will often tolerate short periods of slow response as you replace failed components. In general, choosing SCSI as your connection scheme almost always winds up providing multiple redundant paths. Fibre channel, with its lure of high bandwidth, may cause you to overlook opportunities to balance loads across multiple connections and provide redundant paths in the event of a failure.

Advertisement: Support SunWorld, click here!
<A HREF="http://ad.doubleclick.net/jump/idg.sw.com/archives;sz=468x60"><IMG SRC="http://ad.doubleclick.net/ad/idg.sw.com/archives;sz=468x60" height=60 width=468></A>

URL: http://www.sunworld.com/swol-09-1999/swol-09-raid4.html
Last modified:

RAID basics, Part 4: Uniting systems and storage

Learn how to connect your new RAID system to your servers