A look at the Ultra hardware
System integration drives performance improvements
Sun Microsystems has introduced an entirely new architecture for high-speed computing based on the UltraSPARC processor. The Ultra line of computers uses advances in system interconnects, memory, multimedia instruction sets, and high-speed networking to bring system performance to a whole new level. PLUS: A chart comparing the Ultras' price and performance to other desktop workstations. (2,000 words including one sidebar)
Sun Microsystems Computer Corp. has begun rolling out the Ultra line of workstations and servers, which incorporates advances in microprocessors, system interconnect technology, memory, and multimedia RISC computing. Sun claims the new computers, designed to be "the basis of our desktop product line in the latter half of the '90s," deliver orders-of-magnitude improvements for some applications. The company expects the Ultra architecture will give it a leg up on its competitors, who seem to be focusing mainly on improving the speed of the processor.
"It is like [competitors] are dropping a Ferrari engine into a Yugo chassis," says Mark Ross, a Sun engineer who helped develop the Ultra architecture. "You can do that, but don't expect to get extra performance out of it. We have upgraded every aspect of the architecture."
"This is obviously a big step forward" for Sun, notes Linley Gwennap, editor of the Microprocessor Report. "They're back in the race. From a performance standpoint it really puts Sun in front of everyone but Digital."
At the heart of the new computer is the UltraSPARC chip. It contains extra floating-point units, which enable the 167-MHz chip to deliver 351 SPECfp92 and 252 SPECint92, according to Sun. But these performance numbers do not highlight the chip's advanced multimedia capabilities and switch-bus architecture.
Sun engineers designed registers in the chip for handling RISC-like instructions, called the Visual Instruction Set (VIS). VIS enables a single instruction to operate on multiple pieces of data. It is optimized for multimedia applications in which the same operation must be performed repeatedly, as is the case with video compression, 2D graphics manipulation, and 3D rendering.
For example, most video compression techniques, such as MPEG, require that motion estimation algorithms be calculated for each pixel. This is done by summing the differences for each pixel in a region between one frame and the next. Most processors must perform calculations on one pixel at a time and require on the order of 48 instructions to operate on 8 pixels. Using VIS, the UltraSPARC is able to perform the same operation with a single instruction that takes 3 clock cycles to run. Consequently, Sun says, it's the only processor that can do real-time full-screen video compression entirely in software today.
Sun is incorporating VIS support into standard graphics libraries like OpenGL and XGL. This will enable companies to run their existing compiled binaries, which use these libraries, on their UltraSPARC computers to take advantage of VIS.
"There are some floating-point-intensive applications used in science where you can pick up some additional performance by recompiling for the UltraSPARC computers, thanks to the extra UltraSPARC floating-point registers," says Ken Okin, vice president of engineering for desktops at Sun. "Recompiling may not be for everyone. But for those who need every ounce of performance, it is an option."
Graphics performance on the Ultra computers is further enhanced by a Fast Frame Buffer (FFB) that contains 3D RAM and a Frame Buffer Control (FBC) ASIC. The FBC is capable of performing more than 2 billion operations per second on some graphics applications, such as filtering out pixels that are not visible in a scene. It controls from four to 12 3D-RAM chips, depending on the Ultra model.
Each 3D-RAM chip contains 10 megabytes of DRAM. These chips are connected via a 256-bit bus to an SRAM-based pixel buffer. This enables the cheaper DRAM to deliver SRAM performance, which is about 40 times faster and an order of magnitude more expensive. In addition, a processor inside the 3D RAM can quickly perform operations that require large quantities of data to be read from memory, processed, and returned to memory without consuming bandwidth on the system interconnect, or even the graphics card. The result: the 3D RAM can deliver 1.8 million triangles per second, compared to only 210,000 when using 2-megabyte VRAM.
Recognizing that 3D RAM's benefits require moving large quantities of data, Sun has developed the Ultra Port Architecture (UPA), a packet-switched crossbar interconnect that enables the processor, graphics, and memory components to communicate with each other at up to 1.3 gigabytes per second, with a sustained rate of up to 300 megabytes per second for transfers between memory and graphics.
Since the UPA is packet-switched, it lets multiple devices talk to each other simultaneously. In addition, it can translate between the different interconnect widths of the devices. (The processor port is 144 bits wide. The SBus and visual computing ports are both 72 bits wide.)
The memory interconnect -- 288 bits wide in the Ultra-1 line and 576 bits wide in the Ultra-2 line -- enables sustained throughput between memory and graphics components over the UPA at 300 megabytes per second for the Ultra-1 computers and 600 megabytes per second for the Ultra-2 line. This throughput rate significantly exceeds the SPARCstation 20's peak sustained throughput of 100 megabytes per second.
"With the switch, a central interconnect, you can make the physical interconnect lines shorter," explains Les Poltrack, group manager of desktop marketing at Sun Microsystems. "You have less capacitance and inductance on a shorter line. So you can run the bus that goes into the interconnect at a faster speed. We can run it at a faster clock, we can do multiple things at once." Poltrack notes that Sun engineers decided in 1991 to employ a switched architecture in their next-generation architecture.
Although the components themselves need interconnect widths only in multiples of 64, the extra bits are used for error-checking codes. This means that the UltraSPARC computers have an extra level of reliability against data corruption within the computer itself -- an important tool for protecting against alpha particle disturbances that can disrupt the state of the DRAM. Although quite rare, such disturbances can occasionally corrupt data or even bring down a system.
Sun has extended error checking and correction (ECC) protection beyond the memory system to include the major system buses. This protects against intermittent signal glitches.
Inside the I/O bridge between the SBus and the UPA is a streaming buffer that can support 16 streams between peripherals and the UPA. The streaming buffer helps reduce the overhead required for communicating between the UPA and SBus cards that transfer less than 64 bytes of data at a time. For example a device that only requests 16 bytes at a time, the streaming buffer grabs 64 bytes of data and holds them for the peripheral. Subsequent requests pull data from the buffer instead of from memory.
When all 64 bytes have been consumed by the peripheral, the streaming buffer pre-fetches the next set of data from memory. This is useful for minimizing the latency required to receive large quantities of data from memory, and so helps to increase the effective throughput of the SBus.
The streaming buffer is also useful for transferring data to memory. The memory operates on 64-byte chunks of data. When a device tries to write less than 64 bytes directly to memory, it has to read a 64-byte line, modify it, and then write back to the memory. The streaming buffer can combine multiple transfers into one 64-byte line. This can be written directly to the memory, without having to perform a read or modify operation in the memory, thus saving memory bandwidth.
A well-rounded architecture
From a management standpoint, one of the attractive aspects of these first UltraSPARC computers is that they enable companies to make the most of their investment in peripherals and memory. SBus peripherals can be plugged right into the SBus. You can also reuse the memory modules currently installed in your SPARCstation 20.
The Ultra-1 line of computers is only the first of the Ultra series. The Ultra-2 is just around the corner, and the Ultra-3 is still in the lab. These later models will further enhance the Ultra architecture thanks to even more improvements in microprocessors, interconnects, memory, graphics, and RISC technology.
Although the Ultra computers may offer the most impressive multimedia performance around, they may not offer the best value for every application. Sun says the Ultra computers' key target markets include 3D mechanical CAD, oil and gas, and university and research labs, and no doubt expects the Ultra to win lots of sales in these areas. But some competing machines claim impressive price-performance figures. (See the sidebar "Ultra computers vs. the competition" for a visual comparison.)
"If you are looking at it from an open playing field, it is not clear that the UltraSPARC gets the best price to performance," notes Gwennap. For example, Intel's Pentium Pro (P6) is hitting 275 SPECint92 with a 150-MHz chip, compared to about 250 SPECint92 for the 167-MHz UltraSPARC chip. However, the UltraSPARC still boasts a performance advantage at the system level thanks to the high-speed UPA.
The Pentium Pro, along with PowerPC-based workstations (such as IBM's low-priced PowerPC 604-based workstation), are putting significant price pressure on workstations, says Gwennap. "We expect to see some relatively low cost Pentium Pro servers coming out in 1996. They may not offer as much performance as the UltraSPARC, but they are a lot cheaper."
"While Sun is jumping out in front of competitors, it isn't necessarily going to stay there," adds Gwennap. "HP's got a path to catch up; SGI's working on the R10000." And others aren't standing still.
"Given enough time, our competitors will build something with similar performance," acknowledges Mark Ross. "But by then we will be onto the next level.
Regardless of who's in the lead, Sun's Ultra computers have catapulted the company to the head of the pack, and their formidable performance and architecture will no doubt keep Sun in the running for some time to come.
If you have technical problems with this magazine, contact firstname.lastname@example.org
With the Ultra computer, Sun jumps from being an also-ran to the head of the pack. Just how big of a leap does the Ultra computer provide over Sun's previous top desktop, the SPARCstation 20? And how does it compare to similarly configured machines in the $10,000-$20,000 price range? Take a look:
At the prices shown above, these computers include at least 64 megabytes of RAM, a 1-gigabyte disk drive, a color monitor, keyboard, mouse, and a Unix operating system. (Intergraph substitutes NT for Unix.)
Ultra 2 uses two CPUs to further boost SPEC numbers
Sun reports its Ultra 2 Creator3D Model 2200 ($59,995) delivers 332 SPECint92 and 505 SPECfp92. The Ultra 2 relies on two processors to match the SPEC numbers of Digital Equipment Corp.'s DEC 600 5/300 ZLXp-L2, which sports a single 300-MHz Alpha 21164.
These SPEC figures were obtained from vendors and may be based upon more expensive configurations with the same CPU.
About the author
George Lawton is a writer and consultant based in Brisbane, California. Reach George at email@example.com.