Building the Internet backbone
Take a virtual tour of a busy Network Access Point
The Internet, of course, consists of many networks strung together. The Internet's skeleton is made up of high-capacity trunk lines maintained by telephone companies at sites called "NAPs." This article outlines the role of NAPs, and profiles Pacific Bell's NAP in particular. (2,700 words)
Over time, you become convinced that the situation is worsening, that the Internet is condemned to a fiery death by overload. And you are not alone with such thoughts. Robert Metcalfe, inventor of the Ethernet, has gone so far as to predict that overload will kill the Internet within the next year.
Apocalyptic predictions about the Internet have not deterred the companies constructing tomorrow's Network Access Points (NAPs) that interconnect Internet Service Providers (ISPs) at regional nodes. (See the sidebar A history of the Network Access Point for more background information.) As the maintainer of the NAP in California, which has the most Internet traffic on the planet, Pacific Bell has taken the lead in building a next generation backbone that has enough capacity to handle up to 20 gigabits per second of raw Internet traffic on its two main backbone switches. Although in reality, the total bandwidth cannot be completely used unless the traffic on both switches is exactly balanced, there is enough bandwidth to handle about 200 times the average loads Pacific Bell has experienced recently. By the end of the year, Pacific Bell will begin to connect ISPs to the NAP at 622 megabits per second, which is 12 times more capacity than other NAPs today.
The real problem with maintaining the Internet backbone is not traffic capacity; that will scale into the terabit range with the right ATM switches. The real problem is keeping track of all the networks that form the Internet. ISPs ranging in size from mom and pop companies to Sprint have all stumbled into difficulties that shut down their networks briefly. This process of a given network shutting down and starting again is called "flapping." As a network flaps, messages are sent to routers all over the network so they can make optimal routing decisions based on the state of the network in real-time. As the Internet grew in size, flapping became such a traffic burden that NAPs deployed a new Routing Arbiter infrastructure built around SPARCstations that serves networking information to routers as required.
Upgrading the Net to telephone quality
When telephone companies set standards for quality of service, they shoot for 99.9999 percent reliability, resulting in the average telephone being out of service for a few minutes each year. When the NSF was looking for companies to handle the NAPs, a stipulation of the agreement was that the NAP be maintained at a reliability of at least 99.92 percent, which is quite good by Internet standards.
To achieve this kind of quality, Pacific Bell had to develop an architecture that could scale well. At the time, Cisco routers were the most common choice for switching packets between different networks, and FDDI was the only practical choice for carrying high-speed data over metropolitan areas. But routers did not scale up to support interfaces faster than 45 megabits per second, and the 100-megabits per second capacity of FDDI does not scale at all. In fact, since FDDI bandwidth is shared, the more people on a given FDDI segment, the slower the network becomes for everyone.
When Warren Williams, current director of NAP Project Management, arrived at Pacific Bell's NAP, they already had two switches in place. Williams said, "I decided not to scrub the technology by staying with ATM because of the promise of scalability, hoping that ATM performance problems could be solved quickly."
ATM switches are easy to scale as needs grow. Since ATM is becoming a standard for voice and data traffic, the equipment makers are building interfaces into SONET, which can carry 2.4 gigabits per second over standard equipment today and 9.6 gigabits per second in the next couple of years. ATM is also ideal for carrying real-time traffic such as voice and video when it is implemented correctly.
Moving to ATM requires something of a leap of faith since it is not an established technology, and a number of ISPs have moved away from ATM after running into difficulties. For example, UUNET made headlines a few years ago when it declared it had moved its backbone to ATM. But it has since switched to Frame Relay, a more established technology.
Alan Taffel, vice president of sales and marketing at UUNET said, "We started with an ATM backbone in the early 90s but we do not have it any longer. A lot of people think it is going backwards until they look at it closely. We transitioned for manageability, maturity, cost, and overhead." Overhead is the extra data that must be sent along by the protocol to manage communications and varies by the type of traffic.
"Overhead is a very significant element, because unless you are mixing pure voice and pure video with data, it is not worth the overhead that ATM adds to the data," Taffel said. "Frame Relay adds less than 10 percent overhead while ATM's typical overhead can be 50 percent. People don't latch onto the ramification that ATM uses fixed-length cells, and that really means fixed length. Consider a common packet size like 64 bytes and you see the problem right away." ATM cells have a fixed length of 53 bytes, 48 of which can be used to carry data. Therefore, a 64-byte packet would have to be sent in two ATM cells, requiring 106 bytes of bandwidth.
However, the overhead Pacific Bell experienced with ATM is not much higher than other networking technologies. Pacific Bell has measured ATM overhead at about 20 percent compared to 40 percent overhead for Ethernet and 15 percent for FDDI. "The point here is," according to Williams, "that with variable-length packets crossing a switching matrix, like an FDDI or Ethernet switch, the variability of the packet's size creates the need for an algorithm to detect the beginning of the next packet. With fixed-length cells (as in an ATM switch) there is no need for that extra processing. The switch knows exactly where the next cell begins. This gets into the religious side of the argument.
"Does the gain from the fixed-length processing override the loss from the `white space' left when a packet does not exactly meet the 48-byte payload of an ATM cell? Frankly, I don't care," said Williams. "The key for us is the fact that the switches are wickedly fast, overhead or not, and that we can scale to six times current LAN-based technologies (and that will go even higher when the chipsets for OC-48 (2.4 gigabits per second) and OC-192 (9.6 gigabits per second) are developed."
Williams admits that in the early days, there were some serious problems with doing IP over ATM. "In late 1994 Pacific Bell got a black eye from doing IP over ATM. People had not harnessed congestion management and throughput."
The problem with using TCP/IP over ATM is that TCP/IP connections attempt to double their bandwidth for each acknowledgment it receives until a given link is operating at capacity. The limit depends on the throughput of the switch and size of its buffer. Williams said the first ATM switch Pacific Bell used for voice and video did not handle IP traffic well because of the buffer. After scouring the marketplace, they found much better results with Stratacom's new ATM switch that was designed to handle IP, thanks in part to a buffer that can hold 24,000 cells per port. A fully loaded switch can buffer 600,000 to 700,000 cells.
Pacific Bell pushed the switch to its limits. Williams said, "We put it through some very thorough testing. Our test results came out to be about an inch and a half thick."
Using its experience, Pacific Bell created a top 20 list of features it wanted in the next switch. Its close work with Stratacom in perfecting IP over ATM may have been a factor in its recent acquisition by Cisco.
Pacific Bell's ATM architecture has given it some unique position to grow with the Internet. For example, it is the only NAP maintainer to create a virtual NAP that allows ISPs to run fiber to Pacific Bell offices in Oakland, Palo Alto, and San Francisco. The other NAPs are all in one single physical location. Williams points out that it would have been technically possible to include Los Angeles as well, but utility regulators would not allow them to carry any traffic between local access transport areas (LATAs). However, Pacific Bell will be allowed to build its NAP into any LATA in the state once the new Telecommunications Reform Bill goes into effect next year.
Pacific Bell's ATM switch will give it room to grow significantly. It is in the process of upgrading to a 20-gigabits per second backbone that should last a while considering current traffic is still under 100 megabits per second. Even at their current growth rate of 522 percent a year, the 20-gigabits per second backbone should be able to meet Williams' goal of 99.92 percent reliability for the next three years.
Tracking the networks
While adding more bandwidth to the Internet will enable more traffic, it does nothing to solve the routing problems associated with more networks connected to the Internet. ISPs have to be able to track all changes on the status of networks to determine the best route between two points. Originally, ISPs had to track every single network on the Internet, but this job has been simplified by the development of the Classless Internet Domain Routing (CIDR) technology, which defines address assignment and aggregation methods with the goal of minimizing the size of routing tables in border (top-level) routers.
This approach has worked well, up to a point, but even with CIDR in place, a typical router can spend a considerable amount of time calculating route changes. As more routers are interconnected, the amount of traffic required to inform each other about changes grows at the rate of [(n2-N)/2]. Ten routers would only require 45 connections to exchange routes with each other. By the time you get to 20 routers, you would need 190 connections between them.
To reduce network traffic and router overhead associated with route processing, the NSF mandated the creation of a Routing Arbiter (RA), which is managed by the Routing Arbiter Project, a joint undertaking of Merit Network Inc. and the University of Southern California Information Sciences Institute (ISI). The RA does all of the route processing so that the individual routers can focus on moving packets.
The Route Servers (RS) are Sun SPARCstation 20 workstations running SunOS with 128 megabytes of memory and several gigabytes of disk space. The server application is maintained by the Routing Arbiter Project. Each NAP has two Route Servers for redundancy in case one goes down, or gets overloaded. ISPs make peering agreements with the RS, instead of with each individual ISP. The ISPs can also specify policy to the RS of whom they will exchange traffic with, which is recorded into a Router Information Base within the RS. Thus when their router requests a route from the server, it only sends back valid routes that meet the ISP's routing policy.
If the RS just forwarded packets among NAP-attached ISPs, the traffic would be almost as bad as if the routers talked to each other. Rather, the RS uses the Border Gateway Protocol third-party routing capabilities built into Cisco routers to exchange data. Traffic is then exchanged directly between the routers on the NAP, even though the route is retrieved from the RS.
The RS architecture has solved a number of problems associated with Internet growth. For one, it has reduced the number of connections between routers and the traffic associated with route processing. In addition, it has reduced the amount of processing that each ISP must do on its own routers, so these will have more time to move packets. As more efficient ways of processing routes become apparent, it will be far easier to deploy these technologies in a few centralized route servers than it would be to update all the routers on the Internet.
Not on their shift
In the end, the question of whether or not the Internet melts down will be decided by the engineers in the field that push networking technology to new levels, not the pundits at the podium. It appears as if the technology for moving and routing packets in the backbone is well established and should be able to scale up to meet the growing demands of more users using more bandwidth for the next couple of years, at least in California. However, that thought provides little solace to those who must endure the "connecting" message while waiting for a Web page to appear.
If you have technical problems with this magazine, contact firstname.lastname@example.org
In the early days of the Internet, the National Science Foundation Network (NSFNET) was established to provide interconnections between regional networks as well as schools and universities. At the time, its 56-kilobits per second links were enough to carry the modest amounts of traffic traversing the Internet. But as the Internet grew, so did the backbone to 1.5 megabits per second in 1987, and to 45 megabits per second a few years later. At about the same time, commercial Internet providers began to build their own interconnection called Commercial Internet eXchange (CIX) to allow their customers to conduct business over the Internet, which was technically a no-no on NSFNET.
Despite the creation of CIX, NSFNET backbone traffic continued to grow at an exponential rate, resulting in NSF's decision to get out of the network backbone business itself so that Internet growth could be self-supported by paying customers. In 1993, the NSF drafted a plan for a new network architecture that included four commercial Network Access Points across the country, a routing arbiter to manage network changes, and a very high-speed Backbone Network Service (vhsBNS) for handling traffic between supercomputer centers.
Network Service Providers such as Sprint, MCI, UUNet, PSInet, and Netcom all connected to the four Network Access Points (NAPs) around the country, and the job of managing the NAPs was awarded to four companies: Pacific Bell got the San Francisco Bay area NAP, Sprint got the New Jersey NAP that serves New York City, Metropolitan Fiber Systems got the Washington DC NAP, and the Chicago NAP went to another firm. In addition to these, other interconnection points have also emerged such as a number of FIXen (Federal Internet eXchanges) and Metropolitan Area Exchanges (MAEs), which supplement the NAPs.
In the Big One
Unless you were looking for it, the one-story brick building in Concord, CA (east of San Francisco) would not strike you as the headquarters for Pacific Bell's NAP maintenance office called the Networked Data Products Service Center. A reinforced wall inside the building separates the core hardware away from any disaster Mother Nature may throw at the building.
Inside the nerve center, individual engineers work on about 50 Sun workstations. Major outages are displayed on one of three big screens at the front where they can be evaluated by engineering teams. In addition to managing NAP traffic, the center also handles all of Pacific Bell's commercial data services for California, and signs up about 500 new Frame Relay customers per day.
About the author
George Lawton (email@example.com) is a computer and telecommunications consultant based in Brisbane, CA. You can visit his home page at http://www.best.com/~glawton/glawton/. Reach George at firstname.lastname@example.org.