Readers Speak Out

This month: Connectivity columnist Rawn Shah on the future of ATM technology; readers continue to react to Rosenblatt's Death March review; and Adrian Cockcroft trades tech notes with a Netscape developer; Plus, SarCheck (brand new Unix performance tool) gets an on-the-fly review

September 1997

Mail this
article to
a friend

Send letters to sweditors@sunworld.com

Performance Column: Q&A with Adrian Cockcroft

News from Netscape

To Adrian Cockcroft:

I wanted to notify you of a known bug in Netscape Proxy 2.52 which affects the new additional cache status information; UP-TO-DATE gets erroneously logged as NON-CACHEABLE. This may have had an effect on your cache hit statistics.

This bug is fixed in NS Proxy 2.53.

Cheers,
Ari Luotonen,
Netscape Communications Corp.
Netscape Proxy Server Development

Adrian replies:

Thanks Ari,

We are currently running 2.52, and I don't see any UP-TO-DATE entries. If I see a NON-CACHEABLE with a 304 status code, does that mean it should really be UP-TO-DATE?

One thing that isn't clear to me is the ordering of events. Is the cache write done asynchronously, or before or after the data is returned to the client? I'm trying to figure out if a slow disk will slow down the user's response.

Do you know of any more information on the behavior and performance of proxy caches, and what did you think of my article?

Regards,
Adrian

Ari's reply:

In response to your questions:

Yes, when you have 304 in the remote server field (in the access log, the 3rd number after the request in the log) you should see an UP-TO-DATE entry.

As each packet arrives from the network, it is sent to the client socket, then to the cache file. Effectively, cache writes are done in parallel, and a slow disk may impact the user-perceived performance.

Overall, I thought your article was very good. It may have idealized ICP a bit much -- I personally think that hash function-based routing by proxy-autoconfig files is the way to go. It basically moves away from the non-deterministic ICP towards deterministic mathematical solution that makes the decision without any network I/O. However, your article reflects well the current day and what is viable now, with the current products out there. Setting up hash-based proxy selection is more complex, and the support in proxies is just not there yet -- so in that sense, your article is realistic and feet-on-the-ground.

(Opinions my own, not Netscape's.)

Cheers,
Ari

Monitoring CPU with perfmeter: performance drain?

To Adrian Cockcroft:

Can you tell me if using perfmeter and rstat is a performance drain? I was recently told by one of our system administrators that using perfmeter on our remote servers could cause serious performance degradation because of the network traffic and the kernal resources that it uses.

Is there another more acceptable alternative for monitoring CPU usage on remote machines? (We are running multiple Oracle instances on several different Suns, and I'd like to know if the CPU is going crazy for extended periods of time. We often have orphaned or invalid Oracle processes that can run up enormous amounts of CPU.)

Sandra Bullock

Adrian replies:

It's very lightweight, but the default update is every 2 seconds. Over the network to several servers this adds up. If you reduce the update rate to 30 seconds (or whatever) it should be OK.

GB disks: Whatever works

To Adrian Cockcroft:

For database applications, I've used Sun's 1.05 and 2.1 GB disks, striped and mirrored in SPARC Storage Arrays. I avoided the 4.1s because of the performance advantage of having more disks working in parallel. If I remember right, the RPMs of the 4.2 are 1.5 times those of the 2.1, but the data capacity is 2 times, so on an access-time-per-GB basis the 4.2 loses.

Planning a new system, a Sun SE told me that the 2.1s are still preferred, but a Sun sales rep said the 4.2 has better caching and intelligence, which more than makes up for the storage density, and would outperform the 2.1. I haven't found any white papers or anything that compares the two. Can you point me to anything in writing or pass on your experience?

Thanks very much,
Roger Worden

Adrian replies:

There are two versions of the 4 GB disks (actually 4.3 GB I think). The older 5400 RPM ones and the current 7200 RPM ones. There are several versions of the 2.1 GB as well.

The fastest 2.1s are the Seagate barracuda2 7200 RPM drives. The 7200 RPM 4.3 GB drives are even faster at sequential access (I got 36 MB/s from a four way stripe over UltraSCSI) and 9 MB/s sequential/drive is pretty hot. We use several vendors, so not all drives are completely identical, but they are fairly close. I think the 2.1s max out at 6 to 7 MB/s.

For random access they are basically the same, the seek times are a bit better, and cylinders are bigger so a bit less seeking is needed.

I'd rather have twice as many 2.1s for a database workload.

CPU Power in DiskSuite

To Adrian Cockcroft:

Hi Adrian,

Many thanks for your monthly performance tuning columns. Is there a way to find out how much CPU power Online DiskSuite 4.0 needs? Or can you give some qualitative statements?

I know that there are a lot of ways to configure ODS on various systems, so let's concentrate on my setup.

what about a:

Ultra 2/167, 2x single-ended F/W-SCSI controllers (each has one 9 GB disk attached), mirrored, heavy disk load (amber on your cool tool).

In this case, is the CPU power needed neglectable? What about striped or even RAID 5 setup with the same hardware?

Thanks in advance,
Mathias Weiersmueller

Adrian replies:

DiskSuite uses very little CPU power -- I don't have any exact measurements.

The main extra load when mirroring is that one I/O becomes two, so that uses extra time in the sd device driver, which is still very little.

For RAID 5 the main slowdown comes from moving extra data to and from the drives; this adds latency as well as extra CPU for parity calculations.

Look in Brian Wong's book Configuration and Capacity Planning for Solaris Servers. He has some measured numbers for CPU usage in this area.

For an Internet server almost all the system CPU power goes into TCP/IP, especially on 2.5.1. It's a lot more efficient in Solaris 2.6

SarCheck: Neat

To Adrian Cockcroft:

My company is going to begin porting our Unix performance tool SarCheck to Solaris soon. While serious feedback is probably too much to ask, if you'd like to take a look, please go to http://www.sarcheck.com and check it out. The SCO versions have the most features, so you may want to start there. If you have time to express an opinion, please do. It's really more of a resource analysis tool, but for marketing reasons we call it a performance analysis tool.

Don Winterhalter
Aurora Software Inc.

Adrian replies:

Looks neat. Like a productized version of my virtual_adrian script.

Feel free to use the SE toolkit and rules as a reference for your port and ask me if you need anything. SE includes the sar datafile header definition (which is not normally provided) so you can read the binary files directly and get more data more cleanly than via running sar itself. The only problem is that the sar file does not tell you how many CPUs there are. All CPU data is accumulated together.

The stuff I produce is aimed at making Solaris more manageable for end users so that they buy more hardware. I'm working with several tools vendors to help improve their products, and the SE toolkit is an experimental reference implementation, not a supportable product.

I didn't see what you cost on the Web site, but if you do a good job on Solaris, and the cost is reasonable, I'd point people who want a supported equivalent of my tool in your direction.

You can pick up a pre-release of the next version of SE by ftp from playground.sun.com in the blind directory /incoming/SE3.tar It has updated rules and other refinements, but is not yet finished.

To Adrian Cockcroft:

First I want to thank you for the valuable information presented in your column. I hope you have the time to answer my question.

When I use the performance utility, I can see the CPU load, interrupts (intr), collisions (colls), errors (errs), load, context (cntxt), disk, swap, page, pkts (packets). What header files and functions should I use in C for the same results?

Regards,
Hassan El-Hajj

Adrian replies:

I assume you are looking at perfmeter, which gets data from rpc.rstatd(1M). See the files /usr/include/rpcsvc/rstat.h and rstat.x.

You need to write an rpc client that calls rstatd to get the data, which I've never done, but I think there must be an answerbook section on writing RPC clients, see the man page for rpcgen.

More on Bill Rosenblatt's review of Ed Yourdon's Death March

Shaking the foundations

This was a very good and inspiring review. It shakes the foundations of a tradition of discipline and applied engineering to software development. I teach a software engineering course at a graduate level at Francisco Marroqun University in Guatemala, and I have to confess that I have been preaching Ed Yourdon's methodologies, trying to convert students from a disorganized way of programming to a very structured one. Your article made me think about relaxing a bit and letting programmers be a bit more creative. Maybe they'll be more successful that way.

Alejandro Acevedo

The Big Project Syndrome

Just read your review of Death March by Ed Yourdon.

I have long been teaching that "The Big Project" is an excellent way to create a huge and well-remembered corporate software failure. I refer to the "Big Project Syndrome" as "...we start with a blank slate, all new programmers, all new goals, all new languages, methodologies, tools, and underlying technologies...we'll get the analysis right this time; no mistakes...we'll replace all existing products...we'll leapfrog all existing practices and be the technology leaders in this organization, and rightfully so!"

These are often heartbreaking projects. Death March is bad enough if there's a chance you might succeed. Of course, I recommend less risky ventures, but that's intuitive I suppose.

I know that the "Big Project Syndrome" is dangerous because it takes on maximum risk. I've seen it happen, and I've seen it fail, usually. I am wondering to what extent the problems are also problems of size.

Thanks for a great review.
Tim Ottinger

Understanding ATM networking and network layer switching, part one, by Rawn Shah

To Connectivity columnist Rawn Shah:

Mr. Shah,

I enjoyed your most recent column. It was informative and explained ATM technology very clearly.

I am working on a market research report on the ATM IC market, and I am curious about what trends you see developing for ATM in general, and ATM equipment and ICs specifically. Any insight you could provide would be appreciated!

Thanks,
Shannon Pleasant

Rawn Shah replies:

In brief, I can talk about the ATM industry and vendors, but I'm not as familiar with the chipsets, except for the more popular ones.

ATM has clearly won the high-bandwidth, long-distance market. Almost all future WAN high-speed connections above T3/DS3 will be ATM. IP/SONET is available, but people want to go that extra mile. The lead vendors Fore, Bay, and Cisco will not change, but they will even-out.

Fore has too great a market presence to lose the lead in the enterprise market, where most of the action is, but if they suffer two more bad quarters consecutively they might be in trouble. Their upcoming product, shipping in '98, will easily surpass Cisco's Cougar. I believe they're changing the core processors and backplanes in the 2000, and they'll drop the Orion chips for Pentiums.

Cisco is still playing catch up in the enterprise/campus backbone, but will probably catch the edge-switch market completely. Bay's corporate troubles are the only thing preventing them from catching up to Fore.

Newbridge/Siemens, Alcatel, Nortel, and Lucent are the leaders in the carrier/core class. It's unlikely Nortel will break into the higher ranks of the enterprise market. They have a good lead in voice/ATM, but the company focus just isn't there.

Route/switching/L3switching is still too up in the air to make any sense. Ipsilon is losing ground slowly as growth in the number of products supporting their technology slows down. It's too bad Fore didn't choose to implement Ipsilon's IFMP; they have their own internal developments on that. GSMP isn't too big of a deal; vendors really need IFMP to make Ipsilon's technology useful. Cisco will continue blindly with Tag switching/MLSP. I believe their relationship with Microsoft will start to pay off for the low-end corporate market. NT systems with ATM cards might just become the next big thing for businesses that want ATM but not $50-$150,000 switches.

Outside the U.S., DSL is gaining ground. Small companies like Orckit and Amati are really going to pay big with ATM/ADSL or ATM/VDSL. The first generation of ADSL/DMT products are still too big and too expensive. Amati has new licensing with Motorola which will drive costs down severely in 1998. However, most of the DSL vendors do not have a significant ATM background and are counting on the big chip makers to pull them through. ATM/DSL can make telecommuting a reality with full speed and full network participation for data voice or video.

European and Asian markets are the biggest growth area.

If you have any more specific questions, let me know.

Rawn

If you have technical problems with this magazine, contact webmaster@sunworld.com

URL: http://www.sunworld.com/swol-09-1997/swol-09-letters.html
Last modified:

Comments:
Name:
Email:
Company Name: