Letters to the editor -- SunWorld, October 1996">

Click on our Sponsors to help Support SunWorld

Letters to the editor

October 1996

Mail this
article to
a friend

Letters to the editor

Sun and NT

Dear Barry Bowen,

I just finished reading your excellent article in last month's SunWorld Online. It seems that a lot of articles SunWorld published have such a anti-Microsoft slant that I cannot take the authors opinions on evaluating technology seriously. As a person who uses both Solaris and Windows, developing with CORBA and COM, it is always frustrating to see the posturing of the Unix-Microsoft camps exposed in an article where I just want the facts.

Thanks for a REAL article,

--Jason, (firm indeterminate)

Barry Bowen responds:

Thank you very much for the compliment. Those who love and hate Microsoft, Apple, or Sun, often respond to the familiar and bash the unfamiliar. Many have a love/hate relationship with the same company at the same time, including many Sun and MS customers.

If the industry could only make stuff as easy to use as a Mac, as open and technically solid as Sun/Solaris (SunOS 4.1.3 and before or post-Solaris 2.4), that would sell like Windows.

I'm sure the editors would welcome specific examples of where commentary is being substituted for what should have been news. (Editors: Indeed we would.)

Inside a NAP

I just looked over http://www.sunworld.com/swol-09-1996/swol-09-nap.html and noticed this in the sidebar:

Network Service Providers such as Sprint, MCI, UUNet, PSInet, and Netcom all connected to the four Network Access Points (NAPs) around the country

You should note that neither PSI nor UUnet connect to the ATM naps: the PacBell NAP and the Chicago NAP.

Also, the list of big nets nowadays is:

MCI, Sprint, UUnet, ANS, BBN, AGIS

PSI and Netcom are pretty insignificant, as are everyone else connecting to the SF NAP, excluding my list above.

Also, one of the really big players is making like they will pull out of the SF NAP. If you want to know who, you'll have to pressure PacBell to tell you :-)

Most California internet traffic that is not exchanged via private links (such as between UUnet and MCI, and Sprint and MCI) goes via mae-west. This is unfortunate: MAE-West has had four extended (> 1 week) brownouts this calendar year.

--Mark, (firm indeterminate)

Java vs. ActiveX feature story

Robert E. Lee's comparison of Java and ActiveX was disappointing, because it ignored one of the most important issues in the debate. Security is not simply a fringe issue that geeks are arguing about in newsgroups. JavaSoft has spent a lot of time arguing the superiority of Java on security grounds, and Microsoft has vigorously debated those arguments. Security is central to the evaluation of these two technologies.

Worst of all, the only mention of security concerns in your article is misleading. Toward the end of the article, Lee wrote this:

Where JavaSoft has gained the required support to keep the technology around, in spite of performance or security concerns ...

No mention is made of any concerns about ActiveX security. It's true that holes have been found in Java's security architecture, but those bugs are being fixed. ActiveX, in contrast, has no security architecture. Microsoft's Authenticode proposal addresses some issues of trust, but provides no real security; essentially, it is a "blame assignment" architecture.

Given the growing importance of the Internet to business operations, it's no longer good enough to evaluate technologies based on the ease with which you can do cool stuff with them. A more mature outlook is called for.

--name and firm indeterminate

Network Appliance changes its name

I noticed a reference to NAC in your NFSv3 web page, and I thought you'd be interested in knowing that Network Appliance has dropped the "Corporation" from its name. As a result, NAC is no longer a valid acronym.

We use NetApp as a short name, in places where the full name is too long.

This is obviously not a big deal, since we've been known by both names, but I thought you'd be interested in the update.

Nice article. I always like to see stuff pumping up NFSv3, since I think it's got a good set of features over NFSv2.

--Dave Hitz, co-founder, Network Appliance

Is the Web sapping your firm's productivity?

Dear Barry Bowen,

In your SunWorld Online article for September you mentioned a product called IMON.

My question is, how does a product such as IMON determine with any kind of accuracy the amount of time an employee spends browsing the web? I have seen a lot of desktops with the browser application loaded and displaying some website, however the user on that machine is not looking at the browser, but rather, paying attention to some other window on his desktop doing actual work. I'd feel sorry for this employee if his manager confronts him about his personal use of the network during business hours.

--Edsel Adap, (firm indeterminate)

Barry Bowen responds:

I presume the stateless nature of the Web means the product under reports hours/minutes but should do a good job assessing bytes transferred, etc. Perhaps there is some configurable logic that counts dead time if the connection is not idle for more than xx.

I considered the core story issue not so much about how one product works, but rather about the issues firms are concerned about and whether managers are dealing with those concerns constructively or in a petty and fascist manner.

The folks I spoke with seem to be setting the ground rules up front (rather than playing gotcha) and then letting everyone know they have a tool to enforce the rules and would be conducting spot checks.

Of course the more anal-retentive financial service firms did not want to talk.

Free advice from our client/server columnist

Bill:

I have looked at WebObjects and am now debating to either look at Illustra or ObjectStore. My first question is, which do you recommend (ObjectStore or Illustra)?

Second, it appears Next and Illustra provide two approaches to Web development. In your opinion which approach provides a better solution for Web application development?

Your answers and any referential material will be greatly appreciated.

--Bakhtiar, (firm indeterminate)

Columnist Bill Rosenblatt responds:

I personally feel that Illustra has the right architecture for data-rich web application development, especially when Informix delivers Universal Server. It's more scalable than ODI ObjectStore and has a better set of tools for the Web. Next has a nice toolset (WebObjects) but database connectivity isn't as good.

Performance Q&A questions and answers

Editor's note: See Adrian Cockcroft's frequently asked questions for more answers to reader questions.

Adrian:

Do you know if other OS like AIX, DEC-Ultrix, HP-UX have something like cachefs?

--Mauricio Atanasio, (firm indeterminate)

Adrian Cockcroft responds:

The AFS and DCE DFS filesystems has a similar capability. Some of those companies have licensed rights to NFS from Sun, the latest update to that license includes cachefs, but I don't know who has that level of license or intends to implement it.

Adrian:

On Solaris 2.3, I have seen following problem several times.

A process listening on TCP/IP port terminates but the listen port is not cleared. That is netstat still shows the port as listening. This happens when I attach debugger (AT&T's pi) to the process and kill the process with in the debugger. Once the port is not released, process will not come up due to "Already in Use" error.

Is there a way to clean this LISTEN port other than re-booting the machine?

--Srikrishna Kurapati, (firm indeterminate)

Adrian Cockcroft responds:

TCP mandates that if one end crashes, the other end must time out, if it doesn't then data from the previous connection may get mixed into the new connection.

The timeout is minutes or hours depending on the exact situation.

There is a SOREUSADDR (?) option to setsockopt to allow a server process to reuse a port

For full details see TCP/IP Illustrated Vol 1 by W Richard Stevens

There are some cases where Solaris can detect the failure more quickly, and some problems with bugs in other systems TCP implementations causing problems in Solaris. Upgrading the OS to 2.4 or later with full recent patches may help in those cases. Solaris 2.3 does not get all the TCP patch updates.

Adrian:

I have learned some stuff from your column. It's nice to get some good technical information from the net. I would like to learn more about tuning streams, shared memory, semaphores and interprocess communications. How do these things differ from each other? How does it all work? Is there an old way and a new way of implementing interprocess communications?

--Hal Cooper, (firm indeterminate)

Adrian Cockcroft responds:

Good questions, I haven't done much IPC programming though. The "old way" is System V ipc (message queues and semas), which are portable but relatively slow. The "new way" is to use shared memory (mmap) and use the thread library to put threads etc in the shared memory. Coming in Solaris 2.6 is a supported version of the "doors" interface, that exists in 2.5 as a undocumented prototype used by the nscd process (the interface changed). This is Solaris specific, but fast (avoids rescheduling delay).

There isn't really anything you can do to tune performance of IPC, just be careful how you use it. The "tunables" are just configuration limits.

Adrian:

We have developed a client-server (the much abused term) application whose main components are a backend server, a database engine (sybase) and a GUI (client). We spend a great deal of time populating the database. This has to be done through the server application. During the population, system resources seem to be fine (or within acceptable numbers), however populations will run for long hours (about 40%).

My hardware configuration is a S1000 with 2 SSA's. I'm using mirroring and stripping and running Solaris 2.4.

Using proctool, I have monitored the server application CPU time. It consistently increases over time. From these figures I see the CPU is about 40% in idle state on the S1000.

I also ran the DB population on a SPARCstation 20 and it always outperforms the S1000. Populations were always 10% faster on the S20, even though, the S20 had half (128 megabytes) the amount of memory the S1000 had (256 megabytes)!!

I also ran proctool here and saw that the amount of time the CPU spent on idle state was higher, about 90% !. This was consistent over several runs.

Why the S20 overperforms the S1000? why the CPU waits longer?

My conclusions were:

The S20 has a faster CPU that the S1000 (40-MHz vs 100-MHz)
Assuming memory has similar performance figures, a faster CPU will wait longer. This could explain why the S20 is faster and why its CPU stays on idle state longer
The S1000 has two CPUs, therefore there is memory and bus contention this is not the case on the S20
On the S1000, there is cache synchronization this doesn't happen on the S20
The cross-bar architecture on the S1000 is giving me a virtual machine of a 40-MHz CPU and a 40-60 MB/s bus. My code (application 0is not multi-threaded. which is far from a S20 with almost double figures.

If what I'm saying makes any sense, then the performance problem I'm experiencing is due to the slow memory, cache policy, and bus architecture on the S1000. And if so, How could improve performance on these modules, How could I tune my system to affect these components? How could I monitor/measure/get a feeling for memory latency, bus contention, and cache policy.

Thanks for your help.

--Carlos Perez, (firm indeterminate)

Adrian Cockcroft responds:

The SS1000 should have 50-MHz SuperSPARC modules with 1 megabyte cache, what does the SS20 have?

Often for things like populate, its a single threaded problem, so only one CPU on the SS1000 is doing useful work.

Its quite possible for the SS1000 to be slower if the SS20 has a faster CPU. In a multi-user test you have higher ultimate capacity on the SS1000. At the two-CPU level, the SS1000 is not using that capacity. At the four and more CPU level an SS1000 beats a SS20.

The latest modules are 85-MHz for the SS1000, and should be about twice as fast as the old 50-MHz modules (they are SuperSPARC II, which has other speedups apart from clock rate).

Adrian:

Does Solaris support the Microsoft Windows concept of dynamic link libraries, and if they do, where can I go to get information about how to use them? I have to port a Windows application to Sun.

--(name and firm indeterminate)

Adrian Cockcroft responds:

Sun has had shared libraries since 1988 (SunOS 4.0), predating windows by a long way. Sun were one of the first to use them.

The normal approach is to compile with the -PIC option which generates position independent code -- this is not mandatory, but it makes it more efficient to share the library if the code is PIC.

You then use ld like this:

% /usr/ccs/bin/ld -G *.o -o libthing.so

If you want a default location for the library to be hardwired into it, you use the -R option

% /usr/ccs/bin/ld -G -R /opt/mypackage/lib *.o -o libthing.so

You use the ldd command to see what shared libraries are used.

% ldd /bin/ls
	libc.so.1 =>	 /usr/lib/libc.so.1
	libdl.so.1 =>	 /usr/lib/libdl.so.1

I recommend you get a copy of the Solaris Porting guide, it has a whole chapter on dynamic linking. It's published by Prentice Hall ISBN 0-13-443672-5.

Adrian:

I have some questions and thoughts to performance issues: If at least one process is running: There will ALWAYS be some sort of bottleneck (unless it's sleeping). Either CPU, waiting for disks, network or other resources will be insufficient (at least for a tiny moment). If it would not be so the process would be finished "within a blink of an eye"...
Is there a way to (automatically) get the highest possible throughput for ALL resources? E.g. one SCSI-bus is close to 100% max transfer rate and I can't recognize it (I can monitor only data/s to disks...). Wouldn't it be better to generate a representation (graph) of the monitored system and mark the pathes with their current throughput (best in %)? Like the disk view of Online DiskSuite plus System bus, SBUS cards etc?
LAN performance is not just depending on my own interface's throughput. There is a major delay by simply waiting for the LAN to be clear before the card itself trys to transfer it's data. So, is there a way to see these delays
or occupation of the LAN (not the interface)?
Your (fine and excellent) tools (SymbEL) recomment moving data from busy to idle disks. Large installations use volume managers (disk suite/veritas) which should ease the administration of disks. Do you know a way to MOVE data at a RUNNING system with these tools? (I know there are ways with tar(1) or cpio(1) but that's not what I'm looking for...)

--(name and firm indeterminate)

Adrian Cockcroft responds:

That's why I don't report 100% busy CPU as a problem. I wait until there is a queue of jobs having to wait for CPU before I record a bad state.
Yes, this would be nice. Most software does not know enough about the system configuration to draw the map. Solstice Symon gets close, but is only available on the new Enterprise server range.
Most interfaces count something called "defers", that are deferred output due to being to busy to send.
AIM sharpshooter does this by snooping the net.
Create a new volume of the right size, mirror it with the old volume and resync it to do the transfer, then drop the old volume from the mirror. Could be done with veritas fairly easily.

Adrian:

I would like to know the relationship between the vmstat and the swap -s numbers about the swap space. The ones from the former are different from the latter. For instance on our SPARCstation 10 we have 64 MB RAM allocated and 192.44 MB swap space.

The output from vmstat and swap is:

dimakop@hermes> swap -s ; vmstat
total: 64388k bytes allocated + 33640k reserved = 98028k used, 134544k
available
 procs     memory            page            disk          faults      cpu
 r b w   swap  free  re  mf pi po fr de sr f0 s1 s3 --   in   sy   cs us sy
id
 0 0 0   7572  1088   0  16  4  3  5  0  2  0  1  2  0   39 1065  150 11 76
13

And the output from swap -l is:

swapfile             dev  swaplo blocks   free
/dev/dsk/c0t3d0s1   32,25      8 394112 293896

According to the vmstat manual page the number 7572 in column swap is the amount of swap space currently available in kilobytes. Yes BUT swap -s reports 134544K available which seems to agree with number 293896 from swap -l (this command was run a bit later after the swap and vmstat).

Another question is why in the swap -s output the sum of swap allocated + swap reserved + swap available does not equal with the total amount of swap space on the disk which is 192.44 MB as I mentioned above?

--Panos Dimakopoulos, Computer Technology Institute, Greece

Adrian Cockcroft responds:

You should ignore the first line of vmstat output, its an average since boot.

% vmstat 2
 procs     memory            page            disk          faults      cpu
 r b w   swap  free  re  mf pi po fr de sr i0 i1 i1 i1   in   sy   cs us sy id
 0 0 0   7944  8620   0 905  1  3  3  0  2  1  2  1  1  375 2245  152 21 40 39
 1 1 0 525132 93808   3 2624 388 0 0  0  0  0  6  8 10  447 8458  219 34 66  0
 2 1 0 525116 94104   3 1889 426 0 0  0  0 12  8  4 12  567 6454  236 40 60  0
 3 0 0 524800 93992   1 2035 388 0 0  0  0  6  6  3  9  507 7036  251 33 67  0
^C% swap -s
total: 162120k bytes allocated + 32812k reserved = 194932k used, 524748k
available

You should find that swap available does actually match.

See my SunWorld Online article "How does swap space work?", swap space includes swap disk plus a variable amount of RAM, so its hard to keep track of the total.

Adrian:

I browsed through How much RAM do I need? (which I think is interesting and useful) but I still can't figure out the way you "walked the kernel tables" to find out the private sizes of programs that currently active in the system. You included the list with Shared and Private columns for the common processes but I still trying to find out where the data for this list may come from. Do I have to count the values form pmap command or dig around with adb macros to get those sizes? Or there is some other way?

--Alexander Chelyadinov, (firm indeterminate)

Adrian Cockcroft responds:

We built a special loadable device driver that digs out this info. It required source code, and is not portable to other releases of Solaris apart from 2.5 and 2.5.1, we don't want to distribute it, in its current form, but it is under consideration to be added to a future release of Solaris. (perhaps 2.6, but it may be too late). If it makes it there will be a /usr/proc/bin/pmem command and an extra ioctl on /proc.

Adrian:

Thanks for your book, articles etc. You have pretty well convinced me that there isn't much I can do in tuning Solaris 2.4 & 2.5 kernels. I read with much interest your articles re iostat -x 30 (Step 1) This is great but I can't work out what is happening on 2 different sites with similar configs.

                                 extended disk statistics
disk      r/s  w/s   Kr/s   Kw/s wait actv  svc_t  %w  %b
fd0       0.0  0.0    0.0    0.0  0.0  0.0    0.0   0   0
sd17      0.0  1.4    0.0    5.4  0.0  0.0    7.2   0   1
sd19      0.0  0.0    0.0    0.0  0.0  0.0    0.0   0   0
sd3       0.0  7.0    0.0   49.0  1.1  0.3  198.8   8  11
sd31      0.4 208.9   3.2  849.0  0.0  0.2    1.4   1  23
sd6       0.0  0.0    0.0    0.0  0.0  0.0    0.0   0   0

svc_t on sd3 is often around 300+ with little change in the other sd3 stats. sd3 is a Fast SCSI 2 disk.

The system is a SS10 with 512 megabyte RAM 1x100-MHz & 2x125-MHz hyperSPARCs sd31 is a 24 gigabyte RAID-5 unit. The system is running 3 Oracle instances and 45 users. SGA sizes are 30 megabytes and 2 @ 8 megabytes.

The iostat report says to me the system disk is slow to respond but there is little I/O and no queues of significance. This iostat report is not like any examples I have seen in your columns. Any clues?

I don't expect an answer directly as I am sure you receive lots of questions from all around the world, but maybe this may be this could be incorporated in one your columns at Sun's home page.

--Philip Sewell, (firm indeterminate)

Adrian Cockcroft responds: I love these questions -- I just wrote an answer for this for next month's SunWorld Online column. There is actually no real performance problem. If you can't wait for the hyperlink here is an exercise:

Read the prex(1) manpage, collect an I/O trace, and figure out what causes the large service times. (prex is on Solaris 2.5 and later).

This isn't the first time that a question has come in in-between writing and publishing the answer.

Adrian:

We have a computer installation in our University, currently running an Ingres DBMS. All machines are UltraSPARC (but the 2 servers are going to be migrated to Enterprise 4000 Campfires). We have also two application servers (UltraSPARC 170) with Solaris 2.5 which communicate via Ingres/NET with the DBMS Server. The response time of the overall application was very slow, and we are not sure if the following could be the reason. We've monitorized the application using iostat, vmstat and proctool. (and now analyzing the results to find some bottleneck). The DBMS Server have no problem with neither Memory (scan rate near 0), CPU (although ~100% busy there were no processes in run-queue (sar -q)) and it is attached to 2 SSAs, and we have detected no I/O contention (the svc_t was low: ~20 ms all the time).

The situation with the client of the DBMS Server ( the application server ) is the following:

We have a doubt interpreting one of the parameters given by the "iostat -x 30" command: the "svc_t" parameter;

Part of the output of executing this command in an application server:

disk      r/s  w/s   Kr/s   Kw/s wait actv  svc_t  %w  %b 
sd0       0.0  2.3    0.1   13.4  0.0  0.6  253.2   0   3 
sd1       0.0  0.0    0.0    0.0  0.0  0.0    0.0   0   0 
                                 extended disk statistics 
disk      r/s  w/s   Kr/s   Kw/s wait actv  svc_t  %w  %b 
sd0       0.0  0.5    0.0    3.2  0.0  0.1  106.6   0   1 
sd1       0.0  0.0    0.0    0.0  0.0  0.0    0.0   0   0 

disk      r/s  w/s   Kr/s   Kw/s wait actv  svc_t  %w  %b 
sd0       0.0  0.6    0.0    3.5  0.0  0.0   78.1   0   1 
sd1       0.0  0.0    0.0    0.0  0.0  0.0    0.0   0   0 
                                 extended disk statistics 
disk      r/s  w/s   Kr/s   Kw/s wait actv  svc_t  %w  %b 
sd0       0.0  0.7    0.0    3.8  0.0  0.1  111.3   0   1 
sd1       0.0  0.0    0.0    0.0  0.0  0.0    0.0   0   0 
                                 extended disk statistics 
disk      r/s  w/s   Kr/s   Kw/s wait actv  svc_t  %w  %b 
sd0       0.0  0.6    0.0    3.4  0.0  0.1   84.3   0   1 
sd1       0.0  0.0    0.0    0.0  0.0  0.0    0.0   0   0

We are alarmed by the big numbers in the svc_t column. Our question is:

As the svc_t = ((wait + actv) / (rps + wps))*1000, do we have to ignore it (the big value of svc_t) if the "%b" (or tps) is low? we have gotten this kind of values so often that we are worried. We think this high svc_t value is due to "approximation error in calculating it". Is this true?

We would appreciate very much if you could tell us your opinion, and in any case, thanks in advance.

--Josep Blanes, Universitat Autonoma de Barcelona, Spain

Adrian Cockcroft responds:

Coincidentally I have already written a full answer to this question, and it will be published in next month's SunWorld Online column.

The short answer is that it is not a problem, it is caused by fsflush writing a burst of I/Os to disk in a short time period every 30 seconds or so.

You may be able to use the techniques described in the article to see where your slow response time is located. For now read the man page for prex(1), and think about adding trace points to your application.

If you have problems with this magazine, contact webmaster@sunworld.com

URL: http://www.sunworld.com/swol-10-1996/swol-10-letters.html
Last updated: 1 October 1996

Click on our Sponsors to help Support SunWorld

If you have technical problems with this magazine, contact webmaster@sunworld.com

URL: http://www.sunworld.com/swol-10-1996/swol-10-letters.html
Last modified:

Comments:
Name:
Email:
Company Name: