Click on our Sponsors to help Support SunWorld
Performance Q & A by Adrian Cockcroft

Which is better, static or dynamic linking?

At compile time, developers can choose static or dynamic linking.
Which offers better performance?

SunWorld
February  1996
[Next story]
[Table of Contents]
[Search]
Subscribe to SunWorld, it's free!

Abstract
Applications can be built using static or dynamic linking. Columnist Cockcroft explains the reasons why you should use the default of shared libraries and dynamic linking. There are some useful performance bonuses, especially with Solaris 2.5 and UltraSPARC-based systems. (2,000 words)


Mail this
article to
a friend

Q:
I've heard that an application built using static linking may run faster than a dynamic-linked application using shared libraries. I've also heard static linking is discouraged in Solaris 2. What should I do?

-- Linkless in La Crosse

Dynamic linking became the default for Solaris 1 in 1988 with the advent of SunOS 4.0, and is, of course, the default for Solaris 2. It has several advantages and, in many cases, offers better performance than static linking.

We'll start by examining the differences between static and dynamic linking, then move on to the reasons why dynamic linking is preferred. We'll also look at using dynamic linking to improve application performance.

I mentioned the difference between interfaces and implementations last month, but it is relevant so I'll repeat my definitions briefly:

Interfaces

Interfaces are the expression of "what" something does, and are distinct from the implementation of "how" it is done. "How" changes all the time, whereas the intent is to keep "what" stable. Application stability and portability is available only by defining and maintaining the relationship between an application and the system in terms of the interfaces provided.

Implementations

The implementation hides behind the interface and does the actual work. Bugfixes, performance enhancements, and underlying hardware differences are handled by changes in the implementation. There are often changes from one release to the next, or even from one system to another running the same release.


Advertisements

Static linking
Static linking is the original method used to combine an application program with the parts of various library routines it uses. The linker is given your compiled code, containing many unresolved references to library routines. It also gets archive libraries (for example /usr/lib/libm.a) containing each library routine as a separate module. The linker keeps working until there are no more unresolved references and writes out a single file that combines your code and a jumbled mixture of modules containing parts of several libraries. The library routines make system calls directly, so a statically linked application is built to work with the kernel's system call interface.

Archive libraries are built with the ar command and in older versions of Unix the libraries needed to be processed by ranlib to create an index of the contents for random access to the modules. In Solaris 2, ranlib is not needed; ar does the job properly. Sun had so many people ask "where's ranlib" that in Solaris 2.5 it was put back as a script that does nothing! It acts as a placebo for portable Makefiles that expect to find it on every system.

The main problem with static linking is that the kernel system call interface is in itself a dynamic binding, but it is too low-level. Once upon a time, the kernel interface defined the boundary between applications and the system. The architecture of the system is now based on more sophisticated abstractions than the kernel system call interface. For example, name service lookups use a different dynamic library for each type of server (i.e., files, NIS, NIS+, DNS) and this is linked to the application at runtime.

The performance problems with static linking arise in three areas.

  1. RAM wasted by duplicating the same library code in every static linked process can be significant. For example, if all the window system tools were statically linked, several tens of megabytes of RAM would be wasted for a typical user, and the user would be slowed down by a lot of paging.

  2. A static-linked program contains a subset of the jumbled library routines. The library cannot be tuned as a whole to put routines that call each other onto the same memory page. The whole application could be tuned this way, but very few developers take the trouble.

  3. Subsequent versions of the operating system contain better-tuned and debugged library routines, or routines that enable new functionality. Static linking locks in the old slow or buggy routines and prevents access to the new functionality.

There are a few ways that static linking may be faster. Calls into the library routines have a little less overhead if they are linked together directly, and start-up time is reduced as there is no need to locate and load the dynamic libraries. The address space of the process is simpler, so fork() can duplicate it more quickly. The static layout of the code also makes run times for small benchmarks more deterministic, so that when the a benchmark is reiterated there will be less variation in the run times. These speed-ups tend to be larger on small utilities or toy benchmarks, and less significant for large, complex applications.

Dynamic linking
When the linker builds a dynamically linked application it resolves all the references to library routines, but it does not copy the code into the executable. Consider the number of commands provided with Solaris, and it is clear that the reduced size of each executable file is saving a lot of disk space. The linker adds start-up code to load the required libraries at runtime, and each library call goes through a jump table. The first time a routine is actually called, the jump table is patched to point at the library routine. For subsequent calls, the only overhead is the indirect reference. Use ldd to generate a list of libraries a command depends on. Shared object libraries have a `.so' suffix and a version number.

% ldd /bin/grep
	libintl.so.1 =>	 /usr/lib/libintl.so.1
	libc.so.1 =>	 /usr/lib/libc.so.1
	libw.so.1 =>	 /usr/lib/libw.so.1
	libdl.so.1 =>	 /usr/lib/libdl.so.1

These libraries include the main system interface library libc.so, the dynamic linking library libdl.so, wide character support (libw.so), and internationalization support (libintl.so). This raises another good reason to use dynamic linking. Statically linked programs may not be able to take advantage of some internationalization, networking, and other features that may vary across configurations and environments.

Many of the libraries supplied with Solaris 2 have been carefully laid out so that their internal inter-calling patterns tend to reference the minimum possible number of pages. This reduces the working set size for a library and contributes to a significant speedup on small-memory systems. A lot of effort has been put into the Openwindows and CDE window system libraries. I'm told that Sun's own version of CDE is both smaller and faster than other vendor's implementations.

(See the sidebar, SunSoft's position on linking options, below.)

Solaris 1 compatibility
Many Solaris 1/SunOS 4 applications run on Solaris 2 in a binary compatibility mode. A very similar dynamic linking mechanism is also the default in Solaris 1. Dynamically linked Solaris 1 applications link through specially modified libraries on Solaris 2 that provide the best compatibility and the widest access to new features. Statically linked Solaris 1 applications run on Solaris 2.3 and later releases by dynamically creating a new kernel system call layer for the process. This slows things down a bit and prevents applications from accessing some of the features of Solaris 2. Also, there are problems with access to files that have changed formats. Applications can only make name lookups via the old name services. Solaris 2.5 adds the capability of running some mixed-mode Solaris 1 applications that are partly dynamic and partly statically linked.

Mixed-mode linking
Mixed-mode linking can also be used with Solaris 2 applications. I don't mean the case where you are building an application out of your own archive libraries. Given a choice between linking to either archive or shared libraries, the linker will default to shared libraries. You can force some libraries to be statically linked but you should always dynamically link to the basic system interface libraries and name service lookup library.

Interposition
It is possible to interpose an extra layer of library between the application and its regular dynamically linked library. This can be used to instrument applications at the library interface. You build a new shared library containing only the routines you wish to interpose upon, then set the LD_PRELOAD environment variable to indicate the library and run your application. Interposition is disabled for setuid programs to prevent security problems.

Internally at Sun, two applications have made heavy use of interposition. One was developed to instrument and tune the usage of graphics libraries by real applications. The other is used to help automate testing by making sure that application usage of standard API's actually conform to those standards.

Dynamic performance improvements in Solaris 2.5
There are several new features of Solaris 2.5 libraries that provide a significant performance boost over earlier releases. Dynamically linked applications get these speedups transparently. The standard requirement is to build applications on the oldest Solaris release that you wish to run on. If dynamically linked, applications get these speedups transparently. By using static links you miss out on improvements made to later releases.

The libraries dynamically figure out whether the SPARC CPU in use has integer multiply and divide support. These CPU instructions are present in all recent machines, but are not in the CPUs found in the old SPARCstation IPX, SPARCstation 2, and earlier hardware. The new library routines use the instructions if the hardware supports them, and calculates results the old way if not. You no longer have to choose whether to run fast or to run optimally on every old SPARC system.

Additionally, parts of the standard I/O library (stdio) were tuned. This also helps some I/O-intensive Fortran programs.

For UltraSPARC-based systems, a special "platform specific" version of some library routines is interposed over the standard routines. These provide a hardware speedup that triples the performance of bcopy, bzero, memcopy, and memset operations. The speedup is transparent as long as the application is dynamically linked.

Take the hint
By now the message should be clear, dynamic linking is the only way to go. Sun was an early adopter of this technology, but every other vendor now offers shared libraries in some form. Banish static linking from your Makefiles and figure out how to take best advantage of the technology.

Next month we'll take a look at World Wide Web servers and address the question, "What should I monitor and how can I tell what's happening on a Web server?"


Click on our Sponsors to help Support SunWorld


Resources