Delivering multimedia on the Internet
A look at Internet multimedia and
Companies have been making major strides in improving multimedia on the Internet. The progression of VRML and the creation of new Java applets push Internet multimedia forward. This article examines the latest developments from VRML to MPEG to Real Audio. (4,800 words)
With the introduction of Mosaic in 1993, the Internet turned into a delivery mechanism for text, graphics and audio -- multimedia -- in a standard format to any computer platform with a browser. Today, few would argue with the fact that the advent of open standards in the delivery of multimedia has been one of the key drivers in the Internet's growth. Consequently, developers have been able to focus on creating new content for delivery on the Web instead of spending time porting old content to different formats.
in the early days of Mosaic, most multimedia was crude and slow -- unless you had a high speed link. According to Marc Andreessen, the genius behind Mosaic, the browser was originally designed for the 45 megabit per second T3 links available at the University of Illinois. But once people realized the power of the technology, they began tuning Mosaic to work with the slower speed connections that are more common on the Internet.
A vast number of companies have answered the call to improve the delivery of multimedia over the Net. There are now products for delivering voice-quality audio on demand over links as slow as 9.6 kilobits per second, and low-quality video over 14-kilobit-per-second lines. Although most of these products are available only for PCs on the client side, the audio and video is often delivered from Sun servers.
For people striving to bring 3D technology to the Internet, the development of the Virtual Reality Markup Language (VRML) has sparked alot of interest. Although most of the 3D images on the Net today are uninspiring in comparison to other technologies, VRML does provide a base upon which to build more exciting shared 3D space.
VRML 1.0 does not, however, directly support shared virtual worlds that allow multiple people to interact over the Internet. This capability will come with VRML 2.0, currently under development. One other alternative is the OpenFlight real-time 3D standard. It will allow the deployment of real-time interactive 3D games over the Internet.
Multimedia on the Internet does not stop there. Developers are working furiously with Java to create applets that are delivered on demand to Internet clients, regardless of operating system. For servers, we will see many new tools that not only create dynamic Web changes but can be integrated into a company's information infrastructure thus improving its ability to share information from existing databases and customer support systems.
Applications on demand
Java is important to the future of multimedia, not only for to its animation and interactivity it promotes, but because it provides a platform-independent way of creating applications. Java will enable development of applications which can run on different platforms from a single set of source code. Java will also make it easy for developers to distribute new technologies, such as video compression algorithms, without having to target a specific computer architecture.
Java is fast emerging as a de-facto standard for delivering applications across platforms on the Web. Although no standards body has made it official, all of the big-revenue computer companies including IBM, Silicon Graphics, Netscape, Oracle, and Toshiba have licensed the Java source code to enhance their browsers, create development tools for Java programming, and to port the Java Virtual Machine to various operating systems.
Even Microsoft has expressed an interest in Java. Last December it announced an intent to license Java to incorporate it into its browser and its Visual Basic Scripting Language for Internet programming, however, to date, it has not actually licensed it. (Microsoft is very close to licensing Java, but at press time, has not officially signed on with Sun. --Editor) Perhaps Microsoft wants to see if it can make its own programming environment based on Object Linking and Embedding technology take off instead.
"Since its release in 1995, Java has revolutionized programming for the Internet and other complex networks," said Scott McNealy, president, chairman and CEO of Sun Microsystems Inc. "Creating an infrastructure around this revolutionary technology will help us get the power of Java to every software developer for the public Internet and corporate intranets, and develop Java as the Internet programming standard."
Java is the result of several years of research and development at Sun Microsystems. It is the first progamming language to provide a comprehensive, platform-independent solution to the challenges of programming for the Internet and other complex networks. For example, it creates a virtual addressing space on the client, in effect creating a virtual machine in which all programs run, which is one layer of protection against infection by a virus, preventing the spread of a Java virus. Java also supports encryption, so that applications and their data can be delivered over the net without compromising the security or integrity of data. Applications range from running simple order entry forms, which can validate data as it is entered, to supporting the delivery of 3D worlds over the net.
To promote Java's development, Sun created a separate JavaSoft business unit to develop, market, and support products based on Java technology. The JavaSoft business unit says it will continue to enhance the Java programming language, as well as work with third parties to create applications, tools, systems platforms and services to augment the language's capabilities.
JavaSoft recently released Java 1.0, a development environment for creating Java applets. It incorporates a Java Applet Viewer for running and testing applets, a Java Compiler, a prototype debugger, the Java Virtual Machine to run Java-based programs and class libraries for graphics, audio, animation, and networking. Java 1.0 is available for the Windows 95 and NT on Intel and Solaris on SPARC platforms today. Java 1.0 for Mac OS 7.5 is expected by the end of the first quarter of 1996.
Other companies have jumped also on the bandwagon and released Java development kits. Rogue Wave Software recently began shipping Jfactory, a visual Java development toolkit for Windows 95, Windows NT, and Sun Solaris. The $795 software provides a GUI interface for creating Java applets.
Borland Software has created a beta version of a Java debugger and compiler for the PC platform. This is currently available on Borland's Web site for testing. These tools will eventually be incorporated into the next release of Borland's C++ development environment.
Symantec Corp. has a Java development environment for Windows 95 and NT code named "Symantec Cafe." Developers can download it from Symantec's Web site. Also, Natural Intelligence recently released Roaster, a Java development environment and Java interpreter for the Macintosh that is available for download from its Web site.
Bringing 3D to the masses
Recognizing the impact that the hypertext markup language (html) had on the deployment of multimedia on the net, Mark Pesce, director of the Community Company, and Anthony Parisi an engineer with Intervista Software, began working on a three-dimensional interface for the Web that embodies many of the lessons learned in several years of research in both virtual reality and networking in the early 90's. Both men presented a paper at the First International Conference on the World Wide Web, in Geneva, Switzerland in 1993. During a session to discuss virtual reality interfaces to the Web, attendees agreed there was a need for a common language to specify 3D scene description and WWW hyperlinks. The term Virtual Reality Modeling Language (VRML) was coined, and a group headed by Pesce and Brian Behlendorf, of Wired magazine, began work on a VRML specification immediately following the conference.
Initially, nothing about the proposed standard was set in stone, except that it would have to support 3D across differing platforms. The variety of proposals on the table caused a fierce debate, but a proposal based on Silicon Graphics' Open Inventor technology gathered the strongest support. On May 26th, 1995 at the World Wide Web Forum `94, Gavin Bell of Silicon Graphics, Anthony Parisi and Mark Pesce announced a VRML specification based on Open Inventor. It is available at http://www.hyperreal.com/~mpesce/vrml/vrml.tech/vrml10-3.html.
Still, a number of aspects, VRML is incomplete. Users can move around the virtual world and click on 3D hyperlinked objects to go to different rooms, but the objects themselves can not support interactive behavior. This capability was left out to streamline the design and implementation of the language, since describing interactive behaviors is a big job, particularly when the language needs to express behaviors of objects communicating on a network. Another of VRML's limitations is that multiple people cannot participate in a 3D world together -- something absolutely necessary for games and other interactive applications.
These problems, however, may finally be resolved. Last month, 56 companies agreed on a standard for VRML 2.0 called Moving Worlds. Moving Worlds takes VRML to the next level by allowing multiperson collaboration, realistic 3D geographies, and behaviors for the objects within a world. It uses Java to transfer dynamic information between worlds enabling behaviors, motion, and interaction. It also opens the door for 3D world plug-ins that can add new capabilities to VRML applications. Visit Worlds Inc. for more information.
A variety of companies also announced their support for VRML-based 3D graphics on the World Wide Web. These include: AccelGraphics, Inc., CERN, Digital Equipment Corp., Intergraph, NCD, NEC Technologies, net.Genesis Corp., Netscape Communications, Oki Advanced Products, San Diego Supercomputer Center, Spyglass, Sun Microsystems, Tenet Networks, Viewpoint Datalabs International, Wavefront Technologies, and 3Dlabs Inc.
At least one company, Dimension X, has already introduced a virtual reality authoring kit and viewer that uses Java called Liquid Reality. Creating dynamic extensions for motion, sound, and intelligent behaviors, the application has been available since last December. A copy of the program for Solaris can be downloaded from its Web site. Mark Pesce, noted, "Liquid Reality is an incredible breakthrough for VRML -- it represents the complete integration of Java and VRML, and is undoubtedly one of the finest efforts to date to bring interactivity and behaviors to VRML, and 3D scene description to Java."
3D for Web servers
Of course, not everyone has a fast enough client or network to make 3D VRML worlds, or they may not want to spend a half hour to download the latest VRML viewer over their 9.6 kilobits per second link.
The Center for Computation and Visualization of Geometric Structures, at the University of Minnesota may have found one solution with Cyberview, a server application that manipulates 3D models within Web pages. The software contains CGI script components which process requests, and allow users to zoom, rotate, shade, and perform other modifications to the models. The client sends requests to the server when the user wants to change something. These changes are rendered on the server, or a connected machine, and then inserted into an HTML page which is sent back to the end user.
Cyberview is suited for creating 3D objects which can be displayed on a Web page by the end user. It enables you to deliver 3D viewing capability to those that have HTML browsers but no VRML software. It is ideal for looking at and rotating 3D objects, but it is too slow for creating walk through environments.
The software is available under the terms of the GNU General Public License. It can be downloaded from http://www.geom.umn.edu/docs/W3Kit/W3Kit.html.
Cyberview works with any CGI/1.0 or 1.1 compatible HTTP server, such as the latest versions of NCSA httpd and CERN httpd. One of the limitations of Cyberview is that it must have access to a display server such as X11 or Quick Renderman graphics to operate. Future versions will incorporate their own display server.
Multiplayer 3D worlds
Although there are a number of business uses for multiperson 3D technology, such as collaborative CAD, the primary driver for such technology has been interactive multiplayer games.
As it happens, the military has been using this technology for years in its Distributed Interactive Simulation (DIS) technology to run training exercises on its own internal TCP/IP based Internet. Recognizing the opportunity for this technology, MultiGen Inc. has put its OpenFlight Scene Description Database specification, used in DIS, into the public domain. In addition, MultiGen is working with its partner, Mak Technologies to supply extensions to the standard to provide an open framework for delivering multiperson 3D applications over the Internet. The standard can be downloaded from http://www.multigen.com/openflig.txt. MultiGen plans to extend the API by the third quarter of 1996 with support for Java.
The OpenFlight database is a hierarchical way to store and organize digital data that describes 3D scenes for use in real-time graphics. OpenFlight encapsulates a significant amount of attribute and hierarchy data to facilitate real-time rendering, and has constructs for efficient data retrieval. The database is optimized to retrieve only the parts of objects that are in a viewing window and need to be rendered.
For interactive 3D applications, OpenFlight works with Mak Technologies' VR Link to pass communications data to different users. Today, the MultiGen's OpenFlight software only works on SGI boxes because it requires SGI Performer graphics. Joe Fantuzzi, president of MultiGen points out that SGI is in the process of porting these to the PC and Sun environments.
The VRLink software for passing simulation data between multiple users is available today for the Sun, SGI, and PC computers. One of the interesting aspects of a DIS environment, and VRLink, is that there is no need for a central server. Each machine in a simulation broadcasts its state to all the others on the network using UDP packets. Because of the nature of DIS, participants can enter and leave an active simulation at will. If a vehicle is not heard from after a set period of time, it is deleted from the other simulations automatically.
Despite this decentralized approach, DIS is highly scalable. At a Warbreaker Conference sponsored by the US Advanced Research Project Agency (ARPA) in 1994, participants linked 5,400 vehicles at 20 different sites into a single virtual world. Although many of the virtual vehicles were, themselves operated by computer, this demonstration indicates the technology's scalability.
Ben Lubetsky, one of the pioneers at Mak Technologies said the company is making minor changes to DIS to make it better suited for consumer applications.
For example, the military has a standard, though exhaustive, way of describing the characteristics of objects in the virtual world that includes about 50,000 different parameters. (By comparison, a typical consumer game requires less than 100.) The military requires that the position of each object in the database be described by three 64-bit numbers to describe X, Y, and Z coordinates on a universal Earth grid.
"The coordinate system is big and ugly," Lubetsky said. "You have to do a lot of math to turn it into anything remotely useful. In a game with a fairly small area, you are not working with the entire earth. You can do a lot of optimization. Currently you need to do a fair bit of number crunching to get a useful number [of players] in your game."
The data rate required for a simulation can vary dramatically depending on the kind of simulator application. For a fairly busy simulator with a live person controlling it, the packet rate might be 3.5 kilobits per second per vehicle or even higher. Many of the automated virtual entities that are controlled by computer, however, send much fewer network packets and their rate can be as low as 500 bits per second. In actual battlefield simulations, the average bandwidth requirements are close to 1 kilobit per second per vehicle.
According to Lubetsky, Mak is working on a twist of DIS that will reduce bandwidth by up to 70 percent. Consequently, it would allow significantly more interesting vehicles using a 28 kilobit per second modem. He said Mak is also looking at using Java to reduce bandwidth requirements even further. Currently, when a vehicle is modified, all of its configuration information needs to be sent to the other users. Using Java, Mak could send a much smaller Java applet that describes changes to the existing configuration.
Although few consumers have SPARCstations with which to play games, an interactive simulation could be used for a variety of training and coordination exercises. In addition, companies rolling out interactive games may want to use a server for accounting purposes, allowing players to pay by the bullet.
A high-speed terrain server could be used to simulate the environment for a simulation. As the environment undergoes changes as a result of game playing, such as a destroyed bridge, these changes can be relayed to the other participants by the server.
In the early days of the Internet, the only way to listen to audio was to download .au files from ftp sites. In 1994, Carl Malumud took this technology to its limit with the introduction of the Internet Multicasting Service, the first cyberstation broadcasting on the Internet.
There were two limitations to this approach: the size of the files and the lack of streaming capability. An hour-long show encoded in .au format generates a file 14 megabytes in size. Consequently, only users with high speed connections or a lot of patience would download these files.
And even if you did have a high speed connection that could deliver enough throughput to listen to the broadcast in real time, you still had to wait for the whole file to download before you could start to listen to it.
All that changed last year with the advent of RealAudio from Progressive Networks. Real Audio allows the delivery of audio over a 14.4 kilobit per second Internet connection. Hundreds of radio stations and other broadcasters have since established their own cyberstations.
Initially, RealAudio supported Unix servers and PC and Macintosh clients. In the latest release Progressive has added support for a number of Unix clients including Solaris, SunOS, Linux, and SGI/Irix.
Unfortunately, the original version of RealAudio had sound quality that was a bit lower than AM radio -- OK for talk shows, but worthless for listening to music. RealAudio has begun beta testing version 2.0 of the server and player, which can use a 28 kilobits per second link to transmit FM quality audio.
A number of other players delivering streaming audio to computers have since emerged. Most of these are targeted at delivering audio to the PC environment, although it can be streamed from Unix servers. Some of these applications include Iwave from Vocaltec, TrueSpeech from the DSP Group, ToolVox from Voxware, and Streamworks from Xing Technologies. So far, Streamworks is the only other freely available audio client supporting Unix workstations. Streamworks also supports video.
Both Progressive Networks and Xing Technologies are enabling their systems for live broadcasts over the Internet. Unless designed properly, these live events do not scale up very well since the more users listening simultaneously, the greater the bandwidth requirements.
One way around this limitation is to multicast packets enabling the minimum number of packets to be sent around the Internet, and duplicated only where needed. Although a multicasting backbone, the MBONE, is in place today, a standard is not.
Consequently, Xing Technologies has created its own system for reflecting packets over the Internet. Xing released server software that runs on an Internet service provider's computers enabling many of their customers to receive high quality broadcasts simultaneously. Progressive Networks intends to release similar software soon.
A number of products delivering real-time video to the desktop are also emerging. In addition to audio, Streamworks also supports video. It uses MPEG to compress the video and audio, so it is able to transmit AM quality audio and a few frames of video a second over a 28 kilobit per second modem. The system scales up to T1 rates for considerably higher quality. However, you need to record your video at different compression rates in order to support different streaming rates.
VDOLive from VDONet Corp. is another product delivering video over the Internet. Altough it runs off of a Unix or Windows NT server, VDONet has only released clients for the PC environment so far. The Unix version can support up to 100 simultaneous streams today, and the company is working on a multicasting technique for the future.
VDOLive has considerably higher quality than Streamworks, which is particularly noticeable at lower speeds. It can be delivered at speeds from 14.4 kilobits per second to 256 kilobits per second. At the higher rate, it can deliver 15 frames per second of full screen video. The content provider only needs to record their program once, because the server can scale the bandwidth to meet the capacity of the client in real time.
VDOlive is able to achieve higher quality than MPEG through a wavelet compression technology that appears to be more efficient. Asaf Mohr, president of VDONet noted, "It is a home brewed technology. We looked at MPEG, but we didn't think it would go down to those low bit rates."
Mohr said VDONet is working on adapting this technology to support video conferencing as well. The nice thing about wavelet compression technology is that it is less computationally intensive to compress the video than MPEG, so that even lower powered machines can deliver decent video to each other.
Both Streamworks and VDOLive are giving their players away for free to build an audience, then make money selling the servers.
Precept Software Inc. is pursuing a different strategy for supporting business applications in which you pay for both the client and the server.
There are two components to Precept's technology. One is Flashware, TCP/IP software that adds multicasting support for PCs. The other is Internet Protocol/TV, which runs on top of Flashware, or any other multicasting software.
Initially, Precept will only support Windows. This will enable PC users to take advantage of the MBONE. Judy Estrin, President and CEO of the company said that the IP/TV will be compatible with the MBONE tools available for the Sun platform.
Estrin is bullish about the MBONE as a standard for delivering multimedia over Internet. "Because we use standards, you don't have to have our server," Estrin said. "Our software can be used as a client for the MBONE tools used to broadcast conferences from the IETF. Consequently you do not have to get all of the software from the same vendor."
But if the MBONE is in place, then why aren't these broadcasting software companies supporting it? Xing Technologies' Gordon points out, "The problem with MBONE is that it is not accessible to many users. It is broadly available to folks in the Academic and research environment, but not to the bulk of people that come in from online services. Since that is a large portion of our users we cannot reach them through the MBONE. Eventually we will leverage the MBONE because our protocols are compatible. Our philosophy is conforming to the standards. There is not enough that we are hearing that makes us feel there is a standard yet. When we feel they have arrived, then we will embrace them."
One of the most significant standards for the delivery mechanisms of real-time multimedia over the Internet is the Netscape LiveMedia framework. It will support both broadcast applications as well as point to point audio and videoconferencing. LiveMedia will enable Netscape and third-party real-time audio and video products to interoperate.
A number of multimedia companies announced their support for the framework, including Progressive Networks, Adobe Systems, Digital Equipment Corp., Macromedia, NetSpeak, OnLive!, Precept, Silicon Graphics, VDOnet, VocalTec, and Xing Technologies. Netscape plans to publish the LiveMedia framework on the Internet, and openly license key technology components of it.
The Netscape LiveMedia framework will be based on the Internet real-time Transport Protocol (RTP), RFC number 1889, and other open audio and video standards such as MPEG, H.261, and GSM to enable products from these and other companies to work together seamlessly, providing users with a range of real-time audio and video capabilities on the Internet. For example, if one participant had the Internet Phone plug in, and another had the Web phone plug in, the two ends would look for a common digital signal processing coder and decoder (codec) such as GSM that they could both use to encode conversations between the two sides.
At the heart of LiveMedia is the OpenDVE Software Architecture that Netscape is acquiring through the purchase of Insoft Inc. This enables third parties to plug in their own front ends and codecs into a standard media framework. InSoft has also developed a video conferencing application called Communique as well as radio and TV broadcasting applications called CoolTalk and CoolView, all of which run on Windows and Unix platforms.
Netscape is now working with InSoft to develop the LiveMedia framework and promote InSoft's OpenDVE software architecture and development toolkits; Communique, CoolTalk, and CoolView applications, and the Global Conference telecommunications gateway. Later in 1996, Netscape plans to Integrate InSoft's real-time audio and video capabilities into future versions of the Netscape Navigator and Netscape Server.
The live media framework will use UDP packets to send real-time audio and video streams between users. Although UDP packets can get lost in transit, they do not suffer from the delay associated with waiting for every TCP packet to arrive. Using forward error correction techniques, a small amount of loss can be tolerated without being noticed by the end user.
It is also easier to send packets to lots of users to achieve the same results as true TCP/IP multicasting. But Daniel Klaussen, Product Manager for next generation Netscape Navigator products pointed out, "Multicast infrastructure is not fully deployed across the Internet, Multicast IP only works when all recipients are expecting the same kind of recipient packets. I don't foresee the use of multicasting in the near future, because it is so dependent on all the routers out there. Until they replace them it will not hit all of the routers on the network. They may do it internally in corporations. I am not saying multicast is a bad thing but it is not a lowest common denominator when you look at it globally."
Thanks to the use of UDP, users plugged into a conversation at different speeds will be able to receive the highest audio and video their Internet connections allow. They will not be tied to the lowest common denominator of the slowest machine connected to the conference.
A look ahead
Multimedia technology is one of the key drivers of the Internet. As users discover the power of audio and video communications, they are driving the demands for bandwidth even higher. The chief limitation for delivering real time multimedia today is the delay associated with sending packets across multiple routers. However, this limitation is being addressed by asynchronous transfer mode (ATM) technology, which promises to provide the Internet with the same kind of high reliability and low latency associated with the traditional circuit-switched networks used today for voice and video communications. Many of the major carriers including MCI and Sprint plan to move to ATM technology as they increase the bandwidth of their backbone beyond 45 megabits per second.
Perhaps it is only a matter of time before organizations have but a single high speed line connecting them to their local Internet service provider for all communications, including telephone. Most of the voice and video traffic between offices would travel the Internet to keep costs down, while a connection in the ISPs ATM switch could route calls over the existing circuit-switched network as required.
About the author
George Lawton (firstname.lastname@example.org) is a computer and telecommunications consultant based in Brisbane, CA. You can visit his home page at http://www.best.com/~glawton/glawton/. Reach George at email@example.com.
If you have technical problems with this magazine, contact firstname.lastname@example.org