Why is the multimedia world demanding a standard file format like ASF?

To understand why the multimedia world is demanding a file format like ASF, it is instructive to first look at the world of the Web. From the viewpoint of the client, the Internet Web is one big blob of information. Page after page on the Internet contains information in one form or another: text, images, interactive forms, animation, audio, and video. The list goes on and on.

It seems pretty obvious that the information coming in over the Internet and being displayed through the client browser is coming from servers residing out there on the Internet. What may not be so obvious is that the information, displayed on the same Web page, is often coming from two different kinds of servers: Web servers and media servers.

Web servers deliver Web pages. More specifically, Web servers deliver some or all of the information contained on Web pages. The predominant types of information delivered by Web servers are text and graphics. Media servers, on the other hand, stream multimedia content, such as audio and video. Together, a Web server and a media server can deliver to the client a Web page with streaming video playing in the page.

Media servers dedicated to streaming media are necessary due to the "bandwidth hogging" and real-time nature of audio and video. It is true that, in addition to delivering other kinds of information, Web servers are also capable of streaming audio and video. However, Web servers cannot do the job as well. For streaming media, Web servers are to media servers what bicycles are to cars for getting around. Sure, you could take your bike on the freeway, but you'd be better off with your car. Conversely, you could open the garage, start your car, back out, and drive to the end of the block, but it would be much easier just to hop on your bike.

Media servers and Web servers are different not only in terms of power and function, but also in terms of standards. With Web servers, standard formats and protocols are fairly well established. Currently, the opposite is true with media servers.

In the Web world, the standard presentation format is HTML. That is, no matter what Web page creation tool you are using -- FrontPage®, Microsoft Word, or Notepad -- the output is the same: an HTML file. Similarly, the network protocols and wire formats used for delivering that HTML file over the Internet are established: HTTP and TCP/IP. Finally, "playing back" the text and graphics in an HTML file is something that computers have been doing for a long time.

These Web standards have contributed greatly to the explosion of the Web in the last few years. The Web standards (HTML, HTTP, and TCP/IP) and the ability to render text and graphics were existing technologies a few years ago when they were picked up and built upon to create the Internet as we know it today. Web servers, which deliver HTML content, and Web browsers, which receive and display the HTML content, sprang up like mushrooms. Content developers knew that so long as they could put their content in the HTML format, pretty much any Web server could deliver that content, and pretty much any client browser could read that content. Accordingly, content flooded onto the Web.

The emergence of multimedia content on the Web has followed a much different path. As opposed to finding an established framework of data formats and protocols already in place, the early developers of streaming media technology found themselves in the Wild West. The law was whatever you said it was. These developers were presented with a blank palette (HTML pages), and the means for connecting them (the Internet). They were limited only by the extent of their imagination, and that imagination has proven fertile.

In the last couple of years, a number of media streaming companies have emerged, bringing all manner of new, wild, and wooly multimedia presentations to the Web. We have seen audio broadcasts accompanied by flipping HTML pages; foreign videos with scrolling subtitles; and live video broadcasts on the Web generated from unlikely locations using little more than a camcorder, portable PC, and a couple of modems. The possibilities seem endless. You have all manner of digital multimedia at your fingertips: graphics, animation, VRML, video, audio, and ActiveXTM controls. In other words, you have a blank page. Have at it.

This is all wonderful except for one thing: it is all incompatible. Media streaming companies have their own proprietary presentation format for multimedia data. They lay out that data to disk in their own peculiar and particular way. Media streaming companies also have their own network protocols and wire formats. In fact, only now is there emerging a standard protocol for telling a video server to pause a running video clip (RTSP).

What this has meant for the consumers of streaming media technology is that they have faced an all-or-nothing decision. Commit to all of the technology of vendor A -- media production tools, media servers, and media clients -- or all of the technology of vendor B. No mixing or matching. Thus, content producers who wanted to use a really cool production tool from vendor A were compelled to ask whether vendor A's server and client were also robust and widely deployed; otherwise, the content would never see the light of day. Similarly, the hosting provider who wanted to buy the powerful media streaming server of vendor B and offer video hosting services would also necessarily be "buying into" vendor B's production tools and client. For the end consumer of the content, this framework has required downloading and installing a new client player every time the consumer comes across some interesting new content in yet another format, or in a newer version of an older format. This is very cumbersome, particularly for consumers accessing the Internet by means of 28.8-Kbps or slower modems.

These all-or-nothing decisions and cumbersome demands have, for now, put a cork in the multimedia bottle. But the pressure in the bottle is building. Content producers, hosting providers, and consumers are all demanding the same thing: interoperable media tools, servers, and clients.

A few technologies are necessary for achieving this interoperability. Among the necessary technologies are a standard real-time network delivery protocol (RTP is emerging here) and a standard real-time control protocol (RTSP is emerging here). But the biggest step toward achieving interoperability will be the emergence of a standard multimedia presentation file format. ASF represents this step.

 

© 1997 Microsoft Corporation. All rights reserved. Terms of Use.