Internet Engineering Task Force Audio-Video Transport WG INTERNET-DRAFT A. Klemets draft-klemets-generic-rtp-00 Microsoft Corporation March 13, 1998 Expires: September 13, 1998 Common Generic RTP Payload Format Status of This Memo This document is an Internet-Draft. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as ``work in progress.'' To learn the current status of any Internet-Draft, please check the ``1id-abstracts.txt'' listing contained in the Internet-Drafts Shadow Directories on ftp.is.co.za (Africa), nic.nordu.net (Europe), munnari.oz.au (Pacific Rim), ds.internic.net (US East Coast), or ftp.isi.edu (US West Coast). Distribution of this document is unlimited. Abstract This document specifies a generic payload format for encapsulating arbitrary data into RTP packets. The payload format implements a minimal set of features that are expected to be useful for most applications, while limiting the overhead to as little as one byte. An extension mechanism makes it possible to use the Common Generic Payload Format as a basis for more complex payload formats. This specification is primarily intended for compression schemes that are not covered by other RTP payload formats. It is expected that this specification will be suitable for, but not limited to, streaming data that is stored in file formats that support multiple media types, such as QuickTime, ASF, and the MPEG-4 Intermedia file format. 1. Introduction and Motivation The Real-Time Transport Protocol (RTP) [2] is a protocol for carrying arbitrary real-time data. The RTP specification defines an RTP packet as consisting of an RTP header and an RTP payload. The RTP header is a common header that will always be used, regardless of the kind of data that is being transmitted. The use of certain fields in the RTP header may depend on the media type, or compression scheme, A. Klemets [Page 1] Internet Draft draft-klemets-generic-rtp-00 March 13, 1998 of the data in the RTP payload. In addition, the RTP payload may contain an RTP payload format header, which may be different for each compression scheme. Thus, for each codec that is to be used with RTP, one must specify the syntax of the RTP payload format header, if any, and the use of the RTP header fields. Such a specification is referred to as an RTP payload format. A few RTP payload formats for media types that were popular in 1996 are listed in [3]. Other RTP payload formats are typically published as RFC documents, usually by the AVT working group of the IETF. Having a separate RTP payload format for each media type makes it possible to adapt the RTP payload header to the peculiarities of that media type. The RTP payload header typically adds redundancy information that is not otherwise present in the RTP payload. The purpose of the redundancy information is to reduce the negative impact of lost RTP packets. Although there are clear benefits of tuning RTP payload formats to individual compression schemes, there are also some problems with this approach. For each new compression scheme, a corresponding RTP payload format must be defined, and standardized, before it is possible to use it with RTP in an interoperable manner. If a codec specification is revised, so that the new compression scheme produces a bit-stream that is different from the earlier version, a new RTP payload format may have to be defined. There are currently only a small number of RTP payload formats in existence. However, a codec registry currently lists 104 different audio codecs, and 123 different video codecs [4]. The advent of file formats that can contain a wide variety of media types, such as ASF [1], also poses problems. Even though a streaming media server supports the file format, it might not be able to transmit some of the streams contained in a file, if it does not have an implementation of the appropriate RTP payload format. Each time content creators decide to employ a new compression scheme, it might be necessary to upgrade the software in the streaming media server with the implementation of the corresponding RTP payload format. If no suitable RTP payload format exists for the new compression scheme, it will be necessary to initiate a process to define and standardize it. It might not be possible to stream the data until the standardization effort has completed. Another example of this problem is MPEG-4 [7], an ISO/IEC standard for natural and synthetic audio-visual data. MPEG-4 Systems compliant streams do not only use a wide variety of compression schemes, but streams may also contain scene descriptions and other kinds of meta-data. To be able to transmit MPEG-4 data over RTP, it would be necessary to specify a large number of RTP payload formats, one for each kind of compression scheme or media type. A. Klemets [Page 2] Internet Draft draft-klemets-generic-rtp-00 March 13, 1998 Recent proposals address this problem by specifying an RTP payload format that can be used with arbitrary compression schemes. However, these proposals are all tuned to a particular file format. Currently, mutually incompatible RTP payload format specifications have been proposed for compression schemes found in QuickTime [9], ASF [10], and the MPEG-4 Intermedia file format [11]. It may be argued that these proposals solve one problem by creating a different one. Having multiple incompatible payload formats for streaming the same data may be undesirable. This document solves the problems mentioned above, by specifying a single "generic" RTP payload format that is not tuned to any particular compression scheme or file format. It makes it possible to quickly deploy implementations of new compression schemes. It also simplifies the RTP software, because the number of RTP payload formats that are required can be reduced. However, one should not forget that a generic RTP payload format may not provide the same level of resiliency against lost RTP packets as a codec-specific RTP payload format. Therefore, it may not be advisable to use a generic RTP payload format when a codec-specific alternative is available. Discussion at a recent IETF meeting [8] made it clear that there are several fundamentally different approaches that can be used to define a generic RTP payload format. The approach chosen for this specification is to define a "common" generic RTP payload format. It is called "common" because it only implements features that are believed to be in common with most media streaming systems. An extensibility mechanism makes it possible to include additional features in the payload format header, thus creating what one could call a "specialized" generic RTP payload format. Section 2 provides a high level overview of the features of the payload format. The media streaming system model that is used throughout this document is described in section 3. The RTP encapsulation format is described in detail in section 4. 2. Overview of features Some codecs operate on discrete chunks of data of sizes that may be comparable to the size of large network packets. For example, some video codecs might only process entire video frames. In this document, the term AU (Application Unit) is used as a generic reference to these kinds of chunks. One of the features supported by the Common Generic RTP Payload Format is fragmentation of AU's across multiple RTP packets. Fragmentation is supported in two different degrees of sophistication. This is necessary because the capabilities of different codecs can vary greatly. At the same time, the payload format has been designed so that no undue performance penalty is A. Klemets [Page 3] Internet Draft draft-klemets-generic-rtp-00 March 13, 1998 imposed on implementations that choose to use only the simple mode of fragmentation. The simple mode of fragmentation works by splitting the AU into fragments at arbitrary boundaries. If a single fragment is lost in transit, it is assumed that the other fragments cannot be used. To support this mechanism, the payload format provides a way to determine if all of the fragments have been received. The other mode of fragmentation support assumes that the fragments have been created in a manner such that it might still be possible to use the fragments, even if some of them are lost in transit. A field is provided that specifies where in the AU a fragment fits in. If a series of RTP packets are lost, there is a possibility that the receiver might mistakenly believe that two fragments from different AU's belong to the same AU. In order to prevent this failure mode, each fragment has its own sequence number. The latter mode can be viewed as an example of Application Level Framing (ALF), a design philosophy used in the design of RTP. It is important to provide support for ALF in a generic payload format, so that the payload format does not act as a layer of insulation between the application and RTP. Another feature supported by the payload format, is "grouping", or "bundling". Grouping makes it possible to combine several smaller AU's into a single RTP packet. It is possible to employ grouping and fragmentation at the same time. The first member of a group may be the last fragment in a series, and the last member of a group may be the first fragment of another series. If desired, the combination of grouping and fragmentation can be used to send RTP packets of a constant size, even if the AU's are of variable size. Finally, the payload format is extensible. This payload format is attributed as "Common" and "Generic", because the extension mechanism can be used to create a "Specialized Generic" payload format. The extensibility is achieved by employing the grouping mechanism to carry an object that is opaque to this specification, referred to as "Extension Data". As a result of using the grouping mechanism, any media data can still be easily accessed by the receiver, even if the Extension Data is not understood. Two different types of Extension Data are anticipated. The first type can be viewed as hints or supplementary information to the receiver. The receiver does not need to be able to understand the Extension Data to correctly process the RTP packet, although presumably the Extension Data serves some purpose, or it would not have been transmitted. For example, this kind of Extension Data could include a "Send Time", a value of the instantaneous bit rate (presentation duration), "seekable" or "key" flags, etc. A. Klemets [Page 4] Internet Draft draft-klemets-generic-rtp-00 March 13, 1998 The second type of Extension Data is information that the receiver must use in order to correctly process the RTP packet. For example, if the grouping mechanism were to be used to transmit data from different logical "streams" in the same RTP packet, Extension Data would likely include a list of stream numbers. This document does not define an interpretation for Extension Data. It is expected that the RTP transmitter will convey the interpretation of Extension Data through non-RTP means. For instance, it might be possible to augment SDP [4] for this purpose. The specification of such a mechanism is outside the scope of this document. When the RTP Payload Type that is assigned to this payload format is used, the RTP receiver is not required to process Extension Data for correct operation. If the RTP receiver must process Extension Data for in order to operate correctly, a different RTP Payload Type must be used. These features are implemented with very little overhead. In fact, the payload format header is only one byte in RTP packets where none of the features are used. It could be possible to use SDP to specify that the payload format header should not be used at all, and thus save one byte. But such an approach would not allow any features to be enabled later, should the transmitter suddenly realize that it would need to perform fragmentation or grouping, for instance. Thus, the one byte overhead provides for a lot of flexibility. Features can be controlled on a per-packet basis. If the precise format of the RTP payload format header were to be selected at the time of session establishment, however, this flexibility would be lost. 3. Media Streaming System Model Figure 1 displays a model for the RTP transmitter and receiver, which is used throughout this document. It should be noticed that the three layers in this model may not directly correspond to any tangible layers in a real system. The Network Adaptation Layer, in particular, may not exist in some systems. However, the Network Adaptation Layer may perform nothing, in which case it absence from a real system is not a problem. A. Klemets [Page 5] Internet Draft draft-klemets-generic-rtp-00 March 13, 1998 +------------------------------------------+ media aware | COMPRESSION LAYER | network aware | Codec, processing sequences of AU's. | +------------------------------------------+ media and | NETWORK ADAPTATION LAYER | network aware | Converts between AU's and PDU's, through | | optional segmentation and reassembly. | +------------------------------------------+ network aware | NETWORK LAYER | media unaware | (De-)encapsulates PDU's in RTP packets. | +------------------------------------------+ Figure 1. System model for RTP transmitters and receivers. In the next section, to simplify the discussion, we will describe the RTP transmitter side only. Section 3.2 describes the RTP receiver. 3.1. RTP transmitter operation The Compression Layer may represent a "live" encoder that encodes data concurrently with its transmission by the Network Layer. But the Compression Layer could also be implemented as a file format decoder, reading previously encoded data from a file. The output of the Compression Layer is a sequence of Application Unit's (AU's). The Network Adaptation Layer receives AU's from the Compression Layer. It outputs Protocol Data Units (PDU's). The Network Adaptation Layer might be aware of a "maximum suitable PDU size", which is provided by the Networking Layer. The layer may also be able to convert an AU into several smaller AU's, or be able to split an AU into several fragments such that each fragment can be decoded individually. An implementation of this layer may support any of the three following modes. Mode 1. Each AU is mapped to a single PDU. This mode essentially does nothing. Mode 2. Same as Mode 1, but if an AU is larger than the maximum suitable PDU size, it is converted into multiple independent AU's, each of which is mapped to a separate PDU. Mode 3. Same as Mode 1, but if an AU is larger than the maximum suitable PDU size, it is split into multiple fragments. The boundary of each fragment is chosen in such a manner that the fragment can be decoded individually if some of the fragments are lost during transmission. Each fragment may optionally include redundancy information, to reduce the negative impact of lost fragments. Each fragment is mapped to a separate PDU. Each PDU is marked as containing a fragmented AU. A. Klemets [Page 6] Internet Draft draft-klemets-generic-rtp-00 March 13, 1998 A simple implementation of the Network Adaptation Layer would only implement Mode 1. A more sophisticated implementation would implement any or all of the modes. The Network Layer receives PDU's and outputs RTP packets. The Network Layer implements the Common Generic RTP Payload Format. Each PDU is normally mapped into a single RTP packet. However, if a PDU is larger than the maximum suitable PDU size, multiple PDU fragments will be created. Typically, each PDU fragment will be sent in a separate RTP packet. Because the Network Layer is media unaware, the fragmentation boundaries cannot be chosen such that each fragment can be decoded individually. The Common Generic RTP Payload Format Header distinguishes between PDU fragments, that were created by this layer, and PDU's that contain fragmented AU's, that were created by Mode 3 in the Network Adaptation Layer. In addition, if the size of a PDU, or PDU fragment, is smaller than some threshold, the Network Layer may group multiple PDU's into a single RTP packet. In this situation, the first and last PDU's may be PDU fragments. 3.2. RTP receiver operation The Network Layer at the RTP receiver receives RTP packets and outputs PDU's. If an RTP packet contains one or more PDU's, each PDU is delivered independently. If an RTP packet contains a fragmented PDU, the Network Layer will reassemble the fragments into a complete PDU before it is delivered to the Network Adaptation Layer. If it is determined that any of the fragments have been lost, received fragments belonging to the same PDU will be discarded. The Network Adaptation Layer receives PDU's and outputs AU's. PDU's that were generated through Mode 1 and Mode 2 in the transmitter will be mapped directly to an AU. A mode 3 PDU contains a fragmented AU and will be marked as such by the Network Layer. Depending on the capabilities of the Compression Layer, the Network Adaptation Layer may attempt to reassemble fragmented AU's into a single AU, or it may deliver each AU individually. If it is determined that a PDU containing a fragmented AU has been lost, it is expected that this layer will do any of the following. It may deliver the AU fragments that were received. Or, it may attempt to reduce the negative impact of the lost fragment, by using redundancy information present in the PDU's that contain the other fragments of the same AU. The Compression Layer receives AU's as input. It may decode AU's, or it may store them in a file for decoding at a later time, for instance. Some implementations of the this layer may be able to accept fragmented AU's as input. A. Klemets [Page 7] Internet Draft draft-klemets-generic-rtp-00 March 13, 1998 4. RTP Encapsulation Format The following figure gives a high level view of the RTP encapsulation format. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ . . . RTP Header . . . +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ . . . Common Generic RTP Payload Format Header . . . +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ . . . Common Generic RTP Payload Format Payload . . . +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 2. RTP packet layout. An RTP packet consists of an RTP header and the RTP payload. When using the Common Generic RTP Payload Format, one may define the RTP payload as being separated into a Common Generic RTP Payload Format Header, and a Common Generic RTP Payload Format Payload. This is the definition shown in Figure 2. As will become apparent in section 4.2, however, a strict division of the RTP payload into two pieces is not possible, as the various header fields and PDU's (the payload) are intertwined. 4.1. RTP Header Usage Several fields in the RTP header are assigned a special meaning in the context of this specification. The use of the fields is as follows: Payload Type: This field may be set to the static RTP payload type assigned to this RTP payload format. However, if Extension Data is used, and that data must be understood by the receiver in order to properly process the PDU, the static RTP payload type must not be used. The above assumes that a static RTP payload type has been assigned to this RTP payload format. Marker bit (M bit): This bit is set to 1 if the first PDU in the payload is not a fragmented PDU and does not contain a fragmented AU. The bit is also set to 1 if the first PDU is the last of a series of fragmented PDU's, or if the first PDU contains the last A. Klemets [Page 8] Internet Draft draft-klemets-generic-rtp-00 March 13, 1998 fragment of an AU. Any other PDU's in the RTP payload, other than the first, do not have any effect on the M bit. Timestamp: The RTP timestamp is set to the "presentation time" of the first PDU contained in the payload. PDU fragments all use the presentation time of the complete PDU. The clock frequency depends on the media type, and its value should be conveyed through non-RTP means, such as SDP [4]. 4.2. The Common Generic RTP Payload Format Header Figure 3 shows an RTP payload, containing all the different Common Generic RTP Payload Format Header fields. In this particular example, the RTP payload contains two OBJECT fields. An OBJECT field may contain either a PDU or Extension Data. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |G|S| FRAG | OFFSET | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |E|B| LENGTH | . +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ . . OBJECT (PDU or Extension Data) . +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |E|B| LENGTH | TIMESTAMP DELTA | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ . OBJECT (PDU or Extension Data) . +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 3. Sample RTP payload, using the payload format. The fields have the following meanings: G (Group): 1 bit. If this field is 1, it indicates that the E, B and LENGTH fields are present, directly following the optional OFFSET field. Typically, the G field is set to 1 when the payload contains more than one OBJECT field. S (Shift): 1 bit. If this field is 1, and the FRAG field is not 0, the value given by the OFFSET field should be shifted left by 8 bits. If the S field is 1 and the FRAG field is 0 and the RTP Marker bit is 0, it means that the first PDU is the first fragment in a series of PDU fragments. If the S field is 1 and the FRAG field is 0 and the RTP Marker bit is 1, it means that the first PDU is the last fragment in a series of PDU fragments. FRAG: 6 bits. When this field is 0, and the S field is 1 or the RTP Marker bit is 0, the first PDU is a PDU fragment. If this field is A. Klemets [Page 9] Internet Draft draft-klemets-generic-rtp-00 March 13, 1998 not 0, the first PDU contains a fragmented AU, and the value of the FRAG field is the fragment sequence number. For each AU, the first fragment sequence number always starts with 1 and increments by 1 for each fragment. The sequence number that follows 63 is 1. 0 is not a valid fragment sequence number. When the FRAG field is not 0, the OFFSET field is present. Note that because the FRAG field wraps around, the largest burst of lost AU fragments, in which fragments that were received can be processed, is 62. If more than 62 consecutive AU fragments are lost, other fragments belonging to the same AU must be discarded. Lost packets are detected through the RTP sequence number, which means that correct operation requires the RTP sequence number to increment by 1 for each RTP packet. OFFSET: 24 bits. This field is only present when the FRAG field is not 0. The presence of this field implies that the next PDU contains a fragmented AU. The value of this field gives the byte offset of the start of the data in the fragmented AU, relative to the start of the complete AU. The default value is 0. If the S field is 1, the value of the OFFSET field should be shifted left by 8 bits. E (Extension): 1 bit. This field is only present when the G field is 1. The default value is 0. If this field is 1, the next OBJECT field is Extension Data. If the E field is 0, the next OBJECT field is a PDU. B (Boundary): 1 bit. This field is only present when the G field is 1. The default value is 0. If this field is 1, and the next OBJECT field does not fully utilize all the remaining space in the RTP payload, it means that an additional PDU is contained in the remaining space. That PDU does not have its own TIMESTAMP DELTA, B, and LENGTH fields. If the next PDU is a PDU fragment or if it contains a fragmented AU, and the next PDU is not also the first PDU in the RTP payload, the value of the FRAG field for the next PDU is the same as the value of the B field. LENGTH: 14 bits. This field is only present when the G field is 1. The default value is the number of bytes in the RTP payload, counted from the end of this field to the end of the RTP payload, or to the start of the optional RTP padding field. If a TIMESTAMP DELTA field follows this field, its size is not counted. The value of the field gives the length in bytes of the next OBJECT field, if there is sufficient space to fully contain the OBJECT field in the RTP payload. If there is not sufficient space to contain an OBJECT field of the size given by the LENGTH field, the next OBJECT field is the first of a series of PDU fragments or PDU's that contain a fragmented AU. In this situation, the length of the next OBJECT field is equal to the available space in the RTP payload. In addition, the value of A. Klemets [Page 10] Internet Draft draft-klemets-generic-rtp-00 March 13, 1998 the B field should be used as the value for the FRAG field for the next PDU. TIMESTAMP DELTA: 16 bits. This field is only present when the G field is 1, the next OBJECT field is a PDU and the next PDU is not the first PDU in the RTP payload. The value of the field is an unsigned 16 bit integer. The default value is 0. Adding the value to the RTP timestamp yields the RTP timestamp for the next PDU. OBJECT: variable size. This field is either a PDU or Extension Data. A PDU may also be a PDU fragment or may contain a fragmented AU. Extension Data cannot be fragmented. 4.3. Examples This section has a few examples that illustrate possible ways of using the payload format. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |G|S| FRAG | PDU . +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 4. Simplest case: G=0, S=0, FRAG=0. Figure 4 shows the simplest and perhaps most common situation. The RTP packet contains a single PDU, and the PDU is not fragmented, and it contains an entire AU. There is no Extension Data. The Common Generic RTP Payload Format header is one byte. All the fields in the byte are 0, making further header compression simple. The RTP payload also looks like Figure 4 when the PDU is a fragmented PDU. The RTP M bit and the S field is used to detect the first and last fragment. The first fragment has S=1, M=0. The last fragment has S=1, M=1. All other fragments have S=0, M=0. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |G|S| FRAG |E|B| LENGTH | . +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ . . OBJECT (PDU or Extension Data) . +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ . PDU . +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 5. Grouping example. G=1, FRAG=0, B=1. A. Klemets [Page 11] Internet Draft draft-klemets-generic-rtp-00 March 13, 1998 Figure 5 shows two OBJECT fields in the same RTP payload. The first OBJECT field may contain either a PDU or Extension Data, depending on the value of the E field. The B field is 1, which means that the E, B LENGTH and TIMESTAMP DELTA fields for the second OBJECT field are not present. Because the E field defaults to 0, the second OBJECT field must contain a PDU. If the first OBJECT field contains a PDU, it might be a PDU fragment. However, it cannot be a PDU that contains a fragmented AU, because the OFFSET field is not present. In this example, the last PDU must contain an entire AU. Hence, if the first OBJECT field contains a PDU fragment, it must be the last fragment in a series. The S field and the RTP Marker bit will both be 1 in that case. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |G|S| FRAG | OFFSET | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ . PDU . +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 6. Fragmentation example. G=0, FRAG>0. Figure 6 shows an RTP payload with a single PDU, that contains a fragmented AU. The OFFSET field and the S field, together identify the starting point of this fragment in the complete AU. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |G|S| FRAG | OFFSET | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |E|B| LENGTH | . +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ . . OBJECT (PDU or Extension Data) . +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |E|B| LENGTH | . +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ . . PDU . +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |E|B| LENGTH | TIMESTAMP DELTA | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ . PDU . +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 7. Multiple PDU's. G=1, FRAG>0, B1=0, E2=0, B2=0, E3=0. A. Klemets [Page 12] Internet Draft draft-klemets-generic-rtp-00 March 13, 1998 Figure 7 shows how an RTP payload may look if one wants all RTP packets to be of the same size. In the example, there are three OBJECT fields. The last two contain PDU's, while the first OBJECT field may contain a PDU or Extension Data, depending on the value of the E1 field. The FRAG field is not 0, and the OFFSET field is present. This means that the first PDU contains a fragmented AU. Which OBJECT field contains the first PDU depends on the value of the E1 field. If the last PDU is smaller than the last LENGTH field, and the value of B3 is 0, the last PDU is the first in a series of PDU fragments. If the last PDU is smaller than the last LENGTH field, and the value of B3 is 1, the last PDU is the first in a series of PDU's that contain fragments of an AU. Otherwise, the value of B3 should be 0. The RTP timestamp is applicable to all PDU's except for the last one. The value of the TIMESTAMP DELTA field must be added to the RTP timestamp to obtain the RTP timestamp value for the last PDU. 5. Authors Address Anders Klemets 1 Microsoft Way Redmond, WA 98052-6399 USA E-mail: anderskl@microsoft.com References [1] Microsoft Corporation, "Advanced Streaming Format (ASF) Specification", http://www.microsoft.com/asf, February 1998. [2] H. Schulzrinne, et. al., "RTP : A Transport Protocol for Real- Time Applications", IETF RFC 1889, January 1996. [3] H. Schulzrinne, et. al., "RTP Profile for Audio and Video Conference with Minimal Control", IETF RFC 1890, January 1996. [4] E. Fleischman, "WAVE and AVI Codec Registries", work in progress. [5] M. Handley, "SDP: Session Description Protocol", work in progress. [6] Apple Computer, Inc., "QuickTime File Format Specification", May 1996. [7] ISO/IEC 14496-1 CD, "MPEG-4 Systems", October 1997. A. Klemets [Page 13] Internet Draft draft-klemets-generic-rtp-00 March 13, 1998 [8] Internet Engineering Task Force, "Proceedings of the 40th IETF", December 1997. [9] A. Jones, et. al., "RTP Payload Format for QuickTime Media Streams", work in progress. [10] A. Klemets, "RTP Payload Format for ASF Streams", work in progress. [11] C. Herpel, et. al., "RTP payload format for MPEG-4 Elementary Streams", work in progress. A. Klemets [Page 14]