Internet Engineering Task Force                 Audio-Video Transport WG
INTERNET-DRAFT                                                A. Klemets
draft-klemets-generic-rtp-00                       Microsoft Corporation
March 13, 1998                               Expires: September 13, 1998

Common Generic RTP Payload Format

Status of This Memo

This document is an Internet-Draft.  Internet-Drafts are working
documents of the Internet Engineering Task Force (IETF), its areas,
and its working groups.  Note that other groups may also distribute
working documents as Internet-Drafts.

Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time.  It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as ``work in progress.''

To learn the current status of any Internet-Draft, please check the
``1id-abstracts.txt'' listing contained in the Internet-Drafts Shadow
Directories on ftp.is.co.za (Africa), nic.nordu.net (Europe),
munnari.oz.au (Pacific Rim), ds.internic.net (US East Coast), or
ftp.isi.edu (US West Coast).


Distribution of this document is unlimited.

Abstract

This document specifies a generic payload format for encapsulating 
arbitrary data into RTP packets.  The payload format implements a 
minimal set of features that are expected to be useful for most 
applications, while limiting the overhead to as little as one byte.  
An extension mechanism makes it possible to use the Common Generic 
Payload Format as a basis for more complex payload formats.  This 
specification is primarily intended for compression schemes that are 
not covered by other RTP payload formats.  It is expected that this 
specification will be suitable for, but not limited to, streaming 
data that is stored in file formats that support multiple media 
types, such as QuickTime, ASF, and the MPEG-4 Intermedia file format.

1. Introduction and Motivation

The Real-Time Transport Protocol (RTP) [2] is a protocol for carrying 
arbitrary real-time data.  The RTP specification defines an RTP 
packet as consisting of an RTP header and an RTP payload.  The RTP 
header is a common header that will always be used, regardless of the 
kind of data that is being transmitted.  The use of certain fields in 
the RTP header may depend on the media type, or compression scheme, 


A. Klemets                                                      [Page 1]

Internet Draft        draft-klemets-generic-rtp-00        March 13, 1998


of the data in the RTP payload.  In addition, the RTP payload may 
contain an RTP payload format header, which may be different for each 
compression scheme.  Thus, for each codec that is to be used with 
RTP, one must specify the syntax of the RTP payload format header, if 
any, and the use of the RTP header fields.  Such a specification is 
referred to as an RTP payload format.  A few RTP payload formats for 
media types that were popular in 1996 are listed in [3].  Other RTP 
payload formats are typically published as RFC documents, usually by 
the AVT working group of the IETF.  

Having a separate RTP payload format for each media type makes it 
possible to adapt the RTP payload header to the peculiarities of that 
media type.  The RTP payload header typically adds redundancy 
information that is not otherwise present in the RTP payload.  The 
purpose of the redundancy information is to reduce the negative 
impact of lost RTP packets.

Although there are clear benefits of tuning RTP payload formats to 
individual compression schemes, there are also some problems with 
this approach.  For each new compression scheme, a corresponding RTP 
payload format must be defined, and standardized, before it is 
possible to use it with RTP in an interoperable manner.  If a codec 
specification is revised, so that the new compression scheme produces 
a bit-stream that is different from the earlier version, a new RTP 
payload format may have to be defined.  There are currently only a 
small number of RTP payload formats in existence.  However, a codec 
registry currently lists 104 different audio codecs, and 123 
different video codecs [4].

The advent of file formats that can contain a wide variety of media 
types, such as ASF [1], also poses problems.  Even though a streaming 
media server supports the file format, it might not be able to 
transmit some of the streams contained in a file, if it does not have 
an implementation of the appropriate RTP payload format.  Each time 
content creators decide to employ a new compression scheme, it might 
be necessary to upgrade the software in the streaming media server 
with the implementation of the corresponding RTP payload format.  If 
no suitable RTP payload format exists for the new compression scheme, 
it will be necessary to initiate a process to define and standardize 
it.  It might not be possible to stream the data until the 
standardization effort has completed.

Another example of this problem is MPEG-4 [7], an ISO/IEC standard 
for natural and synthetic audio-visual data.  MPEG-4 Systems 
compliant streams do not only use a wide variety of compression 
schemes, but streams may also contain scene descriptions and other 
kinds of meta-data.  To be able to transmit MPEG-4 data over RTP, it 
would be necessary to specify a large number of RTP payload formats, 
one for each kind of compression scheme or media type.


A. Klemets                                                      [Page 2]

Internet Draft        draft-klemets-generic-rtp-00        March 13, 1998


Recent proposals address this problem by specifying an RTP payload 
format that can be used with arbitrary compression schemes.  However, 
these proposals are all tuned to a particular file format.  
Currently, mutually incompatible RTP payload format specifications 
have been proposed for compression schemes found in QuickTime [9], 
ASF [10], and the MPEG-4 Intermedia file format [11].  It may be 
argued that these proposals solve one problem by creating a different 
one.  Having multiple incompatible payload formats for streaming the 
same data may be undesirable.

This document solves the problems mentioned above, by specifying a 
single "generic" RTP payload format that is not tuned to any 
particular compression scheme or file format.  It makes it possible 
to quickly deploy implementations of new compression schemes.  It 
also simplifies the RTP software, because the number of RTP payload 
formats that are required can be reduced.  However, one should not 
forget that a generic RTP payload format may not provide the same 
level of resiliency against lost RTP packets as a codec-specific RTP 
payload format.  Therefore, it may not be advisable to use a generic 
RTP payload format when a codec-specific alternative is available.

Discussion at a recent IETF meeting [8] made it clear that there are 
several fundamentally different approaches that can be used to define 
a generic RTP payload format.  The approach chosen for this 
specification is to define a "common" generic RTP payload format.  It 
is called "common" because it only implements features that are 
believed to be in common with most media streaming systems.  An 
extensibility mechanism makes it possible to include additional 
features in the payload format header, thus creating what one could 
call a "specialized" generic RTP payload format.  Section 2 provides 
a high level overview of the features of the payload format.  The 
media streaming system model that is used throughout this document is 
described in section 3. The RTP encapsulation format is described in 
detail in section 4. 

2. Overview of features

Some codecs operate on discrete chunks of data of sizes that may be 
comparable to the size of large network packets.  For example, some 
video codecs might only process entire video frames.  In this 
document, the term AU (Application Unit) is used as a generic 
reference to these kinds of chunks. 

One of the features supported by the Common Generic RTP Payload 
Format is fragmentation of AU's across multiple RTP packets.  
Fragmentation is supported in two different degrees of 
sophistication.  This is necessary because the capabilities of 
different codecs can vary greatly.  At the same time, the payload 
format has been designed so that no undue performance penalty is 


A. Klemets                                                      [Page 3]

Internet Draft        draft-klemets-generic-rtp-00        March 13, 1998


imposed on implementations that choose to use only the simple mode of 
fragmentation.

The simple mode of fragmentation works by splitting the AU into 
fragments at arbitrary boundaries.  If a single fragment is lost in 
transit, it is assumed that the other fragments cannot be used.  To 
support this mechanism, the payload format provides a way to 
determine if all of the fragments have been received.

The other mode of fragmentation support assumes that the fragments 
have been created in a manner such that it might still be possible to 
use the fragments, even if some of them are lost in transit.  A field 
is provided that specifies where in the AU a fragment fits in.  If a 
series of RTP packets are lost, there is a possibility that the 
receiver might mistakenly believe that two fragments from different 
AU's belong to the same AU.  In order to prevent this failure mode, 
each fragment has its own sequence number.

The latter mode can be viewed as an example of Application Level 
Framing (ALF), a design philosophy used in the design of RTP.  It is 
important to provide support for ALF in a generic payload format, so 
that the payload format does not act as a layer of insulation between 
the application and RTP.

Another feature supported by the payload format, is "grouping", or 
"bundling".  Grouping makes it possible to combine several smaller 
AU's into a single RTP packet.  It is possible to employ grouping and 
fragmentation at the same time.  The first member of a group may be 
the last fragment in a series, and the last member of a group may be 
the first fragment of another series.  If desired, the combination of 
grouping and fragmentation can be used to send RTP packets of a 
constant size, even if the AU's are of variable size.

Finally, the payload format is extensible.  This payload format is 
attributed as "Common" and "Generic", because the extension mechanism 
can be used to create a "Specialized Generic" payload format.  The 
extensibility is achieved by employing the grouping mechanism to 
carry an object that is opaque to this specification, referred to as 
"Extension Data".  As a result of using the grouping mechanism, any 
media data can still be easily accessed by the receiver, even if the 
Extension Data is not understood. 

Two different types of Extension Data are anticipated.  The first 
type can be viewed as hints or supplementary information to the 
receiver.  The receiver does not need to be able to understand the 
Extension Data to correctly process the RTP packet, although 
presumably the Extension Data serves some purpose, or it would not 
have been transmitted.  For example, this kind of Extension Data 
could include a "Send Time", a value of the instantaneous bit rate 
(presentation duration), "seekable" or "key" flags, etc.


A. Klemets                                                      [Page 4]

Internet Draft        draft-klemets-generic-rtp-00        March 13, 1998


The second type of Extension Data is information that the receiver 
must use in order to correctly process the RTP packet.  For example, 
if the grouping mechanism were to be used to transmit data from 
different logical "streams" in the same RTP packet, Extension Data 
would likely include a list of stream numbers.

This document does not define an interpretation for Extension Data.  
It is expected that the RTP transmitter will convey the 
interpretation of Extension Data through non-RTP means.  For 
instance, it might be possible to augment SDP [4] for this purpose.  
The specification of such a mechanism is outside the scope of this 
document.  When the RTP Payload Type that is assigned to this payload 
format is used, the RTP receiver is not required to process Extension 
Data for correct operation.  If the RTP receiver must process 
Extension Data for in order to operate correctly, a different RTP 
Payload Type must be used.

These features are implemented with very little overhead.  In fact, 
the payload format header is only one byte in RTP packets where none 
of the features are used.  It could be possible to use SDP to specify 
that the payload format header should not be used at all, and thus 
save one byte.  But such an approach would not allow any features to 
be enabled later, should the transmitter suddenly realize that it 
would need to perform fragmentation or grouping, for instance.  Thus, 
the one byte overhead provides for a lot of flexibility.  Features 
can be controlled on a per-packet basis.  If the precise format of 
the RTP payload format header were to be selected at the time of 
session establishment, however, this flexibility would be lost.

3. Media Streaming System Model

Figure 1 displays a model for the RTP transmitter and receiver, which 
is used throughout this document.  It should be noticed that the 
three layers in this model may not directly correspond to any 
tangible layers in a real system.  The Network Adaptation Layer, in 
particular, may not exist in some systems.  However, the Network 
Adaptation Layer may perform nothing, in which case it absence from a 
real system is not a problem.
 

A. Klemets                                                      [Page 5]

Internet Draft        draft-klemets-generic-rtp-00        March 13, 1998


                  +------------------------------------------+
  media aware     |            COMPRESSION LAYER             |
  network aware   | Codec, processing sequences of AU's.     |
                  +------------------------------------------+
  media and       |         NETWORK ADAPTATION LAYER         |
  network aware   | Converts between AU's and PDU's, through |
                  | optional segmentation and reassembly.    |
                  +------------------------------------------+
  network aware   |              NETWORK LAYER               |
  media unaware   | (De-)encapsulates PDU's in RTP packets.  |
                  +------------------------------------------+

Figure 1. System model for RTP transmitters and receivers.

In the next section, to simplify the discussion, we will describe the 
RTP transmitter side only.  Section 3.2 describes the RTP receiver.

3.1. RTP transmitter operation

The Compression Layer may represent a "live" encoder that encodes 
data concurrently with its transmission by the Network Layer.  But 
the Compression Layer could also be implemented as a file format 
decoder, reading previously encoded data from a file.  The output of 
the Compression Layer is a sequence of Application Unit's (AU's).

The Network Adaptation Layer receives AU's from the Compression 
Layer.  It outputs Protocol Data Units (PDU's).  The Network 
Adaptation Layer might be aware of a "maximum suitable PDU size", 
which is provided by the Networking Layer.  The layer may also be 
able to convert an AU into several smaller AU's, or be able to split 
an AU into several fragments such that each fragment can be decoded 
individually.  An implementation of this layer may support any of the 
three following modes.

Mode 1.  Each AU is mapped to a single PDU.  This mode essentially 
does nothing.

Mode 2.  Same as Mode 1, but if an AU is larger than the maximum 
suitable PDU size, it is converted into multiple independent AU's, 
each of which is mapped to a separate PDU.

Mode 3.  Same as Mode 1, but if an AU is larger than the maximum 
suitable PDU size, it is split into multiple fragments.  The boundary 
of each fragment is chosen in such a manner that the fragment can be 
decoded individually if some of the fragments are lost during 
transmission.  Each fragment may optionally include redundancy 
information, to reduce the negative impact of lost fragments.  Each 
fragment is mapped to a separate PDU.  Each PDU is marked as 
containing a fragmented AU.


A. Klemets                                                      [Page 6]

Internet Draft        draft-klemets-generic-rtp-00        March 13, 1998


A simple implementation of the Network Adaptation Layer would only 
implement Mode 1.  A more sophisticated implementation would 
implement any or all of the modes.

The Network Layer receives PDU's and outputs RTP packets.  The 
Network Layer implements the Common Generic RTP Payload Format.  Each 
PDU is normally mapped into a single RTP packet.  However, if a PDU 
is larger than the maximum suitable PDU size, multiple PDU fragments 
will be created.  Typically, each PDU fragment will be sent in a 
separate RTP packet.  Because the Network Layer is media unaware, the 
fragmentation boundaries cannot be chosen such that each fragment can 
be decoded individually.  The Common Generic RTP Payload Format 
Header distinguishes between PDU fragments, that were created by this 
layer, and PDU's that contain fragmented AU's, that were created by 
Mode 3 in the Network Adaptation Layer.  In addition, if the size of 
a PDU, or PDU fragment, is smaller than some threshold, the Network 
Layer may group multiple PDU's into a single RTP packet.  In this 
situation, the first and last PDU's may be PDU fragments.

3.2. RTP receiver operation

The Network Layer at the RTP receiver receives RTP packets and 
outputs PDU's.  If an RTP packet contains one or more PDU's, each PDU 
is delivered independently.  If an RTP packet contains a fragmented 
PDU, the Network Layer will reassemble the fragments into a complete 
PDU before it is delivered to the Network Adaptation Layer.  If it is 
determined that any of the fragments have been lost, received 
fragments belonging to the same PDU will be discarded.  

The Network Adaptation Layer receives PDU's and outputs AU's.  PDU's 
that were generated through Mode 1 and Mode 2 in the transmitter will 
be mapped directly to an AU.  A mode 3 PDU contains a fragmented AU 
and will be marked as such by the Network Layer.  Depending on the 
capabilities of the Compression Layer, the Network Adaptation Layer 
may attempt to reassemble fragmented AU's into a single AU, or it may 
deliver each AU individually.  If it is determined that a PDU 
containing a fragmented AU has been lost, it is expected that this 
layer will do any of the following.  It may deliver the AU fragments 
that were received.  Or, it may attempt to reduce the negative impact 
of the lost fragment, by using redundancy information present in the 
PDU's that contain the other fragments of the same AU.

The Compression Layer receives AU's as input.  It may decode AU's, or 
it may store them in a file for decoding at a later time, for 
instance.  Some implementations of the this layer may be able to 
accept fragmented AU's as input.


A. Klemets                                                      [Page 7]

Internet Draft        draft-klemets-generic-rtp-00        March 13, 1998


4. RTP Encapsulation Format

The following figure gives a high level view of the RTP encapsulation 
format.


 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 .                                                               .
 .                          RTP Header                           .
 .                                                               .
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 .                                                               .
 .            Common Generic RTP Payload Format Header           .
 .                                                               .
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 .                                                               .
 .            Common Generic RTP Payload Format Payload          .
 .                                                               .
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Figure 2. RTP packet layout.

An RTP packet consists of an RTP header and the RTP payload.  When 
using the Common Generic RTP Payload Format, one may define the RTP 
payload as being separated into a Common Generic RTP Payload Format 
Header, and a Common Generic RTP Payload Format Payload.  This is the 
definition shown in Figure 2.  As will become apparent in section 
4.2, however, a strict division of the RTP payload into two pieces is 
not possible, as the various header fields and PDU's (the payload) 
are intertwined.

4.1. RTP Header Usage

Several fields in the RTP header are assigned a special meaning in 
the context of this specification.  The use of the fields is as 
follows:

  Payload Type: This field may be set to the static RTP payload 
  type assigned to this RTP payload format.  However, if Extension 
  Data is used, and that data must be understood by the receiver in 
  order to properly process the PDU, the static RTP payload type 
  must not be used.  The above assumes that a static RTP payload 
  type has been assigned to this RTP payload format.  

  Marker bit (M bit): This bit is set to 1 if the first PDU in the 
  payload is not a fragmented PDU and does not contain a fragmented 
  AU.  The bit is also set to 1 if the first PDU is the last of a 
  series of fragmented PDU's, or if the first PDU contains the last 


A. Klemets                                                      [Page 8]

Internet Draft        draft-klemets-generic-rtp-00        March 13, 1998


  fragment of an AU.  Any other PDU's in the RTP payload, other 
  than the first, do not have any effect on the M bit.

  Timestamp: The RTP timestamp is set to the "presentation time" of 
  the first PDU contained in the payload.  PDU fragments all use 
  the presentation time of the complete PDU.  The clock frequency 
  depends on the media type, and its value should be conveyed 
  through non-RTP means, such as SDP [4].

4.2. The Common Generic RTP Payload Format Header

Figure 3 shows an RTP payload, containing all the different Common 
Generic RTP Payload Format Header fields.  In this particular 
example, the RTP payload contains two OBJECT fields.  An OBJECT field 
may contain either a PDU or Extension Data.


 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 |G|S|   FRAG    |                  OFFSET                       |
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 |E|B|         LENGTH            |                               .
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+                               .
 .               OBJECT (PDU or Extension Data)                  .
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 |E|B|         LENGTH            |       TIMESTAMP DELTA         |
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 .               OBJECT (PDU or Extension Data)                  .
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Figure 3. Sample RTP payload, using the payload format.

The fields have the following meanings:

G (Group): 1 bit.  If this field is 1, it indicates that the E, B and 
LENGTH fields are present, directly following the optional OFFSET 
field.  Typically, the G field is set to 1 when the payload contains 
more than one OBJECT field.

S (Shift): 1 bit.  If this field is 1, and the FRAG field is not 0, 
the value given by the OFFSET field should be shifted left by 8 bits. 
If the S field is 1 and the FRAG field is 0 and the RTP Marker bit is 
0, it means that the first PDU is the first fragment in a series of 
PDU fragments.  If the S field is 1 and the FRAG field is 0 and the 
RTP Marker bit is 1, it means that the first PDU is the last fragment 
in a series of PDU fragments.

FRAG: 6 bits.  When this field is 0, and the S field is 1 or the RTP 
Marker bit is 0, the first PDU is a PDU fragment.  If this field is 


A. Klemets                                                      [Page 9]

Internet Draft        draft-klemets-generic-rtp-00        March 13, 1998


not 0, the first PDU contains a fragmented AU, and the value of the 
FRAG field is the fragment sequence number.  For each AU, the first 
fragment sequence number always starts with 1 and increments by 1 for 
each fragment.  The sequence number that follows 63 is 1.  0 is not a 
valid fragment sequence number.  When the FRAG field is not 0, the 
OFFSET field is present.

Note that because the FRAG field wraps around, the largest burst of 
lost AU fragments, in which fragments that were received can be 
processed, is 62.  If more than 62 consecutive AU fragments are lost, 
other fragments belonging to the same AU must be discarded.  Lost 
packets are detected through the RTP sequence number, which means 
that correct operation requires the RTP sequence number to increment 
by 1 for each RTP packet.

OFFSET: 24 bits.  This field is only present when the FRAG field is 
not 0.  The presence of this field implies that the next PDU contains 
a fragmented AU.  The value of this field gives the byte offset of 
the start of the data in the fragmented AU, relative to the start of 
the complete AU.  The default value is 0.  If the S field is 1, the 
value of the OFFSET field should be shifted left by 8 bits.

E (Extension): 1 bit.  This field is only present when the G field is 
1.  The default value is 0.  If this field is 1, the next OBJECT 
field is Extension Data.  If the E field is 0, the next OBJECT field 
is a PDU.

B (Boundary): 1 bit.  This field is only present when the G field is 
1.  The default value is 0.  If this field is 1, and the next OBJECT 
field does not fully utilize all the remaining space in the RTP 
payload, it means that an additional PDU is contained in the 
remaining space.  That PDU does not have its own TIMESTAMP DELTA, B, 
and LENGTH fields.  If the next PDU is a PDU fragment or if it 
contains a fragmented AU, and the next PDU is not also the first PDU 
in the RTP payload, the value of the FRAG field for the next PDU is 
the same as the value of the B field.

LENGTH: 14 bits.  This field is only present when the G field is 1.  
The default value is the number of bytes in the RTP payload, counted 
from the end of this field to the end of the RTP payload, or to the 
start of the optional RTP padding field.  If a TIMESTAMP DELTA field 
follows this field, its size is not counted.  The value of the field 
gives the length in bytes of the next OBJECT field, if there is 
sufficient space to fully contain the OBJECT field in the RTP 
payload.  If there is not sufficient space to contain an OBJECT field 
of the size given by the LENGTH field, the next OBJECT field is the 
first of a series of PDU fragments or PDU's that contain a fragmented 
AU.  In this situation, the length of the next OBJECT field is equal 
to the available space in the RTP payload.  In addition, the value of


A. Klemets                                                     [Page 10]

Internet Draft        draft-klemets-generic-rtp-00        March 13, 1998

 
the B field should be used as the value for the FRAG field for the 
next PDU.

TIMESTAMP DELTA: 16 bits.  This field is only present when the G 
field is 1, the next OBJECT field is a PDU and the next PDU is not 
the first PDU in the RTP payload.  The value of the field is an 
unsigned 16 bit integer.  The default value is 0.  Adding the value 
to the RTP timestamp yields the RTP timestamp for the next PDU.

OBJECT: variable size.  This field is either a PDU or Extension Data.  
A PDU may also be a PDU fragment or may contain a fragmented AU.  
Extension Data cannot be fragmented.

4.3. Examples

This section has a few examples that illustrate possible ways of 
using the payload format.


 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 |G|S|   FRAG    |                    PDU                        .
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Figure 4. Simplest case: G=0, S=0, FRAG=0.
 
Figure 4 shows the simplest and perhaps most common situation.  The 
RTP packet contains a single PDU, and the PDU is not fragmented, and 
it contains an entire AU.  There is no Extension Data.  The Common 
Generic RTP Payload Format header is one byte.  All the fields in the 
byte are 0, making further header compression simple.
 
The RTP payload also looks like Figure 4 when the PDU is a fragmented 
PDU.  The RTP M bit and the S field is used to detect the first and 
last fragment.  The first fragment has S=1, M=0.  The last fragment 
has S=1, M=1.  All other fragments have S=0, M=0.
 
 
  0                   1                   2                   3
  0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  |G|S|   FRAG    |E|B|          LENGTH           |               .
  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+               .
  .               OBJECT (PDU or Extension Data)                  .
  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  .                              PDU                              .
  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 
Figure 5. Grouping example.  G=1, FRAG=0, B=1.


A. Klemets                                                     [Page 11]

Internet Draft        draft-klemets-generic-rtp-00        March 13, 1998


Figure 5 shows two OBJECT fields in the same RTP payload.  The first 
OBJECT field may contain either a PDU or Extension Data, depending on 
the value of the E field.  The B field is 1, which means that the E, 
B LENGTH and TIMESTAMP DELTA fields for the second OBJECT field are 
not present.  Because the E field defaults to 0, the second OBJECT 
field must contain a PDU.
 
If the first OBJECT field contains a PDU, it might be a PDU fragment.  
However, it cannot be a PDU that contains a fragmented AU, because 
the OFFSET field is not present.  In this example, the last PDU must 
contain an entire AU.  Hence, if the first OBJECT field contains a 
PDU fragment, it must be the last fragment in a series.  The S field 
and the RTP Marker bit will both be 1 in that case.
 
 
  0                   1                   2                   3
  0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  |G|S|   FRAG    |                  OFFSET                       |
  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  .                              PDU                              .
  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 
Figure 6. Fragmentation example.  G=0, FRAG>0.
 
Figure 6 shows an RTP payload with a single PDU, that contains a 
fragmented AU.  The OFFSET field and the S field, together identify 
the starting point of this fragment in the complete AU.
 
 
  0                   1                   2                   3
  0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  |G|S|   FRAG    |                  OFFSET                       |
  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  |E|B|         LENGTH            |                               .
  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+                               .
  .               OBJECT (PDU or Extension Data)                  .
  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  |E|B|         LENGTH            |                               .
  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+                               .
  .                              PDU                              .
  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  |E|B|         LENGTH            |       TIMESTAMP DELTA         |
  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  .                              PDU                              .
  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 
Figure 7. Multiple PDU's.  G=1, FRAG>0, B1=0, E2=0, B2=0, E3=0.


A. Klemets                                                     [Page 12]

Internet Draft        draft-klemets-generic-rtp-00        March 13, 1998


Figure 7 shows how an RTP payload may look if one wants all RTP 
packets to be of the same size.  In the example, there are three 
OBJECT fields.  The last two contain PDU's, while the first OBJECT 
field may contain a PDU or Extension Data, depending on the value of 
the E1 field.  The FRAG field is not 0, and the OFFSET field is 
present.  This means that the first PDU contains a fragmented AU.  
Which OBJECT field contains the first PDU depends on the value of the 
E1 field.

If the last PDU is smaller than the last LENGTH field, and the value 
of B3 is 0, the last PDU is the first in a series of PDU fragments.  
If the last PDU is smaller than the last LENGTH field, and the value 
of B3 is 1, the last PDU is the first in a series of PDU's that 
contain fragments of an AU.  Otherwise, the value of B3 should be 0.

The RTP timestamp is applicable to all PDU's except for the last one.  
The value of the TIMESTAMP DELTA field must be added to the RTP 
timestamp to obtain the RTP timestamp value for the last PDU. 

5. Authors Address

Anders Klemets
1 Microsoft Way
Redmond, WA 98052-6399
USA

E-mail: anderskl@microsoft.com


References

[1] Microsoft Corporation, "Advanced Streaming Format (ASF) 
    Specification", http://www.microsoft.com/asf, February 1998.

[2] H. Schulzrinne, et. al., "RTP : A Transport Protocol for Real-
    Time Applications", IETF RFC 1889, January 1996.

[3] H. Schulzrinne, et. al., "RTP Profile for Audio and Video 
    Conference with Minimal Control", IETF RFC 1890, January 1996.

[4] E. Fleischman, "WAVE and AVI Codec Registries", work in progress.

[5] M. Handley, "SDP: Session Description Protocol", work in 
    progress.

[6] Apple Computer, Inc., "QuickTime File Format Specification", May 
    1996.

[7] ISO/IEC 14496-1 CD, "MPEG-4 Systems", October 1997.


A. Klemets                                                     [Page 13]

Internet Draft        draft-klemets-generic-rtp-00        March 13, 1998


[8] Internet Engineering Task Force, "Proceedings of the 40th IETF", 
    December 1997.

[9] A. Jones, et. al., "RTP Payload Format for QuickTime Media 
    Streams", work in progress.

[10] A. Klemets, "RTP Payload Format for ASF Streams", work in 
     progress.

[11] C. Herpel, et. al., "RTP payload format for MPEG-4 Elementary 
     Streams", work in progress.


A. Klemets                                                     [Page 14]