mediasoup

/ home / Documentation / v3 / mediasoup / RTP Parameters and Capabilities

RTP Parameters and Capabilities

RTP parameters describe the media that a producer endpoint sends to mediasoup (RTP send parameters) or the media that mediasoup forwards to a consumer endpoint (RTP receive parameters).

RTP capabilities, instead, define what mediasoup or a consumer endpoint can receive, thus RTP parameters depend on (or are constrained to) the remote RTP capabilities.


RTP Negotiation Overview

When a mediasoup Router is created it's provided with a set of RtpCodecCapability that define the audio and video codecs enabled in that router. The application then retrieves the computed router.rtpCapabilities (which include the router codecs enhanced with retransmission and RTCP capabilities, and the list of RTP header extensions supported by mediasoup) and provides the endpoints with those RTP capabilities.

The endpoint wishing to send media to mediasoup uses the router's RTP capabilities and its own ones to compute its sending RTP parameters and transmits them to the router (assuming it has already created a transport to send media). The application then creates a Producer instance in the router by using the transport.produce() API.

When an endpoint wishes to receive from the router media associated to a specific producer (assuming it has already created a transport to receive media) the application takes the endpoint's RTP capabilities and uses the transport.consume() API indicating those capabilities and the producerId to be consumed, thus generating a Consumer instance whose RTP receive parameters have been calculated by merging both, the producer's RTP parameters and the endpoint's RTP capabilities. The application can then signal the resulting consumer.rtpParameters (along with other information) to the endpoint.

mediasoup is flexible in what it receives from endpoints, meaning that the producer's RTP parameters can have codec payloadType values and RTP header extension id values that differ from the preferred ones in the router's RTP capabilities. However, the producer's RTP parameters MUST NOT include codecs not present in the router's capabilities.

mediasoup is strict in what it sends to endpoints, meaning that the codec preferredPayloadType values and RTP header extension preferredId values in the endpoint's RTP capabilities MUST match those in the router's RTP capabilities. And then mediasoup will build a RTP receive parameters based on the RTP parameters of the producer being consumed and the endpoint's RTP capabilities.

At the end, the rule is simple:

  • The entity sending RTP (mediasoup or an endpoint) decides the sending ids.
  • The entity receiving RTP (mediasoup or an endpoint) must honor those ids.

Dictionaries

RtpParameters

There are two types of RTP parameters (RtpSendParameters and RtpReceiveParameters), both sharing the following definition:

Field Type Description Required Default
mid String The MID RTP extension value as defined in the BUNDLE specification. No  
codecs Array<RtpCodecParameters> Media and RTX codecs in use. Yes  
headerExtensions Array<RtpHeaderExtensionParameters> RTP header extensions in use. No [ ]
encodings Array<RtpEncodingParameters> Transmitted RTP streams and their settings. Yes  
rtcp RtcpParameters Parameters used for RTCP. No  

RtpSendParameters

@inherits RtpParameters

The RTP send parameters describe a media stream received by mediasoup from an endpoint through its corresponding mediasoup Producer.

  • These parameters may include a mid value that the mediasoup transport will use to match received RTP packets based on their MID RTP extension value.
  • mediasoup allows RTP send parameters with a single encoding and with multiple encodings (simulcast). In the latter case, each entry in the encodings array must include a ssrc field or a rid field (the RID RTP extension value).

Check the Simulcast and SVC sections for more information.

RtpReceiveParameters

@inherits RtpParameters

The RTP receive parameters describe a media stream as sent by mediasoup to an endpoint through its corresponding mediasoup Consumer.

  • The mid value is unset (mediasoup does not include the MID RTP extension into RTP packets being sent to endpoints).
  • There is a single entry in the encodings array (even if the corresponding producer uses simulcast). The consumer sends a single and continuous RTP stream to the endpoint and spatial/temporal layer selection is possible via consumer.setPreferredLayers().
  • As an exception, previous bullet is not true when consuming a stream over a PipeTransport, in which all RTP streams from the associated producer are forwarded verbatim through the consumer.
  • The RTP receive parameters will always have their ssrc values randomly generated for all of its encodings (and optional rtx: { ssrc: XXXX } if the endpoint supports RTX), regardless of the original RTP send parameters in the associated producer. This applies even if the producer's encodings have rid set.

RtpCapabilities

The RTP capabilities define what mediasoup or an endpoint can receive at media level.

Field Type Description Required Default
codecs Array<RtpCodecCapability> Supported media and RTX codecs. Yes  
headerExtensions Array<RtpHeaderExtension> Supported RTP header extensions. No [ ]

RtpCodecParameters

Provides information on codec settings within the RTP parameters. The list of media codecs supported by mediasoup and their settings is defined in the supportedRtpCapabilities.ts file.

Field Type Description Required Default
mimeType String The codec MIME media type/subtype (e.g. “audio/opus”, “video/VP8”). Yes  
payloadType Number The value that goes in the RTP Payload Type Field. Must be unique. Yes  
clockRate Number Codec clock rate expressed in Hertz. Yes  
channels Number The number of channels supported (e.g. two for stereo). Just for audio. No 1
parameters Object Codec-specific parameters available for signaling. Some parameters (such as “packetization-mode” and “profile-level-id” in H264 or “profile-id” in VP9) are critical for codec matching. No  
rtcpFeedback Array<RtcpFeedback> Transport layer and codec-specific feedback messages for this codec. No [ ]

See the Codec Parameters section below for more info about the codec parameters.

RtcpFeedback

Provides information on RTCP feedback messages for a specific codec. Those messages can be transport layer feedback messages or codec-specific feedback messages. The list of RTCP feedbacks supported by mediasoup is defined in the supportedRtpCapabilities.ts file.

Field Type Description Required Default
type String RTCP feedback type. Yes  
parameter String RTCP feedback parameter. No  

RtpEncodingParameters

Provides information relating to an encoding, which represents a media RTP stream and its associated RTX stream (if any).

Field Type Description Required Default
ssrc Number The media SSRC. No  
rid String The RID RTP extension value. Must be unique. No  
rtx Object RTX stream information. It must contain a numeric ssrc field indicating the RTX SSRC. No.  
dtx Boolean It indicates whether discontinuous RTP transmission will be used. Useful for audio (if the codec supports it) and for video screen sharing (when static content is being transmitted, this option disables the RTP inactivity checks in mediasoup). No false
scalabilityMode String Number of spatial and temporal layers in the RTP stream (e.g. “L1T3”). See webrtc-svc. No  

Check the Simulcast and SVC sections for more information.

RtpHeaderExtensionParameters

Defines a RTP header extension within the RTP parameters. The list of RTP header extensions supported by mediasoup is defined in the supportedRtpCapabilities.ts file.

Field Type Description Required Default
uri String The URI of the RTP header extension, as defined in RFC 5285. Yes  
id Number The numeric identifier that goes in the RTP packet. Must be unique. Yes  
encrypt Boolean If true, the value in the header is encrypted as per RFC 6904. No false
parameters Object Configuration parameters for the header extension. No  
  • mediasoup does not currently support encrypted RTP header extensions.
  • No parameters are currently considered.

RtcpParameters

Provides information on RTCP settings within the RTP parameters.

Field Type Description Required Default
cname String The Canonical Name (CNAME) used by RTCP (e.g. in SDES messages). No  
reducedSize Boolean Whether reduced size RTCP RFC 5506 is configured (if true) or compound RTCP as specified in RFC 3550 (if false). No true
  • If no cname is given in a producer's RTP parameters, the mediasoup transport will choose a random one that will be used into RTCP SDES messages sent to all its associated consumers.
  • mediasoup assumes reducedSize to always be true.

RtpCodecCapability

Provides information on the capabilities of a codec within the RTP capabilities. The list of media codecs supported by mediasoup and their settings is defined in the supportedRtpCapabilities.ts file.

Exactly one RtpCodecCapability will be present for each supported combination of parameters that requires a distinct value of preferredPayloadType. For example:

  • Multiple H264 codecs, each with their own distinct “packetization-mode” and “profile-level-id” values.
  • Multiple VP9 codecs, each with their own distinct “profile-id” value.
Field Type Description Required Default
kind MediaKind Media kind (“audio” or “video”). Yes  
mimeType String The codec MIME media type/subtype (e.g. “audio/opus”, “video/VP8”). Yes  
preferredPayloadType Number The preferred RTP payload type. Yes  
clockRate Number Codec clock rate expressed in Hertz. Yes  
channels Number The number of channels supported (e.g. two for stereo). Just for audio. No 1
parameters Object Codec specific parameters. Some parameters (such as “packetization-mode” and “profile-level-id” in H264 or “profile-id” in VP9) are critical for codec matching. No  

RtpCodecCapability entries in the mediaCodecs array of RouterOptions do not require preferredPayloadType field (if unset, mediasoup will choose a random one). If given, make sure it's in the 96-127 range.

RtpHeaderExtension

Provides information relating to supported header extensions. The list of RTP header extensions supported by mediasoup is defined in the supportedRtpCapabilities.ts file.

Field Type Description Required Default
kind MediaKind Media kind (“audio” or “video”). If unset, it's valid for all kinds. No  
uri String The URI of the RTP header extension, as defined in RFC 5285. Yes  
preferredId Number The preferred numeric identifier that goes in the RTP packet. Must be unique. Yes  
preferredEncrypt Boolean If true, it is preferred that the value in the header be encrypted as per RFC 6904. No false
direction String If “sendrecv”, mediasoup supports sending and receiving this RTP extension. “sendonly” means that mediasoup can send (but not receive) it. “recvonly” means that mediasoup can receive (but not send) it. No  

Enums

MediaKind

Value Description
“audio” Audio media kind.
“video” Video media kind.

Codec Parameters

When a producer includes codec parameters into its RTP send parameters, those parameters are passed verbatim to the RTP receive parameters of the consumers associated to that producer.

Some of those parameters are part of the codec settings and are used for codec matching. Some other codec parameters affect the operation of mediasoup for the corresponding producer and consumers.

Parameters for Codec Matching

These parameters are part of the codec settings, meaning that their values determine whether an entry in rtpParameters.codecs matches or not an entry in the remote rtpCapabilities.codecs. These parameters are codec-specific:

H264

H264 codec matching rules are complex and involve inspection of the following parameters (see the RFC 6184 for more details):

Parameter Type Description Required Default
“packetization-mode” Number 0 means that the single NAL mode must be used. 1 means that the non-interleaved mode must be used. No 0
“profile-level-id” String Indicates the default sub-profile and the default level of the stream. Yes  
“level-asymmetry-allowed” Number Indicates whether level asymmetry is allowed. No 0

mediasoup uses the h264-profile-level-id JavaScript library to evaluate those parameters and perform proper H264 codec matching.

Depending the negotiated H264 “packetization-mode” and “profile-level-id”, Chrome may use OpenH264 software encoder or H264 external hardware encoder. In the latter case, Chrome will NOT generate simulcast but a single stream.

See the reported issue for for information.

VP9

Parameter Type Description Required Default
“profile-id” Number VP9 coding profile (more info). Supported values are 0 and 2. No 0

Parameters Affecting mediasoup Operation

These parameters influence the mediasoup operation by enabling or disabling some features. These parameters are codec-specific:

OPUS

Parameter Type Description Required Default
“useinbandfec” Number If 1, mediasoup will use the worst packet fraction lost in the RTCP Receiver Report received from the consuming endpoints and use it into the Receiver Report that mediasoup sends to the OPUS producer endpoint. This will force it to generate more in-band FEC into the OPUS packets to accomodate to the worst receiver. No 0
“usedtx” Number If 1, mediasoup will not consider the stream as inactive when there is no RTP traffic. Same behavior is achieved by indicating dtx: true in the corresponding encoding in the RTP send parameters. No 0

Simulcast

Simulcast involves sending N separate video RTP streams (so N different SSRCs) representing N different qualities of the same video source. If RTX is used, there would also be N additional RTP RTX streams, one for each media RTP stream. Each media RTP stream may also contain M temporal layers.

Currently mediasoup supports simulcast (optionally with M temporal layers) for VP8 and H264 codecs.

When creating a simulcast producer, the associated rtpParameters given to transport.produce() must conform to the following rules:

  • There must be N > 1 entries in the encodings array.
  • Each encoding must include a ssrc field or a rid field (the RID RTP extension value) to help the mediasoup producer identify which RTP stream each packet belongs to.
  • Each encoding represents a “spatial layer”. Entries in encodings must be ordered from lowest to highest resolution (encodings[0] means “spatial layer 0” while encodings[N-1] means “spatial layer N-1”, being N the number of simulcast streams).
  • If the streams have M temporal layers, those must be signaled in each encoding within the scalabilityMode field:
    • Since each stream has a single spatial layer, S must be 1.
    • If there are not temporal layers, the scalabilityMode field can be omitted (it defaults to “S1T1”, this is, one spatial layer and one temporal layer).

Regarding the scalabilityMode syntax, mediasoup uses S for independent spatial layers (simulcast) and L for dependent spatial layers (SVC).

Simulcast consumers will just get a single stream and hence a single entry in their rtpParameters.encodings array. Such a encoding entry has a scalabilityMode value that determines the number of spatial layers (number of simulcast streams in the producer) and the number of temporal layers.

To clarify, if the producer uses simulcast with 3 streams (3 SSRCs), mediasoup will forward a single and continuous stream (1 SSRC) to the consumer.

The encoding entry in rtpParameters.encodings of the consumer contains a scalabilityMode field whose S value (number of independent spatial layers) matches the number of streams in the producer, and whose T value (number of temporal layers) matches the number of temporal layers in each stream in the producer.

Examples

The following examples just show the rtpParameters.encodings field and, for simplicity, do not include RTX information.

Simulcast with 3 streams using SSRCs

  • Producer:
encodings :
[
  { ssrc: 111110 },
  { ssrc: 111111 },
  { ssrc: 111112 }
]
  • Consumer:
encodings :
[
  { ssrc: 222220, scalabilityMode: 'S3T1' }
]

Simulcast with 4 streams and 3 temporal layers using RID

  • Producer:
encodings :
[
  { rid: 'r0', scalabilityMode: 'S1T3' },
  { rid: 'r1', scalabilityMode: 'S1T3' },
  { rid: 'r2', scalabilityMode: 'S1T3' },
  { rid: 'r3', scalabilityMode: 'S1T3' }
]
  • Consumer:
encodings :
[
  { ssrc: 222220, scalabilityMode: 'S4T3' }
]

SVC

SVC involves sending a single RTP stream with N spatial layers and M temporal layers. If RTX is used, there would also be an additional RTP RTX stream.

mediasoup implements two types of SVC, full SVC and K-SVC. The main difference is that, in K-SVC, a RTP key frame is required in order to up/down switch the maximun spatial layer that mediasoup forwards to a consumer. For more information about SVC in WebRTC check the webrtc-svc specification (work in progress).

Currently mediasoup supports SVC for VP9 codec in both, full SVC and K-SVC modes. See below for more information about current state of the art in existing implementations.

When creating a SVC producer, the associated rtpParameters given to transport.produce() must conform to the following rules:

  • There must be just one entry in the encodings array.
  • Such a encoding must include a scalabilityMode field.

SVC consumers will get a single stream and hence a single entry in their rtpParameters.encodings array. Such a encoding entry has a scalabilityMode value that determines the number of available spatial and temporal layers (same value as in the associated producer).

Examples

The following examples just show the rtpParameters.encodings field and, for simplicity, do not include RTX information.

Full SVC with 3 spatial layers and 2 temporal layers

  • Producer:
encodings :
[
  { ssrc: 111110, scalabilityMode: 'L3T2' }
]
  • Consumer:
encodings :
[
  { ssrc: 222220, scalabilityMode: 'L3T2' }
]

K-SVC with 4 spatial layers and 5 temporal layers

  • Producer:
encodings :
[
  { ssrc: 111110, scalabilityMode: 'L4T5_KEY' }
]
  • Consumer:
encodings :
[
  { ssrc: 222220, scalabilityMode: 'L4T5_KEY' }
]

State of the Art

SVC is not yet properly defined for WebRTC and it's not covered by the WebRTC 1.0 specification.

Chrome

mediasoup-client >= 3.1.0 enables VP9 SVC in Chrome >= M74 (without any command line flag) by doing dirty things:

It's important to notice that Chrome uses VP9 K-SVC when transmitting the webcam video and full SVC when doing screen sharing. This must be properly signaled in the scalabilityMode of the mediasoup producer (otherwise things won't work):

  • Webcam video (K-SVC) with 3 spatial layers and 3 temporal layers:
scalabilityMode: 'L3T3_KEY'
  • Screen sharing (full SVC) with 3 spatial layers and 3 temporal layers:
scalabilityMode: 'L3T3'

libwebrtc

VP9 SVC can also be enabled in Chrome and in libwebrtc based native apps via a flag whose value determines the number of spatial and temporal layers:

WebRTC-SupportVP9SVC/EnabledByFlag_3SL3TL/

To enable VP9 SVC in Chrome, the browser must be launched with the following command line argument:

--force-fieldtrials=WebRTC-SupportVP9SVC/EnabledByFlag_3SL3TL/

To enable VP9 SVC using the libwebrtc C++ API:

webrtc::field_trial::InitFieldTrialsFromString("WebRTC-SupportVP9SVC/EnabledByFlag_3SL3TL/");

Note that, instead of EnabledByFlag_3SL3TL, other variations are valid (such as EnabledByFlag_2SL1TL, etc). The thing here is that the scalabilityMode value in the producer must match the number of spatial and temporal layers in the flag.