SIP

Home / VOIP / SIP

VoIP packet format with voice payload and headers

VoIP packet format with voice payload and headers

A compressed voice frame is required to be packetized with Real-Time Transport Protocol (RTP), User Datagram Protocol (UDP), and IP headers and then encapsulated with network interface headers.

The RTP header is 12 bytes. Voice is sensitive to delays. RTP helps proper end-to-end delivery of real-time voice traffic. RTP header compression reduces the number of bytes, but header compression is not considered in this topic. Details on RTP are given in RFC3550.

Compressed payload, RTP, UDP, and IP header combinations are described as VoIP packets.

VoIP header = (IP + UDP + RTP) = 40 bytes in IPv4 and 60 bytes in IP version 6 (IPv6)
VoIP packet = (VoIP header + voice payload)

VoIP packet format with voice payload and headers.

Physical network […]

The difference between the G.729 and G.729 Annex-B

G.729 is a high complexity algorithm, and G.729A (also known as G.729 Annex-A) is a medium complexity variant of G.729 with slightly lower voice quality. All platforms that support G.729 also support G.729A.

G.729 Annex-B is a high complexity algorithm, and G.729A Annex-B is a medium complexity variant of G.729 Annex-B with slightly lower voice quality.

The difference between the G.729 and G.729 Annex-B codec is that the G.729 Annex-B codec provides built-in IETF voice activity detection (VAD) and Comfort Noise Generation (CNG).

G.729a is a compatible extension of G.729, but requires less computational power. This lower complexity, however, bears the cost of marginally reduced speech quality.

G.729 has been extended in Annex B (G.729b) which provides a silence compression method that enables a voice activity detection (VAD) module. It is used to detect voice activity in the […]

Voice Troubleshooting

Voice Troubleshooting

A voice call over a packet network is segmented into discrete call legs. A call leg is a logical connection between two voice gateways or between a gateway and an IP telephony device. Troubleshooting, debugging and sniffing should focus first on each leg independently and then on the VoIP call as a whole. You can isolate where a problem is occurring by determining which dial peer or call leg is having the problem.

You need to understand the call path through the router in order to determine where a problem is occurring.

• Call control application programming interface (CCAPI)—Three clients make use of the CCAPI: command-line interface (CLI), Simple Network Management Protocol (SNMP) agent and Session Application. The CCAPI main functions are:
– Identify the call legs (Which dial peer is it? Where did it come from?).
– Decide which session application takes the call (Who handles it?).
– Invoke the packet handler.
– Conference […]

One-two Way Audio Problem in SIP

One-two Way Audio Problem in SIP

During the initial SIP offer/answer exchange, both the originating and terminating SIP user agents (UAs) specify in the SDP payload the desired IP address and port combination for the caller and callee to receive the associated media stream and to properly direct the signaling stream. SIP UAs within the enclave may use private addressing for topology hiding reasons. A problem occurs when private addressing is used within the SDP payload since the private address is not resolvable from the WAN side of the NAT.

A traditional NAT device will change the IP source address and/or port combination at packet header level, but not the IP address within the SDP payload. Consequently, the callee, or UA in the remote enclave, will not have the correct IP address to respond to from a signaling perspective and the call setup will fail. Even if the call […]

Early Media and Ringing Tone Generation in SIP

Early Media and Ringing Tone Generation in SIP

The concept of “early media” can sometimes confuse . In RFC 3960 defines it as:

   Early media refers to media (e.g., audio and video) that is exchanged
   before a particular session is accepted by the called user.  Within a
   dialog, early media occurs from the moment the initial INVITE is sent
   until the User Agent Server (UAS) generates a final response.  It may
   be unidirectional or bidirectional, and can be generated by the
   caller, the callee, or both.
   An UAC should develop its local policy regarding
local ringing generation. For example, a POTS ("Plain Old Telephone
Service")-like SIP User Agent (UA) could implement the following
[...]

Codecs and Required Bandwidth

Codecs and Required Bandwidth

Codecs and Required Bandwidth

The required bandwidth for a single call, in one direction, is 64 kbps. As the G.711 codec samples 20 ms of voice per packet, 50 such packets need to be transmitted per second. Each packet contains 160 voice samples, which gives 8000 samples per second. Each packet is sent in one Ethernet frame. With every packet of size 160 bytes, headers of additional protocol layers are added. These headers include RTP + UDP + IP + Ethernet, with a preamble of sizes, 12 + 8 + 20 + 26, respectively. Therefore, a total of 226 bytes, or 1808 bits, must be transmitted 50 times per second, or 90.4 kbps, in one direction. For both directions, the required bandwidth for a single call is 100 pps or 180.8 kbps, assuming a symmetric flow.

  • Waveform codecs:– Directly encode speech in an efficient way […]
Go to Top