Bandwidth Utilised for VoIP

Calculating the bandwidth used for VoIP may seem like a daunting task, however it is rather simple, especially after you’ve understood some principles.

VoIP calls consists of 2 main parts. The call management section is the part which gets the call going. It signals call initiation, the ringing, the disconnect, and other communication performed between the 2 endpoints in order to get the call going. The second part is the audio, which is transmitted using RTP. Since the bandwidth consumed by SIP is insignificant, we shall focus on calculating the bandwidth consumed by audio in the rest of this article.

Since raw audio can be rather large, it needs to be encoded before it is sent on the network. This is done using a codec. Different codecs produce different audio quality, consume different levels of bandwidth, and some are more CPU intensive than others. Thus it is important that you select the right codec for your application.

Before we delve into the differences of the common codecs, let’s introduce another principle which would allow us to accurately calculate the bandwidth utilised. When you need to send data over the network, the data needs to be packaged. The ‘packaging’ contains information which allows the data to be sent to the destination and to be rebuilt correctly. As you might imagine, the packaging does not come free – it adds to the bandwidth consumption.

There are different network packaging layers (required by the 7 layer OSI model). The encoded audio needs to be packaged into RTP packets. In turn, the RTP packets need to be packaged into UDP packets, which then need to be packaged into IP packets. Ethernet is the most common type of network, and this requires another wrapper.

Ethernet packet

We shall refer to these packages collectively as overhead. Irrelevant of the codec used, the overhead introduced in the packet is fixed. Below is the overhead introduced by each overhead item:

  •         RTP – 4.8 kbps
  •         UDP – 3.2 kbps
  •         IP – 8 kbps
  •         Ethernet (not using QOS) – 15.2 kbps


The total overhead is 31.2 kbps.



Now that we understand the basics, let’s proceed to explaining the differences between the common codecs which can be used to encapsulate the Audio in a VoIP call. The following table shows the audio quality expected, the CPU resources required to encode and decode the audio, the base size of the audio packets, and total bandwidth consumption after taking into consideration the overhead.


Audio Quality
CPU Resources Base Size
Total Size (Base + Overhead)
Good Very Low
64 kbps
95.2 kbps
Very Good Low 64 kbps 95.2 kbps
Acceptable Average 13 kbps 44.2 kbps
Average High 8 kbps 39.2 kbps

Note that the above bandwidth consumption is in kilobits per second. You will need to divide by 8 in order to get the equivalent in kilobytes per second. Using the above data, we can come up with the following stats:

Codec Kilobits per second Kilobytes per second Kilobytes per minute Megabytes per hour
G711 95.2 11.9 714 41.8
G722 95.2 11.9 714 41.8
GSM 44.2 5.525 331.5 19.4
G729 39.2 4.9 294 17.2

Here are some notes and suggestions for the application of specific codecs:

  • The above are the stats for one audio stream. A VoIP call will use one audio stream for each leg (or endpoint). Thus a call between 2 persons will use double the bandwidth shown above.
  • G729 is the codec which consumes the least bandwidth and has a relatively good audio quality. However that comes with 2 drawbacks:
    1.   Its efficiency comes at a cost, which is CPU usage. It is more CPU intensive to encode audio in such a low size while maintaining the quality.
    2.   G729 is a proprietary codec. Because of this, the number of simultaneous G729 calls cannot exceed half your 3CX Phone System simultaneous call license.
  • Because of this, G729 should only be used when really required, such as for external calls to VoIP Providers, calls across a bridge, or for remote extensions (basically all calls  being done over the internet). You can configure the PBX to fall back to GSM if G729 calls cannot be made.
  • Although G711 and G722 consume over twice as much bandwidth as the other codecs, most Local Area Networks are able to handle this bandwidth. Using the above table, a 1 hour call using G711 is equivalent to transferring a 41.8 MB file. If that causes an issue, you should consider upgrading your network.

Join our newsletter

Follow us


Contact us today!

and keep informed about special promotions

Email this page to a friend
+27 12-681-0097

Linked in profile 

Share this page over facebook

  Tweet this page