HD Voice - I Hear you Loud and Clear

Category: Solutions

HD Voice, Wideband, G722 – all marketing terms for when you get a random phone call which sounds like you’re talking to a high definition MP3! You may well have encountered this on a mobile phone call from a friend or colleague using the same network operator as yourself when both mobile phones simultaneously have good 3G or 4G coverage. It can be surprising, almost unnerving, like the first time you accidentally entered a Skype call with your webcam enabled when you were less than ‘office pruned’!

But joking aside, a good quality phone call can make the world of difference, and why shouldn’t we all strive to make the good ‘old fashioned land line’ something that is the peak of what can be achieved when we just want to have a chat?

That extra layer of clarity you get from a physical phone with guaranteed end to end quality of service is vital to many organisations. From life and death services like air traffic control, emergency services, operator and dispatch calls, to businesses which want to make every call a ‘good’ experience.It’s not just life or death scenarios which benefit. Overseas calls already often suffer from language barriers and latency – why make it worse? Audio conferencing can be difficult in large rooms with lots of voices and echoes, which is only emphasised with narrowband devices.

Whilst you can’t control how the public network handles your call, you probably already have the capability to provide high definition voice to the edge of the public network – and you can ensure that you’re setting the bar at the highest possible standard to anything beyond that.

The human ear can generally detect anything from 0 to 20,000Hz (or 20Khz). However, ‘back in the day’ it was generally considered that anything up to 3.4Khz was ‘good enough’ – and as a result we have all come to expect that any kind of telephone call will be slightly (and, it turns out, reassuringly) muffled. The public network (PSTN) today and in even the most developed nations, still utilises these standards, but IP based telecommunications (VoIP) mean that there is absolutely no reason why you can’t push the envelope within your business and potentially beyond.

The quality of a phone call takes on an entirely new meaning when you’re stuck on a train station platform waiting out a train strike, or trying to avoid driving your car because the cost of fuel is so high. Businesses and organisations which can be flexible – enable speech, along with video, messaging, collaboration across existing data networks - are unaffected by the latest Southern Rail strike or parking charges. The benefits to companies of employees working from home, or from any non-office-based location, become available with reliable and higher-quality communication options.

Video conferencing has gradually evolved from a niche which began to ‘seem’ possible with 128kbps ISDN circuits back in the 1980s to something which now feels part and parcel of many of our electronic devices – FaceTime on an Apple device, Skype Video on a Skype application, and why not your business telephone?

Nearly all telephone solutions now provide SIP as the standard for VoIP telephony – with SIP trunks replacing ISDN and Analogue trunks to the PSTN and SIP (IP) endpoints replacing traditional TDM telephones – and part of SIP is video – so often it’s a case of buying the right device, configuring it and job’s a good-un! Audio conferencing is still traditionally narrowband (or the muffled, standard telephone call we are all used to) and is far from the ideal solution for important meetings.

So what do we need to consider so we can get started?

HD Voice / Wideband / G722 requires that communications are fully IP-based. An ideal route to this objective is to use a platform as a starting point (sometimes called a ‘hybrid’) which supports both Narrowband TDM devices, Narrowband VoIP devices and Wideband VoIP devices. This means that migration to a high definition universe can be achieved over time without drastic, binary, expensive required changes from the outset.

Why hasn’t everyone already done this?

As with any new concept, one of the main culprits as to why wideband voice has taken so long to reach the masses is that for one wideband endpoint to be useful, it has to have a bunch of other wideband endpoints to call! If you happen to have BT’s Home Hub ADSL service then you’ll already have G722 speech to your landline, although most users, perhaps because they don’t use their landline often anymore, are unaware they have the facility of ‘HD voice’.

A good idea is to begin with a pocket of Wideband-enabled devices which can all communicate in high definition and then work to widen that pocket. Of course, you may well initially end up with lots of pockets which may not necessarily talk to each other in wideband – but the PSTN with improving high definition SIP trunking providers, and improvements in latency and bandwidth, those pockets eventually all get linked together.

I’ve mentioned G722 a few times – which is one of many wideband codecs – and a very popular and well-established one. Having said that, there is no de-facto standard, and when two endpoints wish to talk wideband to one another they both have to support the same standard (and automatically negotiate the best one to use) to be able to talk. Equipment and services which support the most comprehensive suite of wideband codecs is a good place to start – just incase Betamax does not look to become the industry standard we all think it will.

Transcoding (and Normalisation, part of Transcoding) is also a possibility. There are appliances, servers, services and sometimes PBXs which will take virtually any format and present it to the distant party as something which it will understand, without having to understand it natively. Channels and CPU utilisation should be considered because remember, all of this has to be done concurrently for the amount of conversations occurring at once, and has to be done instantaneously – on the fly – so that a real time chit chat doesn’t get all jumbled up and confusing when John’s last word arrives before his first.

Upgrades are a wise consideration. We probably all own an HD TV which was cutting edge in 2012, but what about UHD which is about to be the best thing since sliced bread? You may not be able to upgrade your Panasonic TV but you almost certainly can firmware-upgrade many modern voice services / PBXs to cope with future concepts and standards.

Which Codec?

There are many narrow and wideband codecs to consider in terms of backward/cross compatibility and targets. 7Khz is a good target for bandwidth – double the 3.4Khz standard of old, and approaching the 8khz band of human speech. This is where Wideband is generally expected to reach, whereas Super-Wideband and Fullband go above and beyond the 20Khz frequency (which should only be considered if you’re having a conversation with a bat). Joking aside, Super-Wideband is useful for certain music applications but overkill for most organisation’s needs.

It’s interesting to note that the difference between the 3.4kHz ‘narrowband’ and 7kHZ wideband codecs mean you can much more easily discern between a ‘t’ and a ‘p’, ‘n’ and an ‘m’, and ‘s’ and ‘f’.

G.722 is one of the original wideband codecs, and is fairly widespread. It has evolved into two sub-bands coded using ADPCM. Interestingly, it is license-free as its patents have expired. In 2017, G722 is very commonly supported in most SIP endpoints including most desk and softphones.

G722.1 is a modulated lapped transform (MLT) codec. Which means it’s great.

G722.2 is better known as AMR-WB and uses variety of bit-rates to accommodate for its most common use on 3G and 4G cellular networks. If the signal is poor, it will use a low bitrate and if it’s good, it will increase to a higher quality high-bitrate. This codec definitely uses Algebraic Code Excited Linear Prediction (ACELP) to compress and decompress your dulcet tones. Which is nice.

MS RTA was/is commonly used in Microsoft’s OCS, which eventually turned into Lync which eventually turned into Skype and will one day turn into WHESKTISIGS-ngage. Probably.

iSAC is an excellent low-bandwidth codec which is/was used by Googletalk, AOL and Skype, with up to 32kHz sampling.

The most common wideband codec is G722 along with G722.2 – which has been widely adopted by mobile applications. If you’ve been impressed with a Skype audio call then you’ll be interested to know that Skype uses it’s own Codec called SILK.

G.711.1 and G729.1 are other interesting (but not yet widely depoloyed codecs) which function as extensions to existing and very-well established narrowband codecs – the idea being that they can operate in both Narrow and Wideband modes. This can have a huge impact in CPU utilisation as it removes the need for transcoding between devices or services which can and can’t understand wideband.

How Flexible?

Your wideband platform should have a likely course of expansion to support emerging standards. It should be flexible enough to have a priority list based on the type of device calling another, so a variety of preferred wideband codecs can be used before being able to also call on some tried and tested narrowband codecs such as G.711 and G.729a.

The platform should be a hybrid which can use different signalling mechanisms from both the past and the future – such as TDM, H.323 or SIP. It should be flexible enough to grow with you – carry more endpoints, trunks and simultaneous streams of communication.

Frequency - the number of times that a component of a signal oscillates per second. The ITU-T standards body has established various definition terms for voice codecs. The human ear can detect sounds (on frequencies) up to 20kHz (20000Hz), but for most applications in most organisations, a telephone call, some music on hold and audio conferencing is all likely to fall within 7kHz, which is why Wideband is being adopted as the ideal new standard.

If you want to put this in context, a decent MP3 runs at about 18Khz, FM radio is at about 15kHz and AM radio is about 5kHz. So you could say AM radio is lower than what Narrowband is trying to achieve and FM is above (because, naturally FM radio is focused on stereo high quality music with varieties of sounds not often reproduced by a human mouth!).