Improving the VoIP foreign number solution

It has been a while and I’ve noticed a few issues in my solution (more info here) allowing “assigning” a foreign number to your smart phone. Since solving them was a nice little achievement, I’ll share the description here. It was quite an investigation, too!

The first problem was that the connection keeps timing out from time to time. Not always, but I couldn’t determine a pattern (ah, I love these situations!). In such cases there’s no other choice but to get your hands dirty… and in this case it means some deep packet inspection. Asterisk makes this task fairly easy, so tcpdump was not required. Using sip set debug ip/peer on/off allows finding how my asterisk and the SIP client on my mobile are talking. That’s possible even if you don’t have a good understanding of the session initiation protocol – simply googling it’s workflow is enough to see how it should look. Then all you have to do is comparing the expected flow with what you actually get.

Because of my intermittent issue, I had two sets of SIP debug data – when it works, and when it doesn’t. Comparing the two showed me that when the timeout happens, it is actually the phone that stops responding to SIP INVITEs. Having tried a few other SIP clients on the phone, I had a strong feeling this was not going to be a client issue. So maybe the network? Bingo! As soon as I switched off from my wifi and landed on 4G, the connection worked flawlessly and never failed. As this is weird, I went on to research why that migth happen and found some content explaining how most home routers have a faulty implementation of application layer gateway for SIP. Indeed, it was SIP ALG messing my SIP traffic in failed attempts to “secure my traffic” by inspecting source/destination addresses in the SIP packets. In my case, it would filter out my SIP traffic as soon as the router “forgets” the addresses and mappings, which in my case was 30 seconds from registration (TCP handshake from the mobile to my VPS).

Considering the minimum frequency at which SIP clients can re-register is once every 60 seconds and that my router would time out after about 30s, I was only left with half of the register time working. That’s way below my expectations. I can’t reconfigure my home router (thanks to the ISP), so what do? There’s no way of setting the registry time lower in the client itself without dirty hacks. I ended up doign the following: switch from SIP to IAX2 (Asterisk’s preferred protocol for VoIP), and then I wrote a small patch to my ebuild for Asterisk on Gentoo to define the re-register frequency as 25s, using the following:

$ grep sed /usr/portage/net-misc/asterisk/asterisk-11.25.1.ebuild
sed -i 's:EXPIRE.*60:EXPIRE 25:' "${S}"/channels/iax2.h && ewarn patched IAX2 registry timeout

After re-compiling (making such tweaks is very easy thanks to Gentoo) and restarting Asterisk, even though my client kept asking for 60 second sessions, Asterisk would “demand” a new session (from which all it gets is updating the actual IP address of the smart phone) every 25 seconds anyway. And the timeout problem is almost gone!

Almost, because Asterisk has a security setting called “nat”. It has to be set to “no” regardless if it is SIP or IAX2, which means that when the agents register, Asterisk doesn’t inspect the IP addresses in the header – and those do fail since NAT would translate the IP address of my phone from local LAN to the public IP address of my router. Since I control access to my Asterisk using multiple factors and layers (iptables and user/pass), I consider it safe for disabling.