It sounds like this may just be a NAT traversal problem; if it is please move or remove this question, as this would only be relevant if you were programming this client. There is a ton of info out there about this issue (for example here is an excellent article that comes up as the first result when I google "voip nat traversal"), but here's a quick summary:
Why NAT causes a problem for VoIP
Most VoIP protocols use a data stream on one port (e.g. 5060 in this case) to negotiate connection information that includes among other things a socket (IP address and port) to receive audio/RTP traffic; there are 2 things about this negotiation of a socket that might be unexpected:
- It can be any IP address and port combination, not just one that is on the VoIP device itself. So you might have for instance a VoIP server that negotiates a socket on another host that is not part of the SIP dialog, and which might be behind a NAT
- The negotation is done at the OSI Application Layer (Layer 7), so it is normally untouched by the NAT process, which operates at Layers 3 and 4
How to diagnose missing audio due to NAT
If you're able to get packet captures (ideally on both WAN and LAN ports, so you can see your VoIP device's traffic before and after NAT), you can see the problem in action: just look for the packets containing SDP payloads (e.g. if you're doing SIP on UDP 5060, just filter for that port and you will see INVITE requests and 200 OK responses that contain SDP payloads); drill down to the c (Connection Information) and m (Media Description) lines, which should look something like the following:
c=IN IP4 192.168.1.114
m=audio 6094 RTP/AVP 0 8 101
If you're seeing something like this going out your WAN port, it means your VoIP device is requesting to be sent audio on 192.168.1.114:6094; the IP address is a private address and cannot be routed over the internet; the port is just one I picked randomly, but the one you see needs to be open and forwarded to your device
How to fix it
There are various solutions to this problem, which all come down to rewriting the private IP address that your device is giving out into the public one that your device's traffic is being NAT'd out on, so that when the remote device parses the Connection Information line in the SDP, it has a valid, publicly routable IP address to send the audio traffic to, and a UDP port that is NAT'd to your device. Sometimes the VoIP device itself can handle the rewriting (e.g. you can either statically tell the device in its configuration what its public IP is, or it can discover it from a protocol like STUN), sometimes the rewriting is done by the firewall/router that is doing the NAT (there are various names for this, like SIP ALG or SIP Fixup).
Unfortunately due to the variations in NAT implementations across various routers and firewalls, no solution can be guaranteed to work 100% of the time; and if you have multiple devices behind the same firewall, having it do the rewriting will only work for one of them.
In your case:
You mention 2 different VoIP devices, a Cisco SPA504G, and a Grandstream GXP1620. The datasheets for both devices say they support STUN, so I'd start looking in its config for the STUN settings or anything else that references NAT traversal. Also, make sure that the ports you are forwarding to the device are the ones that it uses, this is usually just another item in the config, called something like "RTP base port" or "RTP range"
twilio
tag exist then? – Tono Nam