C++ on Linux: Setting up socket & packets for minimum RTP stream latency

Question

I have a Linux device that is supposed to stream from various real-time audio sources over RTP/UDP to a number of clients, and want to achieve the lowest possible latency. The way it works is it retrieves frames from various ALSA interfaces and forwards them as RTP streams using common C sockets.

I've done some testing using Wireshark and I'm pretty sure I'm setting up the socket's DSCP field in the IP_TOS field for Expedited Forwarding correctly, which as I understand it, ensures the greatest reduction of latency on that front.

However, I'm concerned that I'm not doing anything to tag the packets as VoIP to enforce the best possible QoS throughout each node on the network (using the 802.11e standard), and that might be resulting in less-than-optimal latency. What makes me mostly suspicious about this is that according to my Wireshark logs, my packets are tagged as video packets instead of audio/VoIP:

QoS portion of packet under Wireshark

So, here are my questions:

How does DSCP relate to 802.11e? What I'm thinking is that they do different things within different layers of the network, but I'm not that knowledgeable and may be off about this.
Does the above image reveal anything about any non-optimal setup for packets and/or the UDP socket I'm using to send the RTP stream on either the DSCP or 802.11e fronts?
How can I tag packets for VoIP priority using standard sockets on C++ (if possible)?
Is there any particular configuration I should look out for regarding 802.11e on my router? Should I look for routers that support 802.11e or is that a foregone conclusion? I'm assuming maybe 802.11e isn't about the specific packets but about router configuration.

Again, I'm kinda lost and I think I might need someone to whack me over the head and tell me how all of this works. All I can find online seems to be CISCO-related and I'm not sure of how much use it is for my purposes as explained here.

First, understand that QoS does nothing unless there is congestion. Second, QoS does nothing unless the devices through which the packet passes are configured to do treat DSCP in a congestion condition (nothing on the Internet does). Third, different QoS implementations can do different things with DSCP markings, such as making EF worst or best. There is no one answer for how QoS works. You configure your devices for fairness as you see it, but others see it differently. — Ron Maupin

mark mark · Accepted Answer · 2013-10-03T13:39:54

My understanding is the QoS (ToS/DS) octet is the second octet of the IP header. 802.11e is specifically for wireless networks and resides at a lower layer than IP.
For Expedited Forwarding in DiffServ I think the octet should be 0xb8. I'm not sure what this picture is... 2 octets?
I'm more familiar with Windows, and the OS puts restrictions on QoS tagging. For those finding this post, look into qwave and QOSCreateHandle. On Linux I'd guess you could use raw sockets with appropriate permissions.
There are several different ways the IP QoS octet can be translated by routers... pick one that suites your need; DSCP should be common. Note again that this is different than 802.11e.

Other note: All of this really only matters on your transmitting machine and local network. If packets leave your network, most likely all QoS efforts will be ignored (e.g. by your ISP). So unless you have congestion on your local network, or congestion on an egress router, or have I/O issues on the machine itself, your efforts are in vain.

C++ on Linux: Setting up socket & packets for minimum RTP stream latency

2 Answers