1
votes

I am trying to read packets in a router, like this in python:

# (skipping the exception handling code here)    
s = socket.socket(socket.AF_PACKET, socket.SOCK_RAW, socket.ntohs(0x0003))
while True:
    p = s.recvfrom(2000)
    pkt = p[0]
    # process pkt here ...

Answers to a related question (36115971) say that parameters and methods for UDP vs TCP data are different (some say recv is for TCP and recvfrom is for UDP, and others say the opposite, similarly some say 1024 as buffer size for TCP and larger for UDP, and again some say the reverse). In my case of reading in a router, I do not have different sockets for TCP and UDP, so I need to read both from the same socket, so I am bit confused regarding how I should read the incoming packets.

(1) Should I use recv() or recvfrom(), if I want to read both TCP and UDP packets?

(2) Do the calls return data one packet at a time, or do they return after the buffer is filled up? eg, if I have a large buffer of 4096 bytes, and the incoming streaming 2 packets have 2400 bytes each, will the call return as soon as the 1st packet ends, or will it return after filling up the buffer from the 2nd packet also?

(2a) same question, but if I have a smaller buffer of 2000 bytes. It is clear that on the 1st call I will get the first 2000 bytes of the 1st packet. But on the next call, will I get the last 400 bytes of the 1st packet, or the first 2000 bytes of the 2nd packet?

(3) If I am delayed in making the next call, maybe because I was busy processing the 1st dataset, am I in danger of losing data, or will the OS keep its internal queue of the incoming packets to be given to me when I call the next time? If the OS keeps its internal queue, where can I find information about its size?

NOTE: Some of the given replies have been divergent, so let me put in some boundaries to my question. Hopefully these restrictions will help to give more specific answers.

(a) My objective is to sniff the incoming packets with python sockets only. So other solutions involving tcpdump or tshark etc are outside the scope.

(b) The objective is to only sniff for incoming packets. Additional details like packet reordering (for connection oriented protocols like TCP) are outside the scope, actually they are avoidable overhead.

2
You can do what you describe, but I think you'll have to define two different sockets binded on the same port, one for TCP and one for UDP. Though if you intend to do packet-level inspection I would propose a framework like scapy which allows you to check if a packet is TCP or UDP. Also check out this answer which might be useful. Also it might be better to ask this question on Network Engineering.game0ver
Thanks for the suggestion. I have posted this question on the network engineering forum also - networkengineering.stackexchange.com/questions/51862R71
@game0ver, we don't answer questions like this on network engineeringThis
@MikePennington ok, I'm sorry I didn't know that, I just thought since it is a lot related with networking it might fit there better...game0ver

2 Answers

2
votes

If you're reading packets from a raw socket (as shown in your source code), then you can easily read all packets from the same socket. Be sure this is what you intend to do. A raw socket is for doing packet inspection for troubleshooting, forensic, security or educational purposes. You cannot easily communicate with another system this way.

And likewise, the receive calls will not differ here by protocol because you are not actually using TCP or UDP, you're simply receiving the raw packets that those protocols build and decode.

(1) Should I use recv() or recvfrom(), if I want to read both TCP and UDP packets?

Either one will work. recv() will return to you only the actual packet data, while recvfrom will return to you the data along with metadata about the packet, including the interface from which the data was received (and other things defined in struct sockaddr_ll from the packet(7) man page).

(2) Do the calls return data one packet at a time, or do they return after the buffer is filled up? eg, if I have a large buffer of 4096 bytes, and the incoming streaming 2 packets have 2400 bytes each, will the call return as soon as the 1st packet ends, or will it return after filling up the buffer from the 2nd packet also?

When using a raw socket like this, you get exactly one packet at a time. You will never get more than one. If the buffer you give is not large enough, then the packet will be truncated (with the ending bytes discarded).

(2a) same question, but if I have a smaller buffer of 2000 bytes. It is clear that on the 1st call I will get the first 2000 bytes of the 1st packet. But on the next call, will I get the last 400 bytes of the 1st packet, or the first 2000 bytes of the 2nd packet?

Generally speaking, packets on most networks are limited to about 1514 bytes. This is because the traditional "MTU" (Maximum Transfer Unit) that is configured on the network interface is 1500 bytes and usually an Ethernet header containing two MAC addresses (6 bytes each) plus a two-byte Ethertype is prepended to that. In a switch or router, you may also see packets that have an additional 4-byte header containing a VLAN header (IEEE 802.1Q). (But, some networks internally use "jumbo" packets up to about 9K in size for specific purposes.)

You should also understand that, in writing an application, one can send UDP datagrams (or TCP buffers) larger than the maximum packet size. In that case, the OS breaks those up into smaller chunks for sending (and they are re-assembled on the destination side before being handed to an application). When you're receiving raw packets like this, you will see the packets in their low-level, possibly fragmented, state.

(3) If I am delayed in making the next call, maybe because I was busy processing the 1st dataset, am I in danger of losing data, or will the OS keep its internal queue of the incoming packets to be given to me when I call the next time? If the OS keeps its internal queue, where can I find information about its size?

The OS will keep a queue of packets for you. The size is of course limited since there is no way you would be able to keep up with, say, a 1Gb NIC at full line rate (let alone a 10Gb or higher NIC). The size is configured in a system-specific way. On linux -- and probably other Unix-based systems -- you can call getsockopt with SOL_SOCKET / SO_RCVBUF to get an idea of the queue space available.

On linux, at least, the size can be set with setsockopt up to a system-imposed maximum (which itself can be configured with various sysctl settings).

1
votes

I think you should not do that, because TCP assures various things like reliability, ordering, flow control, and congestion. However UDP does not guarantee anything.

These parameters are defined in the moment of creation of the socket by operating system. That is why I think that you cannot do that you are saying.

Open two different sockets, one native UDP sock and one native TCP sock.