50
votes

This morning, there were big problems at work because an SNMP trap didn't "go through" because SNMP is run over UDP. I remember from the networking class in college that UDP isn't guaranteed delivery like TCP/IP. And Wikipedia says that SNMP can be run over TCP/IP, but UDP is more common.

I get that some of the advantages of UDP over TCP/IP are speed, broadcasting, and multicasting. But it seems to me that guaranteed delivery is more important for network monitoring than broadcasting ability. Particularly when there are serious high-security needs. One of my coworkers told me that UDP packets are the first to be dropped when traffic gets heavy. That is yet another reason to prefer TCP/IP over UDP for network monitoring (IMO).

So why does SNMP use UDP? I can't figure it out and can't find a good reason on Google either.

5
"Wikipedia says that SNMP can be run over TCP/IP", if you read the RFC3430 carefully, faqs.org/rfcs/rfc3430.html you will see it is experimental, so you could not expect all vendor product supports it.Lex Li
+1 for the stated practical issuesnj-ath
@PP, man you're hard, he needs to dig through RFC1155, 1157, 1212, 1215, 1901, 1908, 2578, 2579, 2580, 3416 and 3417 (v1 & v2c), as well as RFC1213, 2863, 3418, 4001, 4001, 4022, 4113, 4292, 4293 and 4898 (MIB) :)Matthieu
@LexLi 1) Thanks the RFC link 2) Message from the future: the question was not "over what protocol is it running", but "why uses it UDP" 3) sorry for the late reactpeterh

5 Answers

55
votes

UDP is actually expected to work better than TCP in lossy networks (or congested networks). TCP is far better at transferring large quantities of data, but when the network fails it's more likely that UDP will get through. (in fact, I recently did a study testing this and it found that SNMP over UDP succeeded far better than SNMP over TCP in lossy networks when the UDP timeout was set properly). Generally, TCP starts behaving poorly at about 5% packet loss and becomes completely useless at 33% (ish) and UDP will still succeed (eventually).

So the right thing to do, as always, is pick the right tool for the right job. If you're doing routine monitoring of lots of data, you might consider TCP. But be prepared to fall back to UDP for fixing problems. Most stacks these days can actually use both TCP and UDP.

As for sending TRAPs, yes TRAPs are unreliable because they're not acknowledged. However, SNMP INFORMs are an acknowledged version of a SNMP TRAP. Thus if you want to know that the notification receiver got the message, please use INFORMs. Note that TCP does not solve this problem as it only provides layer 3 level notification that the message was received. There is no assurance that the notification receiver actually got it. SNMP INFORMs do application level acknowledgement and are much more trustworthy than assuming a TCP ack indicates they got it.

13
votes

If systems sent SNMP traps via TCP they could block waiting for the packets to be ACKed if there was a problem getting the traffic to the receiver. If a lot of traps were generated, it could use up the available sockets on the system and the system would lock up. With UDP that is not an issue because it is stateless. A similar problem took out BitBucket in January although it was syslog protocol rather than SNMP--basically, they were inadvertently using syslog over TCP due to a configuration error, the syslog server went down, and all of the servers locked up waiting for the syslog server to ACK their packets. If SNMP traps were sent over TCP, a similar problem could occur.

http://blog.bitbucket.org/2012/01/12/follow-up-on-our-downtime-last-week/

5
votes

Check out O'Reilly's writings on SNMP: https://library.oreilly.com/book/9780596008406/essential-snmp/18.xhtml

One advantage of using UDP for SNMP traps is that you can direct UDP to a broadcast address, and then field them with multiple management stations on that subnet.

4
votes

The use of traps with SNMP is considered unreliable. You really should not be relying on traps.

SNMP was designed to be used as a request/response protocol. The protocol details are simple (hence the name, "simple network management protocol"). And UDP is a very simple transport. Try implementing TCP on your basic agent - it's considerably more complex than a simple agent coded using UDP.

SNMP get/getnext operations have a retry mechanism - if a response is not received within timeout then the same request is sent up to a maximum number of tries.

2
votes

Usually, when you're doing SNMP, you're on a company network, you're not doing this over the long haul. UDP can be more efficient. Let's look at (a gross oversimplification of) the conversation via TCP, then via UDP...

TCP version:

client sends SYN to server
server sends SYN/ACK to client
client sends ACK to server - socket is now established
client sends DATA to server
server sends ACK to client
server sends RESPONSE to client
client sends ACK to server
client sends FIN to server
server sends FIN/ACK to client
client sends ACK to server - socket is torn down

UDP version:

client sends request to server
server sends response to client

generally, the UDP version succeeds since it's on the same subnet, or not far away (i.e. on the company network). However, if there is a problem with either the initial request or the response, it's up to the app to decide. A. can we get by with a missed packet? if so, who cares, just move on. B. do we need to make sure the message is sent? simple, just redo the whole thing... client sends request to server, server sends response to client. The application can provide a number just in case the recipient of the message receives both messages, he knows it's really the same message being sent again.

This same technique is why DNS is done over UDP. It's much lighter weight and generally it works the first time because you are supposed to be near your DNS resolver.