0
votes

I have a application running on a Windows XP platform (i7 2.1 Ghz processor). This application is a master/slave based communication between master and slave nodes, over UDP. The master sends a request and slave node sends in its response (Burst Mode), packets of data every 5 ms, each data packet 1300 Byte long including header.

Back in the master node the main thread receives the data and writes it to a queue, triggering a parallel thread to read out from the thread.

Problem: The execution time for the Winsock API is very long while reading the next packet, and so the data is being lost from the buffer.

Execution time: Recvfrom() - 200 - 400 Microseconds.

Open_Sock ()
{
    socket();
    //Error check

    connect ();
    //Error Check
}

Receivethread()
{
    sock again:

    select(socket, read,write,excep,(0,0));
    //error check

    rc = recvfrom(socket,buf,len,0,&s_addr,&cln_alen)
    if(rc>0) {
        enqueue(queue,buf);
    }
}

I am sure the Winsock API does not require such a long time just to fetch the next packet. But I cannot find any information on what the real execution times should be. Any help in the direction is really appreciated.

2
Use IOCP, queue up a lot of buffers, let the kernel fill them up. process them later. - Martin James
Being able to call recvfrom at least 12.5 times for each buffer sent seems like it should be quite adequate (5ms / 400us = 12.5). Are you sure what you are DOING with the data is completing soon enough? Also, we've found that Windows tends to drop more UDP packets than we think it should. We ported a client/server Windows app to linux and it dropped ZERO packets using the same hardware. - Steve Valliere
By default, sockets run in blocking mode, so recvfrom() will wait for data to arrive before exiting. Since you are using select(), is the socket actually in a readable state before you call recvfrom()? Knowing that the socket is in blocking mode, you could also just get rid of select(), unless your thread needs to do other threads while waiting for data. - Remy Lebeau
Yes, but 12.5 times is not still not enough, i drop packets which need optimization. after the recv from is just the push in queue it takes 4-5 microseconds and returns back to fetch the next. Yes after the select, i read the state of the socket, check for readability and then pass on for the recvfrom "FD_Read". this helps it to perform it in non bloking mode, and so the select function returns immediately with the state. removing the select would cause more execution time, as it is in blocking mode. - Pipa's
Here this is a single threaded process reading from the RX buffer and pushing in the Queue. As per my understanding IOCP is effective for multiple threaded sockets. for a single socket reading for the buffer it increases the overhead rather than making it simple. Is my understanding correct ? - Pipa's

2 Answers

0
votes

If losing packets is an issue, use TCP. Using TCP, I achieved a response time of less then one millisecond on a less modern machine for a simple loopback connection. Some important points there:

  • Use WSAEventSelect() in combination with WaitForMultipleObjects() when waiting for traffic. I'm not sure if this makes a big difference compared to select(), but it makes handling easier if you want to stop the thread with an additional event.
  • Allocate a buffer before waiting for input, this reduces latency a bit still.
  • Try to not create a thread for each packet but have threads waiting already, i.e. use a thread pool.
  • Try to send as few packets as possible, i.e. try to assemble the whole data in memory and send it with a single call. This avoids the network IO overhead for multiple packets that are assembled lateron.
  • Also take a look at the Nagle Algorithm, which you will probably want to turn off for TCP. The Nagle Algorithm in combination with a delayed acknowledge can seriously affect your latency.
0
votes

You probably hit the combination of sending/receiving buffer size and OS scheduler issues. On Windows platform context switch between threads is not too frequent, so there are two options which you can use:

  1. Increase priority of the server process

    This will reduce time your server application is staying in the queue.

  2. Increase the receiving buffer size

    You need to do it on both ends. You can use setsockopt with SO_RCVBUF option:

    int size = 1 * 1024 * 1024;
    setsockopt(socket, SOL_SOCKET, SO_RCVBUF, (const char*)&size, sizeof(int));