4
votes

I'm working on a very simple lobby system for a game. Each client broadcasts two packets via UDP at regular intervals to initially discover other clients and transmit user info, readiness, etc. The game is being developed for both Windows and Linux (32 & 64 bit).

On the Windows side, I've gotten the lobby system working flawlessly. When I enter the lobby on one Windows machine, the person pops up in the other machines. Similarly, ready checks and disconnects are detected right away. In other words, it works.

Now the problem: Linux. The network code is virtually identical, with only a few necessary platform-specific changes. I first tried Windows<->Linux. Using Wireshark, I found that the Linux side was indeed broadcasting packets and receiving them from the Windows box, but the game never caught the packets. I found a bug in my select statement (socket instead of socket + 1), but fixing it didn't help. The Windows box was broadcasting packets, but it wasn't receiving packets from the Linux box at all!

I then tried Linux<->Linux, but found that even though both machines were broadcasting and receiving (again, confirmed via Wireshark), the games on both machines couldn't "see" the packets.

I'm pretty sure it's not a firewall issue (turned everything off, tested, turned everything back on, no change, on either platform) and network connectivity seems ok (was able to ping each host manually). I also checked to make sure the ports were indeed available (they were).

Below is the code for broadcasting packets:

    void NetworkLinux::BroadcastMessage(const std::string &msg,
            const char prefix)
    {
        string data(prefix + msg);

        if (sendto(linuxSocket, data.c_str(), static_cast<int>(data.length()), 0,
        reinterpret_cast<sockaddr*>(&broadcastAddr), sizeof(broadcastAddr)) == -1)
        {
            Display_PError("sendto");
        }
    }

And the code for receiving packets:

const Message NetworkLinux::ReceiveMessage()
    {
        char buffer[recvBufferLength];
        fill(buffer, buffer + recvBufferLength, 0);
        sockaddr_in sender;
        int senderLen = sizeof(sender);

        fd_set read_fds;
        FD_ZERO(&read_fds);
        FD_SET(linuxSocket, &read_fds);

        timeval time;
        time.tv_sec = 0;
        time.tv_usec = 16667; // microseconds, so this is ~1/60 sec

        int selectResult = select(linuxSocket + 1, &read_fds, 
                                      nullptr, nullptr, &time);
        if (selectResult == -1)
        {
            Display_PError("select");
        }
        else if (selectResult > 0) // 0 means it timed-out
        {
            int receivedBytes = recvfrom(linuxSocket, buffer, 
                        recvBufferLength, 0, reinterpret_cast<sockaddr*>(&sender),
                        reinterpret_cast<socklen_t*>(&senderLen));

            if (receivedBytes == -1)
            {
                Display_PError("recvfrom");
            }
            else if (receivedBytes > 0)
            {
                Message msg;
                msg.prefix = buffer[0];
                msg.msg = string(buffer + 1, buffer + receivedBytes);
                msg.address = sender.sin_addr;
                return msg;
            }
        }
        Message m;
        m.prefix = 'N';
        return m;
    }

Why does select() keep coming back with 0 when I can see packets arriving? Moreover, why does it work in the Windows<->Windows scenario, but not Linux<->Linux or Linux<->Windows?

Edit: Here is the socket creation/setup code, as requested. Sample IPs/broadcast addresses calculated are: 192.168.1.3/192.168.1.255, 192.168.1.5/192.168.1.255, which match what the Windows side generated and used.

bool NetworkLinux::StartUp()
    {
        // zero addr structures
        memset(&machineAddr, 0, sizeof machineAddr);
        memset(&broadcastAddr, 0, sizeof broadcastAddr);

        // get this machine's IP and store it
        machineAddr.sin_family = AF_INET;
        machineAddr.sin_port = htons(portNumber);
        inet_pton(AF_INET, GetIP().c_str(), &(machineAddr.sin_addr));

        // get the netmask and calculate/store the correct broadcast address
        broadcastAddr.sin_family = AF_INET;
        broadcastAddr.sin_port = htons(portNumber);
        GetNetMask();
        broadcastAddr.sin_addr.s_addr = machineAddr.sin_addr.s_addr | ~netmask;

        char bufIP[INET_ADDRSTRLEN], bufBroadcast[INET_ADDRSTRLEN];
        inet_ntop(AF_INET, &machineAddr.sin_addr, bufIP, INET_ADDRSTRLEN);
        inet_ntop(AF_INET, &broadcastAddr.sin_addr, bufBroadcast,
                INET_ADDRSTRLEN);
        Log("IP is: " + string(bufIP) + "\nBroadcast address is: "
                + string(bufBroadcast));

        // create socket
        linuxSocket = socket(AF_INET, SOCK_DGRAM, IPPROTO_UDP);
        if (linuxSocket == -1)
        {
            Display_PError("socket");
            return false;
        }
        Log("Socket created.");

        // switch to broadcast mode
        int broadcast = 1;
        if (setsockopt(linuxSocket, SOL_SOCKET, SO_BROADCAST, &broadcast,
                sizeof broadcast) == -1)
        {
            Display_PError("setsockopt");
            close(linuxSocket);
            return false;
        }
        Log("Socket switched to broadcast mode.");

        // bind it (this simplifies things by making sure everyone is using the same port)
        if (bind(linuxSocket, reinterpret_cast<sockaddr*>(&machineAddr),
                sizeof(machineAddr)) == -1)
        {
            Display_PError("bind");
            close(linuxSocket);
            return false;
        }
        Log("Socket bound.");

        return true;
    }
2
Something related: You should use multicast instead of broadcast, that is nicer for the networks (and later you can also route things through multiple networks, or could start using ipv6 easier, which does not support broadcasts). Also, have you considered using some library that handles most of the stuff for you? That should reduce the amount of possible error points for you. - PlasmaHH
You need to show how you're setting up the socket, too. sockaddr_in, setsockopt(), and all that. - Warren Young
@PlasmaHH I had looked into multicast before, but read that some routers don't support it; I can still look into it in the future, however. In any case, this is bothering me just because it should work, but doesn't. Most of the libraries I looked at required some sort of server address, which doesn't help in this situation since clients don't know who is out there on the LAN yet. - Gemini14
@Gemini14: I am quite sure that routers that can handle broadcasts can also handle multicasts. - PlasmaHH
@Warren Young Wish granted, I've added the code showing the socket creation stage, option setting, etc. - Gemini14

2 Answers

4
votes
machineAddr.sin_port = htons(portNumber);
inet_pton(AF_INET, GetIP().c_str(), &(machineAddr.sin_addr));

:

bind(linuxSocket, reinterpret_cast<sockaddr*>(&machineAddr),

This binds the socket to only accept packets sent to portNumber at the machine address returned by GetIP, which is probably not what you want, as you also want to receive packets sent to the port at the broadcast address. You probably want to set sin_addr to be INADDR_ANY, the wildcard address, which will allow the socket to receive packets sent to the port at any address that gets to the machine somehow.

0
votes

Guess you forgot to set SO_BROADCAST socket option and broadcast packets are filtered out.