0
votes

I made a simple c++ program for armv7 architecture (compiled with linaro gnueabihf using raspi rootfs) that takes in arguments with baud rate, data, serial port etc and sends it to the selected serial port and receives the response. At least that's the goal of it.

I'm currently using it to send a command to disable/enable backlight on an industrial screen through an UART port. The screen takes a simple text command ended with crlf and returns a response. The specification of the screen says it uses 9600 baud, no parity, 8 data bits and 1 stop bit for communication, so pretty much standard.

While the sending works flawlessly - I cannot seem to find a way to properly receive the response. I tried configuring the termios port structure in multiple different ways (disabling hardware control, using cfmakeraw, configuring the VMIN and VTIME values) but without luck.

First thing is that, I'm receiving all the input byte by byte (so each read() call returns exactly 1 byte..), but that wouldn't be a problem.

When using nonblocking mode without select() I'm receiving all bytes, but I don't know when to stop receiving (and I want it to be universal, so I send a command, expect a simple response and if there is no more data then just exit). I made a time counter since the last message, so if nothing was received in last ~500ms then I assume nothing more will come. But this sometimes loses some bytes of the response and I don't know why.

When using blocking mode, I receive correct bytes (still byte by byte though), but I don't know when to stop and the last call to read() leaves the program hanging, because nothing else comes in the input.

When adding select() to the blocking call, to see if input is readable, I get very frequent data loss (sometimes just receiving a few bytes), and sometimes select returns 1, but read() blocks, and I'm left hanging.

When I just send data without doing any reading, and look at the input using cat -v < /dev/ttyS3 I can actually see correct input on the serial port all the time, however when I run both cat and my program as receivers, only one of them gets the data (or cat receives a few bytes and my program a few), this suggests me that something is "stealing" my bytes the same way when I try to read it, but what could it be, and why is it like that?

My current code (using the nonblocking read + 500ms timeout), that still loses some bytes from time to time:

#include <stdio.h>
#include <fcntl.h>
#include <termios.h>
#include <unistd.h>
#include <string.h>
#include <stdlib.h>
#include <errno.h>
#include <time.h>


int open_port(char* portname)
{
    int fd; // file description for the serial port

    fd = open(portname, O_RDWR | O_NOCTTY | O_NDELAY);

    if(fd == -1) // if open is unsucessful
    {
        printf("Error: open_port: Unable to open %s. \n", portname);
    }
    else
    {
        //fcntl(fd, F_SETFL, 0);
        fcntl(fd, F_SETFL, FNDELAY);
    }

    return(fd);
}


int configure_port(int fd, int baud_rate)
{
    struct termios port_settings;

    tcgetattr(fd, &port_settings);
    cfsetispeed(&port_settings, baud_rate);    // set baud rates
    cfsetospeed(&port_settings, baud_rate);
    cfmakeraw(&port_settings);
    port_settings.c_cflag &= ~PARENB;    // no parity
    port_settings.c_cflag &= ~CSTOPB; // 1 stop bit
    port_settings.c_cflag &= ~CSIZE;
    port_settings.c_cflag |= CS8; // 8 data bits


    tcsetattr(fd, TCSANOW, &port_settings);    // apply the settings to the port
    return(fd);

}

/**
 * Convert int baud rate to actual baud rate from termios
 */
int get_baud(int baud)
{
    switch (baud) {
        case 9600:
            return B9600;
        case 19200:
            return B19200;
        case 38400:
            return B38400;
        case 57600:
            return B57600;
        case 115200:
            return B115200;
        case 230400:
            return B230400;
        case 460800:
            return B460800;
        case 500000:
            return B500000;
        case 576000:
            return B576000;
        case 921600:
            return B921600;
        case 1000000:
            return B1000000;
        case 1152000:
            return B1152000;
        case 1500000:
            return B1500000;
        case 2000000:
            return B2000000;
        case 2500000:
            return B2500000;
        case 3000000:
            return B3000000;
        case 3500000:
            return B3500000;
        case 4000000:
            return B4000000;
        default: 
            return -1;
    }
}

unsigned char* datahex(char* string) {

    if(string == NULL) 
       return NULL;

    size_t slength = strlen(string);
    if((slength % 2) != 0) // must be even
       return NULL;

    size_t dlength = slength / 2;

    unsigned char* data = (unsigned char*)malloc(dlength);
    memset(data, 0, dlength);

    size_t index = 0;
    while (index < slength) {
        char c = string[index];
        int value = 0;
        if(c >= '0' && c <= '9')
          value = (c - '0');
        else if (c >= 'A' && c <= 'F') 
          value = (10 + (c - 'A'));
        else if (c >= 'a' && c <= 'f')
          value = (10 + (c - 'a'));
        else {
          free(data);
          return NULL;
        }

        data[(index/2)] += value << (((index + 1) % 2) * 4);

        index++;
    }

    return data;
}

int main(int argc, char **argv) {

    int baud_rate = B9600;
    baud_rate = get_baud(atoi(argv[1]));
    if(baud_rate == -1) {
        printf("Error: Cannot convert baud rate %s, using 9600\n", argv[1]);
        baud_rate = B9600;
    }

    bool convertHex = false;
    char portName[24] = "/dev/ttyS0";
    bool debug = false;
    bool noreply = false;

    for(int i = 3; i < argc; i++) {
        if(!strcmp(argv[i], "hex"))
            convertHex = true;
        else if(strstr(argv[i], "/dev/") != NULL)
            strncpy(portName, argv[i], sizeof(portName));
        else if(!strcmp(argv[i], "debug"))
            debug = true;
        else if(!strcmp(argv[i], "no-reply"))
            noreply = true;
    }

    unsigned char* data = nullptr;
    size_t len = 0;

    if(convertHex) {
        data = datahex(argv[2]);
        if((int)data == (int)NULL) {
            convertHex = false;
            printf("Error: Couldn't convert hex value! Needs to be even length (2 chars per byte)\n");
        }
        else
            len = strlen(argv[2])/2;
    } 

    if(!convertHex) {
        data = (unsigned char*)argv[2];
        len = strlen(argv[2]);
    }

    int fd = open_port(portName);
    if(fd == -1) {
        printf("Error: Couldn't open port %s\n", portName);
        if(convertHex)
            free(data);
        return 0;
    }

    configure_port(fd, baud_rate);

    if(debug) {
        printf("Sending data (raw): ");
        for(int i =0; i< len; i++) {
            printf("%02X", data[i]);
        }
        printf("\n");
    }
    size_t writelen = write(fd, data, len);

    if(debug)
        printf("Sent %d/%d bytes\n", writelen, len);
    if(writelen != len)
        printf("Error: not all bytes were  sent (%d/%d)\n", writelen, len);
    else if(noreply)
        printf("WRITE OK");

    if(!noreply) {
        unsigned  char ibuff[512] = {0};

        int curlen = 0; // full length

        clock_t begin_time = clock();

        while(( float(clock() - begin_time) / CLOCKS_PER_SEC) < 0.5 && curlen < sizeof(ibuff)) {

            int ret = read(fd, ibuff+curlen, sizeof(ibuff)-curlen-1);

            if(ret < 0) {
                ret = 1;
                continue;
            }

            if(ret > 0) {
                curlen += ret;
                begin_time = clock();
            }
        }

        if(curlen > 0) {
            ibuff[curlen] = 0; // null terminator
            printf("RESPONSE: %s", ibuff);
        }

    }

    if(fd)
        close(fd);

    if(convertHex)
        free(data);
    return 0;
}

I launch the program like ./rs232 9600 [hex string] hex debug

The scren should return a response like #BLIGHT_ON!OK, but sometimes I receive for example #BLI_ON!O

What can be the cause of this? I made some serial communcation earlier with QtSerial <-> STM32 controller and had no such issues that would cause data loss.

1
Does it work if you remove the timeout part? Also what's the point of ret = 1; continue?Lundin
Why is while(( float(clock() - begin_time)... called when curlen == sizeof(ibuff) - 1 to read zero bytes?chux - Reinstate Monica
Instead of your program trying to time the reception of data (which it cannot accurately do since it is merely fetching bytes from a system buffer), you can simply utilize the built-in termios capability. Use blocking mode, and set VMIN=sizeof(ibuff)-1 and VTIME=5. Your termios initialization should also set CLOCAL and CREAD in c_cflag.sawdust

1 Answers

1
votes

First thing is that, I'm receiving all the input byte by byte (so each read() call returns exactly 1 byte..) [...]

That's not surprising. The response is coming back at 9600 baud, which is likely much slower per byte than one iteration of the loop requires. It would also arise directly from some configurations of the serial driver. It should be possible to tune this by manipulating VMIN and VTIME, but do note that that requires disabling canonical mode (which you probably want to do anyway; see below).

When using nonblocking mode without select() I'm receiving all bytes, but I don't know when to stop receiving (and I want it to be universal, so I send a command, expect a simple response and if there is no more data then just exit). I made a time counter since the last message, so if nothing was received in last ~500ms then I assume nothing more will come. But this sometimes loses some bytes of the response and I don't know why.

It's all in the details, which you have not presented for that case. We cannot therefore speak to your particular data losses.

Generally speaking, if you're working without flow control, then you have to be sure to read each byte before the next one arrives, on average, else pretty soon, new bytes will overwrite previously-received ones. VMIN and VTIME can help with that, or one can try other methods for tune read timing, but note well that a 9600 baud response will deliver bytes at a rate exceeding one per millisecond, so a 500 ms delay between read attempts is much too long. Supposing that the particular responses you are trying to read are relatively short, however, this will not explain the data losses.

When using blocking mode, I receive correct bytes (still byte by byte though), but I don't know when to stop and the last call to read() leaves the program hanging, because nothing else comes in the input.

So the command is required to be CRLF-terminated, but the response cannot be relied upon to be likewise terminated? What a rude device you're working with. If it terminated its responses the same way it required terminated commands, then you could probably work in canonical mode, and you could definitely watch for the terminator to recognize end-of-transmission.

When adding select() to the blocking call, to see if input is readable, I get very frequent data loss (sometimes just receiving a few bytes), and sometimes select returns 1, but read() blocks, and I'm left hanging.

I cannot suggest what the problem may be in that case without any relevant code to analyze, but you really shouldn't need select() for this.

When I just send data without doing any reading, and look at the input using cat -v < /dev/ttyS3 I can actually see correct input on the serial port all the time,

That's a good test.

however when I run both cat and my program as receivers, only one of them gets the data (or cat receives a few bytes and my program a few),

That's exactly as I would expect. Once a program reads a byte from the port, it is no longer available for any other program to read. Thus, if multiple programs try to read from the same port at the same time then the data available will be partitioned among them in some unspecified and not necessarily consistent fashion.

this suggests me that something is "stealing" my bytes the same way when I try to read it, but what could it be, and why is it like that?

That seems unlikely, considering that cat is not affected the same way when you run it alone, nor (you report) are some versions of your own program.


In the first place, if the device supports flow control then I would enable it. Hardware flow control in preference to software flow control if both are viable. This is mainly a fail-safe, however -- I don't see any reason to think that flow control is likely to actually trigger if your program is well written.

Mainly, then, in addition to setting the serial line parameters (8/n/1), you should

  • Disable canonical mode. This is necessary because you (apparently) cannot rely on the response to be terminated by a line terminator, among other reasons.
  • Disable echo.
  • Avoid enabling non-blocking mode on the file.
  • (Optional) read the first response byte with VMIN == 1 and VTIME == 0; this allows for an arbitrary delay before the device starts sending the response. Alternatively, if you have a reliable upper bound on the time you're willing to wait for the device to start sending the response then you can probably skip this step by using a suitable VTIME in the next one. Or perhaps use a a larger VTIME for this first byte to accommodate a delay before start of transmission, yet not hang if the device fails to respond.
  • Do read the remaining response bytes with VTIME == 1 (or larger) and VMIN == 0. This probably gets you the whole remainder of the response in one call, but do repeat the read() until it returns 0 (or negative). The 0 return indicates that all available bytes have been transferred and no new ones were received for VTIME tenths of a second -- much longer than the inter-character time in a 9600-baud transmission even for VTIME == 1. Do note that the larger you make VTIME, the longer will be the delay between the device sending the last byte of its response and the program detecting end-of-transmission.
  • Do not implement any artificial delay between successive read attempts.

You should not need non-blocking mode at the fcntl level, and you should not need select(). There may be other termios settings you could apply to better tune your program for the particular device at the other end of the serial link, but the above should be enough for single-command / single-response pairs with ASCII-only data and no control characters other than carriage returns and newlines.