0
votes

In LDD3's (Linux Device Drivers 3rd Edition book) scullpipe, what do the parameters in static int scull_read_p_mem(char *buf, char **start, off_t offset, int count, int *eof, void *data) mean? Specifically, I don't understand the difference between start, page, and offset.

There are a number of questions I have regarding the actual implementation itself (which can be found below):

struct scull_pipe {
        wait_queue_head_t inq, outq;       /* read and write queues */
        char *buffer, *end;                /* begin of buf, end of buf */
        int buffersize;                    /* used in pointer arithmetic */
        char *rp, *wp;                     /* where to read, where to write */
        int nreaders, nwriters;            /* number of openings for r/w */
        struct fasync_struct *async_queue; /* asynchronous readers */
        struct semaphore sem;              /* mutual exclusion semaphore */
        struct cdev cdev;                  /* Char device structure */
};

int scull_p_nr_devs;            /* the number of scull_pipe devices */
scull_pipe *scull_p_devices;    /* scull_pipe devices to be malloc'ed */

/* ...... */

/* our proc read implementation */
static int scull_read_p_mem(char *buf, char **start, off_t offset, int count,
        int *eof, void *data)
{
    int i, len;
    struct scull_pipe *p;

#define LIMIT (PAGE_SIZE-200)   /* don't print any more after this size */
    *start = buf;
    len = sprintf(buf, "Default buffersize is %i\n", scull_p_buffer);
    for(i = 0; i < scull_p_nr_devs && len <= LIMIT; i++) {
        p = &scull_p_devices[i];
        if (down_interruptible(&p->sem))
            return -ERESTARTSYS;
        len += sprintf(buf+len, "\nDevice %i: %p\n", i, p);
        len += sprintf(buf+len, "   Buffer: %p to %p (%i bytes)\n", 
                                    p->buffer, p->end, p->buffersize);
        len += sprintf(buf+len, "   rp %p   wp %p\n", p->rp, p->wp);
        len += sprintf(buf+len, "   readers %i   writers %i\n", 
                                    p->nreaders, p->nwriters);
        up(&p->sem);
        scullp_proc_offset(buf, start, &offset, &len);
    }
    *eof = (len <= LIMIT);
    return len;
}


static void scullp_proc_offset(char *buf, char **start, off_t *offset, int *len)
{
    /* QUESTION: what does this function do? */
    if (*offset == 0)
        return;
    if (*offset >= *len) {
        *offset -= *len;    /* QUESTION: what is the purpose of this? */
        *len = 0;
    }
    else {
        *start = buf + *offset; /* QUESTION: why do you need to change "start"? */
        *offset = 0;
    }
}

2

2 Answers

1
votes

The function scull_read_p_mem is used to create a proc entry here using create_proc_read_entry function. A 5 minut google search gave this page which explain the parameters in the function pointer passed to create_proc_read_entry function. With fixed formatting is sais:

Arguments:
*buf : The kernel allocates a page of memory to any process that attempts to read a proc entry. The page pointer points to that buffer of memory into which the data is written.
**start: This pointer is used when the reading of the proc file should not start from the beginning of the file but from a certain offset. For small reads this is generally set to NULL.
off : The offset from the beginning of the file where the file pointer currently points to
count : The number of bytes of data to be read
data : The data passed from the create_read_proc_entry function call.
eof: is set to 1 to indicate end of file

But after a while i've also found some docs in kenel fs/proc/generic.c. It's a bit long, but I think it's the only source to sum up the start parameter:

        /*
         * How to be a proc read function
         * ------------------------------
         * Prototype:
         *    int f(char *buffer, char **start, off_t offset,
         *          int count, int *peof, void *dat)
         *
         * Assume that the buffer is "count" bytes in size.
         *
         * If you know you have supplied all the data you
         * have, set *peof.
         *
         * You have three ways to return data:
         * 0) Leave *start = NULL.  (This is the default.)
         *    Put the data of the requested offset at that
         *    offset within the buffer.  Return the number (n)
         *    of bytes there are from the beginning of the
         *    buffer up to the last byte of data.  If the
         *    number of supplied bytes (= n - offset) is 
         *    greater than zero and you didn't signal eof
         *    and the reader is prepared to take more data
         *    you will be called again with the requested
         *    offset advanced by the number of bytes 
         *    absorbed.  This interface is useful for files
         *    no larger than the buffer.
         * 1) Set *start = an unsigned long value less than
         *    the buffer address but greater than zero.
         *    Put the data of the requested offset at the
         *    beginning of the buffer.  Return the number of
         *    bytes of data placed there.  If this number is
         *    greater than zero and you didn't signal eof
         *    and the reader is prepared to take more data
         *    you will be called again with the requested
         *    offset advanced by *start.  This interface is
         *    useful when you have a large file consisting
         *    of a series of blocks which you want to count
         *    and return as wholes.
         *    (Hack by [email protected])
         * 2) Set *start = an address within the buffer.
         *    Put the data of the requested offset at *start.
         *    Return the number of bytes of data placed there.
         *    If this number is greater than zero and you
         *    didn't signal eof and the reader is prepared to
         *    take more data you will be called again with the
         *    requested offset advanced by the number of bytes
         *    absorbed.
         */

We can see the start used later inside copy_to_user - the parameter is used to optimize proc entry reads on biiiig files. The user can pass the count variable really small, but you have biig file to read. So you return the size of that file from the proc read function with a *start parameter, that says how many bytes are there to read. That way kernel can even pass count=0, but the proc_read function can return like 5000 with a valid *start address, it will be later used in copy_to_user call to speed up the read.

So:

what do the parameters in static int scull_read_p_mem(char *buf, char **start, off_t offset, int count, int *eof, void *data) mean?

  • buf - destination buffer to copy the result to
  • start - the magic pointer explained in the comment above used to speed up proc reading.
  • offset - offset in the file to read from
  • count - count of bytes to read
  • eof - a pointer to int, that need to be set to nonzero, in case the whole file is read
  • data - user context, passed as the last parameter in the create_proc_entry function.

There are a number of questions I have regarding the actual implementation itself (which can be found below):

The scullp_proc_offset manipulates the len offset inside the buf buffer. If offset != 0, then the scull_read_p_mem needs not to read from the first byte, but some byte offset. Because it is lazily written, then snprintf calls execute anyways, you need "kind of shift" the buffer.

what does this function do? - Actually I see it is a funny way of counting how many bytes were/need to be copied to the user.

what is the purpose of this? - No idea. Looks buggy tho, as *offset will become negative. The comment above the function /* FIXME this should use seq_file */ says there is something left to fix. I think the ides is return exactly information about one scull_p_devices[i] in one call.

why do you need to change "start"? - which comes to this. If *offset is different then 0 and if we have some bytes to read, we should return a pointer to buf + offset, to let kernel know from where to read. Note, that *start = buf was initialized already, so the kernel will do copy_to_user(... *start, len).

0
votes

The buf points to the base of a memory page, where the read result should be stored.

The offset suggests where the read should begin in the virtual file.

start as a return value can mean a few thing:

  1. if start == null, we have write result to buf + offset
  2. if start == some int < buf address, we have write result to buf
  3. if start == some address within buf, we have write result to start

And the function should be interpreted as following.

// context:
// 1. We have set start to be the beginning of buf
//     *start = buf;
// 2. We have fill some string into the buf and increment len
static void scullp_proc_offset(char *buf, char **start, off_t *offset, int *len)
{
    // If offset is zero, we are filling a new buf.
    // (Or we have finished catching up as shown below)
    // We can continue to do so until we reach buf size
    if (*offset == 0)
        return;

    // Otherwise, we have some offset to catch up to.
    if (*offset >= *len) {
        // The offset might be still ahead given we have filled len.
        // So, we reduce offset(meaning that we have done some catch up)
        // and reset len(because we are still working on old data)
        *offset -= *len;
        *len = 0;
    }
    else {
        // Or, the offset might be already behind us
        // So, we set start to indicate that the write starts from this address
        // (case 2 in fs/proc/generic.c#L76)
        // and reset offset (meaning that we are done with catching up)
        *start = buf + *offset;
        *offset = 0;
        // this part might lack the handling of len
        // but it might be true if the structure is the same between two calls.
    }
}

In conclusion, the author uses this function to so that the write always starts at buf, but only counts len after catching up with previous offset.