2
votes

I have a small issue with an K&R example (sort line example, page 108).

I do not understand the behaviour I see when I uncomment the line in readlines which removes the newline character added when reading input with getline.

int main()
{
    int nlines;

    if ((nlines = readlines(lineptr, MAXLINES)) >= 0) {
        my_qsort(lineptr, 0, nlines-1);
        writelines(lineptr, nlines);
        return 0;
    } else {
        printf("error: input too big \n");
        return 1;
    }
}

int readlines(char *lineptr[], int maxlines)
{
    int len, nlines;

    char *p, line[MAXLEN];

    nlines = 0;
    while ((len = my_getline(line, MAXLEN)) > 0)
        if (nlines >= maxlines || (p = alloc(len)) == NULL)
             return -1;
        else {
            line[len-1] = '\0'; //delete newline. 
            my_strcpy(p, line);
            lineptr[nlines++] = p;
        }
    return nlines;
}

void writelines(char *lineptr[], int nlines)
{
    while (nlines-- > 0)
        printf("%s\n", *lineptr++);
}

For example, if I then pipe in the following:

linje1
linje2
linje3
linje4

then writelines will output:

linje1
linje2
linje3
linje4

linje2
linje3
linje4

linje3
linje4

linje4
"and one last newline..."

From which I deduce that lineptr[0] points to all the lines. lineptr[1] points to all but the first line, ... , lineptr[3] points just to "linje4"

I do not understand how we get this behaviour from storing lines as "linje1\n", instead of "linje1".

Clarification: in writelines (when lineptr points to the start of the array) how can the call printf("%s", *lineptr) print all the lines?

Edit 2:

Ah, I see, but here is the getline function from K&R

int my_getline(char s[], int lim)
{
    int c, i;

    for (i=0; i < lim-1 && (c=getchar()) != EOF && c != '\n'; i++)
        s[i] = c;
    if (c == '\n') {
        s[i] = c;
        ++i;
    }
    s[i] = '\0';
    return i;
}

And I was sure that would always give me a null-terminated string, regardless of whether it ended with a newline or not?

and here is K&R's alloc:

#define ALLOCSIZE 10000

static char allocbuf[ALLOCSIZE]; // Storeage for alloc
static char *allocp = allocbuf; // Next free position

char *alloc(int n) // Return pointer to n characters
{
    if (allocbuf + ALLOCSIZE - allocp >= n) { // it fits
        allocp += n;
        return allocp - n;
    } else
        return 0;
}

Edit 3: Thank you for all the comments. But the entire program is typed up exactly as in K&R and works perfectly (I have compared the output with grep), so all the peripheral functions do as intended (my_strcpy for example works exactly like strcpy and copies the string up to and including the null terminator). The alloc function is just a pointer to K&R's big char array.

What I still don't understand is:

C reads in some lines of text, copies it, which stores line i somewhere in memory, and have lineptr[i] point to that memory location:

The lines are read with my_getline, which reads in the entire line (including newline character) and then terminates the string with the nullcharacter.

If I skip the line[len-1] = '\0'; step, readlines then stores a pointer to a copy of this line in lineptr[i]. And in memory, I thought the string (for i=1) looked like this "linje1\n\0"

But as @DanJAB pointed out, the nullcharacter is most likely missing, so the string is stored as "linje1\n", and so when writelines prints (via the pointer in the first entry in lineptr) this line, it prints everything following this in memory, since the nullcharacter is missing, which happens to be the rest of the lines.

But what I just can't wrap my head around is why is line[len-1] = '\0'; then evidently is needed for the string (i=1) to be stored as "linje1\0", when my_getline always returns a nullterminated string?

Thanks again, and sorry for any potential unclarities.

Final edit The entire problem was with alloc(len) not allocating space for the final nullcharacter! Thank you for helping me out.

2
perhaps the issue with the null terminator being missing otherwise is in another function. Perhaps my_strcpy is not doing it correctly. - DanJAB
@DanJAB No I get exactly the same if I replace that function with the library version. Thanks though. - luffe
What type/value is MAXLEN? - chux - Reinstate Monica
A basic debugging technique is to use print statements. For example, after the line: line[len-1] = '\0'; //delete newline., you could add: printf("line: <<%s>>\n", line); to check that the newline has been zapped. Of course, if there wasn't a newline, you've zapped some other last character, but that shouldn't be an issue with your data. - Jonathan Leffler

2 Answers

2
votes

If you are talking about the line line[len-1] = '\0'; It's not deleting a new line, it's replacing it with a null terminator. This means that if you don't have that line then you don't have the that thing that marks the end of the string, therefore when you print it, you also get whatever follows it in memory (the next strings).

2
votes

1) With char line[MAXLEN]; ... my_getline(line, MAXLEN) ... my_getline(char s[], int lim), lim is the size of the buffer.

But the function my_getline() is designed that lim is the maximum string length.
C string length is 1 less than the minimum size needed for that char array to resides in.

Use char line[MAXLEN+1]; or alternatively change my_getline(line, MAXLEN) code to i < lim-2

2) The result of my_getline(line, MAXLEN) can be "" (but the len > 0 test takes care of that) , further the line may not end with a '\n'.

line[len-1] = '\0'; //delete newline. 

Better to use

if (len > 0 && line[len-1] == '\n') {
  line[len-1] = '\0'; //delete newline. 
}

3) p = alloc(len) is inadequate. Use p = alloc(len+1u)

4) Recommend commenting out my_qsort(lineptr, 0, nlines-1); until all other code is working.

5) All this makes me suspect the unposted my_strcpy()/my_qsort() too.

Code may have additional problems, but what is posted in not compilable.