1
votes

I am trying to get a line from stdin. as far as I understand, we should never use gets as said in man page of gets:

Never use gets(). Because it is impossible to tell without knowing the data in advance how many characters gets() will read, and because gets() will continue to store characters past the end of the buffer, it is extremely dangerous to use. It has been used to break computer security. Use fgets() instead.

it suggests that we can use fgets() instead. the problem with fgets() is that we don't know the size of the user input in advance and fgets() read exactly one less than size bytes from the stream as man said:

fgets() reads in at most one less than size characters from stream and stores them into the buffer pointed to by s. Reading stops after an EOF or a newline. If a newline is read, it is stored into the buffer. A terminating null byte ('\0') is stored after the last character in the buffer.

There is also another approach which is using POSIX getline() which uses realloc to update the buffer size so we can read any string with arbitrary length from input stream as man said:

Alternatively, before calling getline(), *lineptr can contain a pointer to a malloc(3)-allocated buffer *n bytes in size. If the buffer is not large enough to hold the line, getline() resizes it with realloc(3), updating *lineptr and *n as necessary.

and finally there is another approach which is using obstack as libc manual said:

Aside from this one constraint of order of freeing, obstacks are totally general: an obstack can contain any number of objects of any size. They are implemented with macros, so allocation is usually very fast as long as the objects are usually small. And the only space overhead per object is the padding needed to start each object on a suitable boundary...

So we can use obstack for any object of any size an allocation is very fast with a little space overhead which is not a big deal. I wrote this code to read input string without knowing the length of it.

#include <stdio.h>
#include <stdlib.h>
#include <obstack.h>
#define obstack_chunk_alloc malloc
#define obstack_chunk_free free
int main(){
        unsigned char c;
        struct obstack * mystack;
        mystack = (struct obstack *) malloc(sizeof(struct obstack));
        obstack_init(mystack);
        c = fgetc(stdin);
        while(c!='\r' && c!='\n'){
                obstack_1grow(mystack,c);
                c = fgetc(stdin);
        }
        printf("the size of the stack is: %d\n",obstack_object_size(mystack));
        printf("the input is: %s\n",(char *)obstack_finish(mystack));
        return 0;
}

So my question is : Is it safe to use obstack like this? Is it like using POSIX getline? Am I missing something here? any drawbacks? Why shouldn't I using it? thanks in advance.

2
If you have access to the POSIX getline function use it. Otherwise it's not so hard to implement it yourself, reading characters in a loop (like you do) and then realloc memory as needed. No need to use non-standard functionality like obstack.Some programmer dude
"reading characters in a loop (like you do)" -- or IMHO better, read chunks with fgets() and increase the buffer size if the last character read wasn't a newline.user2371524
@Someprogrammerdude, I know I can use POSIX getline and also know how to implement getline. I just want to know what are the drawbacks of using obstack except being non-standard which is I believe is not since it is in gnu libc (at least I don't care since I use ubuntu and fedora).Sourena Maroofi
there is absolutely no drawbacks to fgets as compared to gets. gets was removed altogether in C11. It doesn't exist any more. There is no such function.Antti Haapala

2 Answers

2
votes

fgets has no drawbacks over gets. It just forces you to acknowledge that you must know the size of the buffer. gets instead requires you to somehow magically know beforehand the length of the input a (possibly malicious) user is going to feed into your program. That is why gets was removed from the C programming language. It is now non-standard, while fgets is standard and portable.

As for knowing the length of the line beforehand, POSIX says that an utility must be prepared to handle lines that fit in buffers that are of LINE_MAX size. Thus you can do:

char line[LINE_MAX];
while (fgets(line, LINE_MAX, fp) != NULL)

and any file that produces problems with that is not a standard text file. In practice everything will be mostly fine if you just don't blindly assume that the last character in the buffer is always '\n' (which it isn't).


getline is a POSIX standard function. obstack is a GNU libc extension that is not portable. getline was built for efficient reading of lines from files, obstack was not, it was built to be generic. With obstack, the string is not properly contiguous in memory / in its final place, until you call obstack_finish.

Use getline if on POSIX, use fgets in programs that need to be maximally portable; look for an emulation of getline for non-POSIX platforms built on fgets.

2
votes

Why shouldn't I using it?

Well, you shouldn't use getline() if you care about portability. You should use getline() if you're specifically targeting only POSIX systems.

As for obstacks, they're specific to the GNU C library, which might already be a strong reason to avoid them (it further restricts portability). Also, they're not meant to be used for this purpose.

If you aim for portability, just use fgets(). It's not too complicated to write a function similar to getline() based on fgets() -- here's an example:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#define CHUNKSIZE 1024

char *readline(FILE *f)
{
    size_t bufsize = CHUNKSIZE;
    char *buf = malloc(bufsize);
    if (!buf) return 0;

    char *pos = buf;
    size_t len = 0;

    while (fgets(pos, CHUNKSIZE, f))
    {
        char *nl = strchr(pos, '\n');
        if (nl)
        {
            // newline found, replace with string terminator
            *nl = '\0';
            char *tmp = realloc(buf, len + strlen(pos) + 1);
            if (tmp) return tmp;
            return buf;
        }

        // no newline, increase buffer size
        len += strlen(pos);
        char *tmp = realloc(buf, len + CHUNKSIZE);
        if (!tmp)
        {
            free(buf);
            return 0;
        }
        buf = tmp;
        pos = buf + len;
    }

    // handle case when input ends without a newline
    char *tmp = realloc(buf, len + 1);
    if (tmp) return tmp;
    return buf;
}

int main(void)
{
    char *input = readline(stdin);
    if (!input)
    {
        fputs("Error reading input!\n", stderr);
        return 1;
    }
    puts(input);
    free(input);
    return 0;
}

This one removes the newline if it was found and returns a newly allocated buffer (which the caller has to free()). Adapt to your needs. It could be improved by increasing the buffer size only when the buffer was filled completely, with just a bit more code ...