2
votes

I learned C a few years ago using K&R 2nd edition ANSI C. I’ve been reviewing my notes, while I’m learning more modern C from 2 other books.

I notice that K&R never use scanf in the book, except on the one section where they introduce it. They mainly use a getline function which they write in the book, which they change later in the book, once they introduce pointers. There getline is different then gcc getline, which caused me some problems until i change the name of getline to ggetline.

Reviewing my notes i found this quote:

This simplification is convenient and superficially attractive, and it works, as far as it goes. The problem is that scanf does not work well in more complicated situations. In section 7.1, we said that calls to putchar and printf could be interleaved. The same is not always true of scanf: you can have baffling problems if you try to intermix calls to scanf with calls to getchar or getline. Worse, it turns out that scanf's error handling is inadequate for many purposes. It tells you whether a conversion succeeded or not (more precisely, it tells you how many conversions succeeded), but it doesn't tell you anything more than that (unless you ask very carefully). Like atoi and atof, scanf stops reading characters when it's processing a %d or %f input and it finds a non-numeric character. Suppose you've prompted the user to enter a number, and the user accidentally types the letter 'x'. scanf might return 0, indicating that it couldn't convert a number, but the unconvertable text (the 'x') remains on the input stream unless you figure out some other way to remove it.

For these reasons (and several others, which I won't bother to mention) it's generally recommended that scanf not be used for unstructured input such as user prompts. It's much better to read entire lines with something like getline (as we've been doing all along) and then process the line somehow. If the line is supposed to be a single number, you can use atoi or atof to convert it. If the line has more complicated structure, you can use sscanf (which we'll meet in a minute) to parse it. (It's better to use sscanf than scanf because when sscanf fails, you have complete control over what you do next. When scanf fails, on the other hand, you're at the mercy of where in the input stream it has left you.)

At first i thought this quote was from K&R, but i cannot find it in the book. Then i realized thats it's from lecture notes i got online, for someone who taught a course years ago using K&R book.

lecture notes

I know that K&R book is 30 years old now, so it is dated is some ways.

This quote is very old, so i was wondering if scanf still has this behavior or has it changed?

Does scanf still leave stuff in the input stream when it fails? for example above:

Suppose you've prompted the user to enter a number, and the user accidentally types the letter 'x'. scanf might return 0, indicating that it couldn't convert a number, but the unconvertable text (the 'x') remains on the input stream.

Is the following still true?

putchar and printf could be interleaved. The same is not always true of scanf: you can have baffling problems if you try to intermix calls to scanf with calls to getchar or getline.

Has scanf changed much since the quotes above were written? Or are they still true today?

The reason i ask, in the newer books i am reading, no one mentions these issues.

1
The behaviour of scanf() has not changed significantly — to change it would be to break working code (lots of working code).Jonathan Leffler

1 Answers

4
votes

scanf() is evil - use fgets() and then parse.


The detail is not that scanf() is completely bad.

1) The format specifiers are often used in a weak manner

char buf[100];
scanf("%s", buf); // bad - no width limit

2) The return value is errantly not checked

scanf("%99[\n]", buf); // what if use entered `"\n"`?
puts(buf); 

3) When input is not as expected, it is not clear what remains in stdin.

if (scanf("%d %d %d", &i, &j, &k) != 3) {
  // OK, not what is in `stdin`?
}

you can have baffling problems if you try to intermix calls to scanf with calls to getchar or getline.

Yes. Many scanf() calls leave a trailing '\n' in stdin that are then read as an empty line by getline(), fgets(). scanf() is not for reading lines. getline() and fgets() are much better suited to read a line.


Has scanf changed much since the quotes above were written?

Only so much change can happen without messing up the code base. @Jonathan Leffler

scanf() remains troublesome. scanf() is unable to accept an argument (after the format) to indicate how many characters to accept into a char * destination.

Some systems have added additional format options to help.

A fundamental issues is this:

User input is evil. It is more robust to get the text input as one step, qualify input, then parse and assess its success than trying to do all this in one function.


Security

The weakness of scanf() and coder tendency to code scanf() poorly has been a gold mine for hackers.

IMO, C lacks a robust user input function set.