6
votes

I try the understand the relation between scanf and the input buffer. I use scanf with the following format string:

int z1,z2;
scanf("%d %d", &z1,&z2);

And try to understand why I can enter as many as possible whitespace (Enter, Blanks, Tabs) after I type in a number like 54 and press enter.

As far as I understand every key which I press is put in the input buffer until I press Enter.

So if I type in 54 and press Enter the input buffer contains 3 elements, two digits and the line break. So my buffer looks like [5][4][\n]

Now scanf/formatstring is evaluated from left to right. So the first %d matches to 54, 54 is stored in z1.

Because of the whitespace in the format string the line break (\n) caused from pressing the first enter is "consumed".

So after the evaluation of the first %d and the whitespace (\n) the buffer is empty again.

Now scanf tries to evaluate the second (and last) %d in the format string. Because the buffer is now empty scanf waits for further user input (user input = reads from stdin in my case keyboard).

So the buffer state/action sequence is

buffer empty -> call of scanf -> scanf blocks for user input --> user input is: 54 Enter --> buffer contains: [5][4][\n] -> evaluation of first %d --> buffer contains [\n] -> evaluation of whitespace --> buffer empty --> scanf blocks for user input (because of the evaluation of second and last %d) --> ...

Did I understand this correct? (sorry, english is not my native language)

regards

2
If you use a terminal then the terminal's handling of input plays a role as well; typically the program would only be sent completed lines of input, but that is not a must. It's easier to discuss the program's behaviour with input redirection from a file, like in prog < prepared-inp.txt which works similar in a Windows command shell and the various linux shells.Peter - Reinstate Monica
Also note that a "%d%d" format string would behave exactly the same; parsing the first integer stops when the first non-digit is encountered (among that is white space like newline), and the (second) %d format rightly skips all white space it encounters because it's only interested in numbers. White space in the format string is only significant for %c and %[...] (which would otherwise assign it to their corresponding arguments).Peter - Reinstate Monica
I also do not like the "buffer" paradigm to begin with because there may not be any buffer, or there may be any number of buffers (in the keyboard, in the remote terminal/computer, in your access point, in your network card, in your program. The latter is probably what you mean (what you could change with setbuf()), but you could disable that!) The interesting thing is the sequence of characters which getchar() sees. Where exactly they come from and whether they were buffered somewhere is secondary.Peter - Reinstate Monica
@PeterSchneider Whitespace is also significant for %n.Spikatrix
@CoolGuy You mean whitespace in the input is [not] skipped -- and hence [not] counted before the value is assigned --, depending on the presence of whitespace before %n in the format? True. Of course ;-).Peter - Reinstate Monica

2 Answers

3
votes

As far as I understand every key which I press is put in the input buffer until I press Enter.

Correct. Pressing Enter flushes data into the stdin(standard input stream). Note that it also sends \n into the stdin.

So if I type in 54 and press Enter the input buffer contains 3 elements, two digits and the line break. So my buffer looks like [5][4][\n]

Yes.

Now scanf/formatstring is evaluated from left to right. So the first %d matches to 54, 54 is stored in z1.

Right.

Because of the whitespace in the format string the line break (\n) caused from pressing the first enter is "consumed".

Correct.

So after the evaluation of the first %d and the whitespace (\n) the buffer is empty again.

Yes.

Now scanf tries to evaluate the second (and last) %d in the format string

Not quite.

The space between the two %d is a whitespace character and whitespace characters in the format string of scanf instructs scanf to scan and discard all whitespace characters, if any, until the first non-whitespace character. This can be seen in n1570, the committee draft of the C11 standard:

7.21.6.2 The fscanf function

[...]

  1. A directive composed of white-space character(s) is executed by reading input up to the first non-white-space character (which remains unread), orn until no more characters can be read. The directive never fails.

This means that the execution is still in the space between the %ds as it hasn't encountered a non-whitespace character yet.

Because the buffer is now empty scanf waits for further user input (user input = reads from stdin in my case keyboard).

Yes.

So,

buffer empty -> call of scanf -> scanf blocks for user input --> user input is: 54 Enter --> buffer contains: [5][4][\n] -> evaluation of first %d --> buffer contains [\n] -> evaluation of whitespace --> buffer empty --> scanf blocks for user input (because of the evaluation of second and last %d) --> ...

should be

"Buffer empty -> call of scanf -> scanf blocks for user input --> user input is: 54\n --> buffer contains: 54\n -> evaluation of first %d --> buffer contains \n -> evaluation of whitespace --> buffer empty --> scanf blocks for user input (because of the evaluation of the whitespace) --> ..."


Note that the scanf will behave the same way when there is many whitespace characters or no whitespace characters between the %ds (before a %d) as %d already skips leading whitespace characters. In fact, the only format specifiers for whom whitespace characters are significant are %c, %[ and %n as seen in n1570:

7.21.6.2 The fscanf function

[...]

  1. Input white-space characters (as specified by the isspace function) are skipped, unless the specification includes a [, c, or n specifier. 284
1
votes

Pretty much.

Scanf reads the input buffer (stdin).

In the windows cmd.exe terminal pressing enter flushes the input you typed in into the input buffer, cue your first variable beeing filled.

It then prompts again, to fill the second variable.