18
votes

The following question was given in a college programming contest. We were asked to guess the output and/or explain its working. Needless to say, none of us succeeded.

main(_){write(read(0,&_,1)&&main());}

Some short Googling led me to this exact question, asked in codegolf.stackexchange.com :

https://codegolf.stackexchange.com/a/1336/4085

There, its explained what it does : Reverse stdin and place on stdout, but not how.

I also found some help in this question : Three arguments to main, and other obfuscating tricks but it still does not explain how main(_), &_ and &&main() works.

My question is, how do these syntaxes work ? Are they something I should know about, as in, are they still relevant ?

I would be grateful for any pointers (to resource links, etc.), if not outright answers.

2
That program won't compile in C++. Removing the C++ tag.Robᵩ
@Robᵩ Ah thank you. I was careless.RaunakS
Even in C, that program invokes undefined behavior multiple ways. The result is predictable only for specific compilers targeting specific types of CPUs (even on codegolf, this program only does something interesting at a specific optimization level). Correct answers to "What does this program do?" include "It depends," "Whatever it wants," and "It gets you fired."Robᵩ
@Robᵩ Or, in my case, it gets me the exit door at a contest. Still, I would like to know how it works. Can I use a debugger, or any tool in an IDE (I'm using codeblocks) to get some idea ?RaunakS
No, RaunakS, it gets a contest the exit door in your life. You really don't want to associate with people who think that this is a valid programming question.Robᵩ

2 Answers

26
votes

What does this program do?

main(_){write(read(0,&_,1)&&main());}

Before we analyze it, let's prettify it:

main(_) {
    write ( read(0, &_, 1) && main() );
}

First, you should know that _ is a valid variable name, albeit an ugly one. Let's change it:

main(argc) {
    write( read(0, &argc, 1) && main() );
}

Next, realize that the return type of a function, and the type of a parameter are optional in C (but not in C++):

int main(int argc) {
    write( read(0, &argc, 1) && main() );
}

Next, understand how return values work. For certain CPU types, the return value is always stored in the same registers (EAX on x86, for example). Thus, if you omit a return statement, the return value is likely going to be whatever the most recent function returned.

int main(int argc) {
    int result = write( read(0, &argc, 1) && main() );
    return result;
}

The call to read is more-or-less evident: it reads from standard in (file descriptor 0), into the memory located at &argc, for 1 byte. It returns 1 if the read was successful, and 0 otherwise.

&& is the logical "and" operator. It evaluates its right-hand-side if and only if it's left-hand-side is "true" (technically, any non-zero value). The result of the && expression is an int which is always 1 (for "true") or 0 (for false).

In this case, the right-hand-side invokes main with no arguments. Calling main with no arguments after declaring it with 1 argument is undefined behavior. Nevertheless, it often works, as long as you don't care about the initial value of the argc parameter.

The result of the && is then passed to write(). So, our code now looks like:

int main(int argc) {
    int read_result = read(0, &argc, 1) && main();
    int result = write(read_result);
    return result;
}

Hmm. A quick look at the man pages reveals that write takes three arguments, not one. Another case of undefined behavior. Just like calling main with too few arguments, we cannot predict what write will receive for its 2nd and 3rd arguments. On typical computers, they will get something, but we can't know for sure what. (On atypical computers, strange things can happen.) The author is relying upon write receiving whatever was previously stored on the memory stack. And, he is relying upon that being the 2nd and 3rd arguments to read.

int main(int argc) {
    int read_result = read(0, &argc, 1) && main();
    int result = write(read_result, &argc, 1);
    return result;
}

Fixing the invalid call to main, and adding headers, and expanding the && we have:

#include <unistd.h>
int main(int argc, int argv) {
    int result;
    result = read(0, &argc, 1);
    if(result) result = main(argc, argv);
    result = write(result, &argc, 1);
    return result;
}


Conclusions

This program won't work as expected on many computers. Even if you use the same computer as the original author, it might not work on a different operating system. Even if you use the same computer and same operating system, it won't work on many compilers. Even if you use the same computer compiler and operating system, it might not work if you change the compiler's command line flags.

As I said in the comments, the question does not have a valid answer. If you found a contest organizer or contest judge that says otherwise, don't invite them to your next contest.

9
votes

Ok, _ is just a variable declared in early K&R C syntax with a default type of int. It functions as temporary storage.

The program will try to read one byte from standard input. If there is input, it will call main recursively continuing to read one byte.

At the end of input, read(2) will return 0, the expression will return 0, the write(2) system call will execute, and the call chain will probably unwind.

I say "probably" here because from this point on the results are highly implementation-dependent. The other parameters to write(2) are missing, but something will be in the registers and on the stack, so something will be passed into the kernel. The same undefined behavior applies to the return value from the various recursive activations of main.

On my x86_64 Mac, the program reads standard input until EOF and then exits, writing nothing at all.