What belongs in an educational tool to demonstrate the unwarranted assumptions people make in C/C++?

122

votes

I'd like to prepare a little educational tool for SO which should help beginners (and intermediate) programmers to recognize and challenge their unwarranted assumptions in C, C++ and their platforms.

Examples:

"integers wrap around"
"everyone has ASCII"
"I can store a function pointer in a void*"

I figured that a small test program could be run on various platforms, which runs the "plausible" assumptions which are, from our experience in SO, usually made by many inexperienced/semiexperienced mainstream developers and record the ways they break on diverse machines.

The goal of this is not to prove that it is "safe" to do something (which would be impossible to do, the tests prove only anything if they break), but instead to demonstrate to even the most uncomprehending individual how the most inconspicuous expression break on a different machine, if it has a undefined or implementation defined behavior..

To achieve this I would like to ask you:

How can this idea be improved?
Which tests would be good and how should they look like?
Would you run the tests on the platforms you can get your hands on and post the results, so that we end up with a database of platforms, how they differ and why this difference is allowed?

Here's the current version for the test toy:

#include <stdio.h>
#include <limits.h>
#include <stdlib.h>
#include <stddef.h>
int count=0;
int total=0;
void expect(const char *info, const char *expr)
{
    printf("..%s\n   but '%s' is false.\n",info,expr);
    fflush(stdout);
    count++;
}
#define EXPECT(INFO,EXPR) if (total++,!(EXPR)) expect(INFO,#EXPR)

/* stack check..How can I do this better? */
ptrdiff_t check_grow(int k, int *p)
{
    if (p==0) p=&k;
    if (k==0) return &k-p;
    else return check_grow(k-1,p);
}
#define BITS_PER_INT (sizeof(int)*CHAR_BIT)

int bits_per_int=BITS_PER_INT;
int int_max=INT_MAX;
int int_min=INT_MIN;

/* for 21 - left to right */
int ltr_result=0;
unsigned ltr_fun(int k)
{
    ltr_result=ltr_result*10+k;
    return 1;
}

int main()
{
    printf("We like to think that:\n");
    /* characters */
    EXPECT("00 we have ASCII",('A'==65));
    EXPECT("01 A-Z is in a block",('Z'-'A')+1==26);
    EXPECT("02 big letters come before small letters",('A'<'a'));
    EXPECT("03 a char is 8 bits",CHAR_BIT==8);
    EXPECT("04 a char is signed",CHAR_MIN==SCHAR_MIN);

    /* integers */
    EXPECT("05 int has the size of pointers",sizeof(int)==sizeof(void*));
    /* not true for Windows-64 */
    EXPECT("05a long has at least the size of pointers",sizeof(long)>=sizeof(void*));

    EXPECT("06 integers are 2-complement and wrap around",(int_max+1)==(int_min));
    EXPECT("07 integers are 2-complement and *always* wrap around",(INT_MAX+1)==(INT_MIN));
    EXPECT("08 overshifting is okay",(1<<bits_per_int)==0);
    EXPECT("09 overshifting is *always* okay",(1<<BITS_PER_INT)==0);
    {
        int t;
        EXPECT("09a minus shifts backwards",(t=-1,(15<<t)==7));
    }
    /* pointers */
    /* Suggested by jalf */
    EXPECT("10 void* can store function pointers",sizeof(void*)>=sizeof(void(*)()));
    /* execution */
    EXPECT("11 Detecting how the stack grows is easy",check_grow(5,0)!=0);
    EXPECT("12 the stack grows downwards",check_grow(5,0)<0);

    {
        int t;
        /* suggested by jk */
        EXPECT("13 The smallest bits always come first",(t=0x1234,0x34==*(char*)&t));
    }
    {
        /* Suggested by S.Lott */
        int a[2]={0,0};
        int i=0;
        EXPECT("14 i++ is strictly left to right",(i=0,a[i++]=i,a[0]==1));
    }
    {
        struct {
            char c;
            int i;
        } char_int;
        EXPECT("15 structs are packed",sizeof(char_int)==(sizeof(char)+sizeof(int)));
    }
    {
        EXPECT("16 malloc()=NULL means out of memory",(malloc(0)!=NULL));
    }

    /* suggested by David Thornley */
    EXPECT("17 size_t is unsigned int",sizeof(size_t)==sizeof(unsigned int));
    /* this is true for C99, but not for C90. */
    EXPECT("18 a%b has the same sign as a",((-10%3)==-1) && ((10%-3)==1));

    /* suggested by nos */
    EXPECT("19-1 char<short",sizeof(char)<sizeof(short));
    EXPECT("19-2 short<int",sizeof(short)<sizeof(int));
    EXPECT("19-3 int<long",sizeof(int)<sizeof(long));
    EXPECT("20 ptrdiff_t and size_t have the same size",(sizeof(ptrdiff_t)==sizeof(size_t)));
#if 0
    {
        /* suggested by R. */
        /* this crashed on TC 3.0++, compact. */
        char buf[10];
        EXPECT("21 You can use snprintf to append a string",
               (snprintf(buf,10,"OK"),snprintf(buf,10,"%s!!",buf),strcmp(buf,"OK!!")==0));
    }
#endif

    EXPECT("21 Evaluation is left to right",
           (ltr_fun(1)*ltr_fun(2)*ltr_fun(3)*ltr_fun(4),ltr_result==1234));

    {
    #ifdef __STDC_IEC_559__
    int STDC_IEC_559_is_defined=1;
    #else 
    /* This either means, there is no FP support
     *or* the compiler is not C99 enough to define  __STDC_IEC_559__
     *or* the FP support is not IEEE compliant. */
    int STDC_IEC_559_is_defined=0;
    #endif
    EXPECT("22 floating point is always IEEE",STDC_IEC_559_is_defined);
    }

    printf("From what I can say with my puny test cases, you are %d%% mainstream\n",100-(100*count)/total);
    return 0;
}

Oh, and I made this community wiki right from the start because I figured that people want to edit my blabber when they read this.

UPDATE Thanks for your input. I've added a few cases from your answers and will see if I can set up a github for this like Greg suggested.

UPDATE: I've created a github repo for this, the file is "gotcha.c":

http://github.com/lutherblissett/disenchanter

Please answer here with patches or new ideas, so they can be discussed or clarified here. I will merge them into gotcha.c then.

c++c cross-platformportability

Consider the medium model in DOS. Functions can be stored in multiple segments, so a function pointer is 32 bits long. But your data is stored in a single segment only, therefore data pointers are only 16 bits long. Since void* is a data pointer, it's 16 bits wide, so you can't fit a function pointer in one. See c-jump.com/CIS77/ASM/Directives/D77_0030_models.htm. - David Given

Perhaps you could throw this code up on github.com or something and then people could easily contribute patches. - Greg Hewgill

A lot of things here should help: stackoverflow.com/questions/367633/… - Martin York

POSIX requires that function pointers have the same representation as void * and can be converted (with a cast) without loss of information. One of the reasons for this is that dlsym() returns a void * but is intended for both data and function pointers. Therefore it may not be so bad to depend on this. - jilles

@tristopia: Point 15 is here, because many beginners are often surprised to learn that data is not packed continuously but instead aligned to certain boundaries. They're puzzled when they change the member order and get different object sizes. Also, packing is the default mode with many contemporary micro controller or embedded devices. My AVR Atmega and TurboC/MSDOS output is packed too. MSDOS is still used in industrial applications. - Nordic Mainframe

92

votes

The order of evaluation of subexpressions, including

the arguments of a function call and
operands of operators (e.g., +, -, =, * , /), with the exception of:
- the binary logical operators (&& and ||),
- the ternary conditional operator (?:), and
- the comma operator (,)

is Unspecified

For example

  int Hello()
  {
       return printf("Hello"); /* printf() returns the number of 
                                  characters successfully printed by it
                               */
  }

  int World()
  {
       return printf("World !");
  }

  int main()
  {

      int a = Hello() + World(); //might print Hello World! or World! Hello
      /**             ^
                      | 
                Functions can be called in either order
      **/
      return 0;
  }

38

votes

sdcc 29.7/ucSim/Z80

We like to think that:
..09a minus shifts backwards
   but '(t=-1,(15<<t)==7)' is false.
..19-2 short<int
   but 'sizeof(short)<sizeof(int)' is false.
..22 floating point is always IEEE
   but 'STDC_IEC_559_is_defined' is false.
..25 pointer arithmetic works outside arrays
   but '(diff=&var.int2-&var.int1, &var.int1+diff==&var.int2)' is false.
From what I can say with my puny test cases, you are Stop at 0x0013f3: (106) Invalid instruction 0x00dd

printf crashes. "O_O"

gcc 4.4@x86_64-suse-linux

We like to think that:
..05 int has the size of pointers
but 'sizeof(int)==sizeof(void*)' is false.
..08 overshifting is okay
but '(1<<bits_per_int)==0' is false.
..09a minus shifts backwards
but '(t=-1,(15<<t)==7)' is false.
..14 i++ is strictly left to right
but '(i=0,a[i++]=i,a[0]==1)' is false.
..15 structs are packed
but 'sizeof(char_int)==(sizeof(char)+sizeof(int))' is false.
..17 size_t is unsigned int
but 'sizeof(size_t)==sizeof(unsigned int)' is false.
..26 sizeof() does not evaluate its arguments
but '(i=10,sizeof(char[((i=20),10)]),i==10)' is false.
From what I can say with my puny test cases, you are 79% mainstream

gcc 4.4@x86_64-suse-linux(-O2)

We like to think that:
..05 int has the size of pointers
but 'sizeof(int)==sizeof(void*)' is false.
..08 overshifting is okay
but '(1<<bits_per_int)==0' is false.
..14 i++ is strictly left to right
but '(i=0,a[i++]=i,a[0]==1)' is false.
..15 structs are packed
but 'sizeof(char_int)==(sizeof(char)+sizeof(int))' is false.
..17 size_t is unsigned int
but 'sizeof(size_t)==sizeof(unsigned int)' is false.
..26 sizeof() does not evaluate its arguments
but '(i=10,sizeof(char[((i=20),10)]),i==10)' is false.
From what I can say with my puny test cases, you are 82% mainstream

clang 2.7@x86_64-suse-linux

We like to think that:
..05 int has the size of pointers
but 'sizeof(int)==sizeof(void*)' is false.
..08 overshifting is okay
but '(1<<bits_per_int)==0' is false.
..09a minus shifts backwards
but '(t=-1,(15<<t)==7)' is false.
..14 i++ is strictly left to right
but '(i=0,a[i++]=i,a[0]==1)' is false.
..15 structs are packed
but 'sizeof(char_int)==(sizeof(char)+sizeof(int))' is false.
..17 size_t is unsigned int
but 'sizeof(size_t)==sizeof(unsigned int)' is false.
..21a Function Arguments are evaluated right to left
but '(gobble_args(0,ltr_fun(1),ltr_fun(2),ltr_fun(3),ltr_fun(4)),ltr_result==4321)' is false.
ltr_result is 1234 in this case
..25a pointer arithmetic works outside arrays
but '(diff=&p1-&p2, &p2+diff==&p1)' is false.
..26 sizeof() does not evaluate its arguments
but '(i=10,sizeof(char[((i=20),10)]),i==10)' is false.
From what I can say with my puny test cases, you are 72% mainstream

open64 4.2.3@x86_64-suse-linux

We like to think that:
..05 int has the size of pointers
but 'sizeof(int)==sizeof(void*)' is false.
..08 overshifting is okay
but '(1<<bits_per_int)==0' is false.
..09a minus shifts backwards
but '(t=-1,(15<<t)==7)' is false.
..15 structs are packed
but 'sizeof(char_int)==(sizeof(char)+sizeof(int))' is false.
..17 size_t is unsigned int
but 'sizeof(size_t)==sizeof(unsigned int)' is false.
..21a Function Arguments are evaluated right to left
but '(gobble_args(0,ltr_fun(1),ltr_fun(2),ltr_fun(3),ltr_fun(4)),ltr_result==4321)' is false.
ltr_result is 1234 in this case
..25a pointer arithmetic works outside arrays
but '(diff=&p1-&p2, &p2+diff==&p1)' is false.
..26 sizeof() does not evaluate its arguments
but '(i=10,sizeof(char[((i=20),10)]),i==10)' is false.
From what I can say with my puny test cases, you are 75% mainstream

intel 11.1@x86_64-suse-linux

We like to think that:
..05 int has the size of pointers
but 'sizeof(int)==sizeof(void*)' is false.
..08 overshifting is okay
but '(1<<bits_per_int)==0' is false.
..09a minus shifts backwards
but '(t=-1,(15<<t)==7)' is false.
..14 i++ is strictly left to right
but '(i=0,a[i++]=i,a[0]==1)' is false.
..15 structs are packed
but 'sizeof(char_int)==(sizeof(char)+sizeof(int))' is false.
..17 size_t is unsigned int
but 'sizeof(size_t)==sizeof(unsigned int)' is false.
..21a Function Arguments are evaluated right to left
but '(gobble_args(0,ltr_fun(1),ltr_fun(2),ltr_fun(3),ltr_fun(4)),ltr_result==4321)' is false.
ltr_result is 1234 in this case
..26 sizeof() does not evaluate its arguments
but '(i=10,sizeof(char[((i=20),10)]),i==10)' is false.
From what I can say with my puny test cases, you are 75% mainstream

Turbo C++/DOS/Small Memory

We like to think that:
..09a minus shifts backwards
but '(t=-1,(15<<t)==7)' is false.
..16 malloc()=NULL means out of memory
but '(malloc(0)!=NULL)' is false.
..19-2 short<int
but 'sizeof(short)<sizeof(int)' is false.
..22 floating point is always IEEE
but 'STDC_IEC_559_is_defined' is false.
..25 pointer arithmetic works outside arrays
but '(diff=&var.int2-&var.int1, &var.int1+diff==&var.int2)' is false.
..25a pointer arithmetic works outside arrays
but '(diff=&p1-&p2, &p2+diff==&p1)' is false.
From what I can say with my puny test cases, you are 81% mainstream

Turbo C++/DOS/Medium Memory

We like to think that:
..09a minus shifts backwards
but '(t=-1,(15<<t)==7)' is false.
..10 void* can store function pointers
but 'sizeof(void*)>=sizeof(void(*)())' is false.
..16 malloc()=NULL means out of memory
but '(malloc(0)!=NULL)' is false.
..19-2 short<int
but 'sizeof(short)<sizeof(int)' is false.
..22 floating point is always IEEE
but 'STDC_IEC_559_is_defined' is false.
..25 pointer arithmetic works outside arrays
but '(diff=&var.int2-&var.int1, &var.int1+diff==&var.int2)' is false.
..25a pointer arithmetic works outside arrays
but '(diff=&p1-&p2, &p2+diff==&p1)' is false.
From what I can say with my puny test cases, you are 78% mainstream

Turbo C++/DOS/Compact Memory

We like to think that:
..05 int has the size of pointers
but 'sizeof(int)==sizeof(void*)' is false.
..09a minus shifts backwards
but '(t=-1,(15<<t)==7)' is false.
..16 malloc()=NULL means out of memory
but '(malloc(0)!=NULL)' is false.
..19-2 short<int
but 'sizeof(short)<sizeof(int)' is false.
..20 ptrdiff_t and size_t have the same size
but '(sizeof(ptrdiff_t)==sizeof(size_t))' is false.
..22 floating point is always IEEE
but 'STDC_IEC_559_is_defined' is false.
..25 pointer arithmetic works outside arrays
but '(diff=&var.int2-&var.int1, &var.int1+diff==&var.int2)' is false.
..25a pointer arithmetic works outside arrays
but '(diff=&p1-&p2, &p2+diff==&p1)' is false.
From what I can say with my puny test cases, you are 75% mainstream

cl65@Commodore PET (vice emulator)

I'll be updating these later:

Borland C++ Builder 6.0 on Windows XP

..04 a char is signed
   but 'CHAR_MIN==SCHAR_MIN' is false.
..08 overshifting is okay
   but '(1<<bits_per_int)==0' is false.
..09 overshifting is *always* okay
   but '(1<<BITS_PER_INT)==0' is false.
..09a minus shifts backwards
   but '(t=-1,(15<<t)==7)' is false.
..15 structs are packed
   but 'sizeof(char_int)==(sizeof(char)+sizeof(int))' is false.
..16 malloc()=NULL means out of memory
   but '(malloc(0)!=NULL)' is false.
..19-3 int<long
   but 'sizeof(int)<sizeof(long)' is false.
..22 floating point is always IEEE
   but 'STDC_IEC_559_is_defined' is false.
From what I can say with my puny test cases, you are 71% mainstream

Visual Studio Express 2010 C++ CLR, Windows 7 64bit

(must be compiled as C++ because the CLR compiler does not support pure C)

We like to think that:
..08 overshifting is okay
   but '(1<<bits_per_int)==0' is false.
..09a minus shifts backwards
   but '(t=-1,(15<<t)==7)' is false.
..14 i++ is structly left to right
   but '(i=0,a[i++]=i,a[0]==1)' is false.
..15 structs are packed
   but 'sizeof(char_int)==(sizeof(char)+sizeof(int))' is false.
..19-3 int<long
   but 'sizeof(int)<sizeof(long)' is false.
..22 floating point is always IEEE
   but 'STDC_IEC_559_is_defined' is false.
From what I can say with my puny test cases, you are 78% mainstream

MINGW64 (gcc-4.5.2 prerelase)

-- http://mingw-w64.sourceforge.net/

We like to think that:
..05 int has the size of pointers
   but 'sizeof(int)==sizeof(void*)' is false.
..05a long has at least the size of pointers
   but 'sizeof(long)>=sizeof(void*)' is false.
..08 overshifting is okay
   but '(1<<bits_per_int)==0' is false.
..09a minus shifts backwards
   but '(t=-1,(15<<t)==7)' is false.
..14 i++ is structly left to right
   but '(i=0,a[i++]=i,a[0]==1)' is false.
..15 structs are packed
   but 'sizeof(char_int)==(sizeof(char)+sizeof(int))' is false.
..17 size_t is unsigned int
   but 'sizeof(size_t)==sizeof(unsigned int)' is false.
..19-3 int<long
   but 'sizeof(int)<sizeof(long)' is false.
..22 floating point is always IEEE
   but 'STDC_IEC_559_is_defined' is false.
From what I can say with my puny test cases, you are 67% mainstream

64 bit Windows uses the LLP64 model: Both int and long are defined as 32-bit, which means that neither is long enough for a pointer.

avr-gcc 4.3.2 / ATmega168 (Arduino Diecimila)

The failed assumptions are:

..14 i++ is structly left to right
..16 malloc()=NULL means out of memory
..19-2 short<int
..21 Evaluation is left to right
..22 floating point is always IEEE

The Atmega168 has a 16 bit PC, but code and data are in separate address spaces. Larger Atmegas have a 22 bit PC!.

gcc 4.2.1 on MacOSX 10.6, compiled with -arch ppc

We like to think that:
..09a minus shifts backwards
   but '(t=-1,(15<<t)==7)' is false.
..13 The smallest bits come always first
   but '(t=0x1234,0x34==*(char*)&t)' is false.
..14 i++ is structly left to right
   but '(i=0,a[i++]=i,a[0]==1)' is false.
..15 structs are packed
   but 'sizeof(char_int)==(sizeof(char)+sizeof(int))' is false.
..19-3 int<long
   but 'sizeof(int)<sizeof(long)' is false.
..22 floating point is always IEEE
   but 'STDC_IEC_559_is_defined' is false.
From what I can say with my puny test cases, you are 78% mainstream

26

votes

A long time ago, I was teaching C from a textbook that had

printf("sizeof(int)=%d\n", sizeof(int));

as a sample question. It failed for a student, because sizeof yields values of type size_t, not int, int on this implementation was 16 bits and size_t was 32, and it was big-endian. (The platform was Lightspeed C on 680x0-based Macintoshes. I said it was a long time ago.)

16

votes

You need to include the ++ and -- assumptions people make.

a[i++]= i;

For example, is syntactically legal, but produces varying results depending on too many things to reason out.

Any statement that has ++ (or --) and a variable which occurs more than once is a problem.

8

votes

Very interesting!

Other things I can think of it might be useful to check for:

do function pointers and data pointers exist in the same address space? (Breaks in Harvard architecture machines like DOS small mode. Don't know how you'd test for it, though.)
if you take a NULL data pointer and cast it to the appropriate integer type, does it have the numeric value 0? (Breaks on some really ancient machines --- see http://c-faq.com/null/machexamp.html.) Ditto with function pointer. Also, they may be different values.
does incrementing a pointer past the end of its corresponding storage object, and then back again, cause sensible results? (I don't know of any machines this actually breaks on, but I believe the C spec does not allow you to even think about pointers that don't point to either (a) the contents of an array or (b) the element immediately after the array or (c) NULL. See http://c-faq.com/aryptr/non0based.html.)
does comparing two pointers to different storage objects with < and > produce consistent results? (I can imagine this breaking on exotic segment-based machines; the spec forbids such comparisons, so the compiler would be entitled to compare the offset part of the pointer only, and not the segment part.)

Hmm. I'll try and think of some more.

Edit: Added some clarifying links to the excellent C FAQ.

5

votes

I think you should make an effort to distinguish between two very different classes of "incorrect" assumptions. A good half (right shift and sign extension, ASCII-compatible encoding, memory is linear, data and function pointers are compatible, etc.) are pretty reasonable assumptions for most C coders to make, and might even be included as part of the standard if C were being designed today and if we didn't have legacy IBM junk grandfathered-in. The other half (things related to memory aliasing, behavior of library functions when input and output memory overlap, 32-bit assumptions like that pointers fit in int or that you can use malloc without a prototype, that calling convention is identical for variadic and non-variadic functions, ...) either conflict with optimizations modern compilers want to perform or with migration to 64-bit machines or other new technology.

5

votes

Here's a fun one: What's wrong with this function?

float sum(unsigned int n, ...)
{
    float v = 0;
    va_list ap;
    va_start(ap, n);
    while (n--)
        v += va_arg(ap, float);
    va_end(ap);
    return v;
}

[Answer (rot13): Inevnqvp nethzragf borl gur byq X&E cebzbgvba ehyrf, juvpu zrnaf lbh pnaabg hfr 'sybng' (be 'pune' be 'fubeg') va in_net! Naq gur pbzcvyre vf erdhverq abg gb gerng guvf nf n pbzcvyr-gvzr reebe. (TPP qbrf rzvg n jneavat, gubhtu.)]

5

votes

EXPECT("## pow() gives exact results for integer arguments", pow(2, 4) == 16);

Another one is about text mode in fopen. Most programmers assume that either text and binary are the same (Unix) or that text mode adds \r characters (Windows). But C has been ported to systems that use fixed-width records, on which fputc('\n', file) on a text file means to add spaces or something until the file size is a multiple of the record length.

And here are my results:

gcc (Ubuntu 4.4.3-4ubuntu5) 4.4.3 on x86-64

We like to think that:
..05 int has the size of pointers
   but 'sizeof(int)==sizeof(void*)' is false.
..08 overshifting is okay
   but '(1<<bits_per_int)==0' is false.
..09a minus shifts backwards
   but '(t=-1,(15<<t)==7)' is false.
..14 i++ is strictly left to right
   but '(i=0,a[i++]=i,a[0]==1)' is false.
..15 structs are packed
   but 'sizeof(char_int)==(sizeof(char)+sizeof(int))' is false.
..17 size_t is unsigned int
   but 'sizeof(size_t)==sizeof(unsigned int)' is false.
From what I can say with my puny test cases, you are 78% mainstream

4

votes

Some of them can't easily be tested from inside C because the program is likely to crash on the implementations where the assumption doesn't hold.

"It's ok to do anything with a pointer-valued variable. It only needs to contain a valid pointer value if you dereference it."

void noop(void *p); /* A no-op function that the compiler doesn't know to optimize away */
int main () {
    char *p = malloc(1);
    free(p);
    noop(p); /* may crash in implementations that verify pointer accesses */
    noop(p - 42000); /* and if not the previous instruction, maybe this one */
}

Same with integral and floating point types (other than unsigned char), which are allowed to have trap representations.

"Integer calculations wrap around. So this program prints a large negative integer."

#include <stdio.h>
int main () {
    printf("%d\n", INT_MAX+1); /* may crash due to signed integer overflow */
    return 0;
}

(C89 only.) "It's ok to fall off the end of main."

#include <stdio.h>
int main () {
    puts("Hello.");
} /* The status code is 7 on many implementations. */

4

votes

Well the classic portability assumptions not meantioned yet are

assumptions about size of integral types
endianness

4

votes

Discretization errors due to floating point representation. For example, if you use the standard formula to solve quadratic equations, or finite differences to approximate derivatives, or the standard formula to calculate variances, precision will be lost due to the calculation of differences between similiar numbers. The Gauß algorithm to solve linear systems is bad because rounding errors accumulate, thus one uses QR or LU decomposition, Cholesky decomposition, SVD, etc. Addition of floating point numbers is not associative. There are denormal, infinite and NaN values. a + b − a ≠ b.
Strings: Difference between characters, code points, and code units. How Unicode is implemented on the various operating systems; Unicode encodings. Opening a file with an arbitrary Unicode file name is not possible with C++ in a portable way.
Race conditions, even without threading: if you test whether a file exists, the result could become invalid at any time.
ERROR_SUCCESS = 0

4

votes

Include a check for integer sizes. Most people assume that an int is bigger than a short is bigger than a char. However, these might all be false: sizeof(char) < sizeof(int); sizeof(short) < sizeof(int); sizeof(char) < sizeof(short)

This code might fail (crashes to unaligned access)

unsigned char buf[64];

int i = 234;
int *p = &buf[1];
*p = i;
i = *p;

3

votes

A couple of things about built-in data types:

char and signed char are actually two distinct types (unlike int and signed int which refer to the same signed integer type).
signed integers are not required to use two's complement. Ones's complement and sign+magnitude are also valid representations of negative numbers. This makes bit operations involving negative numbers implementation-defined.
If you assign an out-of-range integer to a signed integer variable, the behaviour is implementation-defined.
In C90, -3/5 could return 0 or -1. Rounding towards zero in case one operand was negative is only guaranteed in C99 upwards and C++0x upwards.
There are no exact size guarantees for the built-in types. The standard only covers minimal requirements such as an int has at least 16 bits, a long has at least 32 bits, a long long has at least 64 bits. A float can at least represent 6 most significant decimal digits correctly. A double can at least represent 10 most significant decimal digits correctly.
IEEE 754 is not mandatory for representing floating point numbers.

Admittedly, on most machines we'll have two's complement and IEEE 754 floats.

3

votes

How about this one:

No data pointer can ever be the same as a valid function pointer.

This is TRUE for all flat models, MS-DOS TINY, LARGE, and HUGE models, false for MS-DOS SMALL model, and almost always false for MEDIUM and COMPACT models (depends on load address, you will need a really old DOS to make it true).

I can't write a test for this

And worse: pointers casted to ptrdiff_t may be compared. This not true for MS-DOS LARGE model (the only difference between LARGE and HUGE is HUGE adds compiler code to normalize pointers).

I can't write a test because the environment where this bombs hard won't allocate a buffer greater than 64K so the code that demonstrates it would crash on other platforms.

This particular test would pass on one now-defunct system (notice it depends on the internals of malloc):

  char *ptr1 = malloc(16);
  char *ptr2 = malloc(16);
  if ((ptrdiff_t)ptr2 - 0x20000 == (ptrdiff_t)ptr1)
      printf("We like to think that unrelated pointers are equality comparable when cast to the appropriate integer, but they're not.");

3

votes

EDIT: Updated to the last version of the program

Solaris-SPARC

gcc 3.4.6 in 32 bit

We like to think that:
..08 overshifting is okay
   but '(1<<bits_per_int)==0' is false.
..09 overshifting is *always* okay
   but '(1<<BITS_PER_INT)==0' is false.
..09a minus shifts backwards
   but '(t=-1,(15<<t)==7)' is false.
..13 The smallest bits always come first
   but '(t=0x1234,0x34==*(char*)&t)' is false.
..14 i++ is strictly left to right
   but '(i=0,a[i++]=i,a[0]==1)' is false.
..15 structs are packed
   but 'sizeof(char_int)==(sizeof(char)+sizeof(int))' is false.
..19-3 int<long
   but 'sizeof(int)<sizeof(long)' is false.
..22 floating point is always IEEE
   but 'STDC_IEC_559_is_defined' is false.
From what I can say with my puny test cases, you are 72% mainstream

gcc 3.4.6 in 64 bit

We like to think that:
..05 int has the size of pointers
   but 'sizeof(int)==sizeof(void*)' is false.
..08 overshifting is okay
   but '(1<<bits_per_int)==0' is false.
..09 overshifting is *always* okay
   but '(1<<BITS_PER_INT)==0' is false.
..09a minus shifts backwards
   but '(t=-1,(15<<t)==7)' is false.
..13 The smallest bits always come first
   but '(t=0x1234,0x34==*(char*)&t)' is false.
..14 i++ is strictly left to right
   but '(i=0,a[i++]=i,a[0]==1)' is false.
..15 structs are packed
   but 'sizeof(char_int)==(sizeof(char)+sizeof(int))' is false.
..17 size_t is unsigned int
   but 'sizeof(size_t)==sizeof(unsigned int)' is false.
..22 floating point is always IEEE
   but 'STDC_IEC_559_is_defined' is false.
From what I can say with my puny test cases, you are 68% mainstream

and with SUNStudio 11 32 bit

We like to think that:
..08 overshifting is okay
   but '(1<<bits_per_int)==0' is false.
..09a minus shifts backwards
   but '(t=-1,(15<<t)==7)' is false.
..13 The smallest bits always come first
   but '(t=0x1234,0x34==*(char*)&t)' is false.
..14 i++ is strictly left to right
   but '(i=0,a[i++]=i,a[0]==1)' is false.
..15 structs are packed
   but 'sizeof(char_int)==(sizeof(char)+sizeof(int))' is false.
..19-3 int<long
   but 'sizeof(int)<sizeof(long)' is false.
From what I can say with my puny test cases, you are 79% mainstream

and with SUNStudio 11 64 bit

We like to think that:
..05 int has the size of pointers
   but 'sizeof(int)==sizeof(void*)' is false.
..08 overshifting is okay
   but '(1<<bits_per_int)==0' is false.
..09a minus shifts backwards
   but '(t=-1,(15<<t)==7)' is false.
..13 The smallest bits always come first
   but '(t=0x1234,0x34==*(char*)&t)' is false.
..14 i++ is strictly left to right
   but '(i=0,a[i++]=i,a[0]==1)' is false.
..15 structs are packed
   but 'sizeof(char_int)==(sizeof(char)+sizeof(int))' is false.
..17 size_t is unsigned int
   but 'sizeof(size_t)==sizeof(unsigned int)' is false.
From what I can say with my puny test cases, you are 75% mainstream

2

votes

You can use text-mode (fopen("filename", "r")) to read any sort of text file.

While this should in theory work just fine, if you also use ftell() in your code, and your text file has UNIX-style line-endings, in some versions of the Windows standard library, ftell() will often return invalid values. The solution is to use binary mode instead (fopen("filename", "rb")).

1

votes

gcc 3.3.2 on AIX 5.3 (yeah, we need to update gcc)

We like to think that:
..04 a char is signed
   but 'CHAR_MIN==SCHAR_MIN' is false.
..09a minus shifts backwards
   but '(t=-1,(15<<t)==7)' is false.
..13 The smallest bits come always first
   but '(t=0x1234,0x34==*(char*)&t)' is false.
..14 i++ is structly left to right
   but '(i=0,a[i++]=i,a[0]==1)' is false.
..15 structs are packed
   but 'sizeof(char_int)==(sizeof(char)+sizeof(int))' is false.
..16 malloc()=NULL means out of memory
   but '(malloc(0)!=NULL)' is false.
..19-3 int<long
   but 'sizeof(int)<sizeof(long)' is false.
..22 floating point is always IEEE
   but 'STDC_IEC_559_is_defined' is false.
From what I can say with my puny test cases, you are 71% mainstream

1

votes

An assumption that some may do in C++ is that a struct is limited to what it can do in C. The fact is that, in C++, a struct is like a class except that it has everything public by default.

C++ struct:

struct Foo
{
  int number1_;  //this is public by default


//this is valid in C++:    
private: 
  void Testing1();
  int number2_;

protected:
  void Testing2();
};

1

votes

Standard math functions on different systems don't give identical results.

1

votes

Visual Studio Express 2010 on 32-bit x86.

Z:\sandbox>cl testtoy.c
Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 16.00.30319.01 for 80x86
Copyright (C) Microsoft Corporation.  All rights reserved.

testtoy.c
testtoy.c(54) : warning C4293: '<<' : shift count negative or too big, undefined
 behavior
Microsoft (R) Incremental Linker Version 10.00.30319.01
Copyright (C) Microsoft Corporation.  All rights reserved.

/out:testtoy.exe
testtoy.obj

Z:\sandbox>testtoy.exe
We like to think that:
..08 overshifting is okay
   but '(1<<bits_per_int)==0' is false.
..09a minus shifts backwards
   but '(t=-1,(15<<t)==7)' is false.
..14 i++ is structly left to right
   but '(i=0,a[i++]=i,a[0]==1)' is false.
..15 structs are packed
   but 'sizeof(char_int)==(sizeof(char)+sizeof(int))' is false.
..19-3 int<long
   but 'sizeof(int)<sizeof(long)' is false.
..22 floating point is always IEEE
   but 'STDC_IEC_559_is_defined' is false.
From what I can say with my puny test cases, you are 78% mainstream

1

votes

Via Codepad.org (C++: g++ 4.1.2 flags: -O -std=c++98 -pedantic-errors -Wfatal-errors -Werror -Wall -Wextra -Wno-missing-field-initializers -Wwrite-strings -Wno-deprecated -Wno-unused -Wno-non-virtual-dtor -Wno-variadic-macros -fmessage-length=0 -ftemplate-depth-128 -fno-merge-constants -fno-nonansi-builtins -fno-gnu-keywords -fno-elide-constructors -fstrict-aliasing -fstack-protector-all -Winvalid-pch) .

Note that Codepad did not have stddef.h. I removed test 9 due to codepad using warnings as errors. I also renamed the count variable since it was already defined for some reason.

We like to think that:
..08 overshifting is okay
   but '(1<<bits_per_int)==0' is false.
..14 i++ is structly left to right
   but '(i=0,a[i++]=i,a[0]==1)' is false.
..15 structs are packed
   but 'sizeof(char_int)==(sizeof(char)+sizeof(int))' is false.
..19-3 int<long
   but 'sizeof(int)<sizeof(long)' is false.
From what I can say with my puny test cases, you are 84% mainstream

1

votes

How about right-shifting by excessive amounts--is that allowed by the standard, or worth testing?

Does Standard C specify the behavior of the following program:

void print_string(char *st)
{
  char ch;
  while((ch = *st++) != 0)
    putch(ch);  /* Assume this is defined */
}
int main(void)
{
  print_string("Hello");
  return 0;
}

On at least one compiler I use, that code will fail unless the argument to print_string is a "char const *". Does the standard permit such a restriction?

Some systems allow one to produce pointers to unaligned 'int's and others don't. Might be worth testing.

0

votes

FYI, For those who have to translate their C skills to Java, here are a few gotchas.

EXPECT("03 a char is 8 bits",CHAR_BIT==8);
EXPECT("04 a char is signed",CHAR_MIN==SCHAR_MIN);

In Java, char is 16-bit and signed. byte is 8-bit and signed.

/* not true for Windows-64 */
EXPECT("05a long has at least the size of pointers",sizeof(long)>=sizeof(void*));

long is always 64-bit, references can be 32-bit or 64-bit (if you have more than an app with more than 32 GB) 64-bit JVMs typically use 32-bit references.

EXPECT("08 overshifting is okay",(1<<bits_per_int)==0);
EXPECT("09 overshifting is *always* okay",(1<<BITS_PER_INT)==0);

The shift is masked so that i << 64 == i == i << -64, i << 63 == i << -1

EXPECT("13 The smallest bits always come first",(t=0x1234,0x34==*(char*)&t));

ByteOrder.nativeOrder() can be BIG_ENDIAN or LITTLE_ENDIAN

EXPECT("14 i++ is strictly left to right",(i=0,a[i++]=i,a[0]==1));

i = i++ never changes i

/* suggested by David Thornley */
EXPECT("17 size_t is unsigned int",sizeof(size_t)==sizeof(unsigned int));

The size of collections and arrays is always 32-bit regardless of whether the JVM is 32-bit or 64-bit.

EXPECT("19-1 char<short",sizeof(char)<sizeof(short));
EXPECT("19-2 short<int",sizeof(short)<sizeof(int));
EXPECT("19-3 int<long",sizeof(int)<sizeof(long));

char is 16-bit, short is 16-bit, int is 32-bit and long is 64-bit.

What belongs in an educational tool to demonstrate the unwarranted assumptions people make in C/C++?

23 Answers