5. Common pitfalls when using arrays.
5.1 Pitfall: Trusting type-unsafe linking.
OK, you’ve been told, or have found out yourself, that globals (namespace
scope variables that can be accessed outside the translation unit) are
Evil™. But did you know how truly Evil™ they are? Consider the
program below, consisting of two files [main.cpp] and [numbers.cpp]:
// [main.cpp]
#include <iostream>
extern int* numbers;
int main()
{
using namespace std;
for( int i = 0; i < 42; ++i )
{
cout << (i > 0? ", " : "") << numbers[i];
}
cout << endl;
}
// [numbers.cpp]
int numbers[42] = {1, 2, 3, 4, 5, 6, 7, 8, 9};
In Windows 7 this compiles and links fine with both MinGW g++ 4.4.1 and
Visual C++ 10.0.
Since the types don't match, the program crashes when you run it.
In-the-formal explanation: the program has Undefined Behavior (UB), and instead
of crashing it can therefore just hang, or perhaps do nothing, or it
can send threating e-mails to the presidents of the USA, Russia, India,
China and Switzerland, and make Nasal Daemons fly out of your nose.
In-practice explanation: in main.cpp
the array is treated as a pointer, placed
at the same address as the array. For 32-bit executable this means that the first
int
value in the array, is treated as a pointer. I.e., in main.cpp
the
numbers
variable contains, or appears to contain, (int*)1
. This causes the
program to access memory down at very bottom of the address space, which is
conventionally reserved and trap-causing. Result: you get a crash.
The compilers are fully within their rights to not diagnose this error,
because C++11 §3.5/10 says, about the requirement of compatible types
for the declarations,
[N3290 §3.5/10]
A violation of this rule on type identity does not require a diagnostic.
The same paragraph details the variation that is allowed:
… declarations for an array object can specify array types that
differ by the presence or absence of a major array bound (8.3.4).
This allowed variation does not include declaring a name as an array in one
translation unit, and as a pointer in another translation unit.
5.2 Pitfall: Doing premature optimization (memset
& friends).
Not written yet
5.3 Pitfall: Using the C idiom to get number of elements.
With deep C experience it’s natural to write …
#define N_ITEMS( array ) (sizeof( array )/sizeof( array[0] ))
Since an array
decays to pointer to first element where needed, the
expression sizeof(a)/sizeof(a[0])
can also be written as
sizeof(a)/sizeof(*a)
. It means the same, and no matter how it’s
written it is the C idiom for finding the number elements of array.
Main pitfall: the C idiom is not typesafe. For example, the code
…
#include <stdio.h>
#define N_ITEMS( array ) (sizeof( array )/sizeof( *array ))
void display( int const a[7] )
{
int const n = N_ITEMS( a ); // Oops.
printf( "%d elements.\n", n );
}
int main()
{
int const moohaha[] = {1, 2, 3, 4, 5, 6, 7};
printf( "%d elements, calling display...\n", N_ITEMS( moohaha ) );
display( moohaha );
}
passes a pointer to N_ITEMS
, and therefore most likely produces a wrong
result. Compiled as a 32-bit executable in Windows 7 it produces …
7 elements, calling display...
1 elements.
- The compiler rewrites
int const a[7]
to just int const a[]
.
- The compiler rewrites
int const a[]
to int const* a
.
N_ITEMS
is therefore invoked with a pointer.
- For a 32-bit executable
sizeof(array)
(size of a pointer) is then 4.
sizeof(*array)
is equivalent to sizeof(int)
, which for a 32-bit executable is also 4.
In order to detect this error at run time you can do …
#include <assert.h>
#include <typeinfo>
#define N_ITEMS( array ) ( \
assert(( \
"N_ITEMS requires an actual array as argument", \
typeid( array ) != typeid( &*array ) \
)), \
sizeof( array )/sizeof( *array ) \
)
7 elements, calling display...
Assertion failed: ( "N_ITEMS requires an actual array as argument", typeid( a ) != typeid( &*a ) ), file runtime_detect
ion.cpp, line 16
This application has requested the Runtime to terminate it in an unusual way.
Please contact the application's support team for more information.
The runtime error detection is better than no detection, but it wastes a little
processor time, and perhaps much more programmer time. Better with detection at
compile time! And if you're happy to not support arrays of local types with C++98,
then you can do that:
#include <stddef.h>
typedef ptrdiff_t Size;
template< class Type, Size n >
Size n_items( Type (&)[n] ) { return n; }
#define N_ITEMS( array ) n_items( array )
Compiling this definition substituted into the first complete program, with g++,
I got …
M:\count> g++ compile_time_detection.cpp
compile_time_detection.cpp: In function 'void display(const int*)':
compile_time_detection.cpp:14: error: no matching function for call to 'n_items(const int*&)'
M:\count> _
How it works: the array is passed by reference to n_items
, and so it does
not decay to pointer to first element, and the function can just return the
number of elements specified by the type.
With C++11 you can use this also for arrays of local type, and it's the type safe
C++ idiom for finding the number of elements of an array.
5.4 C++11 & C++14 pitfall: Using a constexpr
array size function.
With C++11 and later it's natural, but as you'll see dangerous!, to
replace the C++03 function
typedef ptrdiff_t Size;
template< class Type, Size n >
Size n_items( Type (&)[n] ) { return n; }
with
using Size = ptrdiff_t;
template< class Type, Size n >
constexpr auto n_items( Type (&)[n] ) -> Size { return n; }
where the significant change is the use of constexpr
, which allows
this function to produce a compile time constant.
For example, in contrast to the C++03 function, such a compile time constant
can be used to declare an array of the same size as another:
// Example 1
void foo()
{
int const x[] = {3, 1, 4, 1, 5, 9, 2, 6, 5, 4};
constexpr Size n = n_items( x );
int y[n] = {};
// Using y here.
}
But consider this code using the constexpr
version:
// Example 2
template< class Collection >
void foo( Collection const& c )
{
constexpr int n = n_items( c ); // Not in C++14!
// Use c here
}
auto main() -> int
{
int x[42];
foo( x );
}
The pitfall: as of July 2015 the above compiles with MinGW-64 5.1.0 with
-pedantic-errors
, and,
testing with the online compilers at gcc.godbolt.org/, also with clang 3.0
and clang 3.2, but not with clang 3.3, 3.4.1, 3.5.0, 3.5.1, 3.6 (rc1) or
3.7 (experimental). And important for the Windows platform, it does not compile
with Visual C++ 2015. The reason is a C++11/C++14 statement about use of
references in constexpr
expressions:
th
A conditional-expression e
is a core constant expression unless the evaluation
of e
, following the rules of the abstract machine (1.9), would evaluate one of the
following expressions:
⋮
- an id-expression that refers to a variable or data member of reference type
unless the reference has a preceding initialization and either
- it is initialized with a constant expression or
- it is a non-static data member of an object whose lifetime began within
the evaluation of e;
One can always write the more verbose
// Example 3 -- limited
using Size = ptrdiff_t;
template< class Collection >
void foo( Collection const& c )
{
constexpr Size n = std::extent< decltype( c ) >::value;
// Use c here
}
… but this fails when Collection
is not a raw array.
To deal with collections that can be non-arrays one needs the overloadability of an
n_items
function, but also, for compile time use one needs a compile time
representation of the array size. And the classic C++03 solution, which works fine
also in C++11 and C++14, is to let the function report its result not as a value
but via its function result type. For example like this:
// Example 4 - OK (not ideal, but portable and safe)
#include <array>
#include <stddef.h>
using Size = ptrdiff_t;
template< Size n >
struct Size_carrier
{
char sizer[n];
};
template< class Type, Size n >
auto static_n_items( Type (&)[n] )
-> Size_carrier<n>;
// No implementation, is used only at compile time.
template< class Type, size_t n > // size_t for g++
auto static_n_items( std::array<Type, n> const& )
-> Size_carrier<n>;
// No implementation, is used only at compile time.
#define STATIC_N_ITEMS( c ) \
static_cast<Size>( sizeof( static_n_items( c ).sizer ) )
template< class Collection >
void foo( Collection const& c )
{
constexpr Size n = STATIC_N_ITEMS( c );
// Use c here
(void) c;
}
auto main() -> int
{
int x[42];
std::array<int, 43> y;
foo( x );
foo( y );
}
About the choice of return type for static_n_items
: this code doesn't use std::integral_constant
because with std::integral_constant
the result is represented
directly as a constexpr
value, reintroducing the original problem. Instead
of a Size_carrier
class one can let the function directly return a
reference to an array. However, not everybody is familiar with that syntax.
About the naming: part of this solution to the constexpr
-invalid-due-to-reference
problem is to make the choice of compile time constant explicit.
Hopefully the oops-there-was-a-reference-involved-in-your-constexpr
issue will be fixed with
C++17, but until then a macro like the STATIC_N_ITEMS
above yields portability,
e.g. to the clang and Visual C++ compilers, retaining type safety.
Related: macros do not respect scopes, so to avoid name collisions it can be a
good idea to use a name prefix, e.g. MYLIB_STATIC_N_ITEMS
.
std::array
s,std::vector
s andgsl::span
s - I would frankly expect an FAQ on how to use arrays in C++ to say "By now, you can start considering just, well, not using them." – einpoklum