4
votes

I'm re-learning C++, and I need to use memory mapped files. I decided to use boost (since it seems to be solid library).

I created a memory mapped file mapping to an array of doubles, and wrote to first double in this array. On disk file contained some data in first four bytes, and rest of this were zeroed, this was curious for me as generally if I obtain a pointer in C++ to memory location, in most cases I have to assume that it contains garbage.

Do I have any guarantees that newly created memory mapped files will be zeroed (at least on Linux)? I didn't find any reference for that.

BOOST_AUTO_TEST_CASE(OpenMMapFile){

boost::iostreams::mapped_file file;

boost::iostreams::mapped_file_params params;

params.path = "/tmp/mmaptest-1";
params.mode = std::ios::in | std::ios::out;
params.new_file_size =  10*sizeof(double);

file.open(params);

double* data = static_cast<double*>((void*)file.data());

data[0] = 12;

file.close();

}

Here is the file contents:

cat /tmp/mmaptest-1 | base64
AAAAAAAAKEAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA

AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA=

EDIT

As @Zan pointed --- boost actually uses ftruncate to resize mmaped files, so zeroing is guaranteed (at least on Linux).

2
On Unix/Linux the OS has gives a security guarantee that data is not leaked, therefore when a file or area of memory is allocated it must not allow a process to see what was on the disk/memory in a previous file/process. The easiest way to meet this guarantee is to zero the memory. So that is what it does. - ctrl-alt-delor
richard is comparing linux to Windows95? - user4590120

2 Answers

4
votes

A memory mapped file contains whatever was in the file.

If it is a new file, it has been extended to the right size and the extension will contain zeros. Extending a file is usually done with the ftruncate function.

The ftruncate manpage says:

   If  the  file  previously  was larger than this size, the extra data is
   lost.  If the file previously was shorter,  it  is  extended,  and  the
   extended part reads as null bytes ('\0').

So yes, zeros are guaranteed.

-1
votes

I think boost is zeroing the file for you in order to achieve that the mapped address space is really backed up by disk-space and not by a sparse file. This is slow, especially if you want to create a big address space up front, which might never be fully used -- just so that you can allocate many objects in this address space. They are doing this, as there is no useable way on UNIXs to handle out-of-disk-space when writing to memory mapped sparse files (ignoring for the moment such sick solutions as setjmp/longjmp). But you have still the possibility that some other process truncates the file on disk, in which case the before mentioned problem rears its head again.

Unfortunately they are also doing this (allocating disk space matching the size of the address space instead of using a sparse file) on Windows, where structured exception handling exists.