11
votes

Simple code (below, malloc()/free() sequence being run in 100 threads) crashes on any Windows OS I tried it to run.

Any help would be greatly appreciated.

Maybe using some compiler directive can help?

We build the executable in VS2017 in Release/x64; the executable file crashes on any Windows platform I tried after several minutes of running.

I tried building with VS2015 as well but it doesn't help.

The same code on Linux works fine.

Actually, problem is more serious than it looks; we faced the situation when our server code crashes several times a day in a production environment without any reason (when user calls' number exceeds a certain value). We tried to nail down the issue and created simplest solution that reproduces the problem.

Archive with VS project is here.

VS says that command line is:

/Yu"stdafx.h" /GS /GL /W3 /Gy /Zc:wchar_t /Zi /Gm- /O2 /sdl 
/Fd"x64\Release\vc140.pdb" /Zc:inline /fp:precise /D "NDEBUG"
/D "_CONSOLE" /D "_UNICODE" /D "UNICODE" /errorReport:prompt /WX- /Zc:forScope /Gd
/Oi /MD /Fa"x64\Release\" /EHsc /nologo /Fo"x64\Release\" /Fp"x64\Release\MallocTest.pch" 

Code:

#include "stdafx.h"
#include <iostream>
#include <thread>
#include <conio.h>

using namespace std;

#define MAX_THREADS 100

void task(void) {
    while (true) {
        char *buffer;
        buffer = (char *)malloc(4096);
        if (buffer == NULL) {
            cout << "malloc error" << endl;
        }
        free(buffer);
    }
}

int main(int argc, char** argv) {    
    thread some_threads[MAX_THREADS];

    for (int i = 0; i < MAX_THREADS; i++) {
        some_threads[i] = thread(task);
    }

    for (int i = 0; i < MAX_THREADS; i++) {
        some_threads[i].join();
    }

    _getch();
    return 0;
}
2
Are you linking with a thread-safe version of the runtime library?molbdnilo
@molbdnilo "Are you linking with a thread-safe version of the runtime library?" I guess yes because there is /MD flag in command line: /Yu"stdafx.h" /GS /GL /W3 /Gy /Zc:wchar_t /Zi /Gm- /O2 /sdl /Fd"x64\Release\vc140.pdb" /Zc:inline /fp:precise /D "NDEBUG" /D "_CONSOLE" /D "_UNICODE" /D "UNICODE" /errorReport:prompt /WX- /Zc:forScope /Gd /Oi /MD /Fa"x64\Release\" /EHsc /nologo /Fo"x64\Release\" /Fp"x64\Release\MallocTest.pch"peterg
I was able to reproduce this. Error happens even if new is used (it actually still calls the same _malloc_base). Callstack ends with ntdll.dll!RtlpLowFragHeapAllocFromContext(); tdll.dll!RtlpAllocateHeapInternal(); ucrtbase.dll!_malloc_base()user7860670
It's definitely not a duplicate of stackoverflow.com/questions/4826479/…. I could reproduce this using the multithreaded crt (BTW on my VS2017 there is no single thread crt at all). It's a crash in ntdll, call stack: ntdll.dll!RtlpLowFragHeapAllocFromContext(); tdll.dll!RtlpAllocateHeapInternal(); ucrtbase.dll!_malloc_base()Jabberwocky
You made my day. Reproduced issue with VC2017 100 threads. It shows access violation in HeapAlloc (somewhere in ntdll.dll)... There is some stupid error... or you've found bug in MS c++ or ms windows...Pavlo K

2 Answers

2
votes

This is an issue with the Windows Low Fragmentation Heap. It was fixed in OS build 19041 (the May 2020 update).

1
votes

Nothing in your remarkably small MVCE indicates a programming error, malloc() and free() are supposed to be thread safe and so should the methods invoked on cout. The program is not designed to ever stop, so it appears to be a fine stress test for malloc() in a multi-thread context.

Note however that if malloc() fails, it is questionable to try and report the error to cout, which might make further calls to malloc() for buffering. Reporting the error to cerr or making cout unbuffered is advisable. In any case malloc() failure should not cause a crash, even in the stream methods.

It looks like you found a bug in the runtime library you link to on the VS target platform. It would be interesting to track the memory usage of the program up to the crash. A steady increase in memory usage would indicate some problems in the runtime library too. The program never allocates more than MAX_THREADS blocks of 4K at a time, so the memory use should remain quite low, below 2MB, including the overhead associated with the thread based caches used by modern implementations of malloc().