Why make_shared slower than shared_ptr<T>( new T)

Question

In each article it is written that make_shared is more efficient, than shared_ptr<T>(new T), because of one memory allocation not two. But I try this code:

#include <cstdio>
#include <ctime>

#include <memory>
#include <vector>

static const size_t N = 1L << 25;

int main(void) {

    clock_t start = clock();
        for ( size_t rcx = 0; rcx < N; rcx++ ) {
            auto tmp = std::shared_ptr<std::vector<size_t>>( new std::vector<size_t>( 1024 ) );
        }
    clock_t end = clock();
    printf("shared_ptr with new: %lf\n", ((double)end - start) / CLOCKS_PER_SEC);

    start = clock();
        for ( size_t rcx = 0; rcx < N; rcx++ ) {
            auto tmp = std::make_shared<std::vector<size_t>>( 1024 );
        }
    end = clock();
    printf("make_shared: %lf\n", ((double)end - start) / CLOCKS_PER_SEC);

    return 0;
}

compile with:

g++ --std=c++14 -O2 test.cpp -o test

and got this result:

shared_ptr with new: 10.502945

make_shared: 18.581738

Same for boost::shared_ptr:

shared_ptr with new: 10.778537

make_shared: 18.962444

This question has answer about LLVM's libc++ is broken, but I use libstdc++ from GNU. So, why is make_shared slower?

P.S. With -O3 optimization got this result:

shared_ptr with new: 5.482464

make_shared: 4.249722

same for boost::shared_ptr.

I run your code and make_shared was faster with both -O2 and -O3 on a Linux server. What is your architecture, OS, and version of GCC, libstdc++, and glibc? — Daniel Langr
With msvc make_shared is also faster. I get shared_ptr with new: 12.624000 make_shared: 10.969000 release/x64/cl version 19.23.28106.4 --- with release/x86 it's even faster: shared_ptr with new: 10.610000 make_shared: 7.620000 — churill
Shared with new is slower quick-bench.com/Ih7HfLwsYmhJpBW-Lu8VDrDO8Y4 — S.M.
Have you tried letting the make_shared loop run before the new loop? — xskxzr

J W J W · Accepted Answer · 2019-11-26T08:22:49

Running a program on one computer and measure the execution time does not give any information about the program performance in general but only on your device under the actual conditions. There are dependencies on for example your os, compiler, other programs running on your device, library versions and even your user-name.
Since your program does access the main memory, it is possible that in your special case program is structured in a way so that it is more "fast" to access the memory. But it could of course be that if you terminate or start other software, change your user or the os restructures main-memory the "performance" looks total different.
So if you want credible data you should run the program at least on different devices and under different conditions. But I would recommend you to take a look at "profiler"-software.

Why make_shared slower than shared_ptr<T>( new T)

2 Answers