1
votes

Below is code snippet from the actual code base. Please assume host, port, ioc, all are available and initialized.

// Connection establisher class
class CSSLConn: public std::enable_shared_from_this<CSSLConn>
{
   :  
   : 
};

// Class for maintaining the thread pool
class CHttpClient
{
  // vector to hold connction objects
  std::vector <std::shared_ptr<CSSLConn>>   m_sslConnObj{};


// main method call this to create connection
void CHttpClient::Initialize(int nThreads)
{
    for (int x = 0; x < nThreadCount; ++x)
    {
        worker_threads_.create_thread(boost::bind(&CHttpClient::WorkerThread, this));
    }

    // let connection get established
    std::this_thread::sleep_for(std::chrono::milliseconds(200));
}

// creating threads for thread pool
void CHttpClient::WorkerThread()
{
    auto client = std::make_shared<CSSLConn>(ioc, m_ctx, sHost, sPort);
    client->connect(sHost.c_str(), sPort.c_str());
    m_sslConnObj.push_back(client);
    ioc->run();  
}

};

1/20 times I hit with heap corruption while it try to create thread pool, mostly with the second thread. I am suspecting std::vector, because it allocate memory while pushing (but I am not sure is it the real culprit). I am maintaining the vector of connection to disconnect all at end.

sometimes crash happens at "boost::system::error_code background_getaddrinfo" function, but all the values looks good here.

I am not much familiar with boost, if anyone know how to debug it better, how can I see what is going inside Boost will help me a lot.

1
Run with valgrind, helgrind, ASAN, UBSAN, TSAN. You probably have data races. That's UB 100% of the time, not just the cases where you see it.sehe

1 Answers

2
votes
  1. I see multiple shared objects being accessed from multiple threads without any synchronization in place:

    void WorkerThread() {
        auto client = std::make_shared<CSSLConn>(ioc, m_ctx, sHost, sPort);
        client->connect(sHost.c_str(), sPort.c_str());
        m_sslConnObj.push_back(client);
        ioc->run();
    }
    

    Here m_sslConnObj is modified without any locking. So unless that's somehow a thread-safe container type, that's already Undefined Behaviour.

  2. The same goes for other things like m_ctx.

  3. It is also a code-smell that you are posting async work from separate threads, as if that meant anything. io_context is run from all the threads, so you might as well just run the threads and create all clients from the main thread. They will still be serviced from all the worker threads.

  4. Just noticed, it's also weird that CSSLConn takes sHost and sPort as constructor parameters, but then when you call connect() on them you pass them /again/ (but differently).