4
votes

I have a c++ application that sends data through to a python function over shared memory. This works great using ctypes in Python such as doubles and floats. Now, I need to add a cv::Mat to the function.

My code currently is:

//h

#include <iostream>
#include <opencv2\core.hpp>
#include <opencv2\highgui.hpp>


struct TransferData
{
   double score;
   float other;  
   int num;
   int w;
   int h;
   int channels;
   uchar* data;

};

#define C_OFF 1000
void fill(TransferData* data, int run, uchar* frame, int w, int h, int channels)
{
   data->score = C_OFF + 1.0;
   data->other = C_OFF + 2.0;
   data->num = C_OFF + 3;   
   data->w = w;
   data->h = h;
   data->channels = channels;
   data->data = frame;
}

//.cpp

namespace py = pybind11;

using namespace boost::interprocess;

void main()
{

    //python setup
    Py_SetProgramName(L"PYTHON");
    py::scoped_interpreter guard{};
    py::module py_test = py::module::import("Transfer_py");


    // Create Data
    windows_shared_memory shmem(create_only, "TransferDataSHMEM",
        read_write, sizeof(TransferData));

    mapped_region region(shmem, read_write);
    std::memset(region.get_address(), 0, sizeof(TransferData));

    TransferData* data = reinterpret_cast<TransferData*>(region.get_address());


    //loop
    for (int i = 0; i < 10; i++)
    {
        int64 t0 = cv::getTickCount();

        std::cout << "C++ Program - Filling Data" << std::endl;

        cv::Mat frame = cv::imread("input.jpg");

        fill(data, i, frame.data, frame.cols, frame.rows, frame.channels());

        //run the python function   

        //process
        py::object result = py_test.attr("datathrough")();


        int64 t1 = cv::getTickCount();
        double secs = (t1 - t0) / cv::getTickFrequency();

        std::cout << "took " << secs * 1000 << " ms" << std::endl;
    }

    std::cin.get();
}

//Python //transfer data class

import ctypes


    class TransferData(ctypes.Structure):
_fields_ = [
    ('score', ctypes.c_double),
    ('other', ctypes.c_float),       
    ('num', ctypes.c_int),
    ('w', ctypes.c_int),
    ('h', ctypes.c_int),
    ('frame', ctypes.c_void_p),
    ('channels', ctypes.c_int)  
]


    PY_OFF = 2000

    def fill(data):
        data.score = PY_OFF + 1.0
        data.other = PY_OFF + 2.0
        data.num = PY_OFF + 3

//main Python function

import TransferData
import sys
import mmap
import ctypes




def datathrough():
    shmem = mmap.mmap(-1, ctypes.sizeof(TransferData.TransferData), "TransferDataSHMEM")
    data = TransferData.TransferData.from_buffer(shmem)
    print('Python Program - Getting Data')   
    print('Python Program - Filling Data')
    TransferData.fill(data)

How can I add the cv::Mat frame data into the Python side? I am sending it as a uchar* from c++, and as i understand, I need it to be a numpy array to get a cv2.Mat in Python. What is the correct approach here to go from 'width, height, channels, frameData' to an opencv python cv2.Mat?

I am using shared memory because speed is a factor, I have tested using the Python API approach, and it is much too slow for my needs.

1
Given that it's all in a single process, shared memory seems rather redundant. The OpenCV Python bindings use the Python API to map between cv::Mat on the C++ side and numpy arrays on the Python side -- mostly just bookkeeping, the underlying buffer being shared. I'm curious how your Python API approach looked like -- more likely an implementation issue that caused it to perform poorly.Dan Mašek
Thanks for your reply. Is it possible to pass cv::Mat data at high speed between c++ and Python? I could not find an example anywhere.anti
Give me some time to figure out a pybind11 based implementation. | google.com/search?q=cv::Mat+to+numpy+site:stackoverflow.com and there's even couple of github repos with converters. The implementatiof the OpenCV has the code too, it's just a bit hard to grok.Dan Mašek
Thank you! that would be greatanti
Little proof of concept: pastebin.com/N312Twqz | Still needs some more research, cleanup and generalization before I'll write an answer tho.Dan Mašek

1 Answers

4
votes

The general idea (as used in the OpenCV Python bindings) is to create a numpy ndarray that shares its data buffer with the Mat object, and pass that to the Python function.

Note: At this point, I'll limit the example to continuous matrices only.

We can take advantage of the pybind11::array class.

  • We need to determine the appropriate dtype for the numpy array to use. This is a simple 1-to-1 mapping, which we can do using a switch:

    py::dtype determine_np_dtype(int depth)
    {
        switch (depth) {
        case CV_8U: return py::dtype::of<uint8_t>();
        case CV_8S: return py::dtype::of<int8_t>();
        case CV_16U: return py::dtype::of<uint16_t>();
        case CV_16S: return py::dtype::of<int16_t>();
        case CV_32S: return py::dtype::of<int32_t>();
        case CV_32F: return py::dtype::of<float>();
        case CV_64F: return py::dtype::of<double>();
        default:
            throw std::invalid_argument("Unsupported data type.");
        }
    }
    
  • Determine the shape for the numpy array. To make this behave similarly to OpenCV, let's have it map 1-channel Mats to 2D numpy arrays, and multi-channel Mats to 3D numpy arrays.

    std::vector<std::size_t> determine_shape(cv::Mat& m)
    {
        if (m.channels() == 1) {
            return {
                static_cast<size_t>(m.rows)
                , static_cast<size_t>(m.cols)
            };
        }
    
        return {
            static_cast<size_t>(m.rows)
            , static_cast<size_t>(m.cols)
            , static_cast<size_t>(m.channels())
        };
    }
    
  • Provide means of extending the shared buffer's lifetime to the lifetime of the numpy array. We can create a pybind11::capsule around a shallow copy of the source Mat -- due to the way the object is implemented, this effectively increases its reference count for the required amount of time.

    py::capsule make_capsule(cv::Mat& m)
    {
        return py::capsule(new cv::Mat(m)
            , [](void *v) { delete reinterpret_cast<cv::Mat*>(v); }
            );
    }
    

Now, we can perform the conversion.

py::array mat_to_nparray(cv::Mat& m)
{
    if (!m.isContinuous()) {
        throw std::invalid_argument("Only continuous Mats supported.");
    }

    return py::array(determine_np_dtype(m.depth())
        , determine_shape(m)
        , m.data
        , make_capsule(m));
}

Let's assume, we have a Python function like

def foo(arr):
    print(arr.shape)

captured in a pybind object fun. Then to call this function from C++ using a Mat as a source we'd do something like this:

cv::Mat img; // Initialize this somehow

auto result = fun(mat_to_nparray(img));

Sample Program

#include <pybind11/pybind11.h>
#include <pybind11/embed.h>
#include <pybind11/numpy.h>
#include <pybind11/stl.h>

#include <opencv2/opencv.hpp>

#include <iostream>

namespace py = pybind11;

// The 4 functions from above go here...

int main()
{
    // Start the interpreter and keep it alive
    py::scoped_interpreter guard{};

    try {
        auto locals = py::dict{};

        py::exec(R"(
            import numpy as np

            def test_cpp_to_py(arr):
                return (arr[0,0,0], 2.0, 30)
        )");

        auto test_cpp_to_py = py::globals()["test_cpp_to_py"];


        for (int i = 0; i < 10; i++) {
            int64 t0 = cv::getTickCount();

            cv::Mat img(cv::Mat::zeros(1024, 1024, CV_8UC3) + cv::Scalar(1, 1, 1));

            int64 t1 = cv::getTickCount();

            auto result = test_cpp_to_py(mat_to_nparray(img));

            int64 t2 = cv::getTickCount();

            double delta0 = (t1 - t0) / cv::getTickFrequency() * 1000;
            double delta1 = (t2 - t1) / cv::getTickFrequency() * 1000;

            std::cout << "* " << delta0 << " ms | " << delta1 << " ms" << std::endl;
        }        
    } catch (py::error_already_set& e) {
        std::cerr << e.what() << "\n";
    }

    return 0;
}

Console Output

* 4.56413 ms | 0.225657 ms
* 3.95923 ms | 0.0736127 ms
* 3.80335 ms | 0.0438603 ms
* 3.99262 ms | 0.0577587 ms
* 3.82262 ms | 0.0572 ms
* 3.72373 ms | 0.0394603 ms
* 3.74014 ms | 0.0405079 ms
* 3.80621 ms | 0.054546 ms
* 3.72177 ms | 0.0386222 ms
* 3.70683 ms | 0.0373651 ms