1
votes

I am running a very large meta-simulation where I go through two hyperparameters (lets say x and y) and for each set of hyperparameters (x_i & y_j) I run a modest sized subsimulation. Thus:

for x=1:I
    for y=1:j
        subsimulation(x,y)
    end
end

For each subsimulation however, about 50% of the data is common to every other subsimulation, or subsimulation(x_1,y_1).commondata=subsimulation(x_2,y_2).commondata.

This is very relevant since so far the total simulation results file size is ~10Gb! Obviously, I want to save the common subsimulation data 1 time to save space. However, the obvious solution, being to save it in one place would screw up my plotting function, since it directly calls subsimulation(x,y).commondata.

I was wondering whether I could do something like subsimulation(x,y).commondata=% pointer to 1 location in memory %

If that cant work, what about this less elegant solution:

subsimulation(x,y).commondata='variable name' %string

and then adding

if(~isstruct(subsimulation(x,y).commondata)), 
    subsimulation(x,y).commondata=eval(subsimulation(x,y).commondata)
end

What solution do you guys think is best?

Thanks DankMasterDan

2
As long as you do not modify subsimulation(x,y).commondata, the data is not copied.Oleg

2 Answers

2
votes

You could do this fairly easily by defining a handle class. See also the documentation.

An example:

classdef SimulationCommonData < handle
    properties
        someData
    end

    methods
        function this = SimulationCommonData(someData)
            % Constructor
            this.someData = someData;
        end
    end
end

Then use like this,

commonData = SimulationCommonData(something);
subsimulation(x, y).commondata = commonData;
subsimulation(x, y+1).commondata = commonData; 
% These now point to the same reference (handle)
1
votes

As per my comment, as long as you do not modify the common data, you can pass it as third input and still not copy the array in memory on each iteration (a very good read is Internal Matlab memory optimizations). This image will clarify:

enter image description here

As you can see, the first jump in memory is due to the creation of common and the second one to the allocation of the output c. If the data were copied on each iteration, you would have seen many more memory fluctuations. For instance, a third jump, then a decrease, then back up again and so on...

Follows the code (I added a pause in between each iteration to make it clearer that no big jumps occur during the loop):

function out = foo(a,b,common)
out = a+b+common;
end

for ii = 1:10; c = foo(ii,ii+1,common); pause(2); end