1
votes

I wish to create one vector of data points with a mean of 50 and a standard deviation of 1. Then, I wish to create a second vector of data points again with a mean of 50 and a standard deviation of 1, and with a correlation of 0.3 with the first vector. The number of data points doesn't really matter but ideally I would have 100.

The method mentioned at Generating two correlated random vectors does not answer my question because (due to random sampling) the SDs and means deviate too much from the desired number.

2

2 Answers

1
votes

I worked out a way, though it is ugly. I would still welcome an answer that detailed a more elegant method to get what I want.

z = 0;

while z < 1
    mu = 50
    sigma = 1
    M = mu + sigma*randn(100,2);
    R = [1 0.3; 0.3 1];
    L = chol(R)
    M = M*L;
    x = M(:,1);
    y = M(:,2);
    if (corr(x,y) < 0.301 & corr(x,y) > 0.299) & (std(x) < 1.01 & std(x) > 0.99) & (std(y) < 1.01 & std(y) > 0.99);
    z = 1;
    end
end

I then calculated how the mean of vector y and calculated how much higher than 50 it was. I then subtracted that number from every element in vector y so that the mean was reduced to 50.

0
votes

You can create both vectors together... I dont understand the reason you define them separatelly. This is the concept of multivariate distribution (just to be sure that we have the same jargon)... Anyway, I guess you are almost already there to what I call the simplest way to do that:

Method 1:

Use matlab function mvnrnd [Remember that mvnrnd uses the covariance matrix that can be calculated from the correlation and the variance)

Method 2:

I am not very sure, but I think it is very close to what you are doing (actually my doubt is related to the if (corr(x,y) < 0.301 & corr(x,y) > 0.299) & (std(x) < 1.01 & std(x) > 0.99) & (std(y) < 1.01 & std(y) > 0.99)) I dont understand the reason you have to do that. See the topic "Drawing values from the distribution" in wikipedia Multivariate normal distribution.