When using xcorr
in MATLAB to cross correlate 2 related data sets, everything works as expected - I see a correlation peak and the lag reported is correct. However, when I use xcorr
to cross correlate unrelated data sets where both data sets contain 1 cluster of "spikes", I see a correlation peak and the lag reported is the distance between the 2 spikes.
In this image:
x
is a random data series. y
is also a random data series. Both x
and y
have 30 random peaks inserted into the series in sequence. In theory, there should be no correlation between the 2 data sets since they are both very different. However, it can be seen from the 3rd plot that there is a very strong correlation between the 2 data sets. The code used to generate this figure is at the bottom of this post.
I've tried to filter the spikes using a few different mechanisms (rolling rms power ... etc) before performing the xcorr
. This has worked in some cases but not all. I feel like I need a different approach to the problem, maybe an alternative to xcorr
. I do understand why x
and y
cross correlate using xcorr
. Is there another cross correlation tool that I can use? Note x
and y
will never be exactly the same, they will only ever be approximately the same but in normal operation, it's not the spikes that should make them correlate.
Any suggestions on how to tell if x
and y
correlate while also ignoring the "spikes"?
Here is some my example code:
x = rand(1, 3000);
x = x - 0.5;
y = rand(1, 3000);
y = y - 0.5;
% insert the impulses into the data
impulse_width = 30;
impulse_max_height = 6;
x_impulse_start = 460;
y_impulse_start = 120;
rand_insert_x = rand(1, impulse_width);
rand_insert_x = (rand_insert_x - 0.5) * 2 * impulse_max_height;
rand_insert_y = rand(1, impulse_width);
rand_insert_y = (rand_insert_y - 0.5) * 2 * impulse_max_height;
x(1,x_impulse_start:x_impulse_start + impulse_width - 1) = rand_insert_x;
y(1,y_impulse_start:y_impulse_start + impulse_width - 1) = rand_insert_y;
subplot(3, 1, 1);
plot(x);
ylim([-impulse_max_height impulse_max_height]);
title('random data series: x');
subplot(3, 1, 2);
plot(y);
ylim([-impulse_max_height impulse_max_height]);
title('random data series: y');
[c, l] = xcorr(x, y);
subplot(3, 1, 3);
plot(l, c);
title('correlation using xcorr');
xcorr
is doing a good job, as it is aligning those quite similar signals together! – Ander Biguri