Matlab: Comparing two vectors with different length and different values?

Question

Lets say I have two vectors A and B with different lengths Length(A) is not equal to Length(B) and the Values in Vector A, are not the same as in Vector B. I want to compare each value of B with Values of A (Compare means if Value B(i) is almost the same value of A(1:end) for example B(i)-Tolerance<A(i)<B(i)+Tolerance.

How Can I do this without using for loop since the data is huge?

I know ismember(F), intersect,repmat,find but non of those function can really help me

So you're only comparing A(i) with B(i)? Why not post the existing for loop code and people might be able to suggest improvement from there. — weston
Here is a solution for ismember with a tolerance. It is about twice as slow as the solution posted by @ondav but does handle the tolerance more accurately. mathworks.com/matlabcentral/fileexchange/23294-ismemberf/… — Dennis Jaheruddin

ondrejdee ondrejdee · Accepted Answer · 2013-07-12T11:29:47

You may try a solution along these lines:

tol = 0.1; 

N = 1000000; 

a = randn(1, N)*1000; % create a randomly

b = a + tol*rand(1, N); % b is "tol" away from a

a_bin = floor(a/tol); 
b_bin = floor(b/tol); 

result = ismember(b_bin, a_bin) | ...
         ismember(b_bin, a_bin-1) | ...
         ismember(b_bin, a_bin+1); 

find(result==0) % should be empty matrix.

The idea is to discretize the a and b variables to bins of size tol. Then, you ask whether b is found in the same bin as any element from a, or in the bin to the left of it, or in the bin to the right of it.

Advantages: I believe ismember is clever inside, first sorting the elements of a and then performing sublinear (log(N)) search per element b. This is unlike approaches which explicitly construct differences of each element in b with elements from a, meaning the complexity is linear in the number of elements in a.

Comparison: for N=100000 this runs 0.04s on my machine, compared to 20s using linear search (timed using Alan's nice and concise tf = arrayfun(@(bi) any(abs(a - bi) < tol), b); solution).

Disadvantages: this leads to that the actual tolerance is anything between tol and 1.5*tol. Depends on your task whether you can live with that (if the only concern is floating point comparison, you can).

Note: whether this is a viable approach depends on the ranges of a and b, and value of tol. If a and b can be very big and tol is very small, the a_bin and b_bin will not be able to resolve individual bins (then you would have to work with integral types, again checking carefully that their ranges suffice). The solution with loops is a safer one, but if you really need speed, you can invest into optimizing the presented idea. Another option, of course, would be to write a mex extension.

Matlab: Comparing two vectors with different length and different values?

5 Answers