My question is about finding an alternative approach to what ismember() does in MATLAB in a much more faster way.
Here is my problem:
M [92786253*1] (a: roughly 100M rows)
x [749*1] (b: # of rows can vary from 100 to 10K)
I want to find rows in b that co-exists in a (the row indices of a). This operation needs to be repeated about 10M times for different version of b.
The Normal Approach:
tic
ind1 = ismember(M,x);
toc
Elapsed time is 0.515627 seconds.
The Fast Approach:
tic
n = 1;
ind2 = find(any(all(bsxfun(@eq,reshape(x.',1,n,[]),M),2),3));
toc
Error using bsxfun
Requested 92786253x1x749 (64.7GB) array exceeds maximum array size preference.
Creation of arrays greater than this limit may take a long time and cause MATLAB to become unresponsive.
See array size limit or preference panel for more information.
Error in demo_ismember_fast (line 23)
ind2 = find(any(all(bsxfun(@eq,reshape(x.',1,n,[]),M),2),3))
The second approach is usually 15-20 times faster than the normal one, however in this case, I cannot use it for memory limitation. Are there any suggestion how to speedup this operation? Thanks for sharing expert opinions with me!
ismember(). - YASMorxsorted? - Divakar