I have a large data set with two arrays, say x
and y
. The arrays have over 1 million data points in size. Is there a simple way to do a scatter plot of only 2000 of these points but have it be representative of the entire set?
I'm thinking along the lines of creating another array r ; r = max(x)*rand(2000,1)
to get a random sample of the x array. Is there a way to then find where a value in r
is equal to, or close to a value in x
? They wouldn't have to be in the same indexed location but just throughout the whole matrix. We could then plot the y
values associated with those found x
values against r
I'm just not sure how to code this. Is there a better way than doing this?