How to select the values greater than the mean in an array?

Question

I want to apply feature selection on a dataset (lung.mat) After loading the data, I computed the mean of distances between each feature with others by Jaccard measure. Then I sorted the distances descendingly in B1. And then I selected for example 25 number of all the features and saved the matrix in databs1. I want to select the features that have distance values greater than the mean of the array (B1).

close all;
clc
load lung.mat
data=lung; 
[n,m]=size(data);
for i=1:m-1
      for j=i+1:m
            t1(i,j)=fjaccard(data(:,i),data(:,j));
           b1=sum(t1)/(m-1);
     end
  end
 [B1,indB1]=sort(b1,'descend');
 databs1=data(:,indB1(1:25));
 databs1=[databs1,data(:,m)]; %jaccard
 save('databs1.mat');

I’ll be grateful to have your opinions about how to define this in B1, selecting values of B1 which are greater than the mean of the array B1, It means cutting the rest of smaller values than the mean of B1. I used this line,

B1(B1>mean(B1(:)))

after running, B1 still has the full number of features(column) equal to the full dataset, for example, lung.mat has 57 features and B1 by this line still has 57 columns, I considered that by this line B1 will be cut to the number of features that are greater than the mean of B1.

b1 is overwritten every loop iteration, so only the result of the last loop iteration is used. You should move that calculation out of the loops. You should also pre-allocate t1 before the loops: t1 = zeros(m-1,m). This will speed up your computation significantly. — Cris Luengo

user2305193 user2305193 · Accepted Answer · 2019-09-23T14:46:37

the general answer to your question is here (this seems clear to you based on your code):

a=randi(10,1,10) %example data
a>mean(a) %get binary matrix of which elements are larger than mean 
a(a>mean(a)) %select elements from a that are larger than mean

a =

     1     9    10     7     8     8     4     7     2     8


ans =

  1×10 logical array

   0   1   1   1   1   1   0   1   0   1


ans =

     9    10     7     8     8     7     8

How to select the values greater than the mean in an array?

1 Answers