I have a dataset with n observations and say 2 variables X1 and X2. I am trying to classify each observation based on a set of conditions on their (X1, X2) values. For example, the dataset looks like
df: Index X1 X2 1 0.2 0.8 2 0.6 0.2 3 0.2 0.1 4 0.9 0.3
and the groups are defined by
- Group 1: X1<0.5 & X2>=0.5
- Group 2: X1>=0.5 & X2>=0.5
- Group 3: X1<0.5 & X2<0.5
- Group 4: X1>=0.5 & X2<0.5
I'd like to generate the following dataframe.
expected result: Index X1 X2 Group 1 0.2 0.8 1 2 0.6 0.2 4 3 0.2 0.1 3 4 0.9 0.3 4
Also, would it be better/faster to work with numpy arrays for this type of problems?