0
votes

I got a dataframe which has 79 Variables. out if the 79 Variables, One variable contains COUNTRY and the other Variable contains size of the Country, How can I find the Largest and smallest country our of that two Columns. I'm a novice and confused.

For Example Dataframe is like :

Name      Size(m^2)  Criteria   Type
A         1200
B         1300
C         1400
D         1600
E         1900
F         2000

My desired Output will be : Largest Area : F Smallest Area : A

2
It would be easier to help if you create a small reproducible example along with expected output. Read about how to give a reproducible example.Ronak Shah
as Ronak said, small example would be helpful, since I confused with your question. You said you got 79 variables but you want to find largest value from 2 columns ? Do you mean 79 is number of observation/row instead of variables ?Vinson Ciawandy
use pandas or numpy to find the argmax and argmin value. Pass the index to the Name columnKoo

2 Answers

0
votes

As said by @Ronak you have to post an example of your data or directly dput part of your data. However you can simply do :

df <- data.frame(country=letters[1:10],size=runif(10))
df$country[df$size==max(df$size)]
0
votes

The number of columns or rows in the dataset is largely irrelevant to the question.

Using dplyr you can try this:

library(tmap) #for country and size data
library(sf) # used to get rid of geometry in the sample set
library(dplyr) # this is the package needed to answer the question

# creata a minimal sample dataset of 50 countries

data("World") # dataset including country names and areas

set.seed(1234)

df_countries <- 
  World %>%
  st_drop_geometry() %>% 
  select(name, area) %>% 
  slice_sample(n = 50)

# to answer the question
  df_countries %>% 
  filter(area == max(area) | area == min(area)) %>% 
    arrange(desc(area))

#>         name           area
#> 1     Canada 9093510 [km^2]
#> 2 Luxembourg    2590 [km^2]

Created on 2021-05-03 by the reprex package (v2.0.0)