0
votes

I am dealing with a data frame as shown in this image, but with 380 rows in total

Not sure if this will help but let's say I am working on the dataframe:

df <- data.frame(c(-10:-1),c(-5:4),c(1:10))

and I would like to extract any rows that contain the number "-5" in either the first or the second column.

In the shared Image, I want to extract rows that contain "Arsenal" in either "HomeTeam" or "AwayTeam" column, however I do not know how to do so.

This is my attempt using grep()

However it shows the message below:

"Error: Can't subset columns that don't exist. x The locations 12, 39, 45, 78, 98, etc. don't exist. i There are only 7 columns."

where the mentioned locations are exactly the rows I need...

I wanted to try some other filtering functions like dplyr() but I couldn't understand how it works... And I am not even sure if it's fit for what I wanted to do.

2
Welcome to Stack Overflow. Please make this question reproducible by including code and example data in a plain text format - for example the output from dput(yourdata). We cannot copy/paste data from images.neilfws
Try league1819[grepl('Arsenal', league1819$HomeTeam)|grepl('Arsenal', league1819$AwayTeam), ]Karthik S

2 Answers

1
votes

Using your df <- data.frame(c(-10:-1),c(-5:4),c(1:10)) example, and since you're (potentially) already using tidyverse, it is possible to achieve what you want using the code:

if(!require(tidyverse)) install.packages('tidyverse'); library(tidyverse) #to load the package, just in case you haven't already!
df <- data.frame(c(-10:-1),c(-5:4),c(1:10))
colnames(df) <- c("col1", "col2", "col3")
df %>% filter(col1 %in% "-5" | col2 %in% "-5")

or if you want rows with -5 in both columns, you can use:

df %>% filter(col1 %in% "-5" & col2 %in% "-5")

instead. For your leagues question, I'd do:

sample_Arsenal <- league1819 %>% filter(HomeTeam %in% "Arsenal" | AwayTeam %in% "Arsenal")
0
votes

You can use grepl :

sampleArsenal <- subset(league1819, grepl('Aresenal', HomeTeam) | 
                                    grepl('Aresenal', AwayTeam))

Or if you want to try dplyr :

library(dplyr)
library(stringr)

league1819 %>% 
   filter(str_detect(HomeTeam, 'Aresenal') | str_detect(AwayTeam, 'Aresenal'))