In R, how to remove everything before the last slash

8

votes

I have a dataset say

x <- c('test/test/my', 'et/tom/cat', 'set/eat/is', 'sk / handsome')

I'd like to remove everything before (including) the last slash, the result should look like

my cat is handsome

I googled this code which gives me everything before the last slash

gsub('(.*)/\\w+', '\\1', x)
[1] "test/test" "et/tom"    "set/eat"   "sk / tie"

How can I change this code, so that the other part of the string after the last slash can be shown?

Thanks

r regexstringgsub

7

votes

You can use basename:

paste(trimws(basename(x)),collapse=" ")
# [1] "my cat is handsome"

6

votes

Using strsplit

> sapply(strsplit(x, "/\\s*"), tail, 1)
   [1] "my"       "cat"      "is"       "handsome"

Another way for gsub

> gsub("(.*/\\s*(.*$))", "\\2", x) # without 'unwanted' spaces
[1] "my"       "cat"      "is"       "handsome"

Using str_extract from stringr package

> library(stringr)
> str_extract(x, "\\w+$") # without 'unwanted' spaces
[1] "my"       "cat"      "is"       "handsome"

4

votes

You can basically just move where the parentheses are in the regex you already found:

gsub('.*/ ?(\\w+)', '\\1', x)

1

votes

You could use

x <- c('test/test/my', 'et/tom/cat', 'set/eat/is', 'sk / handsome')
gsub('^(?:[^/]*/)*\\s*(.*)', '\\1', x)

Which yields

[1] "my"       "cat"      "is"       "handsome"

To have it in one sentence, you could paste it:

(paste0(gsub('^(?:[^/]*/)*\\s*(.*)', '\\1', x), collapse = " "))

^            # start of the string
(?:[^/]*/)*  # not a slash, followed by a slash, 0+ times
\\s*         # whitespaces, eventually
(.*)         # capture the rest of the string

This is replaced by \\1, hence the content of the first captured group.

In R, how to remove everything before the last slash

4 Answers