6
votes

Use gsub remove all string before first white space in R

In this example, we try to remove everything before a space with sub(".*? (.+)", "\\1", D$name). I'm looking for something really similar but I'm not really familiar with regex.

I want to delete everything before the first numeric character but without remove it

For example with:

x <- c("lala65lolo","papa3hihi","george365meumeu")

I want:

> "65lolo","3hihi", "365memeu"
2
Try sub("[^0-9]+", "", x) - akrun
@akrun I didn't want you to delete your answer. I just wanted you to be more careful. If the input string is "3dajdaj", your solution doesn't work, but can be easily generalized. Just that. - nicola
It is true that I was thinking whether I use \\D+ or [^0-9]+`, but after I posted the answer, a a similar answer got posted. So, I thought to keep it and then you commented. Anyway, it is okay - akrun
Sorry, after reflexion my example was a little bit simplist ! Don't worry, thank for your help, i have my answer ! - user3083101

2 Answers

6
votes

You may use

> x <- c("lala65lolo","papa3hihi","george365meumeu")
> sub("^\\D+", "", x)
[1] "65lolo"    "3hihi"     "365meumeu"

Or, to make sure there is a digit:

sub("^\\D+(\\d)", "\\1", x)

The pattern matches

  • ^ - start of string
  • \\D+ - one or more chars other than digit
  • (\\d) - Capturing group 1: a digit (the \1 in the replacement pattern restores the digit captured in this group).

In a similar way, you may achieve the following:

  • sub("^\\s+", "", x) - remove all text up to the first non-whitespace char
  • sub("^\\W+", "", x) - remove all text up to the first word char
  • sub("^[^-]+", "", x) - remove all text up to the first hyphen (if there is any), etc.
7
votes

In R 3.6 (currently the R devel version) onwards trimws has a new whitespace argument which can be used to specify what is regarded as whitespace -- in this case any non-digit character:

trimws(x, "left", "\\D")
## [1] "65lolo"    "3hihi"     "365meumeu"