R/Power-Query: Replace value with matching

Question

I am trying to cleaning up some data in a huge dataset. One column holds values for the Sales aamount. Example values could be like those:

Clean Data:

Sales Potential

230

120

300

However, at some points there appear something like this:

Dirty Data

0, 0, 0, 0, 0

4, 0, 0, 0

0, 0, 480

0, 200, 0

In the first case of the dirty data the cell shoul only contains a zero: 0 In all other cases I would like to extract, if there is any non-zero number, this number and replace the cell with this value or add a new cleanded-column.

So the dirty data cleaned up:

Cleaned Data:

0

4

480

200

My approach was using RegExpressions in R as I am loading the data into Power-BI using Power-Query. I tried to find a pattern where I extract the value I am looking for and place it in a new column. However, my resutls looks like nothing. Is there maybe a much simpler approach to achieve this in R?

Code so far: library(stringr) OutputRegEx <- data.frame(MyDataset) Splitter = function(x) substr(str_extract(x,'[1-9]'),1,7) OutputRegEx[["RegExAuswertung"]] <- apply(OutputRegEx[43],1, function(x) Splitter(x) )

horseyride horseyride · Accepted Answer · 2020-06-08T20:04:58

In Powerquery, insert a custom column with below formula

=List.Max(List.Transform(Text.Split(Text.From([Sales Potential]),","), each Number.FromText(_)))

The formula splits everything on commas, puts into a list, converts the list from text into numbers, then takes the maximum number from the list.

R/Power-Query: Replace value with matching

Clean Data:

Dirty Data

Cleaned Data:

3 Answers