0
votes

I have a similar problem but a bit more complex as this one : Power Query: Function to search a column for a list of keywords and return only rows with at least one match and this one : https://community.powerbi.com/t5/Desktop/Power-query-Add-column-with-list-of-keywords-found-in-text/td-p/83109

I have a Database with a lot of columns of which one is a free-text description string. On another Excel Sheet in the workbook, I've set up a Matching table to categorize the rows based on lists of keywords like this :

  • category | keywords
  • pets | dog, cat, rabbit,...
  • cars | Porsche, BMW, Dodge,...
  • ...

The goal is to put a custom column in my database that will return the hereabove category (or categories ?) based on which listed keywords it can find in the description field.

I think the solution above and the one from ImkeF are not so far but I didn't find a way to turn it into a successful Query for my case. (I'm good at Excel but quite a noob to M and programming Queries...)

3

3 Answers

1
votes

oriented on the obove posted links:

M-Code for tbl_category: the keywords (separated with comma) will be split into rows

let
    Source = Excel.CurrentWorkbook(){[Name="tbl_category"]}[Content],
    #"Replaced Value" = Table.ReplaceValue(Source," ","",Replacer.ReplaceText,{"keywords"}),
    #"Split Column by Delimiter" = Table.ExpandListColumn(Table.TransformColumns(#"Replaced Value", {{"keywords", Splitter.SplitTextByDelimiter(",", QuoteStyle.Csv), let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "keywords"),
    #"Changed Type1" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"keywords", type text}})
in
    #"Changed Type1"

M-Code for tbl_text. Here will be add a Custom Column called "Category":

let
    Source = Excel.CurrentWorkbook(){[Name="tbl_text"]}[Content],
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Text", type text}}),
    #"Added Custom" = Table.AddColumn(#"Changed Type", "Category", (Earlier) => Table.SelectRows(tbl_category, 
     each Text.Contains(Record.Field(Earlier, "Text"), Record.Field(_, "keywords"), Comparer.OrdinalIgnoreCase))),
    #"Expanded Category" = Table.ExpandTableColumn(#"Added Custom", "Category", {"Category"}, {"Category"})
in
    #"Expanded Category"

enter image description here

1
votes

Ok,

I've finally found how to build a query to suits my needs based on your steps above!

Note : I used "Row Labels" to replace the column header of the 1st tbl_category column for clarity.

My solution is not as neat as I would like (I had to create a second custom column because of my lack of knowledge on how to nest the two steps so they act on the same cell) but it works perfectly!

So thanks again for your help Chris... without your leads I woudn't have found this maze exit!

here the 2nd code modified:

let
    Source = Excel.CurrentWorkbook(){[Name="tbl_text"]}[Content],
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Text", type text}}), 
    #"Added Custom" = Table.AddColumn(#"Changed Type", "Category", 
       (Earlier) => Table.SelectRows(tbl_category, 
       each Text.Contains(Record.Field(Earlier, "Text"), Record.Field(_, "keywords"),
       Comparer.OrdinalIgnoreCase))),
    #"Added Custom1" = Table.AddColumn(#"Added Custom", "Custom",
       each Text.Combine(Table.ToList(Table.Transpose(
       Table.Distinct(Table.SelectColumns([Category],{"Row Labels"}))),
       Combiner.CombineTextByDelimiter(",")), ", ")),
in
    #"Added Custom1"

Greetz

0
votes

Just for the record,

Once applied to real data the query was not working anymore... giving the error "We cannot convert the value null to type Text." the solution was as easy as removing "null" cells (blank cells that were categories for which no keywords were yet identified) first!

M-Code for tbl_category:

let
    Source = Excel.CurrentWorkbook(){[Name="tbl_category"]}[Content],

    #"Filtered Rows" = Table.SelectRows(Source, each ([keywords] <> null)),

    #"Replaced Value" = Table.ReplaceValue(#"Filtered Rows"," ","",Replacer.ReplaceText,{"keywords"}),
    #"Split Column by Delimiter" = Table.ExpandListColumn(Table.TransformColumns(#"Replaced Value", {{"keywords", Splitter.SplitTextByDelimiter(",", QuoteStyle.Csv), let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "keywords"),
    #"Changed Type1" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"keywords", type text}})
in
    #"Changed Type1"

Greetz