0
votes

Say I have table1 with 3 columns:

Source  Category   Type
DOG     Carnivore  mammal
CAT     Carnivore  mammal
GOAT    Herbivore  mammal
LIZARD  Carnivore  Reptile

I have table2 that contains a lot of text/characters that I don't care about but does have a substring found in table 1's source column:

Description                                             Type (New Column)
nonsensetext345 ka dfne DOG ke 344ab 09/06
unneededtextandnumbers GOAT MoreIrrelavantTextBlahBLah

How do I add another column in table2 that searches the Description for a matching substring in the source column of table1 and return the associated type?

2
I cannot figure out how to format this better. It looks nothing like it did when I added it.pablosyko
This link gives an overview of formatting in Markdown. Hilariously, when posting this comment, I got it backward.greggyb
Got the formatting corrected.pablosyko

2 Answers

3
votes
//Table1
let
    Source = #table(
    {"Source", "Category", "Type"}, 
    {{"DOG", "Carnivore", "mammal"},
     {"CAT", "Carnivore", "mammal"},
     {"GOAT", "Herbivore", "mammal"},
     {"LIZARD", "Carnivore", "Reptile"}})
in
    Source
//Table2
let
    Source = #table(
        {"Description"},
        {{"nonsensetext345 ka dfne DOG ke 344ab 09/06"},
         {"unneededtextandnumbers GOAT MoreIrrelavantTextBlahBLah"}}),
    Type =
        Table.AddColumn(
            Source,
            "Type",
            each 
                // first we grab the description into a variable
                let currentRowDescription = [Description]
                in
                    // Get only rows from Table1 that match a condition
                    Table.SelectRows(
                        Table1,
                        // The condition is Text.Contains(currentRowDescription, [Source])
                        // This means that we are taking only the rows from Table1
                        // whose value in Table1[Source] is found in the string
                        // currentRowDescription.
                        // Based on that table, we take the values in the field, [Type].
                        each Text.Contains(currentRowDescription, [Source]))[Type]),
    // we can't guarantee only one match, so we're expanding all matches here.
    #"Expanded Type" = Table.ExpandListColumn(Type, "Type")
in
    #"Expanded Type"

Above was M, heavily commented. We can take a very similar approach in DAX:

Type (dax) = 
// again, we grab the value of the current row's description
VAR CurrentRowDescription = 'Table2'[Description]
// Again, we filter 'TAble1' based on searching for the values in 'Table1'[Source]
// in the current row's description.
VAR MatchingRowsFromTable1 =
  FILTER (
    'Table1',
    // FIND searches a string (arg1) for a substring (arg2) from an optional starting
    // position (arg3), returning arg4 when the substring isn't found. It returns the
    // starting index of the substring. Thus if it's > 0 there's a match.
    FIND ( 'Table1'[Source], CurrentRowDescription,, 0 ) > 0
  )
// DAX needs exactly one scalar value to add as a column. Here we are defensive with
// SAMPLE which will return exactly 1 row
VAR Only1MatchingRow = SAMPLE ( 1, MatchingRowsFromTable1, 'Table1'[Type], ASC )
RETURN
  // Select just the type column.
  SELECTCOLUMNS ( Only1MatchingRow, "Type", 'Table1'[Type] )
0
votes

Select * from table1 t1 join table2 t2 on t2.description like '%'+t1.source+'%'