0
votes

I'm looking to match a list of keywords in Columns (A, B, C) and categorize(Column D) each link with found strings within the keywords list.

  • Match link strings with keywords list for categorization
  • If there are 2 keywords for category, both keywords must match in order to categorize
  • article word separators will always be "-"

I'm thinking google apps script would be best to categorize it efficiently.

Here's the spreadsheet: https://docs.google.com/spreadsheets/d/1W4VAqUDVdJ6EJqzhkMlYRnUOSSM-kdhj3MfuZsu17qM/edit#gid=0

enter image description here

So far, I have this formula which sort of works sometimes but is slow and inconsistent:

=ARRAYFORMULA(IFERROR(REGEXEXTRACT(
 IFERROR(VLOOKUP(INDEX(IFERROR(REGEXEXTRACT(IFERROR(SPLIT(H2:H, "-")), 
 TEXTJOIN("|", 1, A2:C))),,1), {{A2:A;B2:B;C2:C},{D2:D;D2:D;D2:D}}, 2, 0))&
 IFERROR(VLOOKUP(INDEX(IFERROR(REGEXEXTRACT(IFERROR(SPLIT(H2:H, "-")), 
 TEXTJOIN("|", 1, A2:C))),,2), {{A2:A;B2:B;C2:C},{D2:D;D2:D;D2:D}}, 2, 0))&
 IFERROR(VLOOKUP(INDEX(IFERROR(REGEXEXTRACT(IFERROR(SPLIT(H2:H, "-")), 
 TEXTJOIN("|", 1, A2:C))),,3), {{A2:A;B2:B;C2:C},{D2:D;D2:D;D2:D}}, 2, 0)), 
 TEXTJOIN("|", 1, UNIQUE(IFERROR({
 VLOOKUP(INDEX(IFERROR(REGEXEXTRACT(IFERROR(SPLIT(H2:H, "-")), 
 TEXTJOIN("|", 1, A2:C))),,1), {{A2:A;B2:B;C2:C},{D2:D;D2:D;D2:D}}, 2, 0);
 VLOOKUP(INDEX(IFERROR(REGEXEXTRACT(IFERROR(SPLIT(H2:H, "-")), 
 TEXTJOIN("|", 1, A2:C))),,2), {{A2:A;B2:B;C2:C},{D2:D;D2:D;D2:D}}, 2, 0);
 VLOOKUP(INDEX(IFERROR(REGEXEXTRACT(IFERROR(SPLIT(H2:H, "-")), 
 TEXTJOIN("|", 1, A2:C))),,3), {{A2:A;B2:B;C2:C},{D2:D;D2:D;D2:D}}, 2, 0)}))))))

How can this be achieved with Google App Scripts? Any help is appreciated, thanks!

2

2 Answers

3
votes

Not a array formula, but I believe this will be faster:

I2:

=TEXTJOIN(",",1,QUERY(A:D,"Select D where A is not null "&JOIN(" and '"&H2&"' contains "," ","A" ,"B"," C"), 0))
  • QUERY to check if H contains A,B and C and return corresponding D.
  • TEXTJOIN to join the results, if multiple are present.

Drag fill down

3
votes
  • You want to achieve the result of image in your question using Google Apps Script.

If my understanding is correct, how about this sample script? Please think of this as just one of several answers. The flow of this script is as follows.

Flow:

  1. Retrieve values from Spreadsheet.
  2. Create an object.
    • This is used for searching values from CategoryByKeywords.
  3. Create the result array using the object.
    • Here, the values of MatchedCategory is created using the object.
    • The values of Keywords are searched using indexOf from the last segment of the URL.
  4. Put the result array to the Spreadsheet.
    • In the sample script, the values are put to the column "K".

Sample script:

When you use this script, please copy and paste the script to the script editor of your shared Spreadsheet, and run the function of myFunction().

function myFunction() {
  // Retrieve values from Spreadsheet.
  var sheet = SpreadsheetApp.getActiveSheet();
  var values = sheet.getRange("A2:H" + sheet.getLastRow()).getValues();

  // Create an object.
  var obj = values.reduce(function(o1, e, i) {
    var p = e.splice(-1)[0].split("/").splice(-1)[0];
    if (p) o1.params.push(p);
    var v = e.splice(0, 4);
    var k = v.splice(-1);
    if (k[0]) {
      var obj = {};
      obj[k[0]] = v.filter(String);
      o1.values.push(obj);
    }
    return o1;
  }, {params: [], values: []});

  // Create the result array using the object.
  var res = obj.params.map(function(e) {
    return obj.values.reduce(function(ar, f, j) {
      var key = Object.keys(f)[0];
      if (f[key].every(function(g) {return e.indexOf(g) > -1})) ar.push(key);
      if (j == obj.values.length - 1 && ar.length == 0) {
        ar.push("");
      } else if (ar.length > 1) {
        ar = [ar.join(",")];
      }
      return ar;
    }, []);
  });

  // Put the result array to the column "K".
  sheet.getRange(2, 11, res.length, 1).setValues(res);
}

Note:

  • This script is for your shared Spreadsheet. When you use other designed Spreadsheet, this script might occur an error. Please be careful this.

References:

I thought that I will study from your this question. So I tried to think of the solution. If I misunderstood your question and this was not the result you want, I apologize.