0
votes

I want to scan words in a Google doc from left to right and replace the first occurrences of some keywords with a URL or a bbcode like tag wrapper around them.

I cannot use findText API because it's not simple regex finding but complex pattern matching involving lots of if else conditions involving business logic.

Here is how I want to solve this

let document = DocumentApp.getActiveDocument().getBody();
let paragraph = document.getParagraphs()[0];
let contents = paragraph.getText();
// makeAllTheNecessaryReplacemens has all the business logic to identify which keywords need to changed
let newContents = makeAllTheNecessaryReplacemens(contents);
paragraph.setText(newContents);

The problem here is that text style gets wiped out and also makeAllTheNecessaryReplacemens cannot add hyperlinks to string text.

Please suggest a way to do this.

1

1 Answers

0
votes

Proposed function

/**
 * This is a wrapper around the attribute functions
 * this allows setting one attribute at a time
 * based of a complete attribute object obtained
 * from another element. This makes it far more
 * reliable.
 */
const attributeKey = {
  FONT_SIZE : (o,s,e,a) => o.setFontSize(s,e,a),
  STRIKETHROUGH : (o,s,e,a) => o.setStrikethrough(s,e,a),
  FOREGROUND_COLOR : (o,s,e,a) => o.setForegroundColor(s,e,a),
  LINK_URL : (o,s,e,a) => o.setLinkUrl(s,e,a),
  UNDERLINE : (o,s,e,a) => o.setUnderline(s,e,a),
  BOLD : (o,s,e,a) => o.setBold(s,e,a),
  ITALIC : (o,s,e,a) => o.setItalic(s,e,a),
  BACKGROUND_COLOR : (o,s,e,a) => o.setBackgroundColor(s,e,a),
  FONT_FAMILY : (o,s,e,a) => o.setFontFamily(s,e,a)
}

/**
 * Replace textToReplace with replacementText
 * Will reatain formatting and hyperlinks
 */
function replaceTextPlus(textToReplace, replacementText) {

  // Initializing
  let body = DocumentApp.getActiveDocument().getBody();
  let searchResult = body.findText(textToReplace);

  while (searchResult != null) {

    // Getting info about result
    let foundElement = searchResult.getElement();
    let start = searchResult.getStartOffset();
    let end = searchResult.getEndOffsetInclusive();

    // This returns a complete attributes object
    // Many attributes have null as a value
    let attributes = foundElement.getAttributes(start);

    // Replacing text
    foundElement.deleteText(start, end);
    foundElement.insertText(start, replacementText);

    // Setting new end index
    let newEnd = start + replacementText.length - 1

    // Set attributes for new text skipping over null values
    // This requires the constant defined at the top.
    for (let a in attributes) {
      if (attributes[a] != null) {
        attributeKey[a](foundElement, start, newEnd, attributes[a]);
      }
    }

    // Modifies the actual searchResult so that the next findText
    // starts at the NEW end index.
    try {
      let rangeBuilder = DocumentApp.getActiveDocument().newRange();
      rangeBuilder.addElement(foundElement, start, newEnd);
      searchResult = rangeBuilder.getRangeElements()[0];
    } catch (e){
      Logger.log("End of Document")
      return null
    }

    // searches for next result
    searchResult = body.findText(textToReplace, searchResult);
  }
}

Extending the findText API

This function relies on the findText API, but it adds in a few more steps.

  1. Find the text.
  2. Get the element containing the text.
  3. Get the start and end indices of the text.
  4. Get the attributes of the text (font, color, hyperlink etc)
  5. Replace the text.
  6. Update the end index.
  7. Use the old attributes to update the new text.

You call it like this:

replaceTextPlus("Bing", "Google")
replaceTextPlus("occurrences", "happenings")
replaceTextPlus("text", "prefixedtext")

How to set the formatting and link attributes.

This relies on the attributes object that gets returned from getAttributes. Which looks something like this:

{
    FOREGROUND_COLOR=#ff0000,
    LINK_URL=null,
    FONT_SIZE=null,
    ITALIC=true,
    STRIKETHROUGH=null,
    FONT_FAMILY=null,
    BOLD=null,
    UNDERLINE=true,
    BACKGROUND_COLOR=null
}

I tried to use setAttributes but it was very unreliable. Using this method almost always resulted in some formatting loss.

To fix this I make an object attributeKey that wraps all the different functions for setting individual attributes, so that they can be called from this loop:

for (let a in attributes) {
      if (attributes[a] != null) {
        attributeKey[a](foundElement, start, newEnd, attributes[a]);
      }
    }

This allows null values to be skipped which seems to have solved the unreliability problem. Perhaps the update buffer gets confused with many values.

Limitations

This function gets the formatting of the first character of the found word. If the same work has different formatting within itself. For example, "Hello" (Mixed normal with bold and italic), the replacement word will have the formatting of the first letter. This could potentially be fixed by identifying the word and iterating over every single letter.

References