4
votes

Is there a method in Google Apps Scrips that returns the word count from a Google Document?

Lets say I'm writing a report that have a particular limit on word count. It's quite precise and it states exactly 1.8k - 2k words (yes and it's not just a single case, but many...)

In Microsoft Office Word there was a handy status bar at the bottom of the page which automatically updated the word count for me, so I tried to make one using Google Apps Scrips.

Writing a function that rips out whole text out from a current document and then calculates words again and again several times in a minute feels like a nonsense to me. It's completely inefficient and it makes CPU run for nothing but I couldn't find that function for the word count in Docs Reference.

Ctr+Shift+C opens a pop-up that contains it, which means that a function that returns total word count of a Google Document definitely exists...

But I can't find it! Sigh... I spent few hours digging through Google, but I simply cannot find it, please help!

3

3 Answers

5
votes

Wrote a little snippet that might help.

function myFunction() {
  var space = " ";
  var text = DocumentApp.getActiveDocument().getBody().getText();
  var words = text.replace(/\s+/g, space).split(space);
  Logger.log(words.length);
}
0
votes

I understand the the request is for a built in function, which I looked for as well, but couldn't find anywhere in the documentation. I had to use polling. I started with a script like Amit's, but found that I was never matching Google's word count. This is what I had to do to get it work. I know this can't be efficient, but it now matches google docs count most of the time. What I had to do was clean/rebuild the string first, then count it.

function countWords() {
    var s = DocumentApp.getActiveDocument().getBody().getText();
    //this function kept returning "1" when the doc was blank 
    //so this is how I stopped having it return 1.       
    if (s.length === 0) 
        return 0;
    //A simple \n replacement didn't work, neither did \s not sure why
    s = s.replace(/\r\n|\r|\n/g, " ");
    //In cases where you have "...last word.First word..." 
    //it doesn't count the two words around the period.
    //so I replace all punctuation with a space
    var punctuationless = s.replace(/[.,\/#!$%\^&\*;:{}=\-_`~()"?“”]/g," ");
    //Finally, trim it down to single spaces (not sure this even matters)
    var finalString = punctuationless.replace(/\s{2,}/g," ");
    //Actually count it
    var count = finalString.trim().split(/\s+/).length; 
    return count;
}
0
votes

I think this function probably covers most cases for word count with English characters. If I overlooked something, please comment.

function testTheFunction(){
  var myDoc = DocumentApp.openByUrl('https://docs.google.com/document/d/?????/edit');
  Logger.log(countWordsInDocument(myDoc));    
}

function countWordsInDocument(theDoc){
  var theText = theDoc.getBody().getText();
  var theRegex = new RegExp("[A-Za-z]") // or include other ranges for other languages or numbers
  var wordStarted = false;
  var theCount = 0;
  for(var i=0;i<theText.length;i++){
    var theLetter = theText.slice(i,i+1);
    if(theRegex.test(theLetter)){
      if(!wordStarted){
        wordStarted=true;
        theCount++;
      }
    }else if(wordStarted){
      wordStarted=false;
    }
  }
  return theCount;
}