0
votes

I'm trying to use the VBA code from a similar question in this forum to redact text highlighted in a specific color, but I would like to keep the document layout, which means only replacing the words, but not the spaces and paragraph breaks in the document. Alternatively, I would be happy if we could identify the line breaks and put a space there.

At the end the document would not have large sections of unbroken text where words and spaces were replaced by XXXXXXXX and highlighted black. It the text would look more like XX X XXXX XXX X but all of it should be highlighted in black.

In other words, the text "Mary had a little lamb." would be redacted to "XXXX XXX X XXXXXX XXXXX" rather than XXXXXXXXXXXXXXXXXXXXXXXX.

I've tried changing the "If flag then" section to include unicode 32 (space) instead of the carriage return (unicode 13), but that doesn't seem to work.

Many thanks.

 If flag Then
            If Selection.Range.HighlightColorIndex = wdTurquoise Then
                ' Create replacement string
                ' If last character is a carriage return (unicode 13), then keep that carriage return
                OldText = Selection.Text
                OldLastChar = Right(OldText, 1)
                NewLastChar = ReplaceChar
                If OldLastChar Like String(1, 13) Then NewLastChar = String(1, 13)
                NewText = String(Len(OldText) - 1, ReplaceChar) & NewLastChar

                ' Replace text, black block
                Selection.Text = NewText
                Selection.Font.ColorIndex = wdBlack
                Selection.Font.Underline = False
                Selection.Range.HighlightColorIndex = wdBlack
                Selection.Collapse wdCollapseEnd
            End If
        End If
1
You need a wildcard search where the search text is something like '([0-9A-Za-z])' and the replacement text is 'X'.freeflow

1 Answers

0
votes

@freeflow has given you an answer in his comment on your post, but if you do that you should also include in the wildcard search, all potential punctuation characters excluding blank spaces.

However, with that said, I recommend you not try and eliminate punctuation characters and do not eliminate spaces between words. I’m recommending that because the purpose of redaction is to eliminate the possibility of someone comprehending what the redacted portion of the document originally contained. If you provide them clues, such as how many words in the sentence ... they can guess and sometimes be quite accurate because of the surrounding non-redacted script.

Oh course, that’s just my opinion.

To maintain document formatting, I suggest that you not use as replacement characters letters such as “X” because it is a wide character. I’ve found it better to use a symbol and I recommend a Wingdings character 127. It’s an average width and does a good job of balancing out sentence length ... but for added assurance I also recommend that you include in your replacement a Font.Spacing of -1, which will tighten up each redacted sentence even more.

In redacting, just be aware that maintaining the document formatting, no matter what your replacement character strategy might be, is very difficult. I’ve spent a lot of time experimenting with this and I’ve now shared what I do in my own redaction add-in. I don’t redact paragraph marks, I redact the entire highlighted string, including spaces and punctuation and I use a Wingding font character 127, set the Font.Spacing to -1, at the font color is the same as whatever color I’m using to highlight the redaction.

If you you are interested in seeing my add-in, do a Web search on AuthorTec Redactor.