I am using MS Word to edit text that I convert to structured HTML using VBA.
The text is written out using document.saveas2 with encoding:=msoEncodingUTF8.
Today I found that the Trademark Symbol [Edit: inserted using the Insert Symbol capability; Insert Tab, Symbols group, Symbol button] was appearing in the text files as "(tm)".
Having discovered that encoding:=65001 should also produce UTF8, I tried it - and in one case it seemed to work, but the result was not reproducible.
I also learned that being older than Unicode, Word might use a private code page for certain characters, so I also entered the unicode code directly followed by alt-X; the TM symbol appeared correctly but still failed to be written to the text file.
Whilst I have been able to work around the problem by replacing TM with the HTML "& trade ;" (extra spaces to prevent it getting rendered as the symbol!), I am concerned about the potential for other encoding failures.
Can anyone shed any light on the cause(s) of this issue or offer an effective resolution/mitigation?
System config: Word 2010; Windows 7 64 bit.