9
votes

I export Word docx from markdown using Pandoc.

By default, everything seems to be marked as English in the docx file. So I tried to override this, e.g. by command line option pandoc -s -S images.md -o images.docx -V lang=de or in the header YAML:

---
subtitle: <%= @report.name %>
toc-title: <%= t('.toc_title') %>
lang: de
---

But none seems to work, all content in the exported docx file is marked red by the language spelling feature, telling me that words are not found in English.

How can I override the language?

Update

I tried specifying the language in the docx-file, by simply selecting all text (Cmd+A, I'm on OSX) and clicking on the language button on the bottom left.

enter image description here

Also, I tried using Tools -> Language:

enter image description here

None of it did have an effect though.

Update

Interestingly, when exporting to HTML, the language is set correctly in the <html> attribute.

2
I think you need to use the "--reference-docx" option, as discussed here. Create a reference docx file, and then override the language there.Sergio Correia
I already tried this. But I'm not 100% sure where to specify the language in the docx-file, I simply selected all text and clicked on the language button on the bottom left. But maybe there's a general language option for the full document?Joshua Muheim
I have set the language through Tools -> Language in Word 365 on OSX. Didn't solve the problem.Joshua Muheim
reference-docx can only set styles and a few properties (margins, page size, header, and footer) but language is not one of them <pandoc.org/MANUAL.html#options-affecting-specific-writers>; a workaround is to write a doc macro that does that, and post-process your file.scoa
Agree with scoa, it seems that some post processing is the only way for now. That said, it's an issue that has been discussed already on github. It shouldn't be that hard to fix (after all, docx is just a zip with xml files inside), but of course that's easier said than done.Sergio Correia

2 Answers

4
votes

There is currently no way to set the language of a doc, docx, or odt document output by pandoc. A pandoc GitHub issue discusses this problem (noted in the comments by @Serge Correia).

Indeed, localization in other formats goes through templates, but the doc, docx, and odt equivalent of a template, reference files, only set a few selected styles and properties. For instance, reference-docx: (from the pandoc README)

The contents of the reference docx are ignored, but its stylesheets and document properties (including margins, page size, header, and footer) are used in the new docx.

2
votes

I have just checked again, and with Pandoc v 2.9.2.1 it seems to set the language correctly:

english docx

german docx

Hooray!! Thanks, Pandoc community! <3

Would be interesting though to know when exactly it was added (couldn't find a mention in the https://pandoc.org/changelog.txt).