I'm building a Rails app and I'm looking for a way to convert database entries with html and inline MathJax math (TeX) to LaTeX for pdf creation.
I found similar questions like mine:
- Convert html mathjax to markdown with pandoc
- How to convert HTML with mathjax into latex using pandoc?
and I see two options here:
- Create a Haskell executable which leaves stuff like
\(y=f(x)\)
alone when converting html to LaTeX - Write a ruby method which does the following things:
- Take the
string
and split it into anarray
with a regex (string.split(regex)
) - loop through the created
array
and if content matchesregex
convert the parts to LaTeX which do not include inline math withPandocRuby.html(string).to_latex
- concatenate everything back together (
array.join
)
- Take the
I would prefer the ruby method solution because I'm hosting my application on Heroku and I don't like to checkin binaries into git.
Note: the pandoc
binary is implemented this way http://www.petekeen.net/introduction-to-heroku-buildpacks)
So my question is: what should the regex
look like to split the string
by \(math\)
.
E.g. string
can look like this: text \(y=f(x) \iff \log_{10}(b)\) and \(a+b=c\) text
And for the sake of completeness, how should the Haskell script be written to leave \(math\)
alone when converting to LaTeX and the ruby method is not a possible solution?
string.split(/(\\\(.*?\\\))/).each_slice(2).map { |a| [PandocRuby.html(a[0]).to_latex, PandocRuby.convert(a[1].to_s, {f: "html+tex_math_single_backslash", to: :latex})] }.join
works. – Daniel