0
votes

I know there are a lot of questions out there with this headline, but I can't glean my answer from them, so here goes.

I'm an experienced programmer, but fairly new to Clojure. I'm trying to parse a RTF file by converting it to a HTML file then calling the html parser.

The converter I'm using (unrtf) always prints to stdout, so I need to capture the output and write the file myself.

  (defn parse-rtf
    "Use unrtf to parse a rtf file"
    [#^java.io.InputStream istream charset]
    (let [rtffile (File/createTempFile "parse" ".rtf" (File. "/vault/tmp/"))
          htmlfile (File/createTempFile "parse" ".ohtml" (File. "/vault/tmp/"))
          command (str "/usr/bin/unrtf "
                        (.getPath rtffile)
                  )
         ]
      (try
        (with-open [rtfout  (FileOutputStream. rtffile)]
          (IOUtils/copy istream rtfout))
        (let [ proc (.exec (Runtime/getRuntime) command)
               ostream (.getInputStream proc)
               result (.waitFor proc)]
          (if (> result 0)
            (
              (println "unrtf failed" command result)
              ; throwing an exception causes a parse failure to be logged
              (throw (Exception. (str "RTF to HTML conversion failed")))
            )
            (
              (with-open [htmlout  (FileOutputStream. htmlfile)]
                (IOUtils/copy ostream htmlout))
              ; since we now have html, run it through the html parser
              (parse-html (FileInputStream. htmlfile) charset)
           )
          )
        )
      (finally
        (.delete rtffile)
        (.delete htmlfile)
      )
      )))

The exception points to the line with

(IOUtils/copy ostream htmlout))

which really confuses me, since I used that form earlier (just after the try:) and it seems to be OK there. I can't see the difference.

Thanks for any help you can give.

2
You're using parentheses incorrectly. In particular your if statement is wrong. Parentheses are not curly braces. When you have an extra pair, you're invoking the inner expression as a function. You want (if condition something something-else) and you have (if condition (something) (something-else)).Diego Basch

2 Answers

4
votes

As others have correctly pointed out, you can't just add extra parentheses for code organization to group forms together. Parentheses in a Clojure file are tokens that delimit a list in the corresponding code; lists are evaluated as s-expressions - that is, the first form is evaluated and the result is invoked as a function (unless it names a special form such as if or let).

In this case you have the following:

(
  (with-open [htmlout  (FileOutputStream. htmlfile)]
    (IOUtils/copy ostream htmlout))
  ; since we now have html, run it through the html parser
  (parse-html (FileInputStream. htmlfile) charset)
)

The IOUtils/copy function has an integer return value (the number of bytes copied). This value is then returned when the surrounding with-open macro is evaluated. Since the with-open form is the first in a list, Clojure will then try to invoke the integer return value from IOUtils/copy as a function, resulting in the exception that you see.

To evaluate multiple forms for side-effects without invoking the result from the first one, wrap them in a do form; this is a special form that evaluates each expression and returns the result of the final expression, discarding the result from all others. Many core macros and special forms such as let, when, and with-open (among many others) accept multiple expressions and evaluate them in an implicit do.

2
votes

I didnt try to run your code, just had a look at it, and after the if (> result 0) you have ((println ...)(throw ...)) without a do. Having an extra parens causes the returned value from the inner parens to be treated as a function and get executed.

try to include it, like this (do (println ...) (throw ...))