I'm trying to read a large text file and count occurrences of specific errors. For example, for the following sample text
something
bla
error123
foo
test
error123
line
junk
error55
more
stuff
I want to end up with (don't really care what data structure although I am thinking a map)
error123 - 2
error55 - 1
Here is what I have tried so far
(require '[clojure.java.io :as io])
(defn find-error [line]
(if (re-find #"error" line)
line))
(defn read-big-file [func, filename]
(with-open [rdr (io/reader filename)]
(doall (map func (line-seq rdr)))))
calling it like this
(read-big-file find-error "sample.txt")
returns:
(nil nil "error123" nil nil "error123" nil nil "error55" nil nil)
Next I tried to remove the nil values and group like items
(group-by identity (remove #(= nil %) (read-big-file find-error "sample.txt")))
which returns
{"error123" ["error123" "error123"], "error55" ["error55"]}
This is getting close to the desired output, although it may not be efficient. How can I get the counts now? Also,as someone new to clojure and functional programming I would appreciate any suggestions on how I might improve this. thanks!