i have 500 directories, and 1000 files (each about 3-4k lines) for each directory. i want to run the same clojure program (already written) on each of these files. i have 4 octa-core servers. what is a good way to distribute the processes across these cores? cascalog (hadoop + clojure)?
basically, the program reads a file, uses a 3rd party Java jar to do computations, and inserts the results into a DB
note that: 1. being able to use 3rd party libraries/jar is mandatory 2. there is no querying of any sorts