2
votes

I'm developing some simulation software in Clojure that will need to process lots of vector data (basically originating as offsets into arrays of Java floats, length typically in 10-10000 range). Large numbers of these vectors will need to go through various processing steps - e.g. normalising the vectors, concatenating together two streams of vectors, calculating a moving average etc.

Rather than doing everything in an imperative style, I was hoping to do was create a more functional-style Clojure solution that would do the following:

  • allow any vector function to be turned into a pluggable module, e.g. (def module-a (make-module some-function))
  • allow these modules to be composed in pipelines, e.g. (def combined-module (combine-in-series module-a module-b)) would feed the output of module-a into the input of module-b
  • allow auxillary functions to access state stored within a given module, e.g. (get-moving-average some-moving-average-module), which would need to work even if some-moving-average-module is embedded deep within a combined pipeline
  • hide any boilerplate code behind the scenes, e.g. allocating sufficiently large temporary arrays for vector calculation.

Does this sound like a sensible approach?

If so, any implementation hints or libraries that might help?

2

2 Answers

3
votes

In a functional language, everything is dataflow. You can use functions as your module concept.

To address each of your use-cases:

  • A pluggagble module is a Clojure function that takes a single argument that is the state of your data vector. e.g. (def module-a some-function) To allow for easy extension by modules, I suggest using a Clojure map as your state, where one field is your array of floats.
  • Composing modules is function composition. e.g. (def combined-module (compose module-a module-b)
  • Auxiliary functions are accessor functions, extracting state from your data. e.g. If your data is a Clojure map with a :moving-average field, then the keyword :moving-average is your accessor function. State is not stored in modules.
  • Boilerplate code is hidden in the implementation of your functions, which can be declared anywhere, possibly in another file and namespace.