2
votes

I'm pretty new to clojure, have the below dataset which I have parsed in from an xml document and displayed in an excel file:

({:Total 28, :p3percent 89.28571428571429, :p2percent 0.0, :p1percent 10.71428571428571, :APP "A", :p1 3, :p2 0, :p3 25}
 {:Total 40, :p3percent 92.5, :p2percent 0.0, :p1percent 7.5, :APP "b", :p1 3, :p2 0, :p3 37} 
 {:Total 64, :p3percent 93.75, :p2percent 0.0, :p1percent 6.25, :APP "c", :p1 4, :p2 0, :p3 60} 
 {:Total 128, :p3percent 83.59375, :p2percent 12.5, :p1percent 3.90625, :APP "d", :p1 5, :p2 16, :p3 107}
 {:Total 6, :p3percent 83.33333333333333, :p2percent 16.66666666666667, :p1percent 0.0, :APP "e", :p1 0, :p2 1, :p3 5}
 {:Total 8, :p3percent 87.5, :p2percent 12.5, :p1percent 0.0, :APP "f", :p1 0, :p2 1, :p3 7})

I want to sum up/average the values for each key and create a new entry in the dataset, with APP key "Total" and then show all the summed/averaged values in the last row. This can easily be done in excel but I obviously want to do it in clojure first.

I know how to get the sum of each key i.e. (apply + map( :p1 dataset)), but how do I create a function to iterate through the dataset and add the totals on as an extra row in the dataset?

Thanks

D

3

3 Answers

1
votes

Try merge-with function. Check second example from the link.
I think it's what you want. This function will help you to create last "total" row. The only problem that you have one field that is not number. So you can pass special function to merge-with that ignores strings.

1
votes

I want to sum up/average the values for each key and create a new entry in the dataset, with APP key "Total" and then show all the summed/averaged values in the last row.

I believe the previous answers have misunderstood the question. If you want to create a new entry in your dataset which contains totals for some keys and averages for others, then my answer may help.


Start by defining sum and avg in terms of a collection of numbers. You can always improve the implementation of these functions later, so keep it simple for now.

(defn sum [coll] (reduce + coll))
;;(sum [1 2 3])
;;=> 6
(defn avg [coll] (/ (sum coll) (count coll)))
;;(avg [1 2 3])
;;=> 2

To avoid repeating yourself, define a function for reducing your dataset.

(defn dataset-keys [d] (reduce #(into %1 (keys %2)) #{} d))

(defn reduce-dataset
  [f val dataset]
  (reduce (fn [m k] (assoc m k (f (map k dataset))))
          val
          (dataset-keys dataset)))

reduce-dataset expects val to be a map and dataset to be a collection of maps, like your dataset.

Use reduce-dataset to define totals and averages in terms of sum and avg.

(defn totals [dataset] (reduce-dataset sum {:APP "Totals"} dataset))
(defn averages [dataset] (reduce-dataset avg {:APP "Averages"} dataset))

Since you want to the total of some keys in your dataset and the average of others, you'll need a way to select just those keys across the dataset.

(defn select-cols [dataset ks] (map #(select-keys % ks) dataset))

Now you have everything you need to calculate totals and averages selectively across your dataset.

(totals (select-cols your-dataset [:Total :p1 :p2 :p3]))
;;{:Total 274, :p2 18, :p3 241, :p1 15, :APP "Totals"}

(averages (select-cols your-dataset [:p1percent :p2percent :p3percent]))
;; {:p3percent 88.32713293650794, :p1percent 4.728422619047618, :p2percent 6.9444444444444455, :APP "Averages"}

You can combine the results with your original dataset using conj.

(conj dataset
      (totals (select-cols dataset [:Total :p1 :p2 :p3]))
      (averages (select-cols dataset [:p1percent :p2percent :p3percent])))

This adds two rows to the dataset, one for totals and one for averages. To add a single row, you can merge the results before conj'ing.

(conj dataset
      (merge (totals (select-cols dataset [:Total :p1 :p2 :p3]))
             (averages (select-cols dataset [:p1percent :p2percent :p3percent]))
             {:APP "Total/Avg"}))

In the case of conflicting keys, merge will always use the last value it sees, so in the example above, The value of :APP will be "Total/Avg" not "Totals" or "Averages".

0
votes

if you are just asking about producing a map that contains the totals or averages of the corraponding keys reduce can do this nicely:

user> (pprint (reduce #(assoc %2 :Total (+ (:Total %2) (:Total %1))) 
                      {:Total 0} data))
{:p1 0,
 :p3 7,
 :p2percent 12.5,
 :p2 1,
 :p1percent 0.0,
 :APP "f",
 :p3percent 87.5,
 :Total 274}
nil

this can be wrapped up in a function which preserves the original key which is important so that you can take the sum of a key, and then take the average of it as well:

(defn sum-key [key new-key map] 
  (reduce #(assoc %2 new-key (+ (%1 new-key) (%2 key))) {new-key 0} data))
#'user/sum-key
user> (sum-key :Total :Total-sum data)
{:p1 0, :p3 7, :p2percent 12.5, :p2 1, :p1percent 0.0, :APP "f", :p3percent 87.5, :Total-sum 274, :Total 8}

and this function can then be chained to summerize the keys you want

user> (->> data 
        (sum-key :Total :Total-sum) 
        (sum-key :p1 :p1-sum) 
        (sum-key :p2 :p2-sum))
{:p1 0, :p2-sum 18, 
 :p3 7, :p2percent 12.5, 
 :p2 1, :p1percent 0.0, 
 :APP "f", :p3percent 87.5, 
 :Total 8}

If you want running totals then use reductions instead.