0
votes

Create db with next bash script:

#! /bin/bash
curl -X PUT http://127.0.0.1:5984/sales
IFS=$';'
vals=`cat sales_upload.json`
for i in $vals 
do
    curl -X POST http://127.0.0.1:5984/sales -H "Content-Type: application/json" -d $i
done
unset IFS

and resource file:

{
    "Type" : "customer",
    "LastName" : "Welsh", 
    "FirstName" : "Jim",
    "Address" : "340 West 50th Street, New York, NY",
    "TotalSpent" : 734.34
};
{
    "Type" : "customer",
    "LastName" : "Zuch", 
    "FirstName" : "Bo",
    "Address" : "116 10th Avenue, New York, NY",
    "TotalSpent" : 1102.47
};
{
    "Type" : "customer",
    "LastName" : "Libby", 
    "FirstName" : "Joe",
    "Address" : "611 Fifth Avenue, New York, NY",
    "TotalSpent" : 290.01
};
{
    "Type" : "customer",
    "LastName" : "Grant", 
    "FirstName" : "Sue",
    "Address" : "7 West 55th Street, Manhattan, NY",
    "TotalSpent" : 430.83
};
{
    "Type" : "salesman",
    "LastName" : "Green", 
    "FirstName" : "Gwen",
    "Level" : 1
};
{
    "_id" : "_design/logic",
    "language" : "javascript",
    "views" :
    {
        "customers": {
            "map" : "function(doc) { if (doc.Type == 'customer')  emit(null, {LastName: doc.LastName, FirstName: doc.FirstName, Address: doc.Address}) }"
        },
        "total_purchases": {
            "map" : "function(doc) { if (doc.Type == 'customer')  emit(null, doc.TotalSpent) }",
            "reduce" : "function(keys, values) { return sum(values) }"
        }
    }
}

when i calling curl -X GET http://127.0.0.1:5984/sales/_design/logic/_view/total_purchases

i get:

{"rows":[ {"key":null,"value":2557.65} ]}

but if i in total_purchases change first parameter of emit to emit(doc.LastName, doc.TotalSpent), then i will get:

{"rows":[ {"key":null,"value":2557.6499999999996} ]}

Why so?

1

1 Answers

1
votes

The difference between your answers is due to the fact that you have changed your view function. The first parameter to emit determines how the view index will be built. In the first case, all emitted values will be stored under the 'null' key. With the second example you have now spread your index around different keys, ie the last name of the customer.

Therefore the internal btree in couchdb will be different between the views. So why will you get a different result in the sum?

CouchDB uses incremental map/reduce. You can read about that here: http://damienkatz.net/2008/02/incremental_map.html

From that post Damien makes the point:

To make incremental Map/Reduce possible, the Reduce function has the requirement that not only must it be referentially transparent, but it must also be commutative and associative for the array value input, to be able reduce on it's own output and get the same answer, like this:

f(Key, Values) == f(Key, [ f(Key, Values) ] )

This requirement of reduce functions allows CouchDB to store off intermediated reductions directly into inner nodes of btree indexes, and the view index updates and retrievals will have logarithmic cost. It also allows the indexes to be spread across machines and reduced at query time with logarithmic cost.

The incremental design makes it possible to use map/reduce to query huge partitioned clusters in realtime, instead of having to wait for a whole map/reduce job to complete or having stale, occasionally updated indexes,. The downside is it may be harder to write the Reduce function in an associative and commutative manner.

Therefore I assume what is going on is that in the first view, since they are all under the same key, there is no stored intermediate reductions. While in the second view, temporary sums are being stored. You are probably then seeing the difference in the way the floating point numbers are stored in these intermediate sums. See here: Is floating point math broken?

Two recommendations may help you solve this. First is to use the 'built-in' call to the Erlang version of the reduce functions. See here:

http://wiki.apache.org/couchdb/Built-In_Reduce_Functions

The call is slightly different : "reduce": "_sum"

Second, you could convert emit the float as an integer as seen here: Is floating point math broken?

Hope this helps.