1
votes

I'm using kettle v5.2 which support the aggregation pipleline in MongoDB when using MongoDB input the query works for small data set but I need to use option allowDiskUse to the query can't figure how to add this in pentaho while I tested this option in mongo shell and it's working as expected

http://docs.mongodb.org/manual/reference/method/db.collection.aggregate/

http://wiki.pentaho.com/display/EAI/MongoDB+Input#MongoDBInput-queryaggpipeline

this works

[ {$unwind: "$friends"}, {$group : { '_id' : '$friends.id', name: {'$first': '$friends.name'} ,count: {$sum:1} } } ,{$sort: {count: -1}}, {$limit: 100} ]

this doesn't

[ {$unwind: "$friends"}, {$group : { '_id' : '$friends.id', name: {'$first': '$friends.name'} ,count: {$sum:1} } } ,{$sort: {count: -1}}, {$limit: 100} ] , {allowDiskUse: true}
3

3 Answers

1
votes

If you look at the class who parse the pipeline, you can go up to see that Pentaho use MongoDB class java DBCollection with a deprecated function instead of this aggregate :

public Cursor aggregate(List<DBObject> pipeline,
                        AggregationOptions options)

So unfortunately options are not available in Pentaho Mongo Input.

0
votes

Having you tried checking the "Query is aggregation pipeline" box in the Query tab on the MongoDB input step?

0
votes

check the option "Query is aggregation pipeline"

Code