Quantcast
Channel: CodeSection,代码区,数据库(综合) - CodeSec
Viewing all articles
Browse latest Browse all 6262

MongoDB: Getting counts using aggregate and group methods

$
0
0

An aggregation operation computes a single value from a collection of values. An example of an aggregation operation is counting the occurrence of a number in list of numbers. MongoDB provides a rich set of aggregation operations that examine and perform calculations on the data sets.

MongoDB which is an open source document oriented NoSQL database system written in C++. MongoDB stores data in BSON (Binary JSON) format, supports a dynamic schema and allows for dynamic queries. The mongo Query Language is expressed as JSON and is different from the SQL queries used in an RDBMS.

With MongoDB 2.2 introduced a new aggregation framework, modelled on the concept of data processing pipelines. Documents enter a multi-stage pipeline that transforms the documents into an aggregated result. This feature makes it easy to count a set of data records satisfying certain conditions.

In this article we discuss about using MongoDB aggregate and group methods to get counts for set of data records conditionally.

aggregate()

We can use aggregate method in MongoDB to group documents with respect to a field. This is mainly used to get document count.

For example consider 'books' collection having following documents:

{"_id" : 1, "status" : "available", "title" : "The Time Machine", "author" : "H. G. Wells" }
{"_id" : 2, "status" : "available", "title" : "Harry Potter and the Philosopher's Stone", "author" : "J. K. Rowling"}
{"_id" : 3, "status" : "outofstock", "title" : "The Hobbit", "author" : "J. R. R. Tolkien"}
{"_id" : 4, "status" : "outofstock", "title" : "The War of the Worlds", "author" : "H. G. Wells"}

We can easily get count of books by each author by executing following query.

db.books.aggregate(
[
{
$group : {
_id : "$author",
"count" : { $sum : 1 }
}
}
}
]
)

Which give output as:

{"_id" : "H. G. Wells", "count": 2}
{"_id" : "J. K. Rowling", "count": 1}
{"_id" : "J. R. R. Tolkien", "count": 1}

Further we can use $cond on status field to get counts of books with different status for each author.

db.books.aggregate(
[
{
$group : {
_id : "$author",
"available" : {
$sum : {
$cond : { if: { $eq: ["status", "available"]}, then: 1, else: 0}
}
},
"outofstock" : {
$sum : {
$cond : { if: { $eq: ["status", "outofstock "]}, then: 1, else: 0}
}
}
}
}
]
)

Which give output as:

{"_id" : "H. G. Wells", "available": 1, "outofstock" : 1}
{"_id" : "J. K. Rowling", "available": 1, "outofstock" : 0}
{"_id" : "J. R. R. Tolkien", "available": 0, "outofstock" : 1} group()

Same output can also be got by using group method as follows.

db.books.group(
{
key: { author: 1 },
reduce: function ( curr, result ) {
if(curr.status === 'available'){
result.available++;
} else if(curr.status === 'outofstock'){
result.outofstock++;
}
},
initial: { available: 0, outofstock: 0 }
}
)
{"author" : "H. G. Wells", "available": 1, "outofstock" : 1}
{"author" : "J. K. Rowling", "available": 1, "outofstock" : 0}
{"author" : "J. R. R. Tolkien", "available": 0, "outofstock" : 1} Limitations

Here second query seems to be much simpler but it has some restriction

Following lines are from MongoDB reference:

"Because db.collection.group() uses javascript, it is subject to a number of performance limitations. For most cases the $group operator in the aggregation pipeline provides a suitable alternative with fewer restrictions."

Conclusion

MongoDB aggregation framework is a great feature to pipeline the data through multiple filters and get aggregate values. Whereas group method is simple to use. Depending on the requirement we should choose between these to methods.


Viewing all articles
Browse latest Browse all 6262

Trending Articles