The MongoDB aggregation Pipeline just like UNIX command shell Pipeline means the possibility to execute an operation on some input and use the output as the input for the next command and so on. So MongoDB supports the same concept of an aggregation framework. Documents enter a multi-stage pipeline that transforms the documents into an aggregated results.



It’s consists of stages. Each stage transforms the documents as they pass through the pipeline. Pipeline stages do not need to produce one output document for every input document; e.g., some stages may generate new documents or filter out documents. Pipeline stages can appear multiple times in the pipeline.

MongoDB provides the db.collection.aggregate() method in the mongo shell and the aggregatecommand for aggregation pipeline.

Here are the some most used stages are:-

$group − Groups input documents by a specified identifier expression and applies the accumulator expression(s), if specified, to each group. Consumes all input documents and outputs one document per each distinct group. The output documents only contain the identifier field and, if specified, accumulated fields.

$unwind −Deconstructs an array field from the input documents to output a document for each element. Each output document replaces the array with an element value. Thus with this stage we will increase the amount of documents for the next stage.

$project − Used to select some specific fields from a collection. adding new fields or removing existing fields. For each input document, outputs one document.

$lookup −Performs a left outer join to another collection in the same database to filter in documents from the “joined” collection for processing.

So there are no restrictions on result size as a cursor is returned. Pipeline stages have a limit of 100MB of RAM. To handle large datasets use allowDiskUse option for large data set.

Hope you like this blog. thanks.