Introduction to MongoDb with .NET part 44: a summary

Introduction to MongoDb with .NET part 44: asummary

July 27, 2016 Leave a comment

Introduction

In theprevious post we saw how to set the read and write preferences for our MongoDb operations in code. We can set these options at various levels: at the client, at the database or at the collection level. We can also specify our preferences directly in the connection string.

The previous post was also the last in this series dedicated to MongoDb in .NET. We’ve come a long way and it’s time to summarise what we have learnt.

Summary

MongoDb is a document based database that stores its data in BSON documents. In fact it is the most popular NoSql database out there at the time of writing this series. It’s used by a wide range of companies as their data stores. The default choice for storing data in a .NET project has most often been SQL Server. While SQL Server is probably still the most popular choice for .NET developers they can choose from other well-tested alternatives depending on their project needs. MongoDb is very easy to set up and start working with.

It is a very flexible storage mechanism that lacks a fixed schema, i.e. there are virtually no constraints. We can store just about any JSON in any collection. The most important advantages of MongoDb are the following:

Dynamic data structure with flexible schemas: you don’t need to define columns and tables. You can in fact store pretty much anything within the same collection Data migrations become a lot easier. If you change your domain structure the document will store the objects correspondingly. You can force a change in the schema through changing your custom objects automatically MongoDb collections can represent our records in a much more object oriented way than relational databases. Object graphs can be directly stored in a document. If you extract a single item from a collection then you’ll immediately get its associated objects: orders with their order items, rock bands with their concerts, making it a breeze to perform operations on those linked objects Due to the lack of constraints such as secondary keys updating and deleting items will be easier, e.g. there’s no need for cascading deletes Scalability: MongoDb is highly scalable. We can easily create database clusters with primary and secondary nodes to ensure that our data store is always available It’s free. You can be a paying customer and get enhanced assistance from MongoDb but installing and using MongoDb at scale doesn’t cost anything Speed: MongoDb is very fast and efficient in querying and inserting items in a collection

MongoDb also comes with a number of disadvantages:

Lack of professional tools: with SQL Server you can use SSMS for some very advanced GUI-based database operations, such as database profiling, SQL jobs, a query editor, IntelliSense and a whole lot more. There’s no equivalent in MongoDb. MongoDb doesn’t support transactions The lack of a schema is actually a disadvantage: you cannot associate objects through keys, you cannot force a compulsory data structure with rules like “NOT NULL”. No stored procedures and triggers Business intelligence tools of MS SQL have no counterparts in MongoDb Things we have gone through

Querying

MongoDb has its own query language based on JSON and javascript. If you are familiar with those technologies then you’ll quickly pick up the syntax. We’ve looked at the CRUD operations and aggregations as well. Aggregations can become quite complex with a lot of stages. Keeping the syntax correct with opening and closing square brackets in the right places can be challenging. You’ll most often work with the database through the .NET driver and not directly in the Mongo shell. The Mongo client can be great for testing and optimising your queries and indexes, but most backend operations required by your application will happen through the driver. The .NET driver offers an object-oriented way of interacting with the database.

Aggregations with the different stages are a very neat feature in MongoDb that can get you started with complex analysis of data. Aggregations are also an entry point into Big Data analysis in MongoDb. I personally think that the way aggregations are built with stages where one stage passes a transformed document to the next makes them easier to work with than their MS SQL equivalent.

Serialisation

C# objects can be used to “translate” between BSON documents and POCOs. We can decorate the C# objects with Mongo related attributes that declare how a certain property must be serialised. An example is BsonElement where we can specify what a property is called in the JSON, e.g. C# Address is serialised into “customer_address” in its JSON equivalent. It can be argued whether a C# object with Mongo attributes is still a real POCO that can be used as a pure domain object in Domain Driven Design. MongoDb attributes break the principle of persistence ignorance of DDD. I would only use those C# objects in the concrete repository as a middle translation layer between the domain objects and their MongoDb collections. However, if your project is a not a good fit for DDD then you can obviously go ahead and decorate your POCO objects as you need.

Indexes

MongoDb is quick and efficient as it is but your queries can be made even faster and even more efficient by inserting the necessary indexes. MongoDb offers indexes in much the same way as relational databases do. We can insert indexes on individual properties, array fields, text fields, properties within sub-documents etc. The query plan is a very helpful tool when you are trying to find the optimal index mix. The most important output in a query plan is the name of the index used, the number of documents investigated and the number of documents returned. The goal is to read as few documents in the collection as possible.

The write concern

When we write to the database, i.e. insert or update a document, then the new or updated document is at first persisted to memory, not to disk. The records in memory are persisted to disk somewhat later in an asynchronous fashion. The lag is not dramatic, it can be a second, a little more, a little less, depending on how much data there is in memory. However, there is this lag and if the database server dies within that lag then the data not yet persisted to disk yet will be wiped out from memory. It won’t magically be recovered after a server restart.

By default when we send a write or an update to the server then we get an acknowledgement back saying whether the operation was successful or not. If the acknowledgement says OK, then it means that the new or updated document was persisted to memory. It is not a 100% guarantee that the new data was saved on disk as well. The write operation is abbreviated by “w” and has a default value of 1 which reflects the scenario we have just described, i.e. we want an acknowledgement of the persistence to memory. To be more exact, 1 means that we want the acknowledgement from one node in the database. If we have a single node in our database then 1 is the highest value we can specify. In a database cluster, also called a replica set, we can increase this value if we want to get the acknowledgement from 2 or more database nodes which will of course take more time to complete.

The write parameter in inserts and updates is accompanied by another section in memory for MongoDb, called the journal. The journal is a log where MongoDb registers all insert, update and delete operations that were sent to it from the client. The journal is also a collectio