One of the most difficult things when starting a new project is choosing the technology stack to use. Should you use a NoSQL database? Key-value, document-oriented or graph-based? What about using a service-oriented architecture? Microservices? What about a cache layer?

Which database to choose?
The reason why making such decisions is hard is that whatever we choose now, will probably be hard to change in the future. And it’s very likely that we’ll have to make changes in the future as the project grows. Why? Several reasons: First, requirements change over time assumptions that were correct under the original set of requirements may no longer be correct. An important principle of Agile Development is to embrace this type of change and see it as something positive. The second reason is that you can’t actually know if a certain technology will fit your requirements until you start using it. “Not every problem is a nail and not every solution is a hammer” . Every technology has limitations, some of which are well-known and some of which can only be learned by actually using the technology in a real environment. The final reason: People make mistakes.
Realizing that change is a natural part of software development is a very powerful thing. Once you’ve accepted that you probably will need to change things from your original design, you can design your application in such a way that making changes in one component will have a minimum effect on the rest of the application. If you do this, two things happen: First, and not surprisingly, making those changes becomes easier as you have prepared for them. Secondly, and more interestingly, getting started with your project becomes easier, as you have stopped worrying too much about choosing the perfect database or protocol. You can start by using mysql and later on, if needed, you can just replace it with MongoDB or Cassandra.
So, what is the best way to minimize the effect of changing components in an application? Decoupling. We’ll see a few examples of how using decoupling can allow us to make changes to certain parts of our application without affecting the rest of the application. Some of the examples will be quite obvious, and some less so. Yet the message is the same: as long as your code is modular and parts are easy to change, you can tolerate making changes in the future.
An extra benefit of using decoupling is that it makes our code easier to understand and test.
Database AbstractionThis one is quite obvious, but strive to keep references to specific databases to a minimum. For example, if you’re using MongoDB for storing your users, you might do something like this in your controller class:
...int count = db.getCollection("used").find(and(eq("username", userName), eq("password", password)));
if (count == 1) {
System.out.println("Valid user");
...
} else {
System.out.println("Invalid user");
...
}
The more references to MongoDB we have spread in our code, especially in our business logic, the harder it will be to switch to a different database for storing our users. A better alternative would be to create a UserRepository interface that defines methods for retrieving users. We can start by implementing a specific MongoDB User Repository class. If later on we decide to use a different database, we can just implement a new User Repository class and use it instead of the MongoDB one. The rest of our code wouldn’t need to be aware of any changes. Note that this approach has the extra benefit of making our code easier to test.
class MongoDBUserRepository implements UserRepository {public bool doesUserExist(userName, password) {
return db.getCollection("used").find(and(eq("username", userName), eq("password", password))) == 1;
}
...
}
...
if (userRepository.doesUserExist(userName, password)) {
System.out.println("Valid user");
...
} else {
System.out.println("Invalid user");
...
} Hide Communication Protocols
Another thing that we’d like to hide from our code is communication protocols: are we using REST APIs to communicate with a remote host? Or maybe it’s gRPC? Or websocket? Or maybe the computation is performed locally? Interacting directly with protocols means that not only is it harder to change your code to use a different protocol, but it also makes it harder to test. Here is a real example that I saw not long ago of what not to do. Our client application was supposed to call a remote microservice to perform mesh repair operations. The microservice would receive a mesh as input and return a new, repaired mesh. This was the initial design:
...const microserviceClient = new MicroserviceClient(REMOTE_IP);
const outputMesh = microserviceClient.repairMesh(inputMesh);
// do something with outputMesh
...
There are two main problems with this code. The first one is that it’s hard to unit-test, as it’s hard to mock the MicroserviceClient class. The second one is that if tomorrow we decide to stop using a microservice and start performing the mesh repair operation locally or maybe using a third-party REST API, our code will be affected. A better approach would be to define a generic MeshRepairService interface and the corresponding classes that implement the interface. You can start with a MicroserviceMeshRepairService class. Then, if needed, you can create and use other implementations of the same interface, which would have no effect in the rest of your code.
class MicroserviceMeshRepairService implements MeshRepairService {public Mesh repairMesh(mesh) {
....
}
...
}
...
const outputMesh = meshRepairService.repairMesh(inputMesh);
// do something with outputMesh
... Hide Serialisation Formats
This one might be less obvious than the previous ones, but still as important. There seems to be a tendency to put serialization logic inside the classes that are to be serialized. For example, if we have a class called User , we might do something like that:
class User {private String username;
private String password;
...
public byte[] serialize() {
....
}
public static deserialize(byte[] input) {
...
return new User(...);
}
}
There are a few problems with this approach. The first one, is that whenever you decide to change the serialization format, you’ll need to modify all your business logic classes. The second problem is that you can’t have more than two serialization formats for the same class at the same time. For example, you might want to store your user data in your database using Message Pack , but maybe send it as a protocol buffer to a remote client. The serialization logic is something that depends on the context of how the data will be sent or stored, so it’s better to leave it out of your business logic classes.
This post was inspired by a conversation I had with a former manager a few years ago, who told me that the initial choice of a database for a new project is not that critical, as long as the code is structured in a way that makes it easy to use a new database later on. So, if you’re starting a new project and feel overwhelmed by the enormous choice of databases, protocols and frameworks, just choose something, make sure your code is properly decoupled and start programming!