Best practices for using version control systems
Originally Posted On : 02 Aug 2015 Last Updated : 03 Aug 2016
Version control systems - Best practices so that you can sleep well
When was the last time you wanted to roll back to a particular version of code but could not because of various issues. Have you ever pondered over why such issues arise in the first place. Of course best practices help us keep these issues at bay, but understanding these issues are more important than the ways to fix it.
Best practices should be inside people’s heads, not just on documents.
The above quote is going to stay on all of my blog posts that deal with best practices. What is the point of having a practice if people don’t care to follow it?
A Good maintained version control system is like having the power of the time gem over your repository.
Well at least something close to it, since you can’t go the future but can very well control all aspects of the past.
I am used to Git, so I tend to give examples specific to that. But it can be translated to other ones as well.
So here is a list of things that you should adopt to make the best out of VCS. These can be adopted from a non-coder/programmer perspective as well.
- Use a distributed version control system
- Knowledge of the tools
- Agree upon a common workflow
- Don’t commit unnecessary files
- Avoid big commits, commit early and commit often
- Good commit messages, they are like history
- Keep commit history clean
- Prefer smaller repositories instead of one giant one
- Never mess with the internals of a VCS
- Follow up/be updated with the best practices of the VCS that you are using
Using a distributed VCS gives you many advantages over a centralized one and I am not going to write down all of the advantages since it has already been dealt with. Most of the open source projects such as Apache and others have already migrated to Git or some of the other systems and besides that migrating them is not a herculian task anymore as many tools have come to automate them.
Below are some links that you can read up on why Distributed is better. There are no disadvantages of using a distributed one over a centralized version control system, you can treat a distributed version control as a centralized one, but that will be a total abuse of its power.
Good reads on centralized vs distributed systems.
Different tools have different ways of using them. I have used Git and I find the command line to be much easier to work with, other developers might find IDE integration to be better. The important thing is they should understand their way thought it and master it. There are situations that I have faced where one developer was not comfortable with the IDE’s integration with the VCS and practically all the files that he had experimenting with got committed.
Workflows for version control are like coding standards for code. Agreeing upon a common one is very critical to the quality of the repository which makes maintenance easier.
Here is a Atlassian workflow guide that describes workflows and how they are used in a typical production environment.
Another branching model explained well.
There are a long list of things that you don’t want to commit, but following are a few commonly abused ones.
- Personal user settings - If these preferences/settings file is used across, then it makes sense to commit them.
- Binary files - Committing binary files are useless since you cannot see the diff/changes between one and another.
- Compilation output/Dynamically generated files - There is something called .gitignore for Git and similar ones for other systems as well where folders/files can be omitted from version control. Stop putting crap inside your repo, it is not a garbage bin.
- Formatting/Whitespace changes - This should be done when setting up the environment. Most IDE’s/code editors have these built in.
A commit can be regarded as a single piece of work, there is nothing wrong in splitting two commits if it makes sense to have them separate. Commits are not just for pushing them inside the repository, but also to help retrieve/rollback when it is necessary.
Commit messages are like documentation for code. They are always better to look at instead of staring through gigantic diff files. They are like history, whenever something goes wrong, which often happens this is the first place people look at.
Some of the anti-patterns of commit messages are like below.
- Issue fix
- fixed a bug
- now it works
- some changes
There’s a saying (source unknown), along the lines of “Write every commit message like the next person who reads it is an axe-wielding maniac who knows where you live”.
Do not turn people into axe murderers or motivate someone who is already an axe-murderer.
This goes hand in hand with commit messages. Sometimes we end up writing lot of commit messages and it is better to prune them.
Git has something like rebasing to squash commits. There are similar ones for others as well.
Smaller repos are easier to maintain and are generally faster.Something like having the front-ent part of a codebase in a different repo and the backend in another one. That was an obvious example, but you get the idea.
As the title says it all, never mess with the internals of a VCS. You can experiment all you want in a personal fun project. But in a team, it can lead to chaos.
Remember, there is always an established way of doing it instead of doing hacks and workarounds.
I have not covered all of them, and there might be certain things that are specific to the VCS that you are using.
- There is a free book called Git SCM that is pretty handy. Works as a good reference as well.
- Gitready is a web site which gives excellent tips and tricks.
- Git comprehensive guide from Udemy is an easy getting started guide for git.
- A GUI tool for both Mercurial/Git, free GUI from atlassian which is really good.
That’s all folks. Do let me know your thoughts using comments below.