Getting started with Pre-commit framework
The pre-commit stage is an important part of the CI/CD story. Clean
What is it?
Pre-commit is a Python-based framework for managing pre-commit hooks. It is configured using a YAML file and allows you to directly use open-source hooks in your project without reinventing the wheel. You can find plugins for things like linting, formatting, testing, security scanning, etc., saving a lot of time and effort.
To keep things simple, I’ll show you how Pre-commit works on a new project with a single YAML file. Our goal is to configure Pre-commit to catch YAML syntax errors before committing changes to the repo.
Installing pre-commit on Mac
We’ll start by installing Pre-commit on a Mac:
brew install pre-commit
pre-commit --version
At the time of writing this my version is pre-commit 2.20.0
. If you’re using a different OS, you can find installation instructions here.
Setting up a test project
For my test project, I’ll init a new Git directory ~/testing
and place my.yaml
in it. I’ll also stage the file in Git since Pre-commit only works with files which Git is tracking.
mkdir ~/testing
cd ~/testing
git init
echo "# My List
groceries:
- Milk
- Eggs
- Bread
- Butter
# My dictionary
contact:
name: Ayush Sharma
email: myemail@example.com" > my.yaml
git add my.yaml
Configuring and running pre-commit
With that out of the way, it’s time to configure Pre-commit.
Start by creating a file called .pre-commit-config.yaml
with the following contents:
repos:
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v1.2.3
hooks:
- id: check-yaml
The Pre-commit syntax seems simple but it’s doing a lot of heavy lifting. Pre-commit lets you load and execute hooks from remote repositories. Developers publish their Pre-commit hooks in open-source repos and each repo contains one or more hooks. In the config file, we use the repo
variable to load a particular repository and the hooks
variables to execute specific hooks in that repository. In our case, we’re load the pre-commit-hooks
repo and executing the check-yaml
hook in it. The repo contains several useful hooks which you can call by specifying their IDs under hooks
.
With the config file ready, execute the hook using:
pre-commit run --all-files
If everything ran fine, you should see:
[INFO] Initializing environment for https://github.com/pre-commit/pre-commit-hooks.
[INFO] Installing environment for https://github.com/pre-commit/pre-commit-hooks.
[INFO] Once installed this environment will be reused.
[INFO] This may take a few minutes...
Check Yaml...............................................................Passed
Since our my.yaml
is valid our pre-commit check passed. Make my.yaml
invalid by remove the #
on the first comment line (or by breaking it some other way) and run the command again.
Check Yaml...............................................................Failed
- hook id: check-yaml
- exit code: 1
mapping values are not allowed in this context
in "my.yaml", line 2, column 10
And now check-yaml
produces an error since the YAML is no longer valid.
Running pre-commit before Git commit
Running Pre-commit from the command-line is fine but we want to run this automatically before the commit. Here’s how to add the actual Git hook in place:
pre-commit install
After running the above command, check the file .git/hooks/pre-commit
in your repo and you should see a call to pre-commit
executing your new config file. You can also configure Pre-commit globally on all ne repos so you don’t have to manually enable it for every new repo.
Configuring different stages
Just because Pre-commit is called that doesn’t that’s all you can do with it :) There are many ways to configure Pre-commit to run in different stages such as during commits, during merges, when switching branches, etc. These stages can be configured during installation or in the configuration file.
Conclusion
There are several solutions for handling linting, formatting, security scanning, etc. You can either configure your IDE to handle them or use automation like Bitbucket Pipelines, GitHub Actions, etc. But not everyone on your team will use the same IDE. And automation only kicks in after you push your code by which time it’s too late to clean things up.
Pre-commit, by using Git’s in-built hooks feature, is more portable than the other solutions and guarantees a level of consistency in coding standards across your team. Since it’s configurable using a YAML file, you can update your CI standards and keep track of the hooks you’re using in the repo itself. Its plugin-based nature means you can reuse supported hooks in your project and not have to start from scratch every time. I’ve already started using Pre-commit for Terraform formatting, linting, and SAST scanning (more on that in another post) and it’s refreshing to catch issues during development and when I’m in the flow of things rather than wait for my build pipeline to report them minutes or hours later, so Pre-commit might be worth exploring for the time savings alone.
Happy coding :)