Ayush Sharma ☕ + 🎧 + 🕹️ - Getting started with Pre-commit framework

The pre-commit stage is an important part of the CI/CD story. Clean

What is it?

Pre-commit is a Python-based framework for managing pre-commit hooks. It is configured using a YAML file and allows you to directly use open-source hooks in your project without reinventing the wheel. You can find plugins for things like linting, formatting, testing, security scanning, etc., saving a lot of time and effort.

To keep things simple, I’ll show you how Pre-commit works on a new project with a single YAML file. Our goal is to configure Pre-commit to catch YAML syntax errors before committing changes to the repo.

Installing pre-commit on Mac

We’ll start by installing Pre-commit on a Mac:

brew install pre-commit
pre-commit --version

At the time of writing this my version is pre-commit 2.20.0. If you’re using a different OS, you can find installation instructions here.

Setting up a test project

For my test project, I’ll init a new Git directory ~/testing and place my.yaml in it. I’ll also stage the file in Git since Pre-commit only works with files which Git is tracking.

mkdir ~/testing
cd ~/testing
git init

echo "# My List
groceries:
  - Milk
  - Eggs
  - Bread
  - Butter

# My dictionary
contact:
  name: Ayush Sharma
  email: myemail@example.com" > my.yaml

git add my.yaml

Configuring and running pre-commit

With that out of the way, it’s time to configure Pre-commit.

Start by creating a file called .pre-commit-config.yaml with the following contents:

repos:
-   repo: https://github.com/pre-commit/pre-commit-hooks
    rev: v1.2.3
    hooks:
    -   id: check-yaml

The Pre-commit syntax seems simple but it’s doing a lot of heavy lifting. Pre-commit lets you load and execute hooks from remote repositories. Developers publish their Pre-commit hooks in open-source repos and each repo contains one or more hooks. In the config file, we use the repo variable to load a particular repository and the hooks variables to execute specific hooks in that repository. In our case, we’re load the pre-commit-hooks repo and executing the check-yaml hook in it. The repo contains several useful hooks which you can call by specifying their IDs under hooks.

With the config file ready, execute the hook using:

pre-commit run --all-files

If everything ran fine, you should see:

[INFO] Initializing environment for https://github.com/pre-commit/pre-commit-hooks.
[INFO] Installing environment for https://github.com/pre-commit/pre-commit-hooks.
[INFO] Once installed this environment will be reused.
[INFO] This may take a few minutes...
Check Yaml...............................................................Passed

Since our my.yaml is valid our pre-commit check passed. Make my.yaml invalid by remove the # on the first comment line (or by breaking it some other way) and run the command again.

Check Yaml...............................................................Failed
- hook id: check-yaml
- exit code: 1

mapping values are not allowed in this context
  in "my.yaml", line 2, column 10

And now check-yaml produces an error since the YAML is no longer valid.

Running pre-commit before Git commit

Running Pre-commit from the command-line is fine but we want to run this automatically before the commit. Here’s how to add the actual Git hook in place:

pre-commit install

After running the above command, check the file .git/hooks/pre-commit in your repo and you should see a call to pre-commit executing your new config file. You can also configure Pre-commit globally on all ne repos so you don’t have to manually enable it for every new repo.

Configuring different stages

Just because Pre-commit is called that doesn’t that’s all you can do with it :) There are many ways to configure Pre-commit to run in different stages such as during commits, during merges, when switching branches, etc. These stages can be configured during installation or in the configuration file.

Conclusion

There are several solutions for handling linting, formatting, security scanning, etc. You can either configure your IDE to handle them or use automation like Bitbucket Pipelines, GitHub Actions, etc. But not everyone on your team will use the same IDE. And automation only kicks in after you push your code by which time it’s too late to clean things up.

Pre-commit, by using Git’s in-built hooks feature, is more portable than the other solutions and guarantees a level of consistency in coding standards across your team. Since it’s configurable using a YAML file, you can update your CI standards and keep track of the hooks you’re using in the repo itself. Its plugin-based nature means you can reuse supported hooks in your project and not have to start from scratch every time. I’ve already started using Pre-commit for Terraform formatting, linting, and SAST scanning (more on that in another post) and it’s refreshing to catch issues during development and when I’m in the flow of things rather than wait for my build pipeline to report them minutes or hours later, so Pre-commit might be worth exploring for the time savings alone.

Happy coding :)