Do you suffer from poorly written code? Is your codebase riddled with inconsistencies? Do you experience anxiety every time your code is being reviewed? If you answered ‘yes’ to any of these questions, static code analysis could help.
Static code analysis is the process of analyzing code before it is executed. It provides numerous advantages to developers, and integrating static code analyzers can supercharge your developer workflow.
Let’s take a deep dive to understand what static code analysis is, why you should be using it when to start, and how you can quickly set it up in your project.
Of all the questions we just raised, this is probably the easiest to answer. As the name says, Static code analysis is the analysis of code in a static or non-executing state. It is the automated equivalent to another developer reading and reviewing your code, except with the added efficiency, speed, and consistency afforded by a computer that no human could match.
You might be thinking, “If I write detailed tests of all my units and functional tests at a system level, and they all pass, my code is bug-free, right?” Yes, it is. Congratulations. But bug-free code is not the same as good code; there’s a lot more that goes into that. That is the domain where static analysis shines.
All types of tests, be it unit tests, functional tests, integration tests, visual tests, or regression tests, run the code and then compare the outcome against known expected-state outputs to see if everything works OK. Testing makes sure your code functions as expected. It treats your code as a black box, giving it input and verifying the output.
On the other hand, static code analysis analyses its aspects such as readability, consistency, error handling, type checking, and alignment with best practices. Static analysis is not primarily concerned with whether your code provides the expected output but rather with how the code itself is written. It’s an analysis of the quality of source code, not its functionality.
To summarise, testing checks if your code works or not, whereas static analysis checks if it is written well or not. Testing and static analysis are complementary to each other, and you should ideally be employing a healthy mix of both in your projects.
Any tool that reads the source code, parses it, and suggests improvements is a static code analyzer. There are many tools that fall under the umbrella term of static code analyzers, from linters and formatters to vulnerability scanners and PR reviewers. Let’s go over the main reasons why you should use these in your workflow.
Ask any developer, and they’ll corroborate that code reviews are essential. A second pair of eyes can discover issues in your code you probably never could. They might quite possibly suggest better ways to accomplish the task too. Sometimes reading other people’s code can teach the reviewer about some obscure useful functionality that’s already built into the project. Both the reviewer or the reviewee (which might not be a real word but one I will use nonetheless) learn something in the process.
But what’s better than one person reviewing your code? How about every open-source developer reviewing it! Static analyzers are powered by a vast library of open-source rules, which means that everyone who has contributed to the tool has indirectly reviewed your code. This makes it very hard to subtle bugs that a couple of human reviewers could miss, to slip by.
People make mistakes. Only 15% of codebases that install JSHint, a popular code-review tool for JavaScript, pass without issues. That just goes to show how vital it is to have some computer eyes review your code as well.
Example:
Consider this program for letting the user pick their favourite fruit. If you don’t choose, ‘Mango’ is the default.
let fruits = ['Apple', 'Banana', 'Cherry', 'Mango']
function getFruit(index) {
index = index || 3 // Everybody likes mangoes
return fruits[index]
}
This code works. For all inputs other than 0
that is. If you aren’t very thorough, your tests will also pass without a single hiccup.
getFruit() // Mango
getFruit(2) // Cherry
getFruit(0) // Mango (expected Apple!)
Turns out, you can’t choose an apple in this program because 0
, like null
and undefined
is a falsy value. You should have used the null-coalescing operator (??
) instead, and a linter would have told you that.
let fruits = ['Apple', 'Banana', 'Cherry', 'Mango']
function getFruit(index) {
index = index ?? 3 // Everybody likes mangoes
return fruits[index]
}
Every developer writes code differently in their own personal style. But when many developers are working together, it is important that they write code in a consistent manner. That’s where a style guide comes in. Setting one up is the first step to writing consistent code, and its enforcement extremely important when working with other developers.
Enforcing a style guide is not a manual feat. No developer can be expected to remember hundreds of rules and check every line against each of them. Why not make the computer do it?
Every language that I’ve ever worked in has a linter written for it. JavaScript has ESLint; Python has Black, and Ruby has RuboCop. These linters do the simple job of making sure your code follows the prescribed set of style rules. A few linters like RuboCop also enforce good practices such as atomic functions and better variable names. Such hints are very often helpful in detecting and fixing bugs before they cause issues in production.
Example:
Consider the following JavaScript snippet where you print a fruit name from a list. The list remains unchanged throughout the program.
var fruits = ['Apple', 'Banana', 'Cherry', 'Mango']
console.log(fruits[0])
ESLint, if so configured, can make sure you use constants wherever possible to avoid side-effects in your code. It’s a good practice but easy to miss if you don’t have a linter.
const fruits = ['Apple', 'Banana', 'Cherry', 'Mango']
console.log(fruits[0])
Enforcing the use of const
and let
, which are block-scoped, over var
leads to programs that are easier to debug and is generally considered a good practice.
Another thing developers love is testing their code, making sure it holds up for various inputs. Practices like test-driven development emphasize the importance of testing the code that you write. But writing tests takes time and effort. It is hard to gauge every possible input and make sure your code holds up to that. Eventually, tests become too many and take hours to complete on larger codebases.
Static code analyzers do not suffer from this issue. You don’t need to write the tests; you can import entire libraries of presets. Additionally, static analyzers run incredibly fast as there is no code execution involved! In fact, many linters integrate with the editor and highlight issues with the code in real-time as you type.
Example:
Sometimes, real-time is just too fast.
Most static analyzers, especially linters and formatters, will not just point out issues but can also fix most of them for you. Linters like Black for Python and ESLint for JavaScript integrate with IDEs and can then automatically fix the edited files as soon as you save them.
This is extremely convenient because now, your code quality improves without you having to even consciously think about it. As developers, we’re spoilt for convenience, aren’t we?
Example:
ESLint has the --fix
flag that fixes common issues like unnecessary semicolons, trailing spaces, and dangling commas.
Consider the same code snippet from the past few examples. (Here the · represents a space.)
var fruits = [
'Apple',
'Banana',
'Cherry',··
'Mango'
];
Run ESLint with the --fix
flag and moments later you have this.
const fruits = [
'Apple',
'Banana',
'Cherry',
'Mango',
]
Much better!
A bill of materials is generally used in supply chain management as the cost of just the raw materials that go into any product. A similar bill of materials is needed for software as well.
When you build an app, you inevitably use frameworks and tools that were built by other developers. In turn, those frameworks use frameworks built by other developers. And before you know it, setting up a simple Vue.js app can put thousands of packages in your node_modules/
directory.
This is the scary reality we live in. Packages built on top of packages. Each giant is standing on the shoulders of another. Your app is only as strong as its weakest dependency. Vulnerability scanners are another set of static analyzers that check every dependency in your dependency tree against an extensive database of vulnerabilities and exploits. All packages that have a known vulnerability are reported and can be updated with a single command.
Example:
GitHub provides dependency scanning with Dependabot. npm
also provides a vulnerability scan using the npm audit
command. Both Dependabot and npm audit
offers the ability to automatically update vulnerable packages to their patched versions.
Manual code reviews waste a lot of time. The person doing the review has to take time out of their own work to do the review, go through the code, and point out all the different places where it could be improved, both logically but also in the tiny details such as incorrect formatting or deviation from conventions and style guides. Then the reviewer has to make all the suggested changes and repeat the process.
Adding some linters, formatters, and spell checkers make the entire process much more streamlined. How so, you ask? First, a pre-commit hook will ensure that code is properly linted and formatted before getting checked-in to VCS. Second, project-level automation in the form of build pipelines or GitHub workflows will test the code quality on every commit and highlight issues on the PR itself. Third, the reviewer will be freed up to focus on the big picture because all the smaller things have already been handled before the PR makes it to a manual review.
No amount of code review by software can entire replace manual review. But a static scan before a manual review can easily augment both the reviewer’s experience by reducing their effort and getting the developer’s code reviewed by iterating on the smaller issues faster and more thoroughly than many rounds of manual reviews.
Now. Yes, that’s correct. I did say right now. Any later than right now is too late. You would have reached step two of ‘The How’ if I didn’t have to convince you as much.
Setting up is easy. Since we’ve repeatedly been talking about ESLint here, let’s just set it up in a sample project.
Make a new directory for your project. Enter the directory and initialise a Node.js package in the directory. The npm init
wizard asks you a series of questions. Once you’re done, you have a new Node.js package to work in.
$ mkdir wuphf.com
$ cd wuphf.com
$ npm init
Install ESLint. It’s too simple.
$ npm install eslint
Run the following command to bring up the ESLint wizard.
$ ./node_modules/.bin/eslint --init
This wizard asks a lot of questions about the how you will be using ESLint in the project. Make sure to choose the Airbnb ruleset. When the setup is complete, will will have a file .eslintrc.js
in the directory.
This file defines that the project will be running on Node.js and it will be building on top of the rules defined in the Airbnb style guide. Since we’re writing a console application, I can customise the rules and turn off the one that warns against it.
module.exports = {
env: {
es2021: true,
node: true,
},
extends: [
'airbnb-base',
],
parserOptions: {
ecmaVersion: 12,
},
overrides: [
{
files: ['*.js'],
rules: {
'no-console': 'off',
},
},
],
};
Commit this file into version control.
There you have it! All JS file in the project will now be continuously scanned by ESLint. I also recommend install Husky to run a lint job before every commit so that you never check in poor code into your VCS ever.
DeepSource is a static code analyzer that can find issues in the codebase and automatically submit PRs to fix them. It can even evaluate incoming code in PRs and fix them too. It’s wonderful how well it integrates with GitHub, GitLab, and Bitbucket.
You can set up DeepSource in a project by dropping a single TOML file named .deepsource.toml
in the root of the repo and it will pick the project up and start scanning. Most major languages are supported.
That is all. It’s really simple to statically analyze your code, and the benefits are so many that there’s no reason not to be doing it.
Have fun writing cleaner, safer, more readable, and more maintainable (simply put, better) code, and we’ll see you in the next one.