Rule Authoring 101
In this post, we’ll take a look at the process of writing new rules for Vale (the command-line tool) and Vale Server (the desktop application). The goal for this post is to supplement the existing documentation by covering possible areas of confusion and potential “gotchas.”
But first, we need to define a few terms that you’ll encounter throughout this post and other documentation:
- Vale: An open-source, command-line tool that brings code-like linting to prose.
- Check: Vale’s functionality is exposed through extensible “checks” that perform abstract tasks such as checking the length of certain text segments (such as sentences and paragraphs) or searching for a particular token in a text document.
- Rule: A “rule” is a check that has been given a concrete task — for example, ensuring that sentences contain no more than 25 words.
- Style: A “style” consists of multiple rules that come together to enforce the guidelines of a certain organization, guide, or application. Within a given style, rules are assigned an ID of the form
<style name>.<rule name>.
StylesPath: A directory containing all of a users styles.
- Vale Server: A commercial desktop application built on top of Vale that brings a refined experience for individual writers, including a fully managed
StylesPathand multiple third-party app plugins.
Topic 1: Overcoming the lack of look-around assertions
Vale is written in Go and uses the standard
regexp package to evaluate all of its regular expressions. In general, this should be inconsequential to rule authors —
regexp supports the same syntax as Google’s RE2 library and is similar to most scripting languages.
The one exception is that
regexp, like the aforementioned RE2 library, is non-backtracking:
As a matter of principle, RE2 does not support constructs for which only backtracking solutions are known to exist. Thus, backreferences and look-around assertions are not supported.
While this may seem like a significant loss (especially if we ignore the associated performance guarantees), there are a number of tricks to overcome the limitation.
The first variation of look-arounds are the positive look-ahead and look-behind assertions:
- look-ahead (
x(?=y)): “I want to match
xonly if it precedes
- look-behind (
(?<=y)x): “I want to match
xonly if it is preceded by
For Vale’s purposes, it’s rare to encounter a case that requires matching
y but it’s possible by defining a
message without any format specifiers (
This is definitely a contrived example but it illustrates the point: we’re matching
x = dialog only if it precedes
y = box without including
y in our message to the user.
The second variation of look-arounds are the negative look-ahead and look-behind assertions:
- look-ahead (
x(?!y)): “I want to match
xonly if it does not precede
- look-behind (
(?<!y)x): “I want to match
xonly if it is not preceded by
For Vale’s purposes, these cases can be emulated using the
Using our example from above, we’re now matching
x = dialog only when it does not precede
y = box.
Topic 2: Offering corrections with “actions” [Vale Server only]
While Vale is designed to only notify writers about style violations (typically in a CI or command-line environment), Vale Server is capable of also offering potential solutions.
It does this by offering a set of generic, built-in “actions” that can be extended in a rule’s YAML file — much like how a rule extends an existing check such as
substitution. There are currently 5 available actions that can do anything from suggest grammatical changes (powered by LanguageTool), offer spelling corrections, or perform arbitrary in-place edits to a token.
Let’s look at a real use case:
In the above rule (taken from the Microsoft style), we define an action with the name
edit and the parameter
.?!. This allows the text editor clients (Atom, Sublime Text, and VS Code) to offer an in-editor solution for removing punctuation from headings.