Abusing Flow comment syntax for great good
2018-05-23 · view article source
Flow is a static type system for JavaScript. Code written with Flow looks like normal JavaScript with extra type declarations and type annotations:
type ShoppingCartEntry = {productName: string, count: number};
function totalCount(cart: ShoppingCartEntry[]): number {
const counts: number[] = cart.map((entry) => entry.count);
return counts.reduce((a, b) => a + b, 0);
}
These addenda of course mean that the code is not itself valid JavaScript. Flow provides a preprocessor to strip away these annotations, but also offers an alternative comment syntax in case using a preprocessor would be undesirable for some reason or other:
/*:: type ShoppingCartEntry = {productName: string, count: number}; */
function totalCount(cart /*: ShoppingCartEntry[] */) /*: number */ {
const counts /*: number[] */ = cart.map((entry) => entry.count);
return counts.reduce((a, b) => a + b, 0);
}
The semantics are simple: any block comment that starts with ::
(a double colon) is treated as normal Flow code by the Flow parser, and for convenience any block comment that starts with :
(a single colon) is treated as normal Flow code that starts with a literal colon, so that type annotations can be written like (17 /*: number */)
instead of the more awkward (17 /*:: : number */)
.
This comment syntax is an entirely reasonable feature that we can abuse to create horrifying, devious contraptions. Sounds like fun!
(Note: All code in this post works on Flow v0.72. These techniques may well be patched in the future.)
Motivation: Incompleteness
Sometimes, we write code that is provably correct in a way that the type checker can’t infer. For instance, suppose that we have an array with elements of type Person | null
(“either Person
or null
”), where Person
is an object type with a string field called name
. We want to retrieve the names of all the people in the array, ignoring the null
elements. In plain JavaScript, we might write something like this:
/*:: type Person = {name: string, favoriteColor: string}; */
function peopleNames(maybePeople /*: (Person | null)[] */) /*: string[] */ {
return maybePeople
.filter((person) => person !== null)
.map((person) => person.name);
}
A human can look at this code and easily see that it returns a valid list of strings. But Flow can’t, for a fully understandable reason. Flow knows that filter
takes an array T[]
and a predicate (T) => boolean
, and returns a new array T[]
. However, Flow doesn’t understand the relationship between the inputs and the output—in particular, that every element in the output satisfies the predicate. So, as far as Flow is concerned, the result of the call to filter
might still contain null
elements, and in that case the expression person.name
would indeed be cause for alarm.
In situations like these, it is tempting to reach for the any
keyword: this is a magic type that is interconvertible with every type and for which all operations are permitted. In effect, it says that “anything goes” whenever a particular variable is involved. We can write:
function peopleNames(maybePeople: (Person | null)[]): string[] {
return maybePeople
.filter((person) => person !== null)
.map((person) => (person: any).name); // cast through `any`!
}
But here we are losing valuable type safety. We lose the ability to catch many potential errors in our code—for instance, a typo like person.nmae
would go completely undetected. We want to refine the type information, not throw it away.
We could give Flow a hint, by explicitly checking that each person in the filtered array is actually not null
:
function peopleNames(maybePeople: (Person | null)[]): string[] {
return maybePeople
.filter((person) => person !== null)
.map((person) => {
// Explicit assertion just to appease the typechecker.
if (person === null) {
throw new Error("Unreachable!");
}
// If we get here, `person` is non-null, so this next line is fine.
return person.name;
});
}
Flow is now happy to treat the argument to map
as a function taking Person | null
and returning string
, so this code type-checks and runs correctly. But this is not a great solution. Assertions like this make the code more verbose and harder to read, interrupting (ironically) a reader’s flow. Furthermore, writing code in anything other than the most natural way simply to appease tooling of any sort should always be a red flag: tools exist to help programmers, not hinder them, and if the tools are broken then they must be fixed.
Or: instead of fixing these tools, we can just lie to them.
White lies
Suppose that we had access to a function withoutNulls
that gave a copy of its input array with all null
elements removed. In that case, Flow would be satisfied by the following code:
function withoutNulls<T>(xs: (T | null)[]): T[] { /* implementation elided */ }
function peopleNames(maybePeople: (Person | null)[]): string[] {
let people = maybePeople.filter((person) => person !== null);
people = withoutNulls(people); // no-op
return people.map((person) => person.name);
}
Of course, we don’t actually want to call this function, and ideally we don’t even want the function to exist.
In fact, Flow makes it easy for us to declare that a function exists without providing its implementation, because this is commonly needed to talk about external library functions and the like. We can start with the following:
declare function withoutNulls<T>(xs: (T | null)[]): T[];
function peopleNames(maybePeople: (Person | null)[]): string[] {
let people = maybePeople.filter((person) => person !== null);
people = withoutNulls(people); // now fails at runtime: no such function
return people.map((person) => person.name);
}
Now, Flow is still happy, but our code will fail at runtime unless we actually provide an implementation of the withoutNulls
function. We need Flow to think that we’re calling this function without actually having to do so.
Behold:
declare function withoutNulls<T>(xs: (T | null)[]): T[];
function peopleNames(maybePeople: (Person | null)[]): string[] {
let people = maybePeople.filter((person) => person !== null);
/*:: people = withoutNulls(people); */ // ta-da!
return people.map((person) => person.name);
}
The comment syntax was designed to allow including Flow type annotations, declarations, and the like, but nothing stops us from including actual code! As far as Flow is concerned, the middle line of the function is just as real as the other two.
Now, for something a bit crazier.
Utter fabrications
Suppose that we have some code that requires a module of generated code: created at build time, say, or even at runtime. In JavaScript, it is perfectly fine to write
const frobnicateWidgets = require("./frobnicateWidgets");
as long as the module is available when the require
expression is evaluated. But such an import is of course incompatible with any static analysis. In particular, Flow will yield an error—“Cannot resolve module”—when the module in question has not yet been generated.
We can’t use exactly the same trick as before, wherein we performed some assertions that only Flow could see. The problem is that Flow knows what require
does—it loads a module. If we were in a context where require
were a normal function of appropriate type, then this wouldn’t be a problem.
And we can make it so:
const frobnicateWidgets =
/*:: ((require: any) => */ require("./frobnicateWidgets") /*:: )() */;
Here we see the return of any
. Within the body of this lambda expression—which only exists in Flow’s eyes!—require
is treated as a normal function that we call with a normal string to get back what we need.
We can even give the result a well-defined type so that code in the rest of the program continues to have statically strong types, instead of being polluted by the any
:
type WidgetFrobnicator = (Widget) => void; // whatever the module signature is
const frobnicateWidgets: WidgetFrobnicator =
/*:: ((require: any) => */ require("./frobnicateWidgets") /*:: )() */;
(This works because require
, at type any
, is treated as a function that also returns an any
, which is then converted to a WidgetFrobnicator
.)
In the peopleNames
example, we added some phantom statements to the body of a function. Here, we’re actually changing the structure of the AST. Dangerous? Perhaps. Brittle? Probably. Interesting? Certainly!
Conclusion
We have seen how to bend Flow to our will by splicing arbitrary code into its token stream.
Ridiculous as it seems, this method has some benefits. It’s more precise than using casts through any
. Using this method, we lie to Flow in a very specific and explicit way, instead of declaring that “all bets are off” for a particular variable and anything that it touches. Indeed, the keyword any
is itself a grand lie, just one that tends to be better documented and supported.
The observant reader may recall our motivating suggestion that an ideal solution should be unsurprising to readers and should be written like natural JavaScript code, and protest that we have failed on both these counts.
Such a reader is 100% correct, but is also no fun at parties, because this hack is way cooler than any “practical”, “enterprise-grade” solution—so there.