core_ideas (stanc.core

The Jane Street standard library

We use Jane Street's Core standard library. There are a few differences from the OCaml standard library which are instantly noticeable:

Most higher-order functions such as List.map take a named function argument.
This means a call like List.map square [1;2;3] will look like List.map ~f:square [1;2;3].
Core defaults to safe operations which return options rather than possibly erring.
In "normal" OCaml, List.hd has type 'a list -> 'a. A call List.hd [] will throw an exception. By contrast, in the Jane Street libraries, the same function has type 'a list -> 'a option.
Usually, a function with the suffix _exn recovers the original signature, e.g. List.hd_exn : 'a list -> 'a .

If for some reason you need functionality from the OCaml standard library that is not available in Jane Street (be sure to triple check), you can use the modules Stdlib and Caml to access the built-in versions. Currently there is only one such usage in the compiler, to use the standard definition of != in the Menhir parser.

There are a few other things we gain from these libraries. The most important idea to understand is deriving.

Deriving functions

If you look at a type declaration of something like Ast.typed_expression, you'll notice something curious after the declaration:

type typed_expression = (typed_expr_meta, fun_kind) expr_with
[@@deriving sexp, compare, map, hash, fold]

When using an editor that supports code completion, you may notice that the Ast module suggests functions which are not defined in the actual source text. This is because these functions are created at compile time by ppx_deriving.The above syntax [@@deriving ...] indicates which functions we would like to be generated.

These are very helpful - if a type derives hash, it can automatically be used as keys in Jane Street's hash tables, and a type which derives sexp can be serialized to Lisp-style S-Expressions. Most of our major types derive at least one function.

"Two Level Types"

The other curious thing about the types we define in our AST and MIR is that they use a trick known as "two level types". This allows re-using the same type structure with different amounts of metadata attached (e.g., before type checking we have an untyped_program which only has nodes and their source locations, after type checking we have typed_program which features nodes, locations, and types, in the same basic tree structure.

The way that this is implemented is a bit non-obvious at first. Essentially, many of the tree variant types are parameterized by something that ends up being a placeholder not for just metadata but for the recursive type including metadata, sometimes called the fixed point. So instead of recursively referencing expression, you would instead reference type parameter 'e, which will later be filled in with something like type expr_with_meta = metadata expression.

This takes some getting used to, and also can lead to some unhelpful type signatures in editors such as VSCode, because abbreviations are not always used in hover-over text. For example, Expr.Typed.t, the MIR's typed expression type, actually has a signature of Expr.Typed.Meta.t Expr.Fixed.t.

The `Fmt` library and pretty-printing

We extensively use the Fmt library for our pretty-printing and code generation. We have an existing guide on the wiki which covers the core ideas.

The Jane Street standard library

Deriving functions

"Two Level Types"

The Fmt library and pretty-printing

The `Fmt` library and pretty-printing