From ML to Rust to guji: a lineage of type systems and pattern matching

How algebraic data types, exhaustive pattern matching, and type-directed error handling travelled from OCaml's research roots through Rust's systems pragmatism into guji's text-first, one-obvious-way design.

OCamlRustguji

From ML to Rust to guji: a lineage of type systems and pattern matching

Some ideas are so good that they keep getting reinvented until everyone agrees they were obvious all along. The algebraic data type — a value that is exactly one of several labelled shapes — together with exhaustive pattern matching is one of those ideas. It was born in the ML family, hardened in OCaml, smuggled into systems programming by Rust, and arrives, polished and opinionated, in guji. This is the story of that idea across three languages.

OCaml: the research bloodline

OCaml descends directly from ML, the Meta Language Robin Milner built in the early 1970s for the LCF theorem prover. Its lineage at INRIA runs Caml Light (1990) → Caml Special Light (1995) → Objective Caml 1.00, announced on 9 May 1996 by Xavier Leroy, Jérôme Vouillon, Damien Doligez, and Didier Rémy. From ML it inherited the two pillars that define this whole family: Hindley–Milner type inference, which lets the compiler reconstruct nearly every type without annotations, and algebraic data types read out via pattern matching.

type shape =
  | Circle of float
  | Rect   of float * float

let area = function
  | Circle r      -> 3.14159 *. r *. r
  | Rect (w, h)   -> w *. h

Two things are quietly radical here. First, no type annotations appear, yet area is fully, statically typed: inference does the bookkeeping. Second, the compiler checks the match for exhaustiveness — drop the Rect arm and you get a warning that a case is unhandled. The data and the code that takes it apart are kept honest by the type system. OCaml also leaned hard on option instead of null and a powerful module system from Standard ML, ideas that took the wider industry another two decades to adopt.

Rust: the idea goes to work

Rust took the ML family's algebraic toolbox out of the research lab and into systems programming. Graydon Hoare's project reached its 1.0 release on 15 May 2015, and its enum is a true sum type in the ML tradition — only the syntax puts on a C-shaped coat:

enum Shape {
    Circle(f64),
    Rect(f64, f64),
}

fn area(s: &Shape) -> f64 {
    match s {
        Shape::Circle(r)    => 3.14159 * r * r,
        Shape::Rect(w, h)   => w * h,
    }
}

match is still exhaustive — forget a variant and the program will not compile — and inference, though local rather than whole-program, still spares you most annotations. Rust's headline contribution, ownership and borrowing, is orthogonal to all this; what matters for our lineage is what Rust did with two ordinary library enums. Option<T> retired the null pointer, and Result<T, E> made fallibility a value you must acknowledge:

fn parse_age(s: &str) -> Result<i64, String> {
    let n: i64 = s.parse().map_err(|_| "not a number".to_string())?;
    if n >= 0 { Ok(n) } else { Err("age must be non-negative".into()) }
}

That trailing ? is the punchline. It unwraps an Ok or short-circuits the function with the Err, turning the old chain of manual error checks into a flat, readable line. Rust proved that ML's type-driven discipline was not a luxury for proof assistants but a practical way to make fast software that does not crash.

guji: opinion as a feature

guji picks up the torch and adds a thesis: one obvious way. Where Perl revelled in plurality — Larry Wall, defending the slogan as far back as a 1990 Usenet post, admitted, "Although the Perl Slogan is There's More Than One Way to Do It, I hesitate to make 10 ways to do something" — guji inverts the motto. For any given task there is meant to be exactly one idiomatic construct. The algebraic-data-type machinery survives the cut intact, because it earns its keep.

guji's enum and match are the family resemblance, sigils and all (bindings carry $, @, % to declare their shape):

enum Shape {
    Circle($radius: Float)
    Rect($width: Float, $height: Float)
}

sub area($s: Shape): Float {
    match $s {
        Circle($r)   { 3.14159 * $r * $r }
        Rect($w, $h) { $w * $h }
    }
}

Run through the v0 evaluator, area(Circle(2.0)) yields 12.56636 and area(Rect(3.0, 4.0)) yields 12 — and, exactly as in OCaml and Rust, the compiler rejects a non-exhaustive match, naming the case you missed. Guards ride along on the same arms:

sub classify($n: Int): Str {
    match $n {
        0            { "zero" }
        $x if $x < 0 { "negative" }
        _            { "positive" }
    }
}

Error handling reads almost like Rust's, because the good idea needs no improving. guji has no exceptions; absence and failure are the standard sum types Option[T] and Result[T, E], and the postfix ? propagates an early return:

sub parse_age($s: Str): Result[Int, Str] {
    $n = parse_int($s)?
    if $n >= 0 { Ok($n) } else { Err("age must be non-negative") }
}

Feed it "42" and you get ok: 42; feed it "nope" and the ? carries the parser's Err straight out as err: invalid integer: nope. Same railway, same type-checked guarantees, only the boilerplate is gone.

What makes guji more than a tidier Rust is where it points the family's tools. In OCaml and Rust, text processing is a library afterthought; in guji it is the signature primitive. Regular expressions are a built-in Regex type with a match operator ~~ that yields — what else — an Option[Match], so the very same pattern-matching reflex handles parsing:

match $line ~~ /(?<user>\w+)@(?<host>\w+)/ {
    Some($m) { print("user: { $m<user>.unwrap_or('?') }") }
    None     { print("no match") }
}

Run against 'ada@example.com' that prints user: ada. Above regexes sit first-class PEG grammars (grammar, rule, token), whose parse returns a Bush parse tree you walk with — naturally — match. The ML family's discipline, turned on the one problem those languages always left to libraries.

The through-line

Three languages, thirty years, one idea refined at each step. OCaml proved that algebraic data types plus exhaustive matching plus inference make a coherent, provably-sound core. Rust shipped that core into systems programming and showed the world that Option and Result could replace null pointers and exceptions in production. guji opinionated it — immutable by default, one obvious way, the whole apparatus aimed squarely at text — and verified, in a v0 tree-walking evaluator you can run today, that the old ML guarantees still hold when you bend them toward parsing. The syntax keeps changing its clothes. The idea underneath — make the compiler check that you have handled every case — has been right the whole time.