## A long story about MLF, part 2: System F and principality

### System F

System F is a different type system featuring *explicit*
polymorphism: the System F term language contains explicit
indications of which terms are polymorphic and when we’re taking
monomorphic instances of them.

```
e ::= system F expressions
| x, y, z variables
| λ(x:σ) e λ-abstraction
| e₁ e₂ application
| let x = e₁ in e₂ local definition
| Λ(α) e introduction
| e[σ] elimination
```

The following example declares the polymorphic identity function and uses it on the constant 5, in ML and System F:

```
let id = λ(x) x in id 5 (* ML *)
let id : ∀(α)α→α = Λ(α) λ(x:α) x in id[int] 5 (* System F *)
```

System F terms, being fully explicit about polymorphism, are much heavier to write and read. They can, however, express richer polymorphism than ML:

```
λ(f : ∀(α) α→α) (f[int] 5, f[bool] true)
```

This function takes a polymorphic parameter: its type is
`(∀(α) α→α) → (int*bool)`

. This was not possible in ML, were the types
on both sides of an arrow had to be monotypes. System F has
*higher-ranked types*:

```
σ ::= System F types
| α, β, γ type variables
| σ → σ function types
| ∀(α) σ type quantification
```

`let`

-bindings are usually not included in presentations of
System F. They’re useful here for homogeneity with the other term
languages presented — ML and MLF. It is interesting to note that,
similarly to ML, `let`

cannot be macro-expanded as in untyped
λ-calculus. It would be possible for an *annotated* let-binding:

```
let x:σ = e₁ in e₂ ⟶ (λ(x:σ) e₂) e₁
```

With the non-annotated `let`

, however, the type σ is not syntaxically
apparent. Note that this doesn’t introduce any impliciteness or
non-trivial inference in System F type system: System F terms have
unique types, and σ can thus be uniquely determined from `e₁`

.

### Inference and System F

Is it possible to keep the expressivity of System F, while not being so explicit about types and polymorphism?

If we drop type annotations completely from System F terms, including type introduction and elimiation, we exactly get back the ML terms as previously defined. But there are terms which have a System F type, while they are not typable in ML. Is there an algorithm to infer the System F type of a term? An easier question would be: given the System F type of a term, can we automatically recover type annotations, introductions and eliminations?

The answer to these questions is *no*. Those two questions — which are
confusingly named *type reconstruction* and *type inference*, or
*typability* and *type checking* — are undecidable : this has been
proved by J.B. Wells in the following article.

In other words, it is not possible to get the expressive power of
System F without using some type annotations. System F terms, using
explicit polymorphism, use *a lot* of type annotations that often feel
redundant. Is it possible to do better?

This question is an active research topic. The programming languages based on the ML type system have evolved and are looking for more expressivity. They have included extensions of the ML type system, allowing some restricted forms of higher-order polymorphism, available to the user that correctly inserts type annotations at dedicated places. The idea is to bridge the gap between ML and System F, by finding the right compromise between inference and polymorphism annotations.

#### Related work ?

A comprehensive overview of the topic should necessarily include
a *Related Works* section, with an overview of the other branch of
research in this area — on a personal note, the *Related Works*
section of a paper is the second thing I read, just after the
introduction.

However, I am not familiar enough with the research of these areas to
feel confident about writing such a review. In case you’re interested,
you should read the *Related Works* section of the Recasting MLF
paper by Didier Le Botlan and Didier Rémy, or the more detailed
section of Boris Yakbowski's PhD thesis. I was surprised by the
important involvement of Martin Odersky, which I only knew as the main
conceptor of the Scala language.

### Principality in rich type systems

Rich type systems make type inference harder: more program are
typable, which means more work to do for the inference
algorithm. However, such a monotonic relation does not hold for
principality: enriching the type system can make principality
*easier*.

The simply typed λ-calculus — which is essentially our presentation of
ML, restricted to monotypes only — has an easy type inference
algorithm based on first-order unification, but does not benefit from
principality. For example, the identity function (λ(x) x) admits any
monotype (τ→τ), none of them being more general^{א} than all others.

By enriching the type system with the polymorphic types of ML, we gain principality: the new type scheme ∀(α) α→α is able to express the commonality in all the derivable monotypes. Type quantification is exactly what we needed to express that, for a given term, some parts of the inferred types do not matter.

In other words, principality of type inference requires expressiveness
at the *type level*. During the type inference process on a term, one
often encounters different possibilities. If you cannot express the
alternatives in their full generality in the result type, you cannot
have principality.

### From System F to MLF

MLF is not a possibility in a continuum from ML to System F. It is an extension of System F, that regains principality by enriching the expressiveness of the type level.

Consider the following `choose`

function:

```
choose := (λ(x) λ(y) if true then x else y)
choose : ∀(α) α→α→α
```

If we pass it the identity function as argument, what is the type of the result?

```
id := λ(x) x
id : ∀(α) α→α
```

In ML, which does not allow polymorphism under arrows, the result has type:

```
σ₁ := ∀(β) (β→β) → (β→β)
```

But in system as expressive as System F, there is a second possibility:

```
σ₂ := (∀(β) β→β) → (∀(β) β→β)
```

Those two types are admissible, and none of them is more general from the other. We cannot have principality with System F type language.

It is quite clear than σ₂ is not an instance of σ₁: by going from an
inner quantification (σ₂) to an outer quantification (σ₁), we have
lost the information that the polymorphism is *local*.

What information have we lost in σ₂? It is sound to consider (∀(β) β→β)→(int→int) as an instance of σ₂. It is not sound, however, for (int→int)→(int→int). Indeed, terms with type σ₂ are functions that may use their argument polymorphically, we cannot pass them a (int→int).

The information we have lost is the information that the argument and
the result types are equal. When the type was written ∀(α)α→α→α, that
*equality* was made explicit by the use of the same variable α for all
components of the type. If we knew, somehow, that the instance and
argument types of a term of type (∀(β) β→β) → (∀(β) β→β) are equal,
then we could soundly instantiate it to (int→int)→(int→int).

MLF provides a new way to express equality information inside
types. The most general type for `choose id`

, expressing both local
polymorphism and subtypes equality, is written:

```
∀(α ≥ ∀(β)β→β) α → α
```

א: In the explanation of principality, I claim that the STλC does
not have principality, as it doesn’t have a principal type for
e.g. `λx.x`

. It is a somewhat controversial statement, that actually
depends on more details about STλC. In particular, the atomic types
may be considered either as a set of predefined, constant types
(`int`

, `bool`

…), or as free type variables (`X`

, `Y`

…). In the
second case, one could define an instance relation allowing to
instantiate free variables, and say that the type `X→X`

is a most
general type for the expression `λx.x`

.

I’m not sure there is a widely accepted consensus on that point. See
for example the intense debate on LtU. What I meant, anyway, is
that the most general type is not *internalized* in the language :
adding an explicit polymorphism construct `∀α.α→α`

allows one to
express this generality *inside* the language of types. This is the
very point I’m trying to make : type inference needs the ability to
express any such variability directly inside the language of types.