A long story about MLF, part 2: System F and principality
System F is a different type system featuring explicit polymorphism: the System F term language contains explicit indications of which terms are polymorphic and when we’re taking monomorphic instances of them.
e ::= system F expressions | x, y, z variables | λ(x:σ) e λ-abstraction | e₁ e₂ application | let x = e₁ in e₂ local definition | Λ(α) e introduction | e[σ] elimination
The following example declares the polymorphic identity function and uses it on the constant 5, in ML and System F:
let id = λ(x) x in id 5 (* ML *) let id : ∀(α)α→α = Λ(α) λ(x:α) x in id[int] 5 (* System F *)
System F terms, being fully explicit about polymorphism, are much heavier to write and read. They can, however, express richer polymorphism than ML:
λ(f : ∀(α) α→α) (f[int] 5, f[bool] true)
This function takes a polymorphic parameter: its type is
(∀(α) α→α) → (int*bool). This was not possible in ML, were the types
on both sides of an arrow had to be monotypes. System F has
σ ::= System F types | α, β, γ type variables | σ → σ function types | ∀(α) σ type quantification
let-bindings are usually not included in presentations of
System F. They’re useful here for homogeneity with the other term
languages presented — ML and MLF. It is interesting to note that,
similarly to ML,
let cannot be macro-expanded as in untyped
λ-calculus. It would be possible for an annotated let-binding:
let x:σ = e₁ in e₂ ⟶ (λ(x:σ) e₂) e₁
With the non-annotated
let, however, the type σ is not syntaxically
apparent. Note that this doesn’t introduce any impliciteness or
non-trivial inference in System F type system: System F terms have
unique types, and σ can thus be uniquely determined from
Inference and System F
Is it possible to keep the expressivity of System F, while not being so explicit about types and polymorphism?
If we drop type annotations completely from System F terms, including type introduction and elimiation, we exactly get back the ML terms as previously defined. But there are terms which have a System F type, while they are not typable in ML. Is there an algorithm to infer the System F type of a term? An easier question would be: given the System F type of a term, can we automatically recover type annotations, introductions and eliminations?
The answer to these questions is no. Those two questions — which are confusingly named type reconstruction and type inference, or typability and type checking — are undecidable : this has been proved by J.B. Wells in the following article.
In other words, it is not possible to get the expressive power of System F without using some type annotations. System F terms, using explicit polymorphism, use a lot of type annotations that often feel redundant. Is it possible to do better?
This question is an active research topic. The programming languages based on the ML type system have evolved and are looking for more expressivity. They have included extensions of the ML type system, allowing some restricted forms of higher-order polymorphism, available to the user that correctly inserts type annotations at dedicated places. The idea is to bridge the gap between ML and System F, by finding the right compromise between inference and polymorphism annotations.
Related work ?
A comprehensive overview of the topic should necessarily include a Related Works section, with an overview of the other branch of research in this area — on a personal note, the Related Works section of a paper is the second thing I read, just after the introduction.
However, I am not familiar enough with the research of these areas to feel confident about writing such a review. In case you’re interested, you should read the Related Works section of the Recasting MLF paper by Didier Le Botlan and Didier Rémy, or the more detailed section of Boris Yakbowski's PhD thesis. I was surprised by the important involvement of Martin Odersky, which I only knew as the main conceptor of the Scala language.
Principality in rich type systems
Rich type systems make type inference harder: more program are typable, which means more work to do for the inference algorithm. However, such a monotonic relation does not hold for principality: enriching the type system can make principality easier.
The simply typed λ-calculus — which is essentially our presentation of ML, restricted to monotypes only — has an easy type inference algorithm based on first-order unification, but does not benefit from principality. For example, the identity function (λ(x) x) admits any monotype (τ→τ), none of them being more generalא than all others.
By enriching the type system with the polymorphic types of ML, we gain principality: the new type scheme ∀(α) α→α is able to express the commonality in all the derivable monotypes. Type quantification is exactly what we needed to express that, for a given term, some parts of the inferred types do not matter.
In other words, principality of type inference requires expressiveness at the type level. During the type inference process on a term, one often encounters different possibilities. If you cannot express the alternatives in their full generality in the result type, you cannot have principality.
From System F to MLF
MLF is not a possibility in a continuum from ML to System F. It is an extension of System F, that regains principality by enriching the expressiveness of the type level.
Consider the following
choose := (λ(x) λ(y) if true then x else y) choose : ∀(α) α→α→α
If we pass it the identity function as argument, what is the type of the result?
id := λ(x) x id : ∀(α) α→α
In ML, which does not allow polymorphism under arrows, the result has type:
σ₁ := ∀(β) (β→β) → (β→β)
But in system as expressive as System F, there is a second possibility:
σ₂ := (∀(β) β→β) → (∀(β) β→β)
Those two types are admissible, and none of them is more general from the other. We cannot have principality with System F type language.
It is quite clear than σ₂ is not an instance of σ₁: by going from an inner quantification (σ₂) to an outer quantification (σ₁), we have lost the information that the polymorphism is local.
What information have we lost in σ₂? It is sound to consider (∀(β) β→β)→(int→int) as an instance of σ₂. It is not sound, however, for (int→int)→(int→int). Indeed, terms with type σ₂ are functions that may use their argument polymorphically, we cannot pass them a (int→int).
The information we have lost is the information that the argument and the result types are equal. When the type was written ∀(α)α→α→α, that equality was made explicit by the use of the same variable α for all components of the type. If we knew, somehow, that the instance and argument types of a term of type (∀(β) β→β) → (∀(β) β→β) are equal, then we could soundly instantiate it to (int→int)→(int→int).
MLF provides a new way to express equality information inside
types. The most general type for
choose id, expressing both local
polymorphism and subtypes equality, is written:
∀(α ≥ ∀(β)β→β) α → α
א: In the explanation of principality, I claim that the STλC does
not have principality, as it doesn’t have a principal type for
λx.x. It is a somewhat controversial statement, that actually
depends on more details about STλC. In particular, the atomic types
may be considered either as a set of predefined, constant types
bool…), or as free type variables (
Y…). In the
second case, one could define an instance relation allowing to
instantiate free variables, and say that the type
X→X is a most
general type for the expression
I’m not sure there is a widely accepted consensus on that point. See
for example the intense debate on LtU. What I meant, anyway, is
that the most general type is not internalized in the language :
adding an explicit polymorphism construct
∀α.α→α allows one to
express this generality inside the language of types. This is the
very point I’m trying to make : type inference needs the ability to
express any such variability directly inside the language of types.