Type theory

Posted: 2016-07-31 , Modified: 2016-07-31

Tags: none

Hindley-Milner type system

The Hindley-Milner type system is a very nice point in the space of possible type systems because there is a reasonable algorithm to deduce the most general type of a program without type annotations (it is sound and complete).

To understand it, we need to understand

What rules make sense? \[ x:a,\quad f:a\to b\vdash f x:b\] And we need some kind of specialization \[ x : \forall a, F(a) \vdash x : F(a').\] We need lambda expressions too.

Ingredients

newtype Var = Var String
newtype TVar = TVar String

data Expr = ExprV Var -- x
	| App Expr Expr -- e_1 e_2
	| Lambda Var Expr -- \lambda x. e
	| Let Var Expr Expr -- let x = e_1 in e_2

data Mono = MonoV TVar -- a
	| TApp Type [Mono] -- D t_1 ..., here D is a parametric type with some number of arguments

data Poly = PolyM Mono -- t
	| Forall TypeVar Poly -- \forall a . s

(Note, “->” is a special case of TApp: TApp fun [a, b].)

Monotypes can only be one type (ex. Int -> [Int]) while polytypes can be many different types (ex. a -> [a], forall is implicit here).

We need to make a distinction between monotypes and polytypes because only monotypes can go in the forall.

We also need the notion of free variable. These are variables that have not been captured by a \(\forall\).

import Data.Set as S

freeM :: Mono -> S.Set TVar
freeM = \case
	MonoV t -> S.singleton t
	TApp _ ts -> S.unions (map freeM ts)

freeP :: Poly -> S.Set TVar
freeP = \case
	PolyM m s -> S.delete m (freeP si)

(Warning: in Haskell all type variables are implicitly bound, so free variables do not appear. See Ex. 1 in wikipedia.)

Next we need the notion of a context, which says what expressions are of what type. For example, it can say what types the variables are; in the inside of let we need to know what the context is to do typing.

data Bindings = Bind Var Poly -- x:s

type Context = S.Set Bindings

freeC :: Context -> S.Set Var
freeC ga = S.unions (map (\case Bind v s -> freeP s) (S.elems ga))
	

The polymorphic types form a partial order \(\si\sqsubseteq \si'\), \(\si\) is more special. Ex. Map Int Int\(\sqsubseteq\)Map Int v\(\sqsubseteq\)Map k v.

Axioms

\[\begin{align} \frac{x:\si\in \Ga}{\Ga\vdash x:\si}&& \text{[Var]}\\ \frac{\Ga\vdash e_0:\tau \to \tau'\quad \Ga\vdash e_1:\tau}{\Ga\vdash e_0 \,e_1:\tau'}&& \text{[App]}\\ \frac{\Ga\cup \{ x:\tau\} \vdash e:\tau'}{\Ga \vdash \lambda x.e:\tau \to \tau'} &&\text{[Abs]}\\ \frac{\Ga \vdash e_0:\si\quad \Ga\cup \{x:\si\}\vdash e_1:\tau}{\Ga \vdash \text{let }x=e_0\text{ in }e_1:\tau}&& \text{[Let]}\\ \frac{\Ga \vdash e:\si'\quad \si'\sqsubseteq \si}{\Ga \vdash e:\si}&&\text{[Inst]}\\ \frac{\Ga \vdash e:\si\quad \al\nin \text{free}(\Ga)}{\Ga \vdash e:\forall \al.\si}&&\text{[Gen]}. \end{align}\]

Abs is abstraction. Inst is instantiation. Note we add to the context when we go inside a lambda or a let. Gen then Inst together help specialize given information in context.

Subtlety: in let, variables enter in polymorphic form and can be specialized. Contrast \[ \la f. (f \,\text{true}, f \,0) \] with \[ \text{let } f = \la x. x\text{ in } (f\text{ true}, f \, 0). \] This is why let is NOT just syntactic sugar for \((\la x.e_2)\,e_1\); it genuinely adds expressivity.

Algorithm W

Algorithm is simple, but there’s a lot of things you have to define first (ex. substitution, instantiation).

First, define a unification algorithm. It takes expressions (AST’s) \(\si,\tau\) and returns a substitution (map) \(U\), such that for any substitution \(R\) unifying \(\si\) and \(\tau\), \(R=SU\). I.e., it gives the most general unification. (Unify by making more specific.)

Algorithm W: Given a context/type environment \(\ol p\) (map from strings to polytypes/schemes), and an expression \(e\), return a substitution and a typing for \(e\) and all subexpressions. (We will denote such a typing by \(\ol{e}_\si\) where \(\si\) is the type for \(e\), and \(\ol{\bullet}\) means that all subexpressions have been annotated.) If \(e\) is…

Note we don’t really need to keep track of the intermediate typings, just the substitutions.

Subtle point I’m still trying to get clear (ex. 1):

let bar [forall a. forall b. a -> (b -> a)] = \x ->
	let foo [forall b. b -> a] = \y -> x
	in foo
in bar

is the same as

\x -> (\y -> x)

right?

Bottom-up Algorithm W

data Constraint = EqC Poly Poly
	| InstM Poly (S.Set Mono) Poly
	| GenericInst Poly Poly

Generate the constraint set as follows. For an expression \(e\), if \(e\) is

Note that for the \(\le_M\) constraint, we need to keep a list of monomorphic variables \(M\) (corresponding to free—introduced in lambdas) as we recurse down the tree. (Things in lambdas DO NOT generalize, in \(\la x. e\), \(x\) can’t have two different types/interpretations in \(e\). Thus within the lambda expression, \(x\) is in the monomorphic set—you can’t do \(\forall x\).)

The bottom-up inference rules are different from the usual inference rules:

See p. 10 for the algorithm.

Lambda cube

https://en.wikipedia.org/wiki/Lambda_cube

3 dimensions:

Front:

* None Dependent types
Polymorphism F, \(\la2\) \(\la P2\)
None \(\la_{\to}\) LF, \(\la P\)

Back: (types depending on types)

* None Dependent types
Polymorphism \(F_\om\), \(\la \om\) CIC, \(\la P\om\)
None \(\la_\om\), \(\la\ul{\om}\) \(\la P \ul{\om}\)

Hindley-Milner is a subset of System F (in between \(\la_{\to}\) and \(F=\la 2\)). Haskell contains system F.

References:

Scratch