\chapter{Polymorphic variants} \label{c:poly-variant}
%HEVEA\cutname{polyvariant.html}
{\it (Chapter written by Jacques Garrigue)}

Variants as presented in section~\ref{s:tut-recvariants} are a
powerful tool to build data structures and algorithms. However they
sometimes lack flexibility when used in modular programming. This is
due to the fact that every constructor is assigned to a unique type
when defined and used. Even if the same name appears in the definition
of multiple types, the constructor itself belongs to only one type.
Therefore, one cannot decide that a given constructor belongs to
multiple types, or consider a value of some type to belong to some
other type with more constructors.

With polymorphic variants, this original assumption is removed. That
is, a variant tag does not belong to any type in particular, the type
system will just check that it is an admissible value according to its
use. You need not define a type before using a variant tag. A variant
type will be inferred independently for each of its uses.

\section{s:polyvariant:basic-use}{Basic use}

In programs, polymorphic variants work like usual ones. You just have
to prefix their names with a backquote character "`".
\begin{caml_example}{toplevel}
[`On; `Off];;
`Number 1;;
let f = function `On -> 1 | `Off -> 0 | `Number n -> n;;
List.map f [`On; `Off];;
\end{caml_example}
"[>`Off|`On] list" means that to match this list, you should at
least be able to match "`Off" and "`On", without argument.
"[<`On|`Off|`Number of int]" means that "f" may be applied to "`Off",
"`On" (both without argument), or "`Number" $n$ where
$n$ is an integer.
The ">" and "<" inside the variant types show that they may still be
refined, either by defining more tags or by allowing less. As such, they
contain an implicit type variable. Because each of the variant types
appears only once in the whole type, their implicit type variables are
not shown.

The above variant types were polymorphic, allowing further refinement.
When writing type annotations, one will most often describe fixed
variant types, that is types that cannot be refined. This is
also the case for type abbreviations. Such types do not contain "<" or
">", but just an enumeration of the tags and their associated types,
just like in a normal datatype definition.
\begin{caml_example}{toplevel}
type 'a vlist = [`Nil | `Cons of 'a * 'a vlist];;
let rec map f : 'a vlist -> 'b vlist = function
  | `Nil -> `Nil
  | `Cons(a, l) -> `Cons(f a, map f l)
;;
\end{caml_example}

\section{s:polyvariant-advanced}{Advanced use}

Type-checking polymorphic variants is a subtle thing, and some
expressions may result in more complex type information.

\begin{caml_example}{toplevel}
let f = function `A -> `C | `B -> `D | x -> x;;
f `E;;
\end{caml_example}
Here we are seeing two phenomena. First, since this matching is open
(the last case catches any tag), we obtain the type "[> `A | `B]"
rather than "[< `A | `B]" in a closed matching. Then, since "x" is
returned as is, input and return types are identical. The notation "as
'a" denotes such type sharing. If we apply "f" to yet another tag
"`E", it gets added to the list.

\begin{caml_example}{toplevel}
let f1 = function `A x -> x = 1 | `B -> true | `C -> false
let f2 = function `A x -> x = "a" | `B -> true ;;
let f x = f1 x && f2 x;;
\end{caml_example}
Here "f1" and "f2" both accept the variant tags "`A" and "`B", but the
argument of "`A" is "int" for "f1" and "string" for "f2". In "f"'s
type "`C", only accepted by "f1", disappears, but both argument types
appear for "`A" as "int & string". This means that if we
pass the variant tag "`A" to "f", its argument should be {\em both}
"int" and "string". Since there is no such value, "f" cannot be
applied to "`A", and "`B" is the only accepted input.

Even if a value has a fixed variant type, one can still give it a
larger type through coercions. Coercions are normally written with
both the source type and the destination type, but in simple cases the
source type may be omitted.
\begin{caml_example}{toplevel}
type 'a wlist = [`Nil | `Cons of 'a * 'a wlist | `Snoc of 'a wlist * 'a];;
let wlist_of_vlist  l = (l : 'a vlist :> 'a wlist);;
let open_vlist l = (l : 'a vlist :> [> 'a vlist]);;
fun x -> (x :> [`A|`B|`C]);;
\end{caml_example}

You may also selectively coerce values through pattern matching.
\begin{caml_example}{toplevel}
let split_cases = function
  | `Nil | `Cons _ as x -> `A x
  | `Snoc _ as x -> `B x
;;
\end{caml_example}
When an or-pattern composed of variant tags is wrapped inside an
alias-pattern, the alias is given a type containing only the tags
enumerated in the or-pattern. This allows for many useful idioms, like
incremental definition of functions.

\begin{caml_example}{toplevel}
let num x = `Num x
let eval1 eval (`Num x) = x
let rec eval x = eval1 eval x ;;
let plus x y = `Plus(x,y)
let eval2 eval = function
  | `Plus(x,y) -> eval x + eval y
  | `Num _ as x -> eval1 eval x
let rec eval x = eval2 eval x ;;
\end{caml_example}

To make this even more comfortable, you may use type definitions as
abbreviations for or-patterns. That is, if you have defined "type
myvariant = [`Tag1 of int | `Tag2 of bool]", then the pattern "#myvariant" is
equivalent to writing "(`Tag1(_ : int) | `Tag2(_ : bool))".
\begin{caml_eval}
type myvariant = [`Tag1 of int | `Tag2 of bool];;
\end{caml_eval}

Such abbreviations may be used alone,
\begin{caml_example}{toplevel}
let f = function
  | #myvariant -> "myvariant"
  | `Tag3 -> "Tag3";;
\end{caml_example}
or combined with with aliases.
\begin{caml_example}{toplevel}
let g1 = function `Tag1 _ -> "Tag1" | `Tag2 _ -> "Tag2";;
let g = function
  | #myvariant as x -> g1 x
  | `Tag3 -> "Tag3";;
\end{caml_example}

\section{s:polyvariant-weaknesses}{Weaknesses of polymorphic variants}

After seeing the power of polymorphic variants, one may wonder why
they were added to core language variants, rather than replacing them.

The answer is twofold. The first aspect is that while being pretty
efficient, the lack of static type information allows for less
optimizations, and makes polymorphic variants slightly heavier than
core language ones. However noticeable differences would only
appear on huge data structures.

More important is the fact that polymorphic variants, while being
type-safe, result in a weaker type discipline. That is, core language
variants do actually much more than ensuring type-safety, they also
check that you use only declared constructors, that all constructors
present in a data-structure are compatible, and they enforce typing
constraints to their parameters.

For this reason, you must be more careful about making types explicit
when you use polymorphic variants. When you write a library, this is
easy since you can describe exact types in interfaces, but for simple
programs you are probably better off with core language variants.

Beware also that some idioms make trivial errors very hard to find.
For instance, the following code is probably wrong but the compiler
has no way to see it.
\begin{caml_example}{toplevel}
type abc = [`A | `B | `C] ;;
let f = function
  | `As -> "A"
  | #abc -> "other" ;;
let f : abc -> string = f ;;
\end{caml_example}
You can avoid such risks by annotating the definition itself.
\begin{caml_example}{toplevel}[error]
let f : abc -> string = function
  | `As -> "A"
  | #abc -> "other" ;;
\end{caml_example}
