Theory Typeclass_Hierarchy
theory Typeclass_Hierarchy
imports Setup
begin
section ‹Introduction›
text ‹
The {Isabelle/HOL} type-class hierarchy entered the stage
in a quite ancient era -- first related ∗‹NEWS› entries date
back to release {Isabelle99-1}. Since then, there have been
ongoing modifications and additions, leading to ({Isabelle2016})
more than 180 type-classes. This sheer complexity makes access
and understanding of that type-class hierarchy difficult and
involved, let alone maintenance.
The purpose of this primer is to shed some light on this,
not only on the mere ingredients, but also on the design
principles which have evolved and proven useful over time.
›
section ‹Foundations›
subsection ‹Locales and type classes›
text ‹
Some familiarity with the Isabelle module system is assumed:
defining locales and type-classes, interpreting locales and
instantiating type-classes, adding relationships between
locales (@{command sublocale}) and type-classes
(@{command subclass}). Handy introductions are the
respective tutorials \<^cite>‹"isabelle-locale"›
\<^cite>‹"isabelle-classes"›.
›
subsection ‹Strengths and restrictions of type classes›
text ‹
The primary motivation for using type classes in {Isabelle/HOL}
always have been numerical types, which form an inclusion chain:
\begin{center}
\<^typ>‹nat› ‹⊏› \<^typ>‹int› ‹⊏› \<^typ>‹rat›
‹⊏› \<^typ>‹real› ‹⊏› \<^typ>‹complex›
\end{center}
\noindent The inclusion ‹⊏› means that any value of the numerical
type to the left hand side mathematically can be transferred
to the numerical type on the right hand side.
How to accomplish this given the quite restrictive type system
of {Isabelle/HOL}? Paulson \<^cite>‹"paulson-numerical"› explains
that each numerical type has some characteristic properties
which define an characteristic algebraic structure; ‹⊏›
then corresponds to specialization of the corresponding
characteristic algebraic structures. These algebraic structures
are expressed using algebraic type classes and embeddings
of numerical types into them:
\begin{center}\begin{tabular}{lccc}
\<^term>‹of_nat› ‹::› & \<^typ>‹nat› & ‹⇒› & @{typ [source] "'a::semiring_1"} \\
& ‹⊓› & & ‹↑› \\
\<^term>‹of_int› ‹::› & \<^typ>‹int› & ‹⇒› & @{typ [source] "'a::ring_1"} \\
& ‹⊓› & & ‹↑› \\
\<^term>‹of_rat› ‹::› & \<^typ>‹rat› & ‹⇒› & @{typ [source] "'a::field_char_0"} \\
& ‹⊓› & & ‹↑› \\
\<^term>‹of_real› ‹::› & \<^typ>‹real› & ‹⇒› & @{typ [source] "'a::real_algebra_1"} \\
& ‹⊓› \\
& \<^typ>‹complex›
\end{tabular}\end{center}
\noindent ‹d ← c› means that ‹c› is subclass of ‹d›.
Hence each characteristic embedding ‹of_num› can transform
any value of type ‹num› to any numerical type further
up in the inclusion chain.
This canonical example exhibits key strengths of type classes:
▪ Sharing of operations and facts among different
types, hence also sharing of notation and names: there
is only one plus operation using infix syntax ‹+›,
only one zero written ‹0›, and neutrality
(@{thm (frugal_sorts) add_0_left [all, no_vars]} and
@{thm (frugal_sorts) add_0_right [all, no_vars]})
is referred to
uniformly by names @{fact add_0_left} and @{fact add_0_right}.
▪ Proof tool setups are shared implicitly:
@{fact add_0} and @{fact add_0_right} are simplification
rules by default.
▪ Hence existing proofs about particular numerical
types are often easy to generalize to algebraic structures,
given that they do not depend on specific properties of
those numerical types.
Considerable restrictions include:
▪ Type class operations are restricted to one
type parameter; this is insufficient e.g. for expressing
a unified power operation adequately (see \secref{sec:power}).
▪ Parameters are fixed over the whole type class
hierarchy and cannot be refined in specific situations:
think of integral domains with a predicate \<^term>‹is_unit›;
for natural numbers, this degenerates to the much simpler
@{term [source] "HOL.equal (1::nat)"} but facts
refer to \<^term>‹is_unit› nonetheless.
▪ Type classes are not apt for meta-theory. There
is no practically usable way to express that the units
of an integral domain form a multiplicative group using
type classes. But see session ‹HOL-Algebra›
which provides locales with an explicit carrier.
›
subsection ‹Navigating the hierarchy›
text ‹
An indispensable tool to inspect the class hierarchy is
@{command class_deps} which displays the graph of classes,
optionally showing the logical content for each class also.
Optional parameters restrict the graph to a particular segment
which is useful to get a graspable view. See
the Isar reference manual \<^cite>‹"isabelle-isar-ref"› for details.
›
section ‹The hierarchy›
subsection ‹Syntactic type classes \label{sec:syntactic-type-class}›
text ‹
At the top of the hierarchy there are a couple of syntactic type
classes, ie. classes with operations but with no axioms,
most notably:
▪ @{command class} \<^class>‹plus› with @{term [source] "(a::'a::plus) + b"}
▪ @{command class} \<^class>‹zero› with @{term [source] "0::'a::zero"}
▪ @{command class} \<^class>‹times› with @{term [source] "(a::'a::times) * b"}
▪ @{command class} \<^class>‹one› with @{term [source] "1::'a::one"}
\noindent Before the introduction of the @{command class} statement in
Isabelle \<^cite>‹"Haftmann-Wenzel:2006:classes"› it was impossible
to define operations with associated axioms in the same class,
hence there were always pairs of syntactic and logical type
classes. This restriction is lifted nowadays, but there are
still reasons to maintain syntactic type classes:
▪ Syntactic type classes allow generic notation to be used
regardless of a particular logical interpretation; e.g.
although multiplication ‹*› is usually associative,
there are examples where it is not (e.g. octonions), and
leaving ‹*› without axioms allows to re-use this
syntax by means of type class instantiation also for such
exotic examples.
▪ Type classes might share operations but not necessarily
axioms on them, e.g. \<^term>‹gcd› (see \secref{sec:gcd}).
Hence their common base is a syntactic type class.
\noindent However syntactic type classes should only be used with striking
cause. Otherwise there is risk for confusion if the notation
suggests properties which do not hold without particular
constraints. This can be illustrated using numerals
(see \secref{sec:numerals}): @{lemma "2 + 2 = 4" by simp} is
provable without further ado, and this also meets the typical
expectation towards a numeral notation; in more ancient releases
numerals were purely syntactic and \<^prop>‹2 + 2 = 4› was
not provable without particular type constraints.
›
subsection ‹Additive and multiplicative semigroups and monoids›
text ‹
In common literature, notation for semigroups and monoids
is either multiplicative ‹(*, 1)› or additive
‹(+, 0)› with underlying properties isomorphic.
In {Isabelle/HOL}, this is accomplished using the following
abstract setup:
▪ A \<^locale>‹semigroup› introduces an abstract binary
associative operation.
▪ A \<^locale>‹monoid› is an extension of \<^locale>‹semigroup›
with a neutral element.
▪ Both \<^locale>‹semigroup› and \<^locale>‹monoid› provide
dedicated syntax for their operations ‹(❙*, ❙1)›.
This syntax is not visible on the global theory level
but only for abstract reasoning inside the respective
locale.
▪ Concrete global syntax is added building on existing
syntactic type classes \secref{sec:syntactic-type-class}
using the following classes:
▪ @{command class} \<^class>‹semigroup_mult› = \<^class>‹times›
▪ @{command class} \<^class>‹monoid_mult› = \<^class>‹one› + \<^class>‹semigroup_mult›
▪ @{command class} \<^class>‹semigroup_add› = \<^class>‹plus›
▪ @{command class} \<^class>‹monoid_add› = \<^class>‹zero› + \<^class>‹semigroup_add›
Locales \<^locale>‹semigroup› and \<^locale>‹monoid› are
interpreted (using @{command sublocale}) into their
corresponding type classes, with prefixes ‹add›
and ‹mult›; hence facts derived in \<^locale>‹semigroup›
and \<^locale>‹monoid› are propagated simultaneously to
∗‹both› using a consistent naming policy, ie.
▪ @{fact semigroup.assoc}: @{thm (frugal_sorts) semigroup.assoc [all, no_vars]}
▪ @{fact mult.assoc}: @{thm (frugal_sorts) mult.assoc [all, no_vars]}
▪ @{fact add.assoc}: @{thm (frugal_sorts) add.assoc [all, no_vars]}
▪ @{fact monoid.right_neutral}: @{thm (frugal_sorts) monoid.right_neutral [all, no_vars]}
▪ @{fact mult.right_neutral}: @{thm (frugal_sorts) mult.right_neutral [all, no_vars]}
▪ @{fact add.right_neutral}: @{thm (frugal_sorts) add.right_neutral [all, no_vars]}
▪ Note that the syntax in \<^locale>‹semigroup› and \<^locale>‹monoid›
is bold; this avoids clashes when writing properties
inside one of these locales in presence of that global
concrete type class syntax.
\noindent That hierarchy extends in a straightforward manner
to abelian semigroups and commutative monoids\footnote{The
designation ∗‹abelian› is quite standard concerning
(semi)groups, but not for monoids}:
▪ Locales \<^locale>‹abel_semigroup› and \<^locale>‹comm_monoid›
add commutativity as property.
▪ Concrete syntax emerges through
▪ @{command class} \<^class>‹ab_semigroup_add› = \<^class>‹semigroup_add›
▪ @{command class} \<^class>‹ab_semigroup_mult› = \<^class>‹semigroup_mult›
▪ @{command class} \<^class>‹comm_monoid_add› = \<^class>‹zero› + \<^class>‹ab_semigroup_add›
▪ @{command class} \<^class>‹comm_monoid_mult› = \<^class>‹one› + \<^class>‹ab_semigroup_mult›
and corresponding interpretation of the locales above, yielding
▪ @{fact abel_semigroup.commute}: @{thm (frugal_sorts) abel_semigroup.commute [all, no_vars]}
▪ @{fact mult.commute}: @{thm (frugal_sorts) mult.commute [all, no_vars]}
▪ @{fact add.commute}: @{thm (frugal_sorts) add.commute [all, no_vars]}
›
paragraph ‹Named collection of theorems›
text ‹
Locale interpretation interacts smoothly with named collections of
theorems as introduced by command @{command named_theorems}. In our
example, rules concerning associativity and commutativity are no
simplification rules by default since they desired orientation may
vary depending on the situation. However, there is a collection
@{fact ac_simps} where facts @{fact abel_semigroup.assoc},
@{fact abel_semigroup.commute} and @{fact abel_semigroup.left_commute}
are declared as members. Due to interpretation, also
@{fact mult.assoc}, @{fact mult.commute} and @{fact mult.left_commute}
are also members of @{fact ac_simps}, as any corresponding facts
stemming from interpretation of \<^locale>‹abel_semigroup›.
Hence adding @{fact ac_simps} to the simplification rules for
a single method call uses all associativity and commutativity
rules known by means of interpretation.
›
subsection ‹Additive and multiplicative groups›
text ‹
The hierarchy for inverse group operations takes into account
that there are weaker algebraic structures with only a partially
inverse operation. E. g. the natural numbers have bounded
subtraction \<^term>‹m - (n::nat)› which is only an inverse
operation if \<^term>‹m ≥ (n::nat)›; unary minus ‹-›
is pointless on the natural numbers.
Hence for both additive and multiplicative notation there
are syntactic classes for inverse operations, both unary
and binary:
▪ @{command class} \<^class>‹minus› with @{term [source] "(a::'a::minus) - b"}
▪ @{command class} \<^class>‹uminus› with @{term [source] "- a::'a::uminus"}
▪ @{command class} \<^class>‹divide› with @{term [source] "(a::'a::divide) div b"}
▪ @{command class} \<^class>‹inverse› = \<^class>‹divide› with @{term [source] "inverse a::'a::inverse"}
\\ and @{term [source] "(a::'a::inverse) / b"}
\noindent Here \<^class>‹inverse› specializes the ``partial'' syntax
@{term [source] "a div b"} to the more specific
@{term [source] "a / b"}.
Semantic properties are added by
▪ @{command class} \<^class>‹cancel_ab_semigroup_add› = \<^class>‹ab_semigroup_add› + \<^class>‹minus›
▪ @{command class} \<^class>‹cancel_comm_monoid_add› = \<^class>‹cancel_ab_semigroup_add› + \<^class>‹comm_monoid_add›
\noindent which specify a minimal binary partially inverse operation as
▪ @{fact add_diff_cancel_left'}: @{thm (frugal_sorts) add_diff_cancel_left' [all, no_vars]}
▪ @{fact diff_diff_add}: @{thm (frugal_sorts) diff_diff_add [all, no_vars]}
\noindent which in turn allow to derive facts like
▪ @{fact add_left_imp_eq}: @{thm (frugal_sorts) add_left_imp_eq [all, no_vars]}
\noindent The total inverse operation is established as follows:
▪ Locale \<^locale>‹group› extends the abstract hierarchy with
the inverse operation.
▪ The concrete additive inverse operation emerges through
▪ @{command class} \<^class>‹group_add› = \<^class>‹minus› + \<^class>‹uminus› + \<^class>‹monoid_add› (in \<^theory>‹HOL.Groups›) \\
▪ @{command class} \<^class>‹ab_group_add› = \<^class>‹minus› + \<^class>‹uminus› + \<^class>‹comm_monoid_add› (in \<^theory>‹HOL.Groups›)
and corresponding interpretation of locale \<^locale>‹group›, yielding e.g.
▪ @{fact group.left_inverse}: @{thm (frugal_sorts) group.left_inverse [all, no_vars]}
▪ @{fact add.left_inverse}: @{thm (frugal_sorts) add.left_inverse [all, no_vars]}
\noindent There is no multiplicative counterpart. Why? In rings,
the multiplicative group excludes the zero element, hence
the inverse operation is not total. See further \secref{sec:rings}.
›
paragraph ‹Mitigating against redundancy by default simplification rules›
text ‹
Inverse operations imposes some redundancy on the type class
hierarchy: in a group with a total inverse operation, the
unary operation is simpler and more primitive than the binary
one; but we cannot eliminate the binary one in favour of
a mere syntactic abbreviation since the binary one is vital
to express a partial inverse operation.
This is mitigated by providing suitable default simplification
rules: expression involving the unary inverse operation are
simplified to binary inverse operation whenever appropriate.
The rationale is that simplification is a central device in
explorative proving, where proof obligation remaining after certain
default proof steps including simplification are inspected
to get an idea what is missing to finish a proof. When
preferable normal forms are encoded into
default simplification rules, proof obligations after simplification
are normalized and hence more proof-friendly.
›
end