Theory Further

theory Further
imports Setup
begin

section ‹Further issues \label{sec:further}›

subsection ‹Incorporating generated code directly into the system runtime -- ‹code_reflect››

subsubsection ‹Static embedding of generated code into the system runtime›

text ‹
  The @{ML_antiquotation code} antiquotation is lightweight, but the generated code
  is only accessible while the ML section is processed.  Sometimes this
  is not appropriate, especially if the generated code contains datatype
  declarations which are shared with other parts of the system.  In these
  cases, @{command_def code_reflect} can be used:
›

code_reflect %quote Sum_Type
  datatypes sum = Inl | Inr
  functions "Sum_Type.sum.projl" "Sum_Type.sum.projr"

text ‹
  \noindent @{command code_reflect} takes a structure name and
  references to datatypes and functions; for these code is compiled
  into the named ML structure and the \emph{Eval} target is modified
  in a way that future code generation will reference these
  precompiled versions of the given datatypes and functions.  This
  also allows to refer to the referenced datatypes and functions from
  arbitrary ML code as well.

  A typical example for @{command code_reflect} can be found in the
  \<^theory>‹HOL.Predicate› theory.
›


subsubsection ‹Separate compilation›

text ‹
  For technical reasons it is sometimes necessary to separate
  generation and compilation of code which is supposed to be used in
  the system runtime.  For this @{command code_reflect} with an
  optional ⬚‹file_prefix› argument can be used:
›

code_reflect %quote Rat
  datatypes rat
  functions Fract
    "(plus :: rat ⇒ rat ⇒ rat)" "(minus :: rat ⇒ rat ⇒ rat)"
    "(times :: rat ⇒ rat ⇒ rat)" "(divide :: rat ⇒ rat ⇒ rat)"
  file_prefix rat

text ‹
  \noindent This generates the referenced code as logical files within the theory context,
  similar to @{command export_code}.
›

subsection ‹Specialities of the ‹Scala› target language \label{sec:scala}›

text ‹
  ‹Scala› deviates from languages of the ML family in a couple
  of aspects; those which affect code generation mainly have to do with
  ‹Scala›'s type system:

  \begin{itemize}

    \item ‹Scala› prefers tupled syntax over curried syntax.

    \item ‹Scala› sacrifices Hindely-Milner type inference for a
      much more rich type system with subtyping etc.  For this reason
      type arguments sometimes have to be given explicitly in square
      brackets (mimicking System F syntax).

    \item In contrast to ‹Haskell› where most specialities of
      the type system are implemented using \emph{type classes},
      ‹Scala› provides a sophisticated system of \emph{implicit
      arguments}.

  \end{itemize}

  \noindent Concerning currying, the ‹Scala› serializer counts
  arguments in code equations to determine how many arguments
  shall be tupled; remaining arguments and abstractions in terms
  rather than function definitions are always curried.

  The second aspect affects user-defined adaptations with @{command
  code_printing}.  For regular terms, the ‹Scala› serializer prints
  all type arguments explicitly.  For user-defined term adaptations
  this is only possible for adaptations which take no arguments: here
  the type arguments are just appended.  Otherwise they are ignored;
  hence user-defined adaptations for polymorphic constants have to be
  designed very carefully to avoid ambiguity.

  Note also that name clashes can occur when generating Scala code
  to case-insensitive file systems; these can be avoid using the
  ``‹(case_insensitive)›'' option for @{command export_code}.
›


subsection ‹Modules namespace›

text ‹
  When invoking the @{command export_code} command it is possible to
  leave out the @{keyword "module_name"} part; then code is
  distributed over different modules, where the module name space
  roughly is induced by the Isabelle theory name space.

  Then sometimes the awkward situation occurs that dependencies
  between definitions introduce cyclic dependencies between modules,
  which in the ‹Haskell› world leaves you to the mercy of the
  ‹Haskell› implementation you are using, while for ‹SML›/‹OCaml› code generation is not possible.

  A solution is to declare module names explicitly.  Let use assume
  the three cyclically dependent modules are named \emph{A}, \emph{B}
  and \emph{C}.  Then, by stating
›

code_identifier %quote
  code_module A ⇀ (SML) ABC
| code_module B ⇀ (SML) ABC
| code_module C ⇀ (SML) ABC

text ‹
  \noindent we explicitly map all those modules on \emph{ABC},
  resulting in an ad-hoc merge of this three modules at serialisation
  time.
›

subsection ‹Locales and interpretation›

text ‹
  A technical issue comes to surface when generating code from
  specifications stemming from locale interpretation into global
  theories.

  Let us assume a locale specifying a power operation on arbitrary
  types:
›

locale %quote power =
  fixes power :: "'a ⇒ 'b ⇒ 'b"
  assumes power_commute: "power x ∘ power y = power y ∘ power x"
begin

text ‹
  \noindent Inside that locale we can lift ‹power› to exponent
  lists by means of a specification relative to that locale:
›

primrec %quote powers :: "'a list ⇒ 'b ⇒ 'b" where
  "powers [] = id"
| "powers (x # xs) = power x ∘ powers xs"

lemma %quote powers_append:
  "powers (xs @ ys) = powers xs ∘ powers ys"
  by (induct xs) simp_all

lemma %quote powers_power:
  "powers xs ∘ power x = power x ∘ powers xs"
  by (induct xs)
    (simp_all del: o_apply id_apply add: comp_assoc,
      simp del: o_apply add: o_assoc power_commute)

lemma %quote powers_rev:
  "powers (rev xs) = powers xs"
    by (induct xs) (simp_all add: powers_append powers_power)

end %quote

text ‹
  After an interpretation of this locale (say, @{command_def
  global_interpretation} ‹fun_power:› @{term [source] "power (λn (f
  :: 'a ⇒ 'a). f ^^ n)"}), one could naively expect to have a constant ‹fun_power.powers :: nat list ⇒ ('a ⇒ 'a) ⇒ 'a ⇒ 'a› for which code
  can be generated.  But this not the case: internally, the term
  ‹fun_power.powers› is an abbreviation for the foundational
  term @{term [source] "power.powers (λn (f :: 'a ⇒ 'a). f ^^ n)"}
  (see \<^cite>‹"isabelle-locale"› for the details behind).

  Fortunately, a succint solution is available: a dedicated
  rewrite definition:
›

global_interpretation %quote fun_power: power "(λn (f :: 'a ⇒ 'a). f ^^ n)"
  defines funpows = fun_power.powers
  by unfold_locales
    (simp_all add: fun_eq_iff funpow_mult mult.commute)

text ‹
  \noindent This amends the interpretation morphisms such that
  occurrences of the foundational term @{term [source] "power.powers (λn (f :: 'a ⇒ 'a). f ^^ n)"}
  are folded to a newly defined constant \<^const>‹funpows›.

  After this setup procedure, code generation can continue as usual:
›

text %quote ‹
  @{code_stmts funpows constant: Nat.funpow funpows (Haskell)}
›


subsection ‹Parallel computation›

text ‹
  Theory ‹Parallel› in 🗀‹~~/src/HOL/Library› contains
  operations to exploit parallelism inside the Isabelle/ML
  runtime engine.
›

subsection ‹Imperative data structures›

text ‹
  If you consider imperative data structures as inevitable for a
  specific application, you should consider \emph{Imperative
  Functional Programming with Isabelle/HOL}
  \<^cite>‹"bulwahn-et-al:2008:imperative"›; the framework described there
  is available in session ‹Imperative_HOL›, together with a
  short primer document.
›

end