[wg-camlp4] On domain-specific foreign syntaxes

Alain Frisch alain at frisch.fr
Fri Feb 1 09:15:16 GMT 2013


On 02/01/2013 08:38 AM, Alain Frisch wrote:
> (In the Parsetree, we
> would have "Pexp_extension of expression * expression" instead of
> "Pexp_extension string * expression".)  This is to support passing
> arguments to the "expander".

After some more thought, I've changed my mind.

Following one of Hongbo's suggestions, let's try to think in terms of 
abstract syntax first to clarify the concepts.

I believe we need two "modalities" for generic placeholders in the AST: 
  annotations and extensions.  An attribute annotate an existing 
fragment of the AST and can technically be ignored simply by dropping 
it.  An extension is something which does not have any built-in meaning 
and has to be expanded during the AST transformation (or maybe during 
type-checking to support OCaml Templates-like processing).  Here is a 
minimalistic way to support this in Parsetree like:

  | Ptyp_annotation of core_type * annotation
  | Ptyp_extension of extension

  | Ppat_annotation of pattern * annotation
  | Ppat_extension of extension

  | Pmod_annotation of module_expr * annotation
  | Pmod_extension of extension

  | Ptype_annotation of type_kind * annotation
  | Ptype_extension of extension

  | Pexp_annotation of expression * annotation
  | Pexp_extension of extension

  | ...


We can choose:

type annotation = expression
and extension = expression

or maybe:

type annotation = string * expression
and extension = string * expression

I think I prefer to use bare expressions, and encode in them the 
"markers", because this does not force to hard-code the nature of 
markers. An argument in favor of keeping an explicit "string" is to 
define more explicitly the "namespace" of the extension, though.


I don't believe we should add a further distinction between the 
annotations which can actually be ignored by the compiler and those on 
which the type-checker must complain.  This can be left to a choice of
syntax which combines an annotation and an extension.

Similarly, I'm not sure we should hard-code/enforce the fact that an AST 
mapper should only be able to expand "under" extensions.  Extensions 
could also be used as markers which can trigger "local enough" 
rewriting. (See examples "Bolt" and "PG'OCaml" below.)


Only to fix the ideas, let's give some examples, assuming the following 
syntax:

   (# e)       ->  extension
   ... (+ e)   ->  annotation (with light postfix syntax)
   (@ e) ...   ->  annotation (with light prefix syntax)

   (@(e) ...)  ->  annotation (explicit scope, prefix syntax)

and maybe a derived "non-ignorable annotation":

   ... (& e)  ===  ... (+ (# e))

and syntactic variants such as:

   let(+e) p = ... === (@ e)(let p = ...)

and also something which combine an extension + quotation:

   (:id x[...]x)   === (# id {x{...}x})



Some examples:

   "Bisect"

         let f x =
           match List.map foo [x; a x; b x] with
           | [y1; y2; y3] -> tata
           | _ -> (@ Bisect.visit) assert false


         (@(Bisect.ignore)
           let unused = ()
         )


   "type-conv"

type t = {
   x : int (+ default 42);
   y : int (+ default 3) (+ sexp_drop_default);
   z : int (+ default 3) (+ sexp_drop_if z_test);
} (+ sexp)


   "map/fold generators"

type variable = string
  and term =
   | Var of variable
   | Lam of variable * term
   | App of term * term


class map = (# (Generate.map : term))


   "lwt"

let(+lwt) x = ...
and y = ...
in
try(+lwt) ...
with Killed -> ...


   "Bolt"

let funct n =
   (#LOG) "funct(%d)" n LEVEL DEBUG;
(*or: (#LOG "funct(%d)" n LEVEL DEBUG); *)
   for i = 1 to n do
     print_endline "..."
   done

   "bitstring"

let bits = Bitstring.bitstring_of_file "/bin/ls" in
match(&bitmatch) bits with
| (# [ 0x7f, 8; "ELF", 24, string; (* ELF magic number *)
        e_ident, 12*8, bitstring;    (* ELF identifier *)
        e_type, 16, littleendian;    (* object file type *)
        e_machine, 16, littleendian  (* architecture *)
   ]) ->
   printf "This is an ELF binary, type %d, arch %d\n"
     e_type e_machine;

(* here I've used (&...) instead of (+...) because the interpretation of 
the pattern really depends on this annotation *)
(*  note: without the 12*8 sub-expression, this would also have been
     a valid pattern and we could have avoided the extra (# ...) used
     here only to allow an expression in place of a pattern.  If we can
     somehow bring expressions and patterns closed in the syntax,
     we could avoid this extra little overhead. *)


   "PG'OCaml"

let fetch_users dbh =
   (#PGSQL dbh) "select id, name from users"
(*
   or: (#PGSQL) dbh "select id, name from users"
   or: (#PGSQL dbh "select id, name from users")
*)


  "Cass"

let button = (:css [
    .button {
      $Css.gradient ~low:color2 ~high:color1$;
      color: white;
      $Css.top_rounded$;
  ])



-- Alain


More information about the wg-camlp4 mailing list