[wg-camlp4] On domain-specific foreign syntaxes

Gabriel Scherer gabriel.scherer at gmail.com
Thu Jan 31 13:48:31 GMT 2013


> When fragments of JSON/XML/etc are created programatically, this is often done "piece by piece",
> with very small fragments involving a high density of anti-quotations.
> The benefits of using the foreign syntax here are not clear to me.

This is an argument that I would very much like to be convinced of,
but that does not really match my experience. I have found this style
of piece-by-piece code difficult to understand and maintain, and when
I find the time to do things properly I very much try to have a
central piece of code that reflects the foreign structure (as a
quasiquotation when possible, or a dense expression using combinators
otherwise), preceded by the auxiliary definitions. (It's not always
possible or easy when you need data to flow from some part of the
generated AST to others).

As an example, you are probably familiar with the following piece of
code in the OCaml type-checker, that builds a non-trivial part of
typedtree in a semi-piecewise way to handle a corner case of optional
argument functions (and is probably going away someday), from L2784 to
L2813 here:
  https://github.com/ocaml/ocaml/blob/1b5d02346c8e477dbe38dd883d3b6f430924190d/typing/typecore.ml#L2784

This code is, quite frankly, quite painful to make sense of, and I
think this is in a large part related by the fact that it builds the
typedtree piecewise (relatively; it could be even more fine-grained)
instead of building the entire expression, that is (in approximative
Camlp{4,5} syntax)
  <:texp<
     let $x,tty$ = $texp$ in
     (fun $y,targ$ -> $x$ $list:zip labels none$ $y,targ$)
  >>;
that said, a smart combinator library could produce an equally (or
more, or less, depending on who you ask) readable example, for
example:
  tlet x tty (fun x ->
    tfun y targ (fun y ->
      tnapp x (zip labels none @ [y])))
and in this case the "domain-specific syntax" argument does not hold.


Anil's counter-point (converting specialized embeddings to build-time
code generation) is interesting as well. He did not (by modesty?) cite
his ocaml-orm-sqlite project (
https://github.com/avsm/ocaml-orm-sqlite ) which is doing much of the
same thing in the database field. Maybe my "Use cases for syntax
extensions" wikipage could be augmented with "Non-use cases for syntax
extensions" example?

On Thu, Jan 31, 2013 at 2:11 PM, Alain Frisch <alain.frisch at lexifi.com> wrote:
> On 01/31/2013 01:44 PM, Gabriel Scherer wrote:
>>
>> The point of quoting foreign languages is to translate them into a
>> form that exercize the OCaml type-checker to produce rich typing
>> information. With (parse_json "...") you have no static typing
>> information on the parsed value, while with a quotation you can expect
>> to get, say, a (< streetAdress : string > Json.t) expression.
>> Supporting antiquotations there allows you to write those foreign
>> values in a composable way, eg. using <:json< { streetAddress: $addr$
>> } >> with (addr : string Json.t).
>>
>> This is essentially a nice concrete syntax on top of the typed
>> combinators (point 1. in your message quoted below); indeed, good
>> extensions should embed as few domain knowledge as possible and defer
>> that to the pure-ocaml library providing the combinators.
>
>
> There are two very different topic here: syntax and type-checking.
>
>
> Clever type-checking of the DSL can be done either with combinators, and
> when this is not possible, it makes sense to have some "pre-processors"
> doing clever stuff like adding type annotations.  When we write:
>
>  (:json { streetAddress: "21 2snd Street" })
>
> or even:
>
>  (:json M["streetAdress", S "21 2snd Street"])
>
> this can be expanded to something more clever than just untyped
> constructors.  But there is no reason to force using the concrete syntax of
> the foreign language (which is not really true, because you want to allow
> anti-quotations which are not part of that syntax).  The syntax of OCaml is
> rich enough that a lot of other languages can be encoded in it.  The
> downside is that we need to make it clear syntactically that a fragment will
> be interpreted in a special way.  The upside is that no "parsing technology"
> is involved, which simplifies the design (choice of a syntax for
> anti-quotations), the implementation (writing a parser and calling back the
> OCaml parser on anti-quotations, with careful tracking of locations) and the
> user experience (support from editors, learning details of a concrete syntax
> [you might want to write JSON producers on an abstract level, and benefit
> from clever type-checking, but without knowing exactly how strings have to
> be escaped in concrete JSON]).
>
>
>> Furthermore, the bare approach of using combinators directly is not
>> always desirable in situations where you want the code to be used by
>> non-domain-experts. "Templates" in web programming are an entire
>> cottage industry made of preprocessing HTML-looking quasiquotations,
>> to help the role separation between designers and programmers.
>
>
> When non-programmers have to write some HTML code, this code rarely ends up
> in OCaml source code.  Usually, it stays in its own HTML file, which is then
> processed (statically or dynamically) by the (OCaml) code.
>
> When fragments of JSON/XML/etc are created programatically, this is often
> done "piece by piece", with very small fragments involving a high density of
> anti-quotations.  The benefits of using the foreign syntax here are not
> clear to me.
>
>
>
>> This is maybe less visible for non-programmable data description
>> languages such as JSON, but I suspect this still holds. For example,
>> if you distribute a protocol library built on top of JSON and bridging
>> over many programming languages, you want the same maintainer to be
>> able to quickly reflect changes to the protocol schema over all
>> implements. This is vastly easier if they respect the domain-specific
>> syntax instead of each having an embedding following the concrete
>> syntax of the corresponding language.
>
>
> I understand the theoretical argument, but I don't think it would apply in
> practice to many OCaml projects.
>
>
> Alain


More information about the wg-camlp4 mailing list