[wg-camlp4] My uses of syntax extension

Alain Frisch alain at frisch.fr
Mon Jan 28 10:57:28 GMT 2013

Hi John,

Thanks for sharing this information!

On 01/25/2013 12:03 AM, Harrison, John R wrote:
> 1. In the code for my book (http://www.cl.cam.ac.uk/~jrh13/atp/) I use
>     items entered in <<quotations>>, which I simply want to treat as a
>     string to get passed to my parser. This is essentially trivial in
>     camlp5, though I think it is less so in camlp4.
>     let quotexpander s =
>       .... "default_parser \""^(String.escaped s)^"\"";;
>     Quotation.add "" (Quotation.ExStr (fun x -> quotexpander));;

It is planned to extend OCaml grammar directly with generic quotations 
(in place of expressions, patterns, etc).  The type-checker would fail 
on them, but -ppx filters will have a chance to rewrite them to 
something the compiler understands.  It is not completely clear yet 
which syntax (delimiters, lexing rules) will be chosen, though. 
Delimiters will probably be different from Camlp4, to make it possible 
to combine Camlp4 and -ppx.

*If* you don't care too much about having to escape some characters in 
the content of the quotations, a poor-man version is to rely on standard 
strings.  For instance, a -ppx filter can detect the construction:


(i.e. the bang character followed by a string literal) and rewrites it 
to whatever it wants.  But given your example (where the quotation 
expands to a simple function call on the string literal), I assume that 
you use quotations precisely for them lexical properties.  Is that right?

> 2. For the HOL Light prover (http://code.google.com/p/hol-light/),
>     backward compatibility with other HOLs (where ML is used not only
>     as the implementation language but also as the toplevel) dictates
>     that I do various things to make OCaml look like old-style ML:
>         * Add some alphanumeric infixes like "o" and "THEN"
>         * Use "it" as the default binding for the previous expression
>           (actually, I would prefer the toplevel to do this anyway).
>         * Use quotations, but within the characters `....` not <<...>>
>         * Completely change the OCaml lexical conventions so that
>           all-uppercase words like INST and REWRITE_TAC are treated as
>           ordinary identifiers.
>     The subversion of the lexical conventions is the most tiresome
>     aspect. (It also needs to be switchable in order to allow easy
>     interfacing with more conventional OCaml code.) In camlp5, I do
>     this by copying and editing the existing lexer, which makes it
>     very brittle and necessitates frequent changes as camlp5's
>     internals evolve. I would very much like a better solution.

This is really about proposing a concrete syntax (even lexical 
conventions) different from regular OCaml, a little bit in the same vein 
as camlp4's revised syntax.  I believe this is better left outside the 
scope of -ppx, which is about providing syntactic facilities (code 
generation, macros, etc) rather than changing the concrete syntax.

To support fully custom syntax, the "-pp" flag will always be available. 
  You can implement a custom parser with 
camlp4/camlp5/ocamlyacc/menhir/etc and then use -ppx rewriters on the 
result, if needed (so that your users will be able to use all the nice 
-ppx rewriter even if your concrete syntax is quite different from 
OCaml).  In a sense, -ppx improve the situation here, because I suspect 
that changing OCaml's grammar means destroying/shadowing some rules in 
Camlp'4 OCaml grammar, and those rules might be relied upon by "syntax 
extensions".  With -ppx, we would have a stricter separation between the 
concrete syntax, and additional syntactic tools (-ppx).

Btw, did you consider forking OCaml's lexer/parser instead of relying on 
camlp5 plus hacks for the lexical part?


More information about the wg-camlp4 mailing list