[wg-camlp4] My uses of syntax extension
Alain Frisch
alain at frisch.fr
Mon Jan 28 10:57:28 GMT 2013
Hi John,
Thanks for sharing this information!
On 01/25/2013 12:03 AM, Harrison, John R wrote:
> 1. In the code for my book (http://www.cl.cam.ac.uk/~jrh13/atp/) I use
> items entered in <<quotations>>, which I simply want to treat as a
> string to get passed to my parser. This is essentially trivial in
> camlp5, though I think it is less so in camlp4.
>
> let quotexpander s =
> .... "default_parser \""^(String.escaped s)^"\"";;
>
> Quotation.add "" (Quotation.ExStr (fun x -> quotexpander));;
It is planned to extend OCaml grammar directly with generic quotations
(in place of expressions, patterns, etc). The type-checker would fail
on them, but -ppx filters will have a chance to rewrite them to
something the compiler understands. It is not completely clear yet
which syntax (delimiters, lexing rules) will be chosen, though.
Delimiters will probably be different from Camlp4, to make it possible
to combine Camlp4 and -ppx.
*If* you don't care too much about having to escape some characters in
the content of the quotations, a poor-man version is to rely on standard
strings. For instance, a -ppx filter can detect the construction:
!"....."
(i.e. the bang character followed by a string literal) and rewrites it
to whatever it wants. But given your example (where the quotation
expands to a simple function call on the string literal), I assume that
you use quotations precisely for them lexical properties. Is that right?
> 2. For the HOL Light prover (http://code.google.com/p/hol-light/),
> backward compatibility with other HOLs (where ML is used not only
> as the implementation language but also as the toplevel) dictates
> that I do various things to make OCaml look like old-style ML:
>
> * Add some alphanumeric infixes like "o" and "THEN"
>
> * Use "it" as the default binding for the previous expression
> (actually, I would prefer the toplevel to do this anyway).
>
> * Use quotations, but within the characters `....` not <<...>>
>
> * Completely change the OCaml lexical conventions so that
> all-uppercase words like INST and REWRITE_TAC are treated as
> ordinary identifiers.
>
> The subversion of the lexical conventions is the most tiresome
> aspect. (It also needs to be switchable in order to allow easy
> interfacing with more conventional OCaml code.) In camlp5, I do
> this by copying and editing the existing lexer, which makes it
> very brittle and necessitates frequent changes as camlp5's
> internals evolve. I would very much like a better solution.
This is really about proposing a concrete syntax (even lexical
conventions) different from regular OCaml, a little bit in the same vein
as camlp4's revised syntax. I believe this is better left outside the
scope of -ppx, which is about providing syntactic facilities (code
generation, macros, etc) rather than changing the concrete syntax.
To support fully custom syntax, the "-pp" flag will always be available.
You can implement a custom parser with
camlp4/camlp5/ocamlyacc/menhir/etc and then use -ppx rewriters on the
result, if needed (so that your users will be able to use all the nice
-ppx rewriter even if your concrete syntax is quite different from
OCaml). In a sense, -ppx improve the situation here, because I suspect
that changing OCaml's grammar means destroying/shadowing some rules in
Camlp'4 OCaml grammar, and those rules might be relied upon by "syntax
extensions". With -ppx, we would have a stricter separation between the
concrete syntax, and additional syntactic tools (-ppx).
Btw, did you consider forking OCaml's lexer/parser instead of relying on
camlp5 plus hacks for the lexical part?
Alain
More information about the wg-camlp4
mailing list