[wg-camlp4] Time for a summary?
Leo White
lpw25 at cam.ac.uk
Wed Feb 6 18:11:12 GMT 2013
> Maybe a single post
>summarizing the main proposals and points of contention?
I will attempt a summary of the parts of the discussion that I remember. In
no particular order the following has been discussed:
1. Current uses of camlp4
There were many examples of current uses of camlp4 given. Gabriel put a
summary of them on a wiki:
https://github.com/gasche/ocaml-syntax-extension-discussion/wiki/Use-Cases
Anil also posted a list of all the extensions used by OPAM packages:
https://github.com/avsm/opam-camlp4-analysis/wiki
I don't think anyone has systematically gone through these yet, but I
think that it is important, before any concrete proposal is made by this
working group, to make clear which of these extensions we intend to
support and how we intend to support them.
2. New syntaxes needed for ppx:
There seems to be general agreement that, for ppx to replace camlp4 for
its
most common uses, at least the following three kinds of syntax are needed:
a) A "template" syntax along the lines of (:longid expr). This syntax
could be used as an expression, a pattern, a structure item, etc.
For example Sedlex could be:
(:sedlex
match buf with
| number -> ..
| letter, Star ('A'..'Z' | 'a'..'z' | digit) -> ..
)
b) An "attribute" syntax along the lines of (@ expr). This syntax allows
you to attach expressions to other expressions, patterns, etc. Unlike
the template syntax, the type-checker can silently ignore any of these
that don't get translated. For example Bisect could be:
let f x =
match List.map foo [x; a x; b x] with
| [y1; y2; y3] -> tata
| _ -> (@ Bisect.visit) assert false
(@ Bisect.ignore)
let unused = ()
It has also been suggested that there might need to be different syntax
for "post-fix" and "pre-fix". Gabriel suggested that whether an
attribute in a particular syntactic position is pre-fix or post-fix be
decided on a case by case basis.
c) A quotation syntax like {x{ .. }x} (where x is any operator symbol)
that could be used for quoting non-OCaml syntax. For example sedlex
with a more compact notation for regular expressions:
(:sedlex
match lexbuf with
| {{ xml_letter+ }} -> ...
| {{ "with" }} -> ...
| ...
)
There has also been some agreement, and no strong objection, to including
some abbreviated forms of the above syntaxes. Some of the suggested forms
include:
a) Template + quotation:
{:id x{ str }x} == (:id {x{ str }x})
b) Template + let:
let:id x = .. in ... == (:id let x = .. in ... )
c) Template + match:
match:id exp with .. == (:id match exp with .. )
d) Type-conv style attribute:
type t = ... with foo, bar( expr ) == type t = ... (@foo) (@bar expr)
e) Anonymous template:
(# expr) == (:(*no id*) expr)
3. Anti-quotations
There has been some discussion about how to support anti-quotations:
- Fabrice suggested using a standardised format for anti-quotations, but
some people were against that because they use "$" in their camlp4
quotation extensions.
I suggested that if we provided functions which took a predicate
function and then parsed an OCaml phrase up until the next *unnested*
location that made the predicate true, then we could support general
anti-quotations for ppx.
- Hongbo suggested that AST lifting was necessary for supporting
anti-quotations, citing this paper:
http://dl.acm.org/citation.cfm?id=1291211
I suggested that it isn't necessarily needed, and that if people wanted
it then it could be provided using a type-conv style extension or
run-time types.
4. On the use of "quotations"
There has also been some discussion about when it is a good idea to use a
camlp4-style quasi-quotation:
- Alain suggested that they should not be used for extensions that are
"mostly valid OCaml code". He pointed out that doing so causes you to
lose all editor and tool support.
Hongbo disagreed saying that in Fan extensions like sedlex would be
implement as quasi-quotation and that editor support was fine.
After this there seems to have been some agreement that it would be
better to implement such extensions as "templates" (see section 2 of
this post) rather than quasi-quotations.
- Alain then went further, suggesting that quotations were not even
suitable for encoding foreign languages in OCaml.
Gabriel disagreed, saying that for domain-specific foreign languages
(e.g. SQL) quotations allow you to have a domain expert maintain the
foreign code even if they didn't know OCaml.
Other people also said that they liked using quotations for foreign
syntax.
5. Alternatives to ppx
There have been some proposals for alternatives and variations to using
ppx for extensions:
a) Hongbo suggests that his Fan project is a better alternative to camlp4
than ppx:
https://github.com/bobzhang/Fan
b) Xavier Clerc suggested that attributes might be declared and typed, as
in Java. So before an attribute could be used:
(@ MyAnnot {a=1, b="two"})
"MyAnnot" would have to be declared with fields "a" (of type "int") and
"b" (of type "string"). He points out that this may protect the user
from some typos and other obvious errors.
Alain suggested that this would be a much more complicated system, and
might be too restrictive for some uses of attributes.
c) Sebastien Mondet suggested that run-time types or dyntype might be
sufficient without ppx for many extensions.
Markus Mottl pointed out that there are runtime performance
implications for using those solutions.
d) Jeremy suggested that rather than implementing extensions as AST
transformers that operate over the whole AST they might be implemented
as transformers only for the part of the AST that they were to operate
on. So an extension used like this:
(:perform
x <-- m;
y <-- n;
return (x y))
would be implemented as a function with type:
val perform : Parsetree.expression -> Parsetree.expression
He suggests that this may help with scoping, safety and
compositionality.
Alain pointed out that some legitimate uses of ppx don't work nicely
with the constraint that they can operate only on a marked fragment. He
also pointed out that handling extensions as those extensions are found
within the source forces a top-down expansion order, which is not
necessarily desirable.
e) I suggested an extension to Jeremy's proposal. This involves giving the
expansion functions in their own kind of file (".mlq" - compiled to
".cmq"), referring to them using a namespace mechanism, and then having
the compiler itself perform the expansion of the extensions.
There are more details in my blog post:
http://www.lpw25.net/2013/02/05/camlp4-alternative-part-2.html
Since such a solution would take a while to implement, and since moving
an extension from ppx to it would be trivial, I proposed using ppx for
the short/medium term.
I suggested that such a mechanism might improving tooling and make it
easier for average users to use extensions.
Both Gabriel and Alain suggested that the use of namespaces might be
unnecessary and too heavyweight. Alain also questioned the need for
special ".mlq" files.
There are probably other parts of the discussion that I've forgotten. I'm
sure someone will fill-in any important details I've missed.
More information about the wg-camlp4
mailing list