[wg-camlp4] Pending issues
Alain Frisch
alain.frisch at lexifi.com
Mon Feb 11 14:45:52 GMT 2013
Dear all,
I think we have made good progress in discussing extensions to the OCaml
grammar with generic extension points (which can be used as anchors for
-ppx rewriters), and how existing "camlp4 extensions" could be
re-implemented based on them.
Here is a list of questions on which I'd like to hear the opinion of
participants.
0. Scope of extensions covered by ppx + extension points
Do people see useful camlp4 extensions which couldn't be covered in a
satisfactory way with the ppx + extension points approach? The only
such cases I've seen mentioned on this list are pervasive changes to the
syntax, where the goal is precisely to customized the concrete syntax.
1. Nature of extension points
There seems to be a consensus that we should distinguish between
"ignorable" attributes and extension markers, to be expanded by
rewriters (the type-checker fails on them). Concretely, this amounts to
extending the OCaml Parsetree with cases like:
| Ptyp_attribute of core_type * attribute
| Ptyp_extension of extension
| Ppat_attribute of pattern * attribute
| Ppat_extension of extension
| Pmod_attribute of module_expr * attribute
| Pmod_extension of extension
...
Leo suggests to take:
type attribute = expression
and extension = longident * expression option
I'm now convinced that it's a good idea to make the "extension name"
explicit (and not encoded in the expression itself), if only to allow
nice error messages when the type-checker fails on such a node. I'm
wondering if we shouldn't be more symmetric and do the same for
attributes. Opinions? Also, I'm not sure that the expression should be
optional (if the idea is that the extension marker "expands" its
content, without touching its context "too much", we need a non-trivial
content). One can always have a syntax which support "empty argument"
represented as the "()" expression internally.
2. Concrete syntax for extension points
Proposals are welcome. We must choose an unambiguous syntax, which
allows to add an attribute on a structure item, an expression, etc.
Do we want both prefix and suffix syntaxes for attributes on all
syntactic categories? Do we want a special (lighter) syntax to attach
attributes on specific constructions?
3. Quotations
The more I think about it, the more I believe this concept is orthogonal
to "extensions". Leo disagrees and would like to have a combined syntax
for the case where the argument of an extension is a quotation. What do
other people think? (For me, a quotation is only a way to introduce a
string literal without "suffering" from the lexical conventions of OCaml.)
For the concrete syntax of quotations, it was suggested to use a form
where the closing delimiter would be defined by the opening one. This
makes it possible to "quote" an arbitrary string without having to
define a way to escape a potential occurrence of the closing delimiter
(just pick a different one). Example: {xxx{blablabla}xxx}
Do people familiar with implementation of editor modes believe it will
be easy to support such syntax in emacs or vim modes?
4. What to do with attributes in the type-checker
Attributes are designed in a way which make it trivial for the
type-checker to ignore them, which is a good property. It is not much
more difficult to keep in the Typedtree so that they appear in
.cmt/.cmti files. This could support interesting tools, like an
"ocamldoc-like" tool based on those files (instead of having to rarse
the source file again). What do people think about it? Of course, this
does not need to be implemented right away.
A related question is related to the current work on runtime types.
Attributes on type declarations could be kept in the runtime
representation of the declarations, allowing libraries to interpret them
as they want. (LexiFi's version of OCaml has been extended with
attributes on type declarations precisely to do that.) Having
attributes defined as general expressions, however, means that those
libraries would need to link with compiler-libs, or at least be compiled
against some of its .cmi, in order to be able to analyze the Parsetree.
I don't see it is as a big problem, and it would also be possible to
restrict which expressions are reflected in runtime types (e.g. to
structured constants). Comments are welcome!
5. Fork Parsetree or clean it up?
Hongbo has proposed that we introduce a different representation of the
Parsetree on which -ppx rewriters would be applied. Concretely, we
could fork parsetree.mli into parsetree.mli/ast.mli, clean up parsetree
(see below) and adapt parser.mly accordingly, and then implement a
translation pass from Parsetree to Ast before running the type-checker
on Ast. The alternative is to clean up Parsetree directly and adapt the
type-checker to these changes. This avoids an extra intermediate
language and the mapping, but this could add a little bit of extra
complexity to the type-checker (and it might be more difficult to
convince core developers). What do people think?
Changes (to be discussed):
- Avoid some desugaring currently done in the parsetree (interval
patterns, let open M in... = M.(...) ).
- Avoid some weird encodings (Pexp_when, mandatory Ptyp_poly at some
places, etc).
- Make the Parsetree more future proof and self-documented by using
records instead of tuples, sum types instead of booleans, etc.
- Remove prefixes on constructors/labels (although I've got some
negative feedback from other core developers on this point).
Alain
More information about the wg-camlp4
mailing list