[wg-camlp4] Structured comments, shallow embeddings and deep quasiquotations

Hongbo Zhang hongboz at seas.upenn.edu
Wed Feb 6 00:27:25 GMT 2013


Hi, Jeremy,
   I like your ideas, ;-). Fan adopt the same approach as you proposed.
Instead of
    (@deriving ["sexp"; "json"])
   type t = F of int | G of s
    and s = H of (t * t)

I used the notation
{:ocaml|
type t = F of int | G of s
and s = H of (t * t)
|}
{:derive| (sexp,json)|}
But the semantics are essentially the same,
should we take a serious look at Fan? As a much more advanced tool, Fan,
compared with ppx, while it does not require any change to the compiler,
and porting other camlp4 based library is much easier, I see a lot of
benefits here.

If we change the compiler a lot for little benefit in the internal
compiler, that makes life too hard for more advanced tools, are we a bit
short-sighted here? I really like some other features like run-time types
in the compiler instead, my 2 cents


On Sat, Feb 2, 2013 at 11:17 AM, Jeremy Yallop <yallop at gmail.com> wrote:

> On 1 February 2013 15:38, Gabriel Scherer <gabriel.scherer at gmail.com>
> wrote:
> > Finally, I think -ppx + arbitrary annotations, without any further
> > restriction, is too free-form for robust syntax extensions: one
> > important problem with Camlp4 is that it allowed syntax extension
> > writers to modify the syntax in very bad way that hurt robustness. As
> > already discussed, I have a strong distaste for extensions that only
> > piggyback on existing syntax (without adding any explicit marker); I
> > would feel safer if the extension *mechanism* disallowed such
> > unstructured extensions, or at least made them less rewarding to write
> > than the composable ones. (For example by only passing to the
> > extension writer the part(s) of the AST that have been annotated).
> > Unfortunately, I don't see how Bisect would fit any such restriction.
> > Maybe that's a problem best solved by socialization (writing a
> > documentation on good practices, and yelling on people), but I sort of
> > doubt it -- I don't know how many time I've had to argue for *not*
> > globally changing the associativity of infix operators through Camlp4
> > in Batteries.
>
> I agree with Gabriel.  Actually, I think that a small tweak to the
> design of -ppx could address both this and a number of problems that
> others have raised during the discussion here.
>
> The -ppx approach applies one or more global transformations to the
> ASTs of OCaml source files; these transformations can be parameterized
> by attributes attached at particular points in the syntax tree.  This
> is a significant improvement over the Camlp4 approach, largely because
> it exchanges the (unnecessary) ability to change the concrete syntax
> for a number of valuable guarantees, which make extensions easier to
> write and code that uses extensions easier to understand.
>
> We can go further in this direction, and give up more (unnecessary)
> power in return for further guarantees.  As Gabriel says, since -ppx
> extensions can arbitrarily transform the AST, it's not possible to
> understand any part of a program that uses extensions without
> understanding every aspect of the behaviour of every extension.  We
> could, of course, seek to solve this by convention and social
> pressure, but there seems to be an emerging consensus that this isn't
> really satisfactory.  One of the nice things about functional
> programming is that you have strong guarantees (via parametricity,
> immutability, and so on) about the effects of calling a particular
> function. We should strive to make it possible to reason in the same
> manner about the effects of syntax extensions.
>
> There are other legitimate concerns with the current proposal.  As
> Xavier Clerc and others point out, attributes are apparently
> undeclared (i.e. global) and untyped.  Alain rightly notes that it
> seems to be difficult to introduce declarations and types for
> attributes without significant complexity.  Still, as OCaml
> programmers we're used to the benefits of precisely-scoped names and
> strongly-typed data, and it seems a shame to give these benefits up if
> we can find a way to keep them.
>
> Hongbo Zhang raises a further concern: when syntax extensions are
> global transformations on the whole file, the order in which
> extensions are applied becomes significant.  This is a fairly serious
> matter, I think: the semantics of code that uses syntax extensions is
> now dependent on external factors, since we need to look for the flags
> passed to OCaml in the build configuration in order to understand the
> source.
>
> I think we can address all these concerns with a small adjustment in
> perspective.  Instead of globally-scoped, untyped attributes processed
> by file-level externally-specified transformations, we might add a
> single node to the OCaml grammar for statically-executed AST
> rewriters.  Using the same syntax already proposed for attributes, we
> might write, for example:
>
>    (@deriving ["sexp"; "json"])
>    type t = F of int | G of s
>     and s = H of (t * t)
>
> or
>
>     (@perform)
>        (x <-- m;
>         y <-- n;
>         return (x y))
>
> In order for this to be valid code, 'deriving' and 'perform' should
> resolve to functions of appropriate types:
>
>     val deriving : string list -> Parsetree.structure_item ->
> Parsetree.structure_item
>
>     val perform : Parsetree.expression -> Parsetree.expression
>
> Either during parsing or in a post parsing phase, the ASTs following
> '@deriving ["sexp"; "json"]' and '@perform' are passed to those
> functions and the results are inserted in place into the AST.
> Gabriel's concern is addressed, because there's no way for @perform
> (say) to access other parts of the AST: its effects are purely local.
> Xavier's concern is addressed, since AST rewriters, unlike attributes
> are declared and typed (and hence scoped). Hongbo's concern is
> addressed, since composition is explicit:
>
>     (@deriving ["sexp"])
>     (@nonrec)
>     type t = C of t
>
> (Here '@deriving ["sexp"]' is applied to the result of applying '@nonrec'.)
>
> It should be possible to write almost all extensions in this manner.
> A variant of the stream parser syntax fits easily:
>
>     (@parser)
>        ([ `If; x = expr; `Then; y = expr; `Else; z = expr ] => "if";
>         [ `Let; `Ident x; `Equal; x = expr; In; y = expr ] => "let")
>
> as does Anil's cstruct extension:
>
>     (@cstruct ~endianness:little)
>     type pcap_header = {
>        uint32_t magic_number;   (* magic number *)
>        uint16_t version_major;  (* major version number *)
>        ...
>     }
>
> Other extensions such as ifdef, js_of_ocaml, and pgsql could be
> handled in the same sort of way.
>
> Jeremy.
>
> [I'm deliberately avoiding the interesting but orthogonal questions of
> custom lexical syntax, and benign annotations for tools here.]
> _______________________________________________
> wg-camlp4 mailing list
> wg-camlp4 at lists.ocaml.org
> http://lists.ocaml.org/listinfo/wg-camlp4
>



-- 
-- Regards, Hongbo
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ocaml.org/pipermail/wg-camlp4/attachments/20130205/d6323145/attachment-0001.html>


More information about the wg-camlp4 mailing list