[wg-camlp4] My uses of syntax extension

Tue Jan 29 09:05:09 GMT 2013

I've given it some thought, but I'm not convinced that it really make 
sense to introduce an extra intermediate representation of source code, 
between the parser and the input of the type-checker.  Let's look at 
where the Parsetree representation is actually used:

- Producers:

    the official parser
    ast_mapper and ppx rewriters
    camlp4
    untypeast

- Consumers:

    the type checker
    ast_mapper and ppx rewriters
    depend (could be rewritten to use ast_mapper)
    ocamlprof (could be rewritten to use ast_mapper)

(I've omitted more exotic consumers: pprintast, eqparsetree, addlabels)

The only important producer is the official parser and the only 
important consumer is the type checker, so I don't see clearly the 
benefits of introducing an extra representation in between.  The 
Parsetree could be improved, in particular to do less desugaring in the 
parser itself, but I don't see why we could not do it.  It would not 
increase the complexity of the type checker significantly.

Do you see "big" issues with the Parsetree, which could not be solved 
without the introduction of a different representation?

Alain

On 01/28/2013 10:23 PM, Hongbo Zhang wrote:
>
> After you doing all those stuff,  I am worried that you may find you
> just re-implemented a new P4.
>
> *I agree that the complexity of extensible parser should be removed,
>   there's a compromise here.*
>
> The ppx*shares the same Intermediate Ast*(It could be improved with more
> meaningful tags names)  with camlp4 but uses the built in parser.
>
> The benefit lies in two aspects:
>      1. You get the automatically meta filter, quasi-quotation is for free
>      2. The camlp4 can works well with ppx, the change is incremental,
> most existing library still works under both cases
>
> Introducing an Intermediate Ast also gives you some freedom, the
> parsetree does a lot of syntax desguaring(to name a few, range patterns,
> bigarray, string access, array access), you may change
> the representation of parsetree, but that introduces complexity in other
> cases, so my suggestion is that we can separate extensible parser part
> from camlp4, that's absolutely doable, then we share the same
> intermediate Ast
>
> On Mon, Jan 28, 2013 at 4:01 PM, Leo White <lpw25 at cam.ac.uk
> <mailto:lpw25 at cam.ac.uk>> wrote:
>
>         3 parsers actually.
>         The original parser, the parser to patt, the parser to expr.
>
>
>     The original parser isn't necessary for the quotation, and of course
>     the other two parsers are only both needed if you want the quotation
>     to work as both a pattern and an expression.
>
>     Anyway, my point is that you do not *need* to implement quotations
>     by parsing to a value and then converting to an AST fragment, in
>     fact it is often easier not to.
>
>     That is not to say that some people might not want to implement
>     their quotations by parsing to a value and then converting to an AST
>     fragment. In fact, if they want to use an external parser (over
>     which they have little control) to parse their quotations then they
>     may have to do it that way.
>
>     For these cases I think there are three options:
>
>     1. They can implement conversion functions by hand.
>
>     2. They could use some kind of type-conv style extension to
>     automatically produce the conversion functions. I'm sure such an
>     extension would be welcomed by other extension authors.
>
>     3. If/when some kind of run-time type representation is added to the
>     language, it could be used to create a generic conversion function.
>
>
>
>
> --
> -- Regards, Hongbo