[wg-camlp4] Reconciling Merlin and ppx

Yaron Minsky yminsky at janestreet.com
Mon Nov 4 14:01:34 GMT 2013


On Mon, Nov 4, 2013 at 8:02 AM, Frédéric Bour <defree at gmail.com> wrote:
> On 02/11/2013 18:21, Yaron Minsky wrote:
>>
>> I've been playing using Merlin more and more for personal projects,
>> and have started to wonder about how we're going to make Merlin work
>> effectively in a ppx world.  In some ways, ppx simplifies Merlin's
>> job, since there's now just one concrete syntax to parse.
>>
>> But there are still problems to solve.  In particular, you will still
>> need to be able to apply AST-transformers to the partial ASTs that
>> Merlin generates in order to run it through the type-inference stage,
>> or to determine the environment.
>
> Those partial ASTs are still represented with the same type as OCaml
> "normal" ASTs.
> There should be no difference for the transformers at this stage.
>
>
>> Obviously, Merlin could just reimplement a bunch of common AST
>> transformers, as it does now for some common camlp4 extensions.  But
>> this seems like a situation that is too painful to maintain in the
>> long term.
>
> Indeed, ppx seems to be the opportunity to stop hardcoding extensions in
> merlin.
>
>
>> A better idea, perhaps, would be to have some libraries or standards
>> about how to present an AST transformer so that it's suitable for
>> running against a partial AST.  I don't really know the details of how
>> Merlin works, but I understand it does a first pass to parse
>> expressions into top-level units, the parsing of which is done
>> independently.  AST transformers that operate on those units would
>> naturally gain some level of incrementality.  Another, perhaps more
>> intrustive approach would be to write PPX transformers against
>> Merlin's version of the AST.
>
> The work done by the transformer when invoked by merlin or by the toplevel
> should be the same: the AST is given chunk-by-chunk, each chunk being passed
> for transformation.

Does this work for sub-modules as well?  It's not rare in my
experience to do quite a bit of programming within a sub-module.  In a
top-level, you'd have to do that as one top-level chunk, which seems
like the wrong solution for merlin.

> Under the assumption that, for a given transformer t and without imposing an
> order of transformation:
> t (def1;; def2) ~= t(def1) ;; t(def2)
> using this transformer in merlin should not require specific work.
>
> That being said, I don't know much about ppx. Here are the cases I see that
> could be problematic in general:
> - the transformer relies on some global state, that has to be threaded
> across different invocations.
> - the toplevel works only in a sequential way, and backtrack only if the
> current definition fails to type. merlin can rollback to arbitrary point in
> files, which could break assumptions in the transformer if there is a state.
> - merlin also split definitions inside modules, which means that pieces of
> ast could be extracted and transformed either separately or not in the
> expected order (e.g. the signature constraining a structure can be extracted
> by merlin and transformed as if it was an independent declaration, then
> reinjected in the AST).
>
>
>>
>> I don't quite know what the right solution is, but when 4.02 comes out
>> we're all going to be writing a bunch of new AST transformers, and if
>> we really want Merlin to be a success, we should have a plan for how
>> those transformers will integrate with it.
>>
>> y
>
>


More information about the wg-camlp4 mailing list