[Merlin-discuss] further integration of merlin at Jane Street

Tue Aug 27 11:01:13 BST 2013

> Here are some thoughts on further integrating merlin into the Jane
> Street environment.  I preface this by saying I don't understand the
> current merlin/emacs integration, so feel free to elucidate me.

More documentation should be coming soon.

> To state what I think is a goal:
>
>   A programmer should be able to go to any position in any OCaml
>   source file on their machine, press a button, and autocomplete at
>   that point (or be told that autocompletion isn't possible because
>   the necessary libraries haven't been compiled).
>
> The first thing I think this implies is that we need to keep a merlin
> installed for each ocaml that we install (i.e. /janelibs/ocaml-*/).
> This is needed so that the merlin is compatible with the source and
> object files for the project.  The merlin editor integration can
> determine the correct merlin to run by looking at the .omake-ocaml-bin
> at the root of the repo as generated by the build system.  Presumably
> our deployment of merlin would put "ocamlmerlin" and related
> executables in the same bin as ocamlc, ocamlopt, etc., or in some
> fixed directory relative to that bin.
>

Having one merlin per ocaml version seems to be the way to go.

We have been discussing this issue on
[https://github.com/def-lkb/merlin/issues/44]: we could support
different versions, by always using the latest typer, converting
cmi/cmt files, and relying on backward compatibility.

But this approach is likely to be fragile, at least at the beginning,
and adds a transformation pass that may need further caching.

To select the right merlin binary, we could make the build system
generate a symlink to the executable at the same time it produces the
".merlin" file.

> Next, we need merlin to understand the libraries that are in scope in
> a given directory.  I think this is a fairly straightforward matter of
> having jenga output a .merlin for each directory that contains OCaml
> files.  That .merlin would be complete, i.e. not require any recursion
> up the directory tree.  The contents of the .merlin would be the
> OCAML_LIBRARIES plus the libraries as recursively required by
> OCAML_INTERFACES (i.e. whatever the build system is already making
> available when compiling source files in that directory).

Yes.

> Next, I speculate that for performance reasons, we need a merlin
> server to cache information so that it can quickly reconstruct the
> environment at a given program point.  It might cache:
>
>   * the environment for each library
>   * the environment that is the union of libraries imported for a
>     directory
>   * even more, perhaps the imported libraries plus all of the files in
>     a directory
>
> A natural architecture for this is to have a merlin server process for
> each project (i.e. per jengaroot).  When one first visits a project,
> the merlin process would be created.  As one queries for
> autocompletion in files in the project, that merlin process answers
> the queries, using and updating its cache.

Did you observe slowdown for some workload?

In the current version, cmi files are cached when first read
(inheriting the behavior of ocaml compiler), and when receiving a
"SIGUSR1" merlin reload all cmi files having newer mtime. This signal
is sent to merlin by the build-system when a build finishes.
This is fast enough that it don't feel slow at all for my particular
workflow (this is subjective of course).

The slowest thing is retyping the current buffer, where most of the
time is spent on "open Core.Std". This has been optimized in the
experimental branch, which we plan to merge soon.

> To support multiple simultaneous projects, one needs to manage a set
> of merlin processes (with possibly different merlin versions).  It is
> unclear whether the process management should be written separately
> for each editor, or whether we should write a "meta-merlin" in OCaml
> that presents a unified command-line interface that could then be used
> by any editor.  At Jane Street, we need to support emacs and vim, so
> there is at least some advantage to a meta-merlin written in OCaml.
> But if the process management is simple enough in each editor, then
> perhaps an OCaml meta-merlin isn't worth it.
>
> We already have an analog of meta-merlin at Jane Street -- it's called
> omake-server, which manages a collection of per-project build
> processes.  I chatted with Pete about the possibility of implementing
> meta-merlin.  Roughly, we think that having ocaml-server has worked
> out well, although there are a number of ways in which the
> implementation is too complicated.  So, if we write meta-merlin, we'd
> like to start with a new codebase, with heavy involvement from Pete to
> avoid some of the design mistakes of omake-server.
>
>
> That's probably enough for now.  Hopefully people who understand the
> merlin architecture can speak up so we can all understand it clearly,
> and we can then decide how to proceed to get a merlin that works
> smoothly for all Jane Street devs.

The current implementation can make use of multiple processes (managed
by the editor). But the setup you suggest seems much more complicated,
I hardly see the benefits.
Is this only for performance reasons?

Fr?d?ric