[ocaml-platform] on the need and design of OCaml namespaces

Christophe TROESTLER Christophe.Troestler at umons.ac.be
Tue Feb 26 22:26:04 GMT 2013


On Tue, 26 Feb 2013 12:30:15 +0000, Leo White wrote:
> 
> Ignoring the implementation issues for now, consider the run-time
> semantics of the module system.

Thanks for that explanation.  I have some questions though.

> At run-time a module is a record. Initialising a module involves
> initialising every component of the module and placing them in this
> record. Initialising these components can involve executing arbitrary
> code; in fact the execution of an OCaml program is simply the
> initialisation of all its modules.

How about doing that initialization in a more lazy fashion?
Initializing a module would only create the top level components:
toplevel values and modules¹, the modules being only "pointers to
NULL".  when accessing a submodule, the compiler would take note to
initialize that submodule (and link the corresponding cmo/cmx for
packed modules²).  If the modules is used in a functor or as a first
class value, then all submodules present in the signature are
initialized (this could be on the spot — especially for the toplevel —
or moved to the place where the module is initialized if that makes a
difference).  To avoid runtime checks, some static analysis is
required to make sure that, when a value is used, the corresponding
submodule has been properly initialized but that may not be too hard.

There are two problems that I can see with this but I think namespaces
have the same ones if they are to be convenient.  The first is about
the initialization code in submodules.  As you say, loading a library
should execute the "toplevel code" of each submodule.  Since this is
rather infrequent, when building the module record, one may add all
module paths containing submodules with initialization code.  That way
the current semantics are preserved and for huge packed modules many
components should still go away.  However, one could argue that if one
does not reference the module at all, its initialization code should
not be executed (alike the behavior one gets when adding cm[x]a on the
command line).

The second problem is that some toplevel value may need some sumodules
to be initialized (because they use these submodules).  Some analysis
is thus required to add these submodules during the record
initialization.  The same is true when one initialize a submodule that
was "NULL" before — other submodules may need to be initialized first.
This analysis may not be too had to do — indeed it is already
basically present in ocamldep.

¹ As an added bonus, one does not have to load the entire module
signature (≈4Mb for Core) — only keep pointers to the signatures of
submodules — which should speed up compilation.  This is mostly an
orthogonal issue but it could benefit from the same laziness framework.

² This could be done almost for free if packing would rename the
modules to some internal names, gather them in a cm[x]a (and not a
cmo/cmx as of now) and add a module aliasing these internal names to
the desired names for the outside world.

> Any attempt to overcome the problems with pack, whilst still
> maintaining the illusion that the "pack" is a normal module, would
> result (at the very least) in one of the following unhealthy
> situations:
> 
> - The module type of the "pack" module would depend on which of its
>   components were accessed by the program.
> 
> - Any use of the "pack" module other than as a simple container
>   (e.g. "module CS = Core.Std") could have a dramatic effect on what was
>   linked into the program and potentially on the semantics of the
>   program.

Maybe one of these problems occurs with the rough proposal described
above but it is not immediately clear to me.  May you tell?

> Namespaces are basically modules that can only be used as a simple
> container. This means that they do not need a corresponding record at
> run-time (or any other run-time representation). This avoids the
> problems with pack as well as enabling other useful features
> (e.g. open definitions).

I agree that openness may be a desired feature.  Gabriel however said
that it is not essential.  Moreover, one could imagine that a module
wanting to be added to a hierarchy (say "Data") could be repacked at
in the module Data at installation time.  Granted this is not very
nice but this is just intended to show that there are short term
possibilities.

Best,
C.


More information about the Platform mailing list