[ocaml-platform] on the need and design of OCaml namespaces

Gabriel Scherer gabriel.scherer at gmail.com
Wed Feb 27 06:58:41 GMT 2013


I don't like your lazy initialization scheme because it changes the
observable semantics of the the OCaml language when modules have
global side-effects. Currently, the effects are evaluated in the
linking order of the modules, not the use order.

Note that the main difficulty of your proposal is to track
dependencies at a structure item level, rather than at a compilation
unit level as is currently done. Lazyness is arguably bad and in any
case not needed if you have this finer granularity -- but as I argued
this would have large, and possibly unpleasant, implementation
implications.

On Tue, Feb 26, 2013 at 11:26 PM, Christophe TROESTLER
<Christophe.Troestler at umons.ac.be> wrote:
> On Tue, 26 Feb 2013 12:30:15 +0000, Leo White wrote:
>>
>> Ignoring the implementation issues for now, consider the run-time
>> semantics of the module system.
>
> Thanks for that explanation.  I have some questions though.
>
>> At run-time a module is a record. Initialising a module involves
>> initialising every component of the module and placing them in this
>> record. Initialising these components can involve executing arbitrary
>> code; in fact the execution of an OCaml program is simply the
>> initialisation of all its modules.
>
> How about doing that initialization in a more lazy fashion?
> Initializing a module would only create the top level components:
> toplevel values and modules¹, the modules being only "pointers to
> NULL".  when accessing a submodule, the compiler would take note to
> initialize that submodule (and link the corresponding cmo/cmx for
> packed modules²).  If the modules is used in a functor or as a first
> class value, then all submodules present in the signature are
> initialized (this could be on the spot — especially for the toplevel —
> or moved to the place where the module is initialized if that makes a
> difference).  To avoid runtime checks, some static analysis is
> required to make sure that, when a value is used, the corresponding
> submodule has been properly initialized but that may not be too hard.
>
> There are two problems that I can see with this but I think namespaces
> have the same ones if they are to be convenient.  The first is about
> the initialization code in submodules.  As you say, loading a library
> should execute the "toplevel code" of each submodule.  Since this is
> rather infrequent, when building the module record, one may add all
> module paths containing submodules with initialization code.  That way
> the current semantics are preserved and for huge packed modules many
> components should still go away.  However, one could argue that if one
> does not reference the module at all, its initialization code should
> not be executed (alike the behavior one gets when adding cm[x]a on the
> command line).
>
> The second problem is that some toplevel value may need some sumodules
> to be initialized (because they use these submodules).  Some analysis
> is thus required to add these submodules during the record
> initialization.  The same is true when one initialize a submodule that
> was "NULL" before — other submodules may need to be initialized first.
> This analysis may not be too had to do — indeed it is already
> basically present in ocamldep.
>
> ¹ As an added bonus, one does not have to load the entire module
> signature (≈4Mb for Core) — only keep pointers to the signatures of
> submodules — which should speed up compilation.  This is mostly an
> orthogonal issue but it could benefit from the same laziness framework.
>
> ² This could be done almost for free if packing would rename the
> modules to some internal names, gather them in a cm[x]a (and not a
> cmo/cmx as of now) and add a module aliasing these internal names to
> the desired names for the outside world.
>
>> Any attempt to overcome the problems with pack, whilst still
>> maintaining the illusion that the "pack" is a normal module, would
>> result (at the very least) in one of the following unhealthy
>> situations:
>>
>> - The module type of the "pack" module would depend on which of its
>>   components were accessed by the program.
>>
>> - Any use of the "pack" module other than as a simple container
>>   (e.g. "module CS = Core.Std") could have a dramatic effect on what was
>>   linked into the program and potentially on the semantics of the
>>   program.
>
> Maybe one of these problems occurs with the rough proposal described
> above but it is not immediately clear to me.  May you tell?
>
>> Namespaces are basically modules that can only be used as a simple
>> container. This means that they do not need a corresponding record at
>> run-time (or any other run-time representation). This avoids the
>> problems with pack as well as enabling other useful features
>> (e.g. open definitions).
>
> I agree that openness may be a desired feature.  Gabriel however said
> that it is not essential.  Moreover, one could imagine that a module
> wanting to be added to a hierarchy (say "Data") could be repacked at
> in the module Data at installation time.  Granted this is not very
> nice but this is just intended to show that there are short term
> possibilities.
>
> Best,
> C.
> _______________________________________________
> Platform mailing list
> Platform at lists.ocaml.org
> http://lists.ocaml.org/listinfo/platform


More information about the Platform mailing list