[ocaml-platform] on the need and design of OCaml namespaces

Leo White lpw25 at cam.ac.uk
Fri Feb 22 14:06:22 GMT 2013


>  - Performance:  looking up and opening files takes time, especially 
>under bad OS such as Windows.

There are a number of possible solutions to spurious opens. The simplest of 
which is to only look for "Core.Std.Mutex" in directories which contain the 
special "Core.Std" .cmi file (mentioned previously as a solution to typos 
in open statements). These special .cmi files could also be extended to 
include a list of modules that have that namespace within the current 
directory which would prevent spurious reads entirely.

>  - It prevents from putting .cmi files from many libraries in the same 
>directory, which is sometimes useful (to simplify deployment; to control 
>precisely the set of .cmi available for a given file; to improve 
>performance by avoiding repeated lookups in many directories).

I think that this ability is of dubious value and not really a big loss.

>  - Spurious dependencies: technically, since the compiler will open 
>them, all x.cmi files in the search path should be considered as 
>dependencies for a module which refers to X.  This is necessary to have 
>a correct notion of dependency for the build system (formally, each 
>x.cmi could become the "correct one" if its namespace changes in the 
>source file; and since all these files are opened, they should not be 
>overwritten in parallel).  This complexifies the build system, 
>especially for parallel builds, and creates a risk of dependency cycles.

The solutions above should solve this problem.

Even without those solutions, there is no need for a proper dependency, 
since changing the namespace would cause there to be two filenames with the 
same name and namespace. This is basically an error, so dependencies should 
not be expected to be correct. It would not be difficult for an OCaml 
specific build system to detect the existence of two files with the same 
name and namespace and raise an error.

There is possibly the need for some kind of partial dependency for parallel 
builds. This is more like a lock than a dependency, so there should be no 
question of circular dependencies. I'm not really familiar with how 
parallel file accesses work on different file systems, but perhaps the 
compiler could lock ".cmi" files before reading and writing. This might be 
a good idea more generally for cases where dependencies have not been 
correctly calculated.

>That's why I've proposed to allow specifying mapping between references 
>to external modules in dedicated files.  We could have a file 
>core_std.ns (probably shipped with Core) with this content:
>
>Mutex = Core_std_mutex
>Thread = Core_std_thread
>Date = Core_std_date
>
>and just a reference in the source code (or on the command-line):
>
>   open namespace Core_std
>
>which would load core_std.ns and use the corresponding module renaming 
>in the rest of the module.

There is very little difference between that suggestion and having 
a core_std.ns file containing:

  Mutex;
  Thread;
  Date;

and using that as a (partial) declaration of a Core_std namespaces, except 
that you have to give every file a unique long filename. So I don't really 
see the particular benefit of using long filenames.

There is also a more general problem with any solution like this, which 
tries to define namespaces (or sets of aliases) in a single file. It is 
difficult to use the namespace from inside the modules that are within the 
namespace. For example, if I use:

  open namespace Core_std

from within mutex.ml then it will attempt to open itself.

This is particularly problematic for language extensions, because they want 
to generate code like:

Core_Std.Mutex.lock

but if they are used within another module in Core_Std then it breaks.

The solution to these problems is to have membership of a namespace encoded 
in the module itself.



More information about the Platform mailing list