[ocaml-platform] on the need and design of OCaml namespaces

Tue Feb 26 10:12:04 GMT 2013

Christophe TROESTLER <Christophe.Troestler at umons.ac.be> wrote:
> It seems to me that the openness of namespaces is the only feature I
> have seen mentioned that modules do not have.  But is the openness of
> namespaces something considered useful?  What problem does this solves?

If you have a haskell-ish view of module hierarchies as functional
classification rather than provenance, eg.
  Data.List
  Data.Array
  Data.Array.Mutable
  Foreign.ForeignPtr
  Control.Concurrent.IO
  ...

then having a "merge {Data.{List,Array.Mutable}} with
{Data.{Array,String}}" is important. This is something open structures
naturally have, and that is not a good fit for closed structures. The
discussion for namespaces (before it landed on this list) insisted on
openness as a distinctive aspect for a while, but then we realized
that, at the moment of compilation of a single module, all the
information about the compilation environment in known, so you can
have a closed view of the world -- even if the world may change
between compilations. This idea that "once everything is decided you
are in a closed world again" allows to present (open) namespaces as
(closed) modules at the source code level if deemed desirable -- see
http://gallium.inria.fr/~scherer/namespaces/pack_et_functor_pack.html

Summary: "open" is not essential, but "open merge" is an useful
primitive to have (even in a closed world). You can always locally
assume that the world is closed, and (locally) close structures are
simpler to deal with.

> With no doubt, I understand even less about the compiler internals
> than you do.  Nonetheless, shouldn't these questions receive a
> definitive answer before we speak about namespaces?  I have heard
> neither that these things are hard to do.  And, if they are, it would
> be interesting to understand what features of modules hamper their
> efficiency as simple containers.  That way, it seems to me, either the
> technical problems will get solved or what kind of "stripped modules"
> namespaces must be will emerge naturally.

The reason why this isn't done more is that the people that know well
about the compiler internals don't want to be bothered with the
namespace discussion (except Alain, that knows very well about these
issues and is active, if maybe a bit conservatively so, in the
discussions), and these questions are full of tedious implementation
details that even the implementors don't keep all in mind at the same
time. Some partial answers:

- The reason why packing everything in a single .cmo (compilation
unit) bloats linking results is that dependencies are handled at a
granularity of the compilation unit. Relying (dynamically) on the
module Foo in your source unit will result in a mention of Foo's
internal name in your compilation unit, and Foo will need to be linked
as a whole. To reduce the granularity in a principled way, one could
decide to track dependencies at the definition/structure-item level
rather than the whole-unit level, but that would be a fairly invasive
implementation change that has so far been resisted, with unclear (and
potentially scary) implications of compilation time changes for
example.

- One should keep in mind that while bytecode compiled files (.cmo and
.cmi) are easy to deal with, native objects (.cmx and friends) are a
pain because of portability issues. The reason why we have this
two-step "-for-pack" then "-pack" process for packing is that
implementors found that (at the time) MacOS systems couldn't be relied
on to manipulate and change .cmx files. That's the reason why you need
to prepare for packing in advance (-for-pack) instead of simply
packing regularly compiled module. These are the kind of limitation
you have to work with in linker-related settings.

On Mon, Feb 25, 2013 at 11:37 PM, Christophe TROESTLER
<Christophe.Troestler at umons.ac.be> wrote:
> On Mon, 25 Feb 2013 16:50:33 -0500, Yaron Minsky wrote:
>>
>> On Mon, Feb 25, 2013 at 3:43 PM, Christophe TROESTLER
>> <Christophe.Troestler at umons.ac.be> wrote:
>> > On Mon, 25 Feb 2013 14:16:03 -0500, Yaron Minsky wrote:
>> >>
>> >> On Mon, Feb 25, 2013 at 1:04 PM, Daniel Bünzli
>> >> <daniel.buenzli at erratique.ch> wrote:
>> >> >
>> >> >
>> >> > Le vendredi, 22 février 2013 à 22:51, Xavier Clerc a écrit :
>> >> >
>> >> >> So, as of today, we have :
>> >> >> - "archives" (cma / cmxa) allowing to gather modules but without
>> >> >> naming (at the language level) the gathering ;
>> >> >> - "packs" allowing to gather modules into a module.
>> >> >> I regard namespaces are gathering modules into a named entity but
>> >> >> without creating a module. Hence, it is a new beast, different from
>> >> >> archives and packs.
>> >> >
>> >> > So basically a new concept is introduced because "pack" is not
>> >> > technically satisfying. That's not the way I would like the language
>> >> > I program in to be designed. I'd rather see the problems pack has
>> >> > fixed which I'm sure could be done by allowing archives to be named
>> >> > at the language level as a module.
>> >>
>> >> You might be right, but I think there's a deep issue here that
>> >> shouldn't be dismissed so lightly.  The argument is that modules are
>> >> simply too powerful to be used as the complete solution to namespace
>> >> management.  Deciding that the only principled approach is to always
>> >> pick the most powerful, most general purpose primitive is attractive,
>> >> but not always sane...
>> >
>> > That's an interesting take on this.  Would you care to elaborate on
>> > why a module approach may not be sane?  Is it from a semantic or an
>> > implementation point of view?
>>
>> To be clear: I'm not an expert on the internals of the compiler, and
>> am mostly repeating claims made by others who are.
>>
>> But my understanding is roughly this: we want namespaces to behave
>> differently than modules currently do: in particular, we need to be
>> able to depend on only a subset of a namespace, and to track
>> dependencies within the different components of a namespace.
>>
>> One could imagine building these features into modules directly, but
>> this is hampered by the fact there is a rich set of operators on
>> modules, for example, you can apply a functor to a module.
>>
>> It's of course possible that either (a) one could naturally add these
>> features to modules directly and thus neatly avoid the need for
>> another language feature; or (b) that one could have two classes of
>> modules whose implementations differ under the skin but that present
>> themselves almost identically to users.
>>
>> But I am unaware of anyone who understand the compiler internals who
>> believes either (a) or (b) is reasonably easy to do.
>
> With no doubt, I understand even less about the compiler internals
> than you do.  Nonetheless, shouldn't these questions receive a
> definitive answer before we speak about namespaces?  I have heard
> neither that these things are hard to do.  And, if they are, it would
> be interesting to understand what features of modules hamper their
> efficiency as simple containers.  That way, it seems to me, either the
> technical problems will get solved or what kind of "stripped modules"
> namespaces must be will emerge naturally.
>
> Best,
> C.
> _______________________________________________
> Platform mailing list
> Platform at lists.ocaml.org
> http://lists.ocaml.org/listinfo/platform