[ocaml-platform] on the need and design of OCaml namespaces

Wed Feb 27 16:31:28 GMT 2013

Let me try to summarize the current situation about the argument
between Alain, Leo and myself.  I think Leo and I are roughly on the
same page, but I may be missing things.

- MAKING LONG NAMES AVAILABLE.  Alain prefers to have unambiguous long
  names that are usable in a first class way.  I find this mildly
  distasteful, but would be OK with it as long as it was well hidden
  from the user by default.  Long names shouldn't show up by default
  in source files, error messages or documentation.  I view this as
  quite important for usability of namespaces.

- SOURCE-LEVEL OPENS.  Alain would prefer to have namespace
  manipulations restricted to the command line, and therefore the
  build system.  He thinks of namespaces as something that should be
  used pretty rarely (or at least, there should be very few
  namespaces), and it's therefore OK to push them to the outside.

  Leo and I both believe this is a big mistake.  We expect opens to
  happen fairly commonly, and for there to be many different libraries
  that are organized as namespaces.

  Alain doubts that there would be many module-name clashes.  I
  disagree on this point as does Leo.  We use packed modules
  pervasively (for /every/ library), and as a result, we have lots of
  little namespaces, and lots of repeated names within them (names
  like Common, Protocol, Spec, Config, etc.)

  My biggest objection to having opens be at the build system level is
  that it makes your code more ambiguous.  When you do namespace
  manipulations, you very much want to see what's happening by
  inspecting the source.  We have a vigorous code review system here,
  and I don't want to start adding code review of the build rules to
  it, and this change would require that.

  Alain's claim that opens are a bad thing also seems wrong to me.
  opens should be rare, but all of our proposals involve the
  equivalent of opening a namespace.  Alain is not saying we should
  have none of that (after all, we're all glad that Pervasives is
  opened!).  But what Alain is proposing is to make opening a
  namespace silent at the source level.  This strikes me as a grave
  error.

- NAMESPACES WITH VALUES.  I have argued for allowing the opening of a
  namespace to also implicitly open some modules, this essentially
  adding values to the search path in addition to modules.  I would be
  sad to lose this feature, but I don't think it's absolutely
  essential.  It would merely add boilerplate.  Roughly speaking,
  every time a user of Core writes

     open namespace Core#Std

  instead of

     open namespace Core#Std
     open Core#Std.Common

  they're making a mistake.  I'd like to avoid this error, and I don't
  know really what the objection to the feature is, but in the worst
  case, we can add a syntax extension to work around this problem,
  using a -ppx transformer to add the open ourselves.

On Wed, Feb 27, 2013 at 8:09 AM, Alain Frisch <alain.frisch at lexifi.com> wrote:
> On 02/27/2013 12:51 PM, Leo White wrote:
>>
>> Without an open statement your proposal does not scale well. The main
>> need for namespaces arises out of the fact that people choose the same
>> short names for their modules. This will still be the case for the names
>> people choose within their ".ns" files. As soon as I want to use two
>> libraries that contain modules with the same short name, your proposal
>> will have the meaning of the short name decided by the order of
>> command-line arguments. This seems very fragile.
>
>
> I want to see concrete examples!  I don't believe that it is common, in the
> same unit, to access two modules called "List" from two different libraries.
> Maybe you want to use both String.Set and Int.Set, but you're not going to
> "open" String and Int anyway.  You may also want to use Xmllight.parse and
> Xmlm.parse (although it is not clear that this will happen in the same
> module), but you're not going to open Xmllight or Xmlm globally in your
> unit.
>
>
>> I also think that open statements (even at the top of a file) are a very
>> good thing.
>
>
> If put at the top of the file, I don't see them as fundamentally different
> from command-line arguments, and the meaning of the short name is decided by
> the order of open statements, which is also quite fragile.
>
> And "open statements" (for modules, not even namespaces) are generally
> considered as a dangerous feature, because they are the source of technical
> problems (dependency analysis), because they make the code harder to read
> and refactorize, and because it makes modules more fragile (if a module B is
> extended with more components, they can hide components with the same name
> from another module A which is open before B in some client code).  Making
> "opens" more local is a way to reduce those problems.  I'm thus surprised by
> your claim that open statements are very good thing.
>
>
>
>> They show which libraries the file is going to use.
>
>
> This seems to confirm that what you're aiming at is really a way to specify
> in the source code which libraries are used.  But then we should push the
> reasoning further and ensure this specification is the only one required to
> use a library.  Why should we accept to pass -I / -pp / -ppx flags to the
> compiler (and specify again libraries at link time) when the information is
> already part of the source code?
>
>
>
>> Currently, a file using a library that does not use pack will simply
>> launch straight into using modules with short names that give no
>> indication of their origin.
>
>
> Many of the third-party libraries I use export a single module, which I
> never "open", and whose name is unique enough to avoid clashes (e.g.
> Postgresql, Sqlite3, Xmlm).  Some libraries use prefixes (Nethtml, with an
> internal Nethtml_scanner module; Lwt comes with Lwt_util, Lwt_condition,
> Lwt_mutex, etc).  It would have been crazy, indeed, for Nethtml to have an
> internal "Scanner" module, or for Lwt to expose a "Mutex" module.  I'm fine
> with this situation, but I can understand that in some cases, it would make
> sense, for instance, to alias Lwt_mutex to Mutex in a given project.
>
> So I don't agree with the opinion that -pack is currently the only way to
> avoid clashes!  It would be useful to get some statistics about the use of
> -pack in OPAM.
>
>
>
>> I must look in the build system to find out
>>
>> what they refer to. By encouraging people to use namespaces, these files
>> will instead start with "open namespace Foo", and it will be obvious
>> what libraries they are using.
>
>
> So if you have a program like:
>
>  open namespace Foo
>  open namespace Bar
>
>  (* ... several hundreds of lines ... *)
>
>   .... Baz.parse ...
>
> then, yes, you know that this program uses the Foo and Bar
> libraries/namespaces, but you have no idea where Baz comes from.  This is
> fine, as long as you don't have clashes of "short" names, i.e. Foo and Bar
> are different enough to not provide top-level components of the same name.
>
>
>> As OPAM (and the platform) become more widely used I expect there to be
>> many more small convenience libraries. This will increase the number of
>> libraries that are being used within a single project, and it will be
>> useful to know from the source files themselves which files use which
>> libraries.
>
>
> I like this idea of putting more information in the source code about a file
> "requirements" (let's say, which libraries are used).  This could be
> interpreted directly by the compiler (if there is a well-defined convention
> on how libraries are located and invoked) or by a driver such as ocamlfind.
> Currently, library dependencies are specified on the command-line (hence in
> the build system) and I agree it makes sense to allow specifying that in the
> source code.
>
>
>> I understand that for backwards compatibility it is useful to be able to
>> use files via both a namespaced name and a traditional name. However,
>> when not worried about backwards compatibility, we should instead be
>> focussed on providing a coherent story about how OCaml components are
>> named and how these names are managed. For me, this means (for a
>> non-backwards compatible library) only providing (or at least actively
>> encouraging) access to components through namespaces.
>
>
> I understand your point and I think it makes sense if "namespaces" are
> indeed to be added to the language.
>
>
>>> Worse: if we use "ocamldep -modules", this resolution has to be done
>>> by the build system, so this complex logic (which depends on the
>>> location in the source file) will have to be re-implemented in omake
>>> and other build systems around.  It is an important property that
>>> "ocamldep -modules" does not need to look for the existence of
>>> compiled units on the current tree.
>>
>>
>> Firstly, I would like to make clear that this would be no problem for
>> build systems that used makefile formatted ocamldep output.
>
>
> Indeed.  However, "ocamldep -modules" has been added for good reasons and it
> is the recommended way to use omake.  (And don't know about ocamlbuild.)
> omake is used by some of the largest OCaml code base around.
>
>
>> "ocamldep -modules" has always produced an over-estimate of the modules
>> that a file uses and then allowed the build system to figure out the
>> rest.
>
>
> "ocamldep -modules" does not produce more false dependencies than
> "ocamldep".  I'd rather say that "ocamldep" (without -modules) misses some
> dependencies (for generated files which are not yet present when ocamldep is
> executed).
>
>
>
>> OCamlDep would simply treat any namespace for which it could find
>> no ".ns" file as an implicit namespace.
>
>
> I have to admit that I'm a little bit lost and I don't really know which
> "namespace proposal" we are talking about (mine has only ".ns" file, no
> other notion of namespaces).
>
>
>
> Alain