[ocaml-platform] An alternative proposal for namespaces
alain.frisch at lexifi.com
Thu Mar 21 06:42:11 GMT 2013
On 3/20/2013 10:07 PM, Leo White wrote:
> This is the strategy I have referred to as "regular ocamldep with
> generated search path files", it works just as well with simple
> namespaces. The only difference is that the build system generates the
> search path file to give to ocamldep, rather than making the user write
> it by hand.
So the "good" mode for using ocamldep would be to have the build system
generate a big search path file for each call to ocamldep?
- How does the build system generate this search path file? I guess
it has to know about the "simple namespaces" convention. Does it? And
does it know about the "-name" arguments scattered around in many
subdirectories. Concretely, I don't see how this would work under
omake, for instance.
- How this would work for non-namespaced modules? Can you represent
them with searh path files as in your proposal. I thought that search
path files only defined namespaced names.
- If you assume that the build system can generate a search path file
to avoid calling ocamldep (and thus the compiler as well) with any -I
directory, what's the point of supporting -I directories any more in the
compiler and tools?
> As a side note, I think that "ocamldep -modules" should continue to be a
> purely syntactic version that ignores the search path. It is regular
> ocamldep that should be used for this purpose.
I propose that "ocamldep -modules" ignores the search path directories,
but knows about the definition of namespaces. Otherwise, you need to
invent a new convention to report possible namespaces together with each
>>> I'm not particularly worried about hypothetical build systems. If you
>>> want to implement such a build system then you should really add hooks
>>> into the OCaml compiler. This argument also assumes that catching
>>> "Sys.file_exists" is fine but catching "Sys.readdir" is impossible.
>> No, this argument does not assume that. But catching Sys.readdir is useless, since you don't know which files the
>> compiler is interested in. The tool would have to assume that the dependency is on the entire directory, which is of
>> course way too weak.
> The tool would only have to know what files it could produce, but it
> should already know that to answer Sys.file_exists queries.
That's not the way it works, Sys.file_exists returns not only files that
the build system can produce but also files which are already here.
The build system I was referring to worked like that:
- The project is specified by a list of build commands, each of which
annotated with a list of target files (assumed to be created by the
- The build is triggered by asking to build one target.
- To build one target X, the system picks a command which lists X as a
- If the exact same command has already been run previously (this
information is kept in a persistent cache), the system checks that pre-
and post-conditions attached to that run are still valid in the current
state of the file system. If yes, the command does not need to be run
- The recorded conditions are: the content (before execution) of any
file opened for reading and the content (after execution) of any file
opened for writing; the presence or absence (before execution) of any
file checked for existence (stat) during the command.
- When running a command, the tool records those conditions and if the
command checks a file for existence or open a file for reading, the tool
tries to build this file as a target, recursively.
The result of system calls are not modified, they are just intercepted
to allow recording and intermediate compilation of other files. The
tool make the assumption that the behavior of the build commands only
depend on the file existence/absence and content, not on extra meta-data
(such as mtimes, environment variables, or the system date).
A simpler variant of the system was specified with an ordered set of
commands to be executed in sequence, with the same cache behavior. The
benefit is that you don't need to tell the system about which files can
be generated by each command.
And as said, even if you don't believe that is a viable approach for a
robust build system, the same approach can be used to add extra checks
to existing build system that they don't miss dependencies.
> This assumes that a C compiler won't read a directory (say to cache its
> contents) in order to check for the existence of a file. It is not
> exactly the most robust basis for a build system, which is probably why
> it is only a hypothetical build system.
Well, it worked very well for ocaml + gcc. I could build non trivial
code bases (CDuce + all its library dependencies), with extremely
precise dynamic dependency analysis and without having to use ocamldep.
> This kind of behaviour already exists in OCaml. Consider this piece of code:
> type t = Bar.t (* Bar only contains type definitions *)
> If you rename bar.mli to baz.mli but don't remove bar.cmi then it will
> continue to compile until you run "make clean".
Yes, and I see it as a problem. I would actually prefer a system where
one must pass explicitly to the compiler the list of files it can use,
but this is not possible because of backward compatibility. Since
namespaces change the way OCaml interact with the file system anyway and
we have this nice notion of explicitly listing available units in
well-defined files (which can be used by other tools), I think it's a
good opportunity to fix the existing problems (partially).
Moreover, since the relation between the name of compiled units and of
source files will be less tightly coupled for namespaces files (because
of "-name" or "-namespace"), the chances for facing difficult to track
error messages will become higher.
> I really don't think that preventing a very unlikely scenario, which can
> already happen anyway, is a good reason to make namespaces significantly
> less convenient for the average user.
I don't think it will. The average user who creates a library to be
used by others will need to pick a good namespace name and list which
files constitute the library. At this point, writing an explicit
.mlpath file does not add any burden (the same source of information can
be used to define the content of the library).
More information about the Platform