[opam-devel] [ANN] opam-fmt 1.0

Gabriel Scherer gabriel.scherer at gmail.com
Sun Nov 22 20:55:23 GMT 2015


> - Any chance you could use cmdliner instead of Arg?  Fairly minor, but
>   all the other plugins use it and it's nice to have the same behaviours
>   for OPAM plugins where possible.

I'm fond of Daniel's design work, so I would gladly move to
Cmdliner -- Arg was just what I could easily use "in anger" for
a first try.

> - How does this behave on pre-1.2 files?  I think it's about time that
>   we deprecate pre-1.2 so that we can have clean metadata standards
>   about the new fields such as dev-repo.

opam-fmt updates older opam files to its own support version -- and
refuses to work on newer files. Two things:

I think it would be a better design to have a family of scripts
opam-fmt-1.0, opam-fmt-1.1, opam-fmt-1.2 etc. and a "mother script"
opam-fmt that calls the right reformatter according to the file
version¹. However, that requires changes in the packaging of opam-lib,
so that the package for distinct versions can be installed
simultaneously (they would be separate packages
opam-lib-VER, and ocamlfind packages as well).

¹: another option would be for opam-lib to also support pairs of parsing
and printing functions for older format version specifically, but that
is an invasive choice to make in a codebase. Right now there is a tiny
bit of logic to know which fields are 1.0 or 1.1-specific, but this
would be much more ambitious.

There are various warnings implemented in opam-lib that could be emitted
during the processing of files by opam-fmt -- they may be already
available depending on the OPAMDEBUG variable or something, but an
explicit support in the interface could be nice. When I tested
reformatting opam-repository, I observed that a large part of its opam
files raise such warnings (so indeed there seems to be a metadata
problem in the repository today).

> - Having a bot regularly go through and reformat the entire repository
>   also seems quite legit.  The alternative would be to reformat at the
>   merge point, but this would require a staging branch.

I'm not sure what you call "merge point"; my idea was to put the burden
of reformatting onto users submitting PRs against the
repository. (Regular reformatting are a sensible idea, but they run in
the problem of loss of information, whether distributed manual
reformatting keeps humans closer in the loop)

> - I would really like opam-fmt to be lossless, so preserving comments
>   seems like an extension that we should implement.  Ideally everyone
>   can just call it on their packages without thinking about it.

I have mixed feelings about trying to be lossless. At the very least,
one should recognize that setting this as a design goal would impose
a significant burden on the developers of the parsing/printing functions
in opam-lib.

Some human choices (alignment of string fields for example) are rather
difficult and fragile to recognize -- and they could complexify the
codebase. Even for comments, right now you cannot tell to which
configuration item an element is attached. There are several ways around
this, which are interesting to consider but also involve a fair amount
of work:

- You could use ocamldoc-like placement rules: "always after the
  relevant field" (a first comment would be a file-global comment), or
  "either before or after, but an empty line between the comment and
  a non-relevant field"; this seems painful and not-that-easy to
  implement.

- You could move to a docstring-like (or attribute-like) syntax where
  comments are explicitly attached to an AST node; from a language
  design point of view this would be my preference, but it may require
  a change in concrete syntax.

- Finally, the choices you can make in this design space depend a lot on
  whether reformatting will be performed by humans or by bots. If your
  comments-attachment rules are obscure, humans have the opportunity to
  reformat, see that they got them wrong, and reiterate. Bots will just
  put stuff at the wrong place.

I think that the people that maintain this corner of opam today are
those that will pay the greater cost if "lossless" becomes a design
goal, so it should be their choice to make.


In the meantime, it would be interesting to have a look at how opam
files in the repository actually use comments. With

  find packages -name 'opam' | xargs grep --color=always " #"

I see 75 occurences of comments, 38 of which are just "TODO fixme". The
37 others seem rather interesting, below are a few representative examples:

packages/arakoon/arakoon.1.8.6/opam:
>   "lwt" { = "2.4.8" }  # 2.4.9 had an incompatible API change

Having textual exaplanations for choice of bound is a reasonable
use-case for attributes.

packages/camlp4/camlp4.4.01/opam:
>   build: [] # dummy package

This could be replaced by a dedicated note/comment field.

packages/frama-c/frama-c.20150201/opam:
>   "lablgtk" { >= "2.18.2" } #for ocaml >= 4.02.1

I don't understand the semantics of this one.

packages/git/git.1.6.0/opam:
>   depopts: [
>     # --enable-mirage
>     "mirage-types-lwt"
>     "mirage-http"
>     "mirage-flow"
>     "channel"
>     # --enable-unix
>     "cohttp"
>     "conduit"
>     "base-unix"
>   ]

This usage is very interesting, it seems to call for a hierarchy inside
the "depopts" list (and "dependencies" as well, I suppose), with
annotations on sub-groups of dependencies.

packages/gsl/gsl.1.18.2/opam:
>   depends: [
>     "base-bigarray"
>     "camlp4"
>     "ocamlfind" {>= "1.3.1"}
>     # Included from _opam file
>     "conf-gsl"
>   ]

I don't know what this comment means.

packages/lz4/lz4.1.0.0/opam:
> depexts: [
>   [["debian"] ["liblz4-dev"]]
>   # [["ubuntu"] ["liblz4-dev"]] reenable when CI updates its Ubuntu
>   [["source"] ["https://.../install.sh"]]
> ]

Again, this would require annotations.

packages/ppx_deriving/ppx_deriving.0.3/opam:
> build: [
>   # If there is no native dynlink, we can't use native builds
>   "ocaml" "pkg/build.ml" "native=true"
>                          "native-dynlink=true"
> ]

packages/frama-c-e-acsl/frama-c-e-acsl.0.5/opam:
> build: [
>   ["ocaml" "run_autoconf_if_needed.ml"] #when used in pinned mode the configure *can* not yet be generated
>   ["./configure" "--prefix" prefix]
>   [make]
> ]

packages/clangml/clangml.0.5.2/opam:
> depexts: [
>   [["debian"] ["libboost-dev" "llvm-3.4-dev" "clang-3.4" "libclang-3.4-dev" "binutils-dev"]]
>   [["ubuntu"] ["libboost-dev" "llvm-3.4-dev" "clang-3.4" "libclang-3.4-dev" "binutils-dev"]]
>   [["gentoo"] ["dev-libs/boost" "sys-devel/llvm-3.4.1-r2" "sys-devel/clang-3.4.0-r100" "sys-devel/binutils"]]
>   # archlinux has no package providing llvm and clang 3.4.1
>   [["archlinux"] ["boost" "binutils"]]
> ]

packages/mtime/mtime.0.8.1/opam:
> depends: [ "ocamlfind"
>            "js_of_ocaml" # FIXME should become a deptopt
> ]

On Sun, Nov 22, 2015 at 8:09 PM, Anil Madhavapeddy <anil at recoil.org> wrote:
> Thanks for this Gabriel! Quick notes:
>
> - I would really like opam-fmt to be lossless, so preserving comments
>   seems like an extension that we should implement.  Ideally everyone
>   can just call it on their packages without thinking about it.
>
> - Having a bot regularly go through and reformat the entire repository
>   also seems quite legit.  The alternative would be to reformat at the
>   merge point, but this would require a staging branch.
>
> - Any chance you could use cmdliner instead of Arg?  Fairly minor, but
>   all the other plugins use it and it's nice to have the same behaviours
>   for OPAM plugins where possible.
>
> - How does this behave on pre-1.2 files?  I think it's about time that
>   we deprecate pre-1.2 so that we can have clean metadata standards
>   about the new fields such as dev-repo.
>
> regards,
> Anil
>
>> On 21 Nov 2015, at 21:53, Gabriel Scherer <gabriel.scherer at gmail.com> wrote:
>>
>> Hi opam-devel,
>>
>> As part of the discussion in
>>
>>  bulk addition of 'ocamlbuild {build}' dependencies
>>  https://github.com/ocaml/opam-repository/pull/5140
>>
>> it became apparent that performing bulk updates on opam-repository is
>> made harder by the fact that the parse-print roundtrip does not
>> preserve human-formatted opam files. For my proposed patch I tried to
>> separate the reformatting of opam file (to follow the opam-lib printer
>> convention) from the actual changes in two separate commits, but that
>> means more work for script authors, and also creates patches that are
>> harder to review. (If (re)formatting was controlled by the maintainer
>> of the OPAM packages instead of authors of bulk updates, we would be
>> more confident in its correctness.)
>>
>> In order to move that discussion forward (how to maintain opam
>> metadata in a way that is easy for both human and scripts to work
>> with?), I propose the opam-fmt script that can be found at
>>  https://github.com/gasche/opam-fmt/
>>
>> I wrote it in the last few days and there are probably some rough
>> edges. Once I feel that it should work, I may try to package it on the
>> opam-repo (in the meantime, clone then pin).
>>
>> This suggests one possible way forward: publicize opam-fmt, encourage
>> users to reformat their opam files using it, adapt the opam-repository
>> QA to call `opam fmt --check` on opam files proposed in PR to enforce
>> it, and after some time (once we trust it works as expected thanks to
>> the human guinea pigs) reformat all older opam files to get a perfect
>> round-trip on all files of the repository.
>> It is not clear to me that this is the best way forward. (For example
>> it means that, in the current state of the opam file parsing/printing
>> code, comments in opam files would always be erased by reformatting.
>> Should we remove comments from the valid syntax of opam files, or
>> attach them to configuration lines to re-print them correctly later,
>> or maybe refuse to work on files with comments?) Opam developers and
>> repository maintainers may decide that the cost of caring about
>> reformatting outweigh the benefits in terms of scriptability.
>>
>> What do you think?
>> _______________________________________________
>> opam-devel mailing list
>> opam-devel at lists.ocaml.org
>> http://lists.ocaml.org/listinfo/opam-devel
>>
>


More information about the opam-devel mailing list