[opam-devel] [ANN] opam-fmt 1.0

Thomas Gazagnaire thomas at gazagnaire.org
Mon Nov 23 12:11:22 GMT 2015


Thanks Gabriel, that's very useful!

For the various parsers, it might make sense to simply fork (or copy/paste) bits of opam parsers from various versions of opam and try to fix these parsers to:

- keep comments
- support some kind of alignments? 

Thomas


> On 22 Nov 2015, at 20:55, Gabriel Scherer <gabriel.scherer at gmail.com> wrote:
> 
>> - Any chance you could use cmdliner instead of Arg?  Fairly minor, but
>>  all the other plugins use it and it's nice to have the same behaviours
>>  for OPAM plugins where possible.
> 
> I'm fond of Daniel's design work, so I would gladly move to
> Cmdliner -- Arg was just what I could easily use "in anger" for
> a first try.
> 
>> - How does this behave on pre-1.2 files?  I think it's about time that
>>  we deprecate pre-1.2 so that we can have clean metadata standards
>>  about the new fields such as dev-repo.
> 
> opam-fmt updates older opam files to its own support version -- and
> refuses to work on newer files. Two things:
> 
> I think it would be a better design to have a family of scripts
> opam-fmt-1.0, opam-fmt-1.1, opam-fmt-1.2 etc. and a "mother script"
> opam-fmt that calls the right reformatter according to the file
> version¹. However, that requires changes in the packaging of opam-lib,
> so that the package for distinct versions can be installed
> simultaneously (they would be separate packages
> opam-lib-VER, and ocamlfind packages as well).
> 
> ¹: another option would be for opam-lib to also support pairs of parsing
> and printing functions for older format version specifically, but that
> is an invasive choice to make in a codebase. Right now there is a tiny
> bit of logic to know which fields are 1.0 or 1.1-specific, but this
> would be much more ambitious.
> 
> There are various warnings implemented in opam-lib that could be emitted
> during the processing of files by opam-fmt -- they may be already
> available depending on the OPAMDEBUG variable or something, but an
> explicit support in the interface could be nice. When I tested
> reformatting opam-repository, I observed that a large part of its opam
> files raise such warnings (so indeed there seems to be a metadata
> problem in the repository today).
> 
>> - Having a bot regularly go through and reformat the entire repository
>>  also seems quite legit.  The alternative would be to reformat at the
>>  merge point, but this would require a staging branch.
> 
> I'm not sure what you call "merge point"; my idea was to put the burden
> of reformatting onto users submitting PRs against the
> repository. (Regular reformatting are a sensible idea, but they run in
> the problem of loss of information, whether distributed manual
> reformatting keeps humans closer in the loop)
> 
>> - I would really like opam-fmt to be lossless, so preserving comments
>>  seems like an extension that we should implement.  Ideally everyone
>>  can just call it on their packages without thinking about it.
> 
> I have mixed feelings about trying to be lossless. At the very least,
> one should recognize that setting this as a design goal would impose
> a significant burden on the developers of the parsing/printing functions
> in opam-lib.
> 
> Some human choices (alignment of string fields for example) are rather
> difficult and fragile to recognize -- and they could complexify the
> codebase. Even for comments, right now you cannot tell to which
> configuration item an element is attached. There are several ways around
> this, which are interesting to consider but also involve a fair amount
> of work:
> 
> - You could use ocamldoc-like placement rules: "always after the
>  relevant field" (a first comment would be a file-global comment), or
>  "either before or after, but an empty line between the comment and
>  a non-relevant field"; this seems painful and not-that-easy to
>  implement.
> 
> - You could move to a docstring-like (or attribute-like) syntax where
>  comments are explicitly attached to an AST node; from a language
>  design point of view this would be my preference, but it may require
>  a change in concrete syntax.
> 
> - Finally, the choices you can make in this design space depend a lot on
>  whether reformatting will be performed by humans or by bots. If your
>  comments-attachment rules are obscure, humans have the opportunity to
>  reformat, see that they got them wrong, and reiterate. Bots will just
>  put stuff at the wrong place.
> 
> I think that the people that maintain this corner of opam today are
> those that will pay the greater cost if "lossless" becomes a design
> goal, so it should be their choice to make.
> 
> 
> In the meantime, it would be interesting to have a look at how opam
> files in the repository actually use comments. With
> 
>  find packages -name 'opam' | xargs grep --color=always " #"
> 
> I see 75 occurences of comments, 38 of which are just "TODO fixme". The
> 37 others seem rather interesting, below are a few representative examples:
> 
> packages/arakoon/arakoon.1.8.6/opam:
>>  "lwt" { = "2.4.8" }  # 2.4.9 had an incompatible API change
> 
> Having textual exaplanations for choice of bound is a reasonable
> use-case for attributes.
> 
> packages/camlp4/camlp4.4.01/opam:
>>  build: [] # dummy package
> 
> This could be replaced by a dedicated note/comment field.
> 
> packages/frama-c/frama-c.20150201/opam:
>>  "lablgtk" { >= "2.18.2" } #for ocaml >= 4.02.1
> 
> I don't understand the semantics of this one.
> 
> packages/git/git.1.6.0/opam:
>>  depopts: [
>>    # --enable-mirage
>>    "mirage-types-lwt"
>>    "mirage-http"
>>    "mirage-flow"
>>    "channel"
>>    # --enable-unix
>>    "cohttp"
>>    "conduit"
>>    "base-unix"
>>  ]
> 
> This usage is very interesting, it seems to call for a hierarchy inside
> the "depopts" list (and "dependencies" as well, I suppose), with
> annotations on sub-groups of dependencies.
> 
> packages/gsl/gsl.1.18.2/opam:
>>  depends: [
>>    "base-bigarray"
>>    "camlp4"
>>    "ocamlfind" {>= "1.3.1"}
>>    # Included from _opam file
>>    "conf-gsl"
>>  ]
> 
> I don't know what this comment means.
> 
> packages/lz4/lz4.1.0.0/opam:
>> depexts: [
>>  [["debian"] ["liblz4-dev"]]
>>  # [["ubuntu"] ["liblz4-dev"]] reenable when CI updates its Ubuntu
>>  [["source"] ["https://.../install.sh"]]
>> ]
> 
> Again, this would require annotations.
> 
> packages/ppx_deriving/ppx_deriving.0.3/opam:
>> build: [
>>  # If there is no native dynlink, we can't use native builds
>>  "ocaml" "pkg/build.ml" "native=true"
>>                         "native-dynlink=true"
>> ]
> 
> packages/frama-c-e-acsl/frama-c-e-acsl.0.5/opam:
>> build: [
>>  ["ocaml" "run_autoconf_if_needed.ml"] #when used in pinned mode the configure *can* not yet be generated
>>  ["./configure" "--prefix" prefix]
>>  [make]
>> ]
> 
> packages/clangml/clangml.0.5.2/opam:
>> depexts: [
>>  [["debian"] ["libboost-dev" "llvm-3.4-dev" "clang-3.4" "libclang-3.4-dev" "binutils-dev"]]
>>  [["ubuntu"] ["libboost-dev" "llvm-3.4-dev" "clang-3.4" "libclang-3.4-dev" "binutils-dev"]]
>>  [["gentoo"] ["dev-libs/boost" "sys-devel/llvm-3.4.1-r2" "sys-devel/clang-3.4.0-r100" "sys-devel/binutils"]]
>>  # archlinux has no package providing llvm and clang 3.4.1
>>  [["archlinux"] ["boost" "binutils"]]
>> ]
> 
> packages/mtime/mtime.0.8.1/opam:
>> depends: [ "ocamlfind"
>>           "js_of_ocaml" # FIXME should become a deptopt
>> ]
> 
> On Sun, Nov 22, 2015 at 8:09 PM, Anil Madhavapeddy <anil at recoil.org> wrote:
>> Thanks for this Gabriel! Quick notes:
>> 
>> - I would really like opam-fmt to be lossless, so preserving comments
>>  seems like an extension that we should implement.  Ideally everyone
>>  can just call it on their packages without thinking about it.
>> 
>> - Having a bot regularly go through and reformat the entire repository
>>  also seems quite legit.  The alternative would be to reformat at the
>>  merge point, but this would require a staging branch.
>> 
>> - Any chance you could use cmdliner instead of Arg?  Fairly minor, but
>>  all the other plugins use it and it's nice to have the same behaviours
>>  for OPAM plugins where possible.
>> 
>> - How does this behave on pre-1.2 files?  I think it's about time that
>>  we deprecate pre-1.2 so that we can have clean metadata standards
>>  about the new fields such as dev-repo.
>> 
>> regards,
>> Anil
>> 
>>> On 21 Nov 2015, at 21:53, Gabriel Scherer <gabriel.scherer at gmail.com> wrote:
>>> 
>>> Hi opam-devel,
>>> 
>>> As part of the discussion in
>>> 
>>> bulk addition of 'ocamlbuild {build}' dependencies
>>> https://github.com/ocaml/opam-repository/pull/5140
>>> 
>>> it became apparent that performing bulk updates on opam-repository is
>>> made harder by the fact that the parse-print roundtrip does not
>>> preserve human-formatted opam files. For my proposed patch I tried to
>>> separate the reformatting of opam file (to follow the opam-lib printer
>>> convention) from the actual changes in two separate commits, but that
>>> means more work for script authors, and also creates patches that are
>>> harder to review. (If (re)formatting was controlled by the maintainer
>>> of the OPAM packages instead of authors of bulk updates, we would be
>>> more confident in its correctness.)
>>> 
>>> In order to move that discussion forward (how to maintain opam
>>> metadata in a way that is easy for both human and scripts to work
>>> with?), I propose the opam-fmt script that can be found at
>>> https://github.com/gasche/opam-fmt/
>>> 
>>> I wrote it in the last few days and there are probably some rough
>>> edges. Once I feel that it should work, I may try to package it on the
>>> opam-repo (in the meantime, clone then pin).
>>> 
>>> This suggests one possible way forward: publicize opam-fmt, encourage
>>> users to reformat their opam files using it, adapt the opam-repository
>>> QA to call `opam fmt --check` on opam files proposed in PR to enforce
>>> it, and after some time (once we trust it works as expected thanks to
>>> the human guinea pigs) reformat all older opam files to get a perfect
>>> round-trip on all files of the repository.
>>> It is not clear to me that this is the best way forward. (For example
>>> it means that, in the current state of the opam file parsing/printing
>>> code, comments in opam files would always be erased by reformatting.
>>> Should we remove comments from the valid syntax of opam files, or
>>> attach them to configuration lines to re-print them correctly later,
>>> or maybe refuse to work on files with comments?) Opam developers and
>>> repository maintainers may decide that the cost of caring about
>>> reformatting outweigh the benefits in terms of scriptability.
>>> 
>>> What do you think?
>>> _______________________________________________
>>> opam-devel mailing list
>>> opam-devel at lists.ocaml.org
>>> http://lists.ocaml.org/listinfo/opam-devel
>>> 
>> 
> _______________________________________________
> opam-devel mailing list
> opam-devel at lists.ocaml.org
> http://lists.ocaml.org/listinfo/opam-devel



More information about the opam-devel mailing list