[opam-devel] [ANN] opam-fmt 1.0
    Gabriel Scherer 
    gabriel.scherer at gmail.com
       
    Mon Nov 23 12:18:33 GMT 2015
    
    
  
One idea I just had when re-reading the scavenged comments is to just
consider comment lines as independent items. If you add ordering
information to the internal opam representation to keep the items in the
order, then I suspect that all the examples above would be reformatted in a
way that is acceptable to the original author. One final bit of
sophistication would be to store (in the comment items) a boolean for
whether they started on their own line, or are to be inserted at the end of
the previous line.
Of course this fake structure is less true to the real document structure
than properly placed attribute nodes would be. It seems easier to implement
(and thus a decent compromise to make), but then maybe not: keeping an
ordering may be non-trivial for data whose current internal representation
is not a set of things (for example dependency formulas are actually full
logic formulas; but then if we keep to Conjunctive Normal Forms they can
have a list-like representation).
On Mon, Nov 23, 2015 at 1:11 PM, Thomas Gazagnaire <thomas at gazagnaire.org>
wrote:
> Thanks Gabriel, that's very useful!
>
> For the various parsers, it might make sense to simply fork (or
> copy/paste) bits of opam parsers from various versions of opam and try to
> fix these parsers to:
>
> - keep comments
> - support some kind of alignments?
>
> Thomas
>
>
> > On 22 Nov 2015, at 20:55, Gabriel Scherer <gabriel.scherer at gmail.com>
> wrote:
> >
> >> - Any chance you could use cmdliner instead of Arg?  Fairly minor, but
> >>  all the other plugins use it and it's nice to have the same behaviours
> >>  for OPAM plugins where possible.
> >
> > I'm fond of Daniel's design work, so I would gladly move to
> > Cmdliner -- Arg was just what I could easily use "in anger" for
> > a first try.
> >
> >> - How does this behave on pre-1.2 files?  I think it's about time that
> >>  we deprecate pre-1.2 so that we can have clean metadata standards
> >>  about the new fields such as dev-repo.
> >
> > opam-fmt updates older opam files to its own support version -- and
> > refuses to work on newer files. Two things:
> >
> > I think it would be a better design to have a family of scripts
> > opam-fmt-1.0, opam-fmt-1.1, opam-fmt-1.2 etc. and a "mother script"
> > opam-fmt that calls the right reformatter according to the file
> > version¹. However, that requires changes in the packaging of opam-lib,
> > so that the package for distinct versions can be installed
> > simultaneously (they would be separate packages
> > opam-lib-VER, and ocamlfind packages as well).
> >
> > ¹: another option would be for opam-lib to also support pairs of parsing
> > and printing functions for older format version specifically, but that
> > is an invasive choice to make in a codebase. Right now there is a tiny
> > bit of logic to know which fields are 1.0 or 1.1-specific, but this
> > would be much more ambitious.
> >
> > There are various warnings implemented in opam-lib that could be emitted
> > during the processing of files by opam-fmt -- they may be already
> > available depending on the OPAMDEBUG variable or something, but an
> > explicit support in the interface could be nice. When I tested
> > reformatting opam-repository, I observed that a large part of its opam
> > files raise such warnings (so indeed there seems to be a metadata
> > problem in the repository today).
> >
> >> - Having a bot regularly go through and reformat the entire repository
> >>  also seems quite legit.  The alternative would be to reformat at the
> >>  merge point, but this would require a staging branch.
> >
> > I'm not sure what you call "merge point"; my idea was to put the burden
> > of reformatting onto users submitting PRs against the
> > repository. (Regular reformatting are a sensible idea, but they run in
> > the problem of loss of information, whether distributed manual
> > reformatting keeps humans closer in the loop)
> >
> >> - I would really like opam-fmt to be lossless, so preserving comments
> >>  seems like an extension that we should implement.  Ideally everyone
> >>  can just call it on their packages without thinking about it.
> >
> > I have mixed feelings about trying to be lossless. At the very least,
> > one should recognize that setting this as a design goal would impose
> > a significant burden on the developers of the parsing/printing functions
> > in opam-lib.
> >
> > Some human choices (alignment of string fields for example) are rather
> > difficult and fragile to recognize -- and they could complexify the
> > codebase. Even for comments, right now you cannot tell to which
> > configuration item an element is attached. There are several ways around
> > this, which are interesting to consider but also involve a fair amount
> > of work:
> >
> > - You could use ocamldoc-like placement rules: "always after the
> >  relevant field" (a first comment would be a file-global comment), or
> >  "either before or after, but an empty line between the comment and
> >  a non-relevant field"; this seems painful and not-that-easy to
> >  implement.
> >
> > - You could move to a docstring-like (or attribute-like) syntax where
> >  comments are explicitly attached to an AST node; from a language
> >  design point of view this would be my preference, but it may require
> >  a change in concrete syntax.
> >
> > - Finally, the choices you can make in this design space depend a lot on
> >  whether reformatting will be performed by humans or by bots. If your
> >  comments-attachment rules are obscure, humans have the opportunity to
> >  reformat, see that they got them wrong, and reiterate. Bots will just
> >  put stuff at the wrong place.
> >
> > I think that the people that maintain this corner of opam today are
> > those that will pay the greater cost if "lossless" becomes a design
> > goal, so it should be their choice to make.
> >
> >
> > In the meantime, it would be interesting to have a look at how opam
> > files in the repository actually use comments. With
> >
> >  find packages -name 'opam' | xargs grep --color=always " #"
> >
> > I see 75 occurences of comments, 38 of which are just "TODO fixme". The
> > 37 others seem rather interesting, below are a few representative
> examples:
> >
> > packages/arakoon/arakoon.1.8.6/opam:
> >>  "lwt" { = "2.4.8" }  # 2.4.9 had an incompatible API change
> >
> > Having textual exaplanations for choice of bound is a reasonable
> > use-case for attributes.
> >
> > packages/camlp4/camlp4.4.01/opam:
> >>  build: [] # dummy package
> >
> > This could be replaced by a dedicated note/comment field.
> >
> > packages/frama-c/frama-c.20150201/opam:
> >>  "lablgtk" { >= "2.18.2" } #for ocaml >= 4.02.1
> >
> > I don't understand the semantics of this one.
> >
> > packages/git/git.1.6.0/opam:
> >>  depopts: [
> >>    # --enable-mirage
> >>    "mirage-types-lwt"
> >>    "mirage-http"
> >>    "mirage-flow"
> >>    "channel"
> >>    # --enable-unix
> >>    "cohttp"
> >>    "conduit"
> >>    "base-unix"
> >>  ]
> >
> > This usage is very interesting, it seems to call for a hierarchy inside
> > the "depopts" list (and "dependencies" as well, I suppose), with
> > annotations on sub-groups of dependencies.
> >
> > packages/gsl/gsl.1.18.2/opam:
> >>  depends: [
> >>    "base-bigarray"
> >>    "camlp4"
> >>    "ocamlfind" {>= "1.3.1"}
> >>    # Included from _opam file
> >>    "conf-gsl"
> >>  ]
> >
> > I don't know what this comment means.
> >
> > packages/lz4/lz4.1.0.0/opam:
> >> depexts: [
> >>  [["debian"] ["liblz4-dev"]]
> >>  # [["ubuntu"] ["liblz4-dev"]] reenable when CI updates its Ubuntu
> >>  [["source"] ["https://.../install.sh"]]
> >> ]
> >
> > Again, this would require annotations.
> >
> > packages/ppx_deriving/ppx_deriving.0.3/opam:
> >> build: [
> >>  # If there is no native dynlink, we can't use native builds
> >>  "ocaml" "pkg/build.ml" "native=true"
> >>                         "native-dynlink=true"
> >> ]
> >
> > packages/frama-c-e-acsl/frama-c-e-acsl.0.5/opam:
> >> build: [
> >>  ["ocaml" "run_autoconf_if_needed.ml"] #when used in pinned mode the
> configure *can* not yet be generated
> >>  ["./configure" "--prefix" prefix]
> >>  [make]
> >> ]
> >
> > packages/clangml/clangml.0.5.2/opam:
> >> depexts: [
> >>  [["debian"] ["libboost-dev" "llvm-3.4-dev" "clang-3.4"
> "libclang-3.4-dev" "binutils-dev"]]
> >>  [["ubuntu"] ["libboost-dev" "llvm-3.4-dev" "clang-3.4"
> "libclang-3.4-dev" "binutils-dev"]]
> >>  [["gentoo"] ["dev-libs/boost" "sys-devel/llvm-3.4.1-r2"
> "sys-devel/clang-3.4.0-r100" "sys-devel/binutils"]]
> >>  # archlinux has no package providing llvm and clang 3.4.1
> >>  [["archlinux"] ["boost" "binutils"]]
> >> ]
> >
> > packages/mtime/mtime.0.8.1/opam:
> >> depends: [ "ocamlfind"
> >>           "js_of_ocaml" # FIXME should become a deptopt
> >> ]
> >
> > On Sun, Nov 22, 2015 at 8:09 PM, Anil Madhavapeddy <anil at recoil.org>
> wrote:
> >> Thanks for this Gabriel! Quick notes:
> >>
> >> - I would really like opam-fmt to be lossless, so preserving comments
> >>  seems like an extension that we should implement.  Ideally everyone
> >>  can just call it on their packages without thinking about it.
> >>
> >> - Having a bot regularly go through and reformat the entire repository
> >>  also seems quite legit.  The alternative would be to reformat at the
> >>  merge point, but this would require a staging branch.
> >>
> >> - Any chance you could use cmdliner instead of Arg?  Fairly minor, but
> >>  all the other plugins use it and it's nice to have the same behaviours
> >>  for OPAM plugins where possible.
> >>
> >> - How does this behave on pre-1.2 files?  I think it's about time that
> >>  we deprecate pre-1.2 so that we can have clean metadata standards
> >>  about the new fields such as dev-repo.
> >>
> >> regards,
> >> Anil
> >>
> >>> On 21 Nov 2015, at 21:53, Gabriel Scherer <gabriel.scherer at gmail.com>
> wrote:
> >>>
> >>> Hi opam-devel,
> >>>
> >>> As part of the discussion in
> >>>
> >>> bulk addition of 'ocamlbuild {build}' dependencies
> >>> https://github.com/ocaml/opam-repository/pull/5140
> >>>
> >>> it became apparent that performing bulk updates on opam-repository is
> >>> made harder by the fact that the parse-print roundtrip does not
> >>> preserve human-formatted opam files. For my proposed patch I tried to
> >>> separate the reformatting of opam file (to follow the opam-lib printer
> >>> convention) from the actual changes in two separate commits, but that
> >>> means more work for script authors, and also creates patches that are
> >>> harder to review. (If (re)formatting was controlled by the maintainer
> >>> of the OPAM packages instead of authors of bulk updates, we would be
> >>> more confident in its correctness.)
> >>>
> >>> In order to move that discussion forward (how to maintain opam
> >>> metadata in a way that is easy for both human and scripts to work
> >>> with?), I propose the opam-fmt script that can be found at
> >>> https://github.com/gasche/opam-fmt/
> >>>
> >>> I wrote it in the last few days and there are probably some rough
> >>> edges. Once I feel that it should work, I may try to package it on the
> >>> opam-repo (in the meantime, clone then pin).
> >>>
> >>> This suggests one possible way forward: publicize opam-fmt, encourage
> >>> users to reformat their opam files using it, adapt the opam-repository
> >>> QA to call `opam fmt --check` on opam files proposed in PR to enforce
> >>> it, and after some time (once we trust it works as expected thanks to
> >>> the human guinea pigs) reformat all older opam files to get a perfect
> >>> round-trip on all files of the repository.
> >>> It is not clear to me that this is the best way forward. (For example
> >>> it means that, in the current state of the opam file parsing/printing
> >>> code, comments in opam files would always be erased by reformatting.
> >>> Should we remove comments from the valid syntax of opam files, or
> >>> attach them to configuration lines to re-print them correctly later,
> >>> or maybe refuse to work on files with comments?) Opam developers and
> >>> repository maintainers may decide that the cost of caring about
> >>> reformatting outweigh the benefits in terms of scriptability.
> >>>
> >>> What do you think?
> >>> _______________________________________________
> >>> opam-devel mailing list
> >>> opam-devel at lists.ocaml.org
> >>> http://lists.ocaml.org/listinfo/opam-devel
> >>>
> >>
> > _______________________________________________
> > opam-devel mailing list
> > opam-devel at lists.ocaml.org
> > http://lists.ocaml.org/listinfo/opam-devel
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ocaml.org/pipermail/opam-devel/attachments/20151123/9154db40/attachment-0001.html>
    
    
More information about the opam-devel
mailing list