[wg-camlp4] Time for a summary?

Wed Feb 6 18:11:12 GMT 2013

> Maybe a single post
>summarizing the main proposals and points of contention?

I will attempt a summary of the parts of the discussion that I remember. In 
no particular order the following has been discussed:

1. Current uses of camlp4

 There were many examples of current uses of camlp4 given. Gabriel put a 
 summary of them on a wiki:

  https://github.com/gasche/ocaml-syntax-extension-discussion/wiki/Use-Cases

 Anil also posted a list of all the extensions used by OPAM packages:

  https://github.com/avsm/opam-camlp4-analysis/wiki

 I don't think anyone has systematically gone through these yet, but I 
 think that it is important, before any concrete proposal is made by this 
 working group, to make clear which of these extensions we intend to 
 support and how we intend to support them.

2. New syntaxes needed for ppx:

 There seems to be general agreement that, for ppx to replace camlp4 for 
its
 most common uses, at least the following three kinds of syntax are needed:

 a) A "template" syntax along the lines of (:longid expr). This syntax 
    could be used as an expression, a pattern, a structure item, etc.
    For example Sedlex could be:
       (:sedlex
          match buf with
          | number -> ..
          | letter, Star ('A'..'Z' | 'a'..'z' | digit) -> ..
       )
 b) An "attribute" syntax along the lines of (@ expr). This syntax allows 
    you to attach expressions to other expressions, patterns, etc. Unlike 
    the template syntax, the type-checker can silently ignore any of these 
    that don't get translated. For example Bisect could be:

      let f x =
        match List.map foo [x; a x; b x] with
        | [y1; y2; y3] -> tata
        | _ -> (@ Bisect.visit) assert false

      (@ Bisect.ignore)
      let unused = ()

    It has also been suggested that there might need to be different syntax 
    for "post-fix" and "pre-fix". Gabriel suggested that whether an 
    attribute in a particular syntactic position is pre-fix or post-fix be 
    decided on a case by case basis.

 c) A quotation syntax like {x{ .. }x} (where x is any operator symbol) 
    that could be used for quoting non-OCaml syntax. For example sedlex 
    with a more compact notation for regular expressions:

      (:sedlex
         match lexbuf with
         | {{ xml_letter+ }} -> ...
         | {{ "with" }} -> ...
         | ...
      )

 There has also been some agreement, and no strong objection, to including 
 some abbreviated forms of the above syntaxes. Some of the suggested forms 
 include:

 a) Template + quotation: 
     {:id x{ str }x}  ==  (:id {x{ str }x})

 b) Template + let: 
      let:id x = .. in ...  ==  (:id let x = .. in ... )

 c) Template + match:
      match:id exp with ..  ==  (:id match exp with .. )

 d) Type-conv style attribute:
      type t = ... with foo, bar( expr ) == type t = ... (@foo) (@bar expr)

 e) Anonymous template:
      (# expr)  ==  (:(*no id*) expr)

3. Anti-quotations

 There has been some discussion about how to support anti-quotations:

 - Fabrice suggested using a standardised format for anti-quotations, but 
   some people were against that because they use "$" in their camlp4 
   quotation extensions.

   I suggested that if we provided functions which took a predicate 
   function and then parsed an OCaml phrase up until the next *unnested* 
   location that made the predicate true, then we could support general 
   anti-quotations for ppx.

 - Hongbo suggested that AST lifting was necessary for supporting 
   anti-quotations, citing this paper:

     http://dl.acm.org/citation.cfm?id=1291211

   I suggested that it isn't necessarily needed, and that if people wanted 
   it then it could be provided using a type-conv style extension or 
   run-time types.

4. On the use of "quotations"

 There has also been some discussion about when it is a good idea to use a 
 camlp4-style quasi-quotation:

 - Alain suggested that they should not be used for extensions that are 
   "mostly valid OCaml code". He pointed out that doing so causes you to 
   lose all editor and tool support.

   Hongbo disagreed saying that in Fan extensions like sedlex would be 
   implement as quasi-quotation and that editor support was fine.

   After this there seems to have been some agreement that it would be 
   better to implement such extensions as "templates" (see section 2 of 
   this post) rather than quasi-quotations.

 - Alain then went further, suggesting that quotations were not even 
   suitable for encoding foreign languages in OCaml.

   Gabriel disagreed, saying that for domain-specific foreign languages 
   (e.g. SQL) quotations allow you to have a domain expert maintain the 
   foreign code even if they didn't know OCaml.

   Other people also said that they liked using quotations for foreign 
   syntax.

5. Alternatives to ppx

 There have been some proposals for alternatives and variations to using 
 ppx for extensions:

 a) Hongbo suggests that his Fan project is a better alternative to camlp4 
    than ppx:

      https://github.com/bobzhang/Fan

 b) Xavier Clerc suggested that attributes might be declared and typed, as 
    in Java. So before an attribute could be used:

      (@ MyAnnot {a=1, b="two"})

    "MyAnnot" would have to be declared with fields "a" (of type "int") and 
    "b" (of type "string"). He points out that this may protect the user 
    from some typos and other obvious errors.

    Alain suggested that this would be a much more complicated system, and 
    might be too restrictive for some uses of attributes.

 c) Sebastien Mondet suggested that run-time types or dyntype might be 
    sufficient without ppx for many extensions.

    Markus Mottl pointed out that there are runtime performance 
    implications for using those solutions.

 d) Jeremy suggested that rather than implementing extensions as AST 
    transformers that operate over the whole AST they might be implemented 
    as transformers only for the part of the AST that they were to operate 
    on. So an extension used like this:

      (:perform
          x <-- m;
          y <-- n;
          return (x y))

    would be implemented as a function with type:

      val perform : Parsetree.expression -> Parsetree.expression

    He suggests that this may help with scoping, safety and 
    compositionality.

    Alain pointed out that some legitimate uses of ppx don't work nicely 
    with the constraint that they can operate only on a marked fragment. He 
    also pointed out that handling extensions as those extensions are found 
    within the source forces a top-down expansion order, which is not 
    necessarily desirable.

 e) I suggested an extension to Jeremy's proposal. This involves giving the 
    expansion functions in their own kind of file (".mlq" - compiled to 
    ".cmq"), referring to them using a namespace mechanism, and then having 
    the compiler itself perform the expansion of the extensions.

    There are more details in my blog post:

      http://www.lpw25.net/2013/02/05/camlp4-alternative-part-2.html

    Since such a solution would take a while to implement, and since moving 
    an extension from ppx to it would be trivial, I proposed using ppx for 
    the short/medium term.

    I suggested that such a mechanism might improving tooling and make it 
    easier for average users to use extensions.

    Both Gabriel and Alain suggested that the use of namespaces might be 
    unnecessary and too heavyweight. Alain also questioned the need for 
    special ".mlq" files.

There are probably other parts of the discussion that I've forgotten. I'm 
sure someone will fill-in any important details I've missed.