[wg-camlp4] Request for feedback

Gabriel Scherer gabriel.scherer at gmail.com
Thu Mar 7 17:07:18 GMT 2013


> It is true that there could be some discontinuity: at some point, you can
> shoehorn your "DSL" in the syntax of OCaml expressions, but when you extend
> it, you might not find a nice way to continue doing so and you might be
> forced to switch to a very different embedding.

I was thinking of changes to the extension mechanism itself. There
have been sparse but regular calls to a more structured representation
of annotations (eg. using a few literal categories to do some form of
structure-checking uniformly on all annotations), and if we found out
in the future we really want that, that could be disruptive to
existing quotation usages. In the discussion you suggested a dedicated
{{...}} syntax for quotations, something like that seems less likely
to want to evolve and forbid previous precious use cases.

(Don't jump implementing {{...}} yet! There are equally compelling
reasons to limit the number of new concepts and appreciate the
expressivity of the core you have, so it's more thinking out loud
about possible arguments and counter-arguments.)

> Do you mean:
>
> [%bit ... %bit]
> or:
> [%bit ... bit%]
> or:
> [%bit ... %]
>
> or something else?

I was thinking of [%bit ... %]. On the other hand, it's not clear we
would want to have two syntaxes, one for "short annotations" and one
for "long annotations", while one suffices semantically. Also, for
extensions that are really quotations, that is of the form

  [%name
     ...long expr...
  ]

we could use the following pattern by convention

[%name(
   ... long expr ...
)]

or even

[%name begin
   ... long expr ...
end]


In a perfect world, I would dream of a long-syntax with symmetric
delimiters [% foo bar %], and a short-syntax *without delimiters*
reserved to the specific cases where the extension is attached to a
keyword: for%lwt, let%monad, etc., so we can have both convenient
short-scale marks and readable, structured, symmetric larger-scale
delimiters. However, non-delimited syntaxes such as for%lwt open a new
parsing can of worms, so it's reasonable to choose [% foo bar ] as an
awkward compromise that is not too painful at small scale, even if
maybe not optimal at large scale.


On Thu, Mar 7, 2013 at 5:55 PM, Alain Frisch <alain.frisch at lexifi.com> wrote:
> On 03/07/2013 05:23 PM, Gabriel Scherer wrote:
>>
>> I'm wondering how much I should trust them :p Have you actually tried
>> to parse the most daring ones with your parser? If you have a
>> testsuite, it may be interesting to add them to it.
>
>
> Yes, I've tried most of them.  Most of them work. A few number are rejected,
> but only because more work is needed on the parser (e.g. for[@id] is not yet
> supported, even though it would have taken less time to add support than to
> write this sentence; and class expressions do not yet support extension
> nodes).
>
>
>> While I admire the cleverness of using the fact that attributes are
>> OCaml syntax to get the "quotation" use cases for free, it also gives
>> me mixed feelings:
>> - I'm not sure how robust it is to future design changes
>
>
> It is true that there could be some discontinuity: at some point, you can
> shoehorn your "DSL" in the syntax of OCaml expressions, but when you extend
> it, you might not find a nice way to continue doing so and you might be
> forced to switch to a very different embedding.  I don't believe that this
> will be a serious problem in practice, in particular because one can always
> use an extension node with a string literal content  in order to use custom
> concrete syntax locally in most syntactic categories (or directly a string
> if the category accept it, i.e. expressions and patterns).  As an example,
> let's have a look at the bitstring example:
>
> let bits = Bitstring.bitstring_of_file "/bin/ls" in
> [%bit match bits with
> | [ 0x7f, 8; "ELF", 24, string;  (* ELF magic number *)
>     e_ident, Mul(12,8), bitstring;    (* ELF identifier *)
>     e_type, 16, littleendian;    (* object file type *)
>     e_machine, 16, littleendian  (* architecture *)
>   ] ->
>   printf "This is an ELF binary, type %d, arch %d\n"
>     e_type e_machine
> ]
>
> One can imagine that in an early version, the size field could only be
> specified as an int literal.  When the author decides to let his users write
> "12*8", he can decide between:
>
>  - Creating some "DSL" to describe those formulas, with a syntax compatible
> with patterns, as I did in the example above (Mul(12, 8))
>
>   e_ident, Mul(12,8), bitstring;    (* ELF identifier *)
>
>  - Using a string to represent the formula (but then one must create a new
> parser... or call the OCaml parser!):
>
>   e_ident, "12 * 8", bitstring;
>
>  - Using an extension node to allow injecting an expression without
> requiring a parser:
>
>   e_ident, [%sz 12 * 8], bitstring;
>
>   (Here we don't care about the "id" of the extension node.  It is worth
> creating a syntax for name-less extension nodes?)
>
>
>
>
>> - For multi-line quotations, I would appreciate a slightly more
>> explicit syntax (possibly the symmetric delimiter to help users see
>> where the quotation end).
>
>
> Do you mean:
>
> [%bit ... %bit]
> or:
> [%bit ... bit%]
> or:
> [%bit ... %]
>
> or something else?
>
> Note than one can also write:
>
> begin[@bit]
> ....
> end
>
> The only difference is that if we compile without the "bitstring expander",
> we will probably get a type-error which might be a little harder to
> understand.  I was thinking about supporting:
>
> begin[%bit]
> ....
> end
>
> as an equivalent to:
>
> [%bit ....]
>
> but this would be ambiguous (function appliction with [%bit] in function
> position).
>
>
> Alain


More information about the wg-camlp4 mailing list