[wg-camlp4] Meta Programming from the view of the implementaion

Wed Jan 30 09:12:03 GMT 2013

On 01/30/2013 03:34 AM, Hongbo Zhang wrote:
> let rec token enc =  {:lex|
>     "<utf8>" -> begin enc := Ulexing.Utf8; token enc lexbuf end
>    | "<latin1>" -> begin enc := Ulexing.Latin1; token enc lexbuf end
>    | xml_letter+ -> Printf.sprintf "word(%s)" (Ulexing.utf8_lexeme lexbuf)
>    | number -> "number"
>    | eof -> exit 0
>    | [1234-1246] -> "bla"
>    | "(" ->  begin
>        Ulexing.rollback lexbuf; (* Puts the lexeme back into the buffer *)
>        {| "(" [^ '(']* ")" -> Ulexing.utf8_lexeme lexbuf |} lexbuf
>        (* Note the use of an inline lexer *)
>    end
>    | "(*" -> begin comment lexbuf; "comment" end
>    | ' ' -> "whitespace"
>    | _ -> "???" |}
> and comment = {:lex|
>     "*)" -> ()
>    | eof -> failwith "comment"
>    | _ -> let _lexeme = Ulexing.lexeme lexbuf in
>      comment lexbuf |}

This looks very bad to me:

1. You loose all support from your editor (indentation, coloring, 
parentheses matching).  Am I the only one who finds this really 
problematic?  With quotations, my emacs looks like notepad... 
Indentation and coloring do find typos on the fly; automatic indentation 
makes it easy to copy/paste code from one context to another one.

2. Quotation "with OCaml code in them" does not combine well with other 
AST rewriters.  If you have a -ppx filter implementing, say, macro 
expansion (in patterns and expressions), would you apply it before or 
after the one expanding the {:lex| ... |} quotations?  Probably both. 
(And won't be able to benefit of macro on patterns for the "regexps".)

3. How do you implement the expander?  Somehow, you need to parse the 
content of the quotation (stored as a string in the AST), which involves 
non trivial stuff, like a parser being able to parse OCaml code mixed 
with something else.  Personally, I don't know how to implement the 
quotation expander with the parsing technologies I'm familiar with.

For me quotations (in position of expression) are useful only for one 
thing: escape from the lexical conventions of OCaml string literals. 
Otherwise, strings are just fine.  There are very few cases where with 
actually matters.  A decent example might be something like Pa_tyxml, 
allowing to write XML code in XML syntax within OCaml sources.  But even 
there, I'm not absolutely sure that quotations are the best solution: 
would it really be so bad to use normal OCaml strings (with an attribute):

   (@xml)"<div>xyz</div>"

or OCaml syntax interpreted specially:

   (@xml)(div "xyz")

instead of the current:

  <:html5< <div>xyz</div> >>

But maybe for some cases (embedding a language with a lot of double 
quoted and backslashes), escaping from the lexical conventions of OCaml 
string literals might really be useful.  This could be actually be the 
case for standard string literals, with no special syntactic processing. 
  So why not simply address this need by allowing an alternative syntax 
for string literals, where no special character is interpreted?  The 
opening delimiter would define the closing delimiter.  For instance:

   {xxx{ ..... }xxx}

(here I've assumed that { } are hard-coded, but xxx could be replaced 
say, by any sequence of identifier/digit/operator characters). In the 
AST, this could be represented as a normal string literal (or maybe we 
should keep the "xxx" annotation, to avoid having to put an extra 
attribute in front of the literal).

I've yet to be convinced that quotations really have other interesting 
uses than escaping from lexical conventions of OCaml string literals...

Alain