[wg-camlp4] Raw representation of literals in the parsetree

Jeremie Dimino jdimino at janestreet.com
Thu May 2 12:35:11 BST 2013


Hi,

We recently felt the need to write a new extension that would check
integer literals.  The goal is to verify that ones containing
underscores match a specific regular expression.  Namely digits are
grouped by 3.

We can do that for instance with a camlp4 token filter, but we wanted
to try with ppx.  It is currently not possible to do it by looking at
the parsetree since constants are already evaluated (integers are
represented by an OCaml int) and the raw form is lost.

One solution would be to add the raw representation of constants in
the parsetree. For instance by changing Asttypes.constant to:

,----
| type constant_value =
|     Const_int of int
|   | Const_char of char
|   | Const_string of string
|   | Const_float of string
|   | Const_int32 of int32
|   | Const_int64 of int64
|   | Const_nativeint of nativeint
|
| type constant = constant_value * string
`----

I believe it could also be useful for other rewriters, especially ones
dealing with strings since they would be able to compute the correct
location inside a string. And maybe also for printing: Pprintast could
use the representation choosed by the programmer instead of a
standardized one.

Do people thinks that this a reasonable thing to add to the OCaml
parsetree? If yes I can do the modification.

Jeremie


More information about the wg-camlp4 mailing list