[wg-camlp4] Raw representation of literals in the parsetree

Jeremie Dimino jdimino at janestreet.com
Thu May 23 16:26:16 BST 2013


We discussed it further with Alain but it turns out it doesn't work
very well.  A constant may be composed of several tokens so it is not
possible to have the exact raw representation.  And for strings one
can use the new quoted strings which are already treated as raw
strings.

Another possibility would be to modify the lexer so that it doesn't
'evaluate' tokens immediately and let the parser do it.  At least it
would avoid this kind of thing:

  # max_int;;
  - : int = 4611686018427387903
  # 4611686018427387904;;
  - : int = -4611686018427387904

On Thu, May 2, 2013 at 2:00 PM, Gabriel Scherer
<gabriel.scherer at gmail.com> wrote:
> I think it is very reasonable, and would be a good fit for Alain's branch.
>
> On Thu, May 2, 2013 at 1:35 PM, Jeremie Dimino <jdimino at janestreet.com> wrote:
>> Hi,
>>
>> We recently felt the need to write a new extension that would check
>> integer literals.  The goal is to verify that ones containing
>> underscores match a specific regular expression.  Namely digits are
>> grouped by 3.
>>
>> We can do that for instance with a camlp4 token filter, but we wanted
>> to try with ppx.  It is currently not possible to do it by looking at
>> the parsetree since constants are already evaluated (integers are
>> represented by an OCaml int) and the raw form is lost.
>>
>> One solution would be to add the raw representation of constants in
>> the parsetree. For instance by changing Asttypes.constant to:
>>
>> ,----
>> | type constant_value =
>> |     Const_int of int
>> |   | Const_char of char
>> |   | Const_string of string
>> |   | Const_float of string
>> |   | Const_int32 of int32
>> |   | Const_int64 of int64
>> |   | Const_nativeint of nativeint
>> |
>> | type constant = constant_value * string
>> `----
>>
>> I believe it could also be useful for other rewriters, especially ones
>> dealing with strings since they would be able to compute the correct
>> location inside a string. And maybe also for printing: Pprintast could
>> use the representation choosed by the programmer instead of a
>> standardized one.
>>
>> Do people thinks that this a reasonable thing to add to the OCaml
>> parsetree? If yes I can do the modification.
>>
>> Jeremie
>> _______________________________________________
>> wg-camlp4 mailing list
>> wg-camlp4 at lists.ocaml.org
>> http://lists.ocaml.org/listinfo/wg-camlp4

-- 
Jeremie


More information about the wg-camlp4 mailing list