[containers-users] Possible additions to Containers and Friends

peter frey pjfrey at sympatico.ca
Mon Feb 26 23:52:24 GMT 2018


Simon occasionally includes code from some other part of the libraries 
to avoid requiring, say, Gen to access Sequence or Containers; I don't 
remember offhand.  In the case of some tiny piece of code thats 
sensible. (And so far that is all I have provided)

Pervasives has now a type uchar which Uutf uses consistently. In the 
case that one is not dealing with the many exigencies that Uutf deals 
with, thats a bit overkill?  Alain Frisch used ints in Ulex and I 
remember another usage of uchar only in Camomille.

Uutf has a smaller range of codes that it accepts; namely Utf8.
Thats 0 to (1024 * 16) + (1024 * 16) + (64 * 1024) where 64k portion at 
the end is also excluded from the 1Mb range before.

Originally Utf8 encoded all possible codes in the positive int32 range.
I prefer to revert to the old standard; (and call it Utf31) since I this 
allows me to encode alphabets that are larger than, but include, the 
utf8 range.  (This may not work with js_of_ocaml; but not all 
applications involve the web)
When it comes to comparing utf8 chars (codes; ints) consider the following:

utop # ((=));;
- : 'a -> 'a -> bool = <fun> after loading utop
utop # open Containers;;
utop # ((=));;
- : int -> int -> bool = <fun>                      type is overloaded
utop #

( I noticed this also in Jane Streets's "Sequence" .  Possibly they want 
to avoid 'polymorphic'  comparisons; i.e.: comparisons that examine the 
internal structure)

Did you intend to do that?

Peter


On 2018-02-25 05:12 AM, SP wrote:
> On Sat, Feb 24, 2018 at 12:21:52PM -0600, Simon Cruanes wrote:
>> We could build on uutf, it's relatively small and doesn't have too many
>> deps. However, I also don't think utf8 is that complicated that we
>> couldn't just redo the codepoint<-> byte conversions in a simpler
>
> Make it uutf compatible then, so one can either use uutf for full
> functionality or use a few basic converters provided in Containers.
>



More information about the Containers-users mailing list