[containers-users] Possible additions to Containers and Friends
peter frey
pjfrey at sympatico.ca
Mon Feb 26 23:52:24 GMT 2018
Simon occasionally includes code from some other part of the libraries
to avoid requiring, say, Gen to access Sequence or Containers; I don't
remember offhand. In the case of some tiny piece of code thats
sensible. (And so far that is all I have provided)
Pervasives has now a type uchar which Uutf uses consistently. In the
case that one is not dealing with the many exigencies that Uutf deals
with, thats a bit overkill? Alain Frisch used ints in Ulex and I
remember another usage of uchar only in Camomille.
Uutf has a smaller range of codes that it accepts; namely Utf8.
Thats 0 to (1024 * 16) + (1024 * 16) + (64 * 1024) where 64k portion at
the end is also excluded from the 1Mb range before.
Originally Utf8 encoded all possible codes in the positive int32 range.
I prefer to revert to the old standard; (and call it Utf31) since I this
allows me to encode alphabets that are larger than, but include, the
utf8 range. (This may not work with js_of_ocaml; but not all
applications involve the web)
When it comes to comparing utf8 chars (codes; ints) consider the following:
utop # ((=));;
- : 'a -> 'a -> bool = <fun> after loading utop
utop # open Containers;;
utop # ((=));;
- : int -> int -> bool = <fun> type is overloaded
utop #
( I noticed this also in Jane Streets's "Sequence" . Possibly they want
to avoid 'polymorphic' comparisons; i.e.: comparisons that examine the
internal structure)
Did you intend to do that?
Peter
On 2018-02-25 05:12 AM, SP wrote:
> On Sat, Feb 24, 2018 at 12:21:52PM -0600, Simon Cruanes wrote:
>> We could build on uutf, it's relatively small and doesn't have too many
>> deps. However, I also don't think utf8 is that complicated that we
>> couldn't just redo the codepoint<-> byte conversions in a simpler
>
> Make it uutf compatible then, so one can either use uutf for full
> functionality or use a few basic converters provided in Containers.
>
More information about the Containers-users
mailing list