[ocaml-ctypes] Help with strings

john skaller skaller at internode.on.net
Tue Dec 26 23:45:23 GMT 2017


> On 27 Dec. 2017, at 10:04, Jeremy Yallop <yallop at gmail.com> wrote:
> 
> Dear John,

Hi! Sorry for the email mixuo, i’m subscribed twice now,
once at sourceforge and once at internode.

I have another question: what’s the best way to bind an “enum”?
If the codes are sequential is there a builtin way to map the enum
into an Ocaml variant?

I guess that would use a ppx thing…

> I have another question: what’s the best way to bind an “enum”?
> If the codes are sequential is there a builtin way to map the enum
> into an Ocaml variant?
> 
> 
> On 25/12/2017, john skaller <skaller at internode.on.net> wrote:
>> So roughly .. when is it safe to use “string”?
>> At the moment the answer for me is “never” because I don’t understand the
>> memory management protocol.
> 
> It works like this:
> 
>  * passing a string using 'string' makes a copy in either direction.
> Furthermore,
> 
>    - the copy created when passing a string from OCaml to C lives for
> the lifetime of the C call.
>      (It's possible this will be strengthened in the future:
> https://github.com/ocamllabs/ocaml-ctypes/issues/556)
> 
>    - the copy created when passing a string from C to OCaml is a
> regular OCaml string, subject to usual GC behaviour.  Ctypes makes no
> attempt to deallocate the memory used by the original C string.

Right, thanks! That’s a definite spec, perhaps please you could add those
comments to the documentation?


> 
> So, for your three examples
> 
>>        char const *get_version();
>>        char *get_buffer();
>>        char *strdup(char *);
>> 
>> 
>> The first function returns a pointer to an immutable global object.
> 
> It's fine, and probably the best choice, to use 'string' for
> 'get_version'.  Each call to the function will create a fresh OCaml
> string.

Heh .. happily that’s the one function I have actually implemented and tested.

> 
>> The second to a mutable part of a buffer.
> 
> Mutability means that it's probably not useful to use `string` here,
> since 'string' will create a copy that won't track changes.  It would
> be better to write
> 
>   let get_buffer = foreign "get_buffer"
>      (void @-> returning (ptr char))

Gotcha!

> 
>> The third to a freshly malloc()ed copy of a NTBS.
> 
> Here 'string' is not the right thing for the *return type*, since (as
> you say) it prevents access to the pointer needed to free the returned
> memory.  Again, 'ptr char' is probably the best starting point.  On
> the other hand, ‘string' is a reasonable choice for the *argument*,

Right.

I may have some more questions, hope you don’t mind.
Perhaps the answers will be useful to others starting out.

BTW: some of the code appears in  several places.
Why is that?
(Ctypes vs Ctypes_types.TYPE)


> since Ctypes will automatically deallocate the copy of the string
> after the call.  So the following binding is reasonable:
> 
>    let strdup = foreign "strdup"
>      (string @-> returning (ptr char))
> 
> In this case it's possible to do a little better.  The 'ocaml_string'
> type description is an alternative to 'string' that avoids the copy:
> 
>    let strdup = foreign "strdup"
>      (ocaml_string @-> returning (ptr char))
> 
> In the general case, where the C function can call back into OCaml,
> 'ocaml_string' is not safe.  But it's safe in the common (first-order)
> case, which includes ‘strdup'.

Ok, so that’s a pointer to the live Ocaml string, and we’re relying
on the string not being modified or moved during the C call.

Current rules say Ocaml strings are immutable, use bytes instead.
However, if several threads are running, normally Ocaml is using
a global lock to serialise them. However the lock is normally
released when calling C code. That would not be safe in this
case, using ocaml_string, so presumably in this case the lock
is not released. Is that correct?


> 
>> So roughly, if you wanted to do it right, Ctypes “type” system is inadequate.
>> The “types” must contain mutablility and ownership information.
>> For example
>> 
>>        immutable_lend_noincrementable_nonnull_pointer_char
> 
> You're quite right, both that Ctypes doesn't capture these properties
> in the type system, and that it's quite challenging to do so, since C
> is so flexible.  (For example, the 'realpath' function can return
> either a malloc-allocated buffer or a caller-supplied buffer,
> depending on the value of its second argument!)

Yeah. And what’s worse is that many libraries describe functionality
in the documentation and fail to describe the “ownership” rules.
Ctypes could do with a bit more of that IMHO.

EG:

val string : string typ
…
“To avoid problems with the garbage collector, values passed using Ctypes_types.TYPE.string 
are copied into immovable C-managed storage before being passed to C.”

That would best contain your explanation above regarding lifetimes.
It reads like “you just leaked the copy” since “C managed store” means
the C heap and malloc(), which requires a called to free() which isn’t
what happens.

> 
> Regarding mutability, the 4.06.0 release of OCaml distinguishes
> mutable from immutable strings.  When Ctypes drops support for older
> OCaml versions it's likely that string mutability will be more clearly
> marked in Ctypes bindings, too.

I’m using 4.05 at the moment and already you get a complaint
using mutabile operations on string instead of bytes.
[IMHO the way of doing this changes was poorly thought out]

> 
>> Of course I can do that by passing a char ptr manually created
>> (C types can do that I think),
> 
> Indeed (see above).
> 
>> copying Ocaml string contents manually (how?),
> 
> A quick way is to use the 'coerce' function, which can convert between
> 'ptr char' and 'string' because 'string' is a view for 'ptr char'.
> Here's an example using the 'strdup' binding above.
> 
>    # let p = strdup "hello, world";;
>    val p : char Ctypes_static.ptr = (char*) 0x5601c2a90130
>    # coerce (ptr char) string p;;
>    - : string = "hello, world"
> 
> More generally the 'ocaml-memcpy' package provides functions that can
> copy between several different representations
> (https://github.com/yallop/ocaml-memcpy).

Thanks!

> 
>> and then manually free-ing it (C types can do that?)
> 
> A simple way is to bind 'free'!  Here's an example, freeing the memory
> returned from the call to 'strdup' above.
> 
>    # let free = foreign "free" (ptr char @-> returning void);;
>    val free : char Ctypes_static.ptr -> unit = <fun>
>    # free p;;
>    - : unit = ()
> 
> Kind regards,
> 
> Jeremy


Thanks heaps!


—
john skaller
skaller at users.sourceforge.net
http://felix-lang.org



More information about the Ctypes mailing list